* [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err @ 2021-01-18 11:08 lixiaokeng 2021-01-19 9:41 ` Martin Wilck 2021-01-19 21:57 ` Martin Wilck 0 siblings, 2 replies; 18+ messages in thread From: lixiaokeng @ 2021-01-18 11:08 UTC (permalink / raw) To: Martin Wilck, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 Hi When we make IO stress test on multipath device, there will be a metadata err because of wrong path. There are three test scripts. First: #!/bin/bash disk_list="/dev/mapper/3600140531f063b3e19349bc82028e0cc /dev/mapper/36001405ca5165367d67447ea68108e1d /dev/mapper/3600140584e11eb1818c4afab12c17800 /dev/mapper/36001405b7679bd96b094bccbf971bc90" for disk in ${disk_list} do mkfs.ext4 -F $disk done while true do for disk in ${disk_list} do test_dir=${disk##*/} [ -d $test_dir ] && umount $test_dir || mkdir $test_dir while true do mount -o data_err=abort,errors=remount-ro $disk $test_dir && break sleep 0.1 done nohup fsstress -d $(pwd)/$test_dir -l 10 -n 1000 -p 10 -X &>/dev/null & done sleep 5 while [ -n "`pidof fsstress`" ] do sleep 1 done done Second: #!/bin/bash while true do sleep 15 i=0 while [ $i -le 5 ] do iscsiadm -m node -p 100.1.1.1 -u iscsiadm -m node -p 100.1.1.1 -l sleep 1 iscsiadm -m node -p 100.1.2.1 -u iscsiadm -m node -p 100.1.2.1 -l sleep 1 ((i=i+1)) done done Third: #!/bin/bash function iscsi_query() { interval=5 while true do iscsiadm -m node -p 100.1.1.1 &> /dev/null iscsiadm -m node -p 100.1.2.1 &> /dev/null iscsiadm -m session &> /dev/null rescan-scsi-bus.sh &> /dev/null sleep $interval done } function multipath_query() { interval=1 while true do multipath -F &> /dev/null multipath -r &> /dev/null multipath -v2 &> /dev/null multipath -ll &> /dev/null sleep $interval done } function multipathd_query() { disk_base=63 # sdc interval=1 while true do multipathd show paths &> /dev/null multipathd show status &> /dev/null multipathd show daemon &> /dev/null multipathd show maps json &> /dev/null multipathd show config &> /dev/null multipathd show config local &> /dev/null multipathd show blacklist &> /dev/null multipathd show devices &> /dev/null multipathd reset maps stats &> /dev/null multipathd disablequeueing maps &> /dev/null multipathd restorequeueing maps &> /dev/null multipathd forcequeueing daemon &> /dev/null multipathd restorequeueing daemon &> /dev/null let disk_num=disk_base+RANDOM%8 disk=sd`echo "$disk_num" | xxd -p -r` multipathd show path $disk &> /dev/null multipathd del path $disk &> /dev/null multipathd add path $disk &> /dev/null multipathd fail path $disk &> /dev/null multipathd reinstate path $disk &> /dev/null multipathd show path $disk &> /dev/null map_count=`multipathd show maps | grep -v name | wc -l` if [ $map_count -ge 1 ];then let map_num=(RANDOM%map_count)+1 map=`multipathd show maps | grep -v name | awk '{print $1}' | sed -n "$map_num"p` multipathd show map $map &> /dev/null multipathd suspend map $map &> /dev/null multipathd resume map $map &> /dev/null multipathd reload map $map &> /dev/null multipathd reset map $map &> /dev/null fi sleep $interval done } iscsi_query & iscsi_query & multipath_query & multipath_query & multipathd_query & multipathd_query & After the test scripts are executed for some time (about 24h), there will a metadata error. The reason is that multipath device has wrong path. The detail of the first scene: ip1: node disk minor 4:0:0:0: [sdd] 48 4:0:0:1: [sdm] 192 4:0:0:2: [sdk] 160 4:0:0:3: [sdi] 128 ip2: node disk minor 5:0:0:0: [sdc] 32 5:0:0:1: [sdj] 144 5:0:0:2: [sdg] 96 5:0:0:3: [sde] 64 Sequence of events: (1)multipath -r, ip1 logout at same The load table params of 36001405ca5165367d67447ea68108e1d is "0 1 alua 1 1 service-time 0 1 1 8:128 1"(The reason no 128 may be not long after ip2 login and path_discovery doesn't find sde). However, domap failed because ip1 logout. The path of sdi is still in gvecs->pathvec. (2) multipathd add path sde The load table params of 36001405ca5165367d67447ea68108e1d is "0 1 alua 2 1 service-time 0 1 1 8:64 1 service-time 0 1 1 8:128 " and domap successes. At this time, 36001405ca5165367d67447ea68108e1d has two path (sde, sdi), but sdi is actually the path of 36001405b7679bd96b094bccbf971bc90. (3) metadata of 36001405ca5165367d67447ea68108e1d sync The metadata of 36001405b7679bd96b094bccbf971bc90 will be covered. (4) umount 36001405b7679bd96b094bccbf971bc90 36001405b7679bd96b094bccbf971bc90 has no usable path when umount, so the correct metadata doesn't sync. (5) mount 36001405b7679bd96b094bccbf971bc90 Failed because of err metadata I think there may be other ways to lead metadata err too. I have no good idea to deal this. Can you give a great advice about this. Thanks very much. Regards, Lixiaokeng -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-01-18 11:08 [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err lixiaokeng @ 2021-01-19 9:41 ` Martin Wilck 2021-01-19 12:46 ` lixiaokeng 2021-01-19 21:57 ` Martin Wilck 1 sibling, 1 reply; 18+ messages in thread From: Martin Wilck @ 2021-01-19 9:41 UTC (permalink / raw) To: lixiaokeng, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 Hi Lixiaokeng, On Mon, 2021-01-18 at 19:08 +0800, lixiaokeng wrote: > Hi > When we make IO stress test on multipath device, there will > be a metadata err because of wrong path. > thanks for the detailed report and analysis, also for sharing your test scripts! I'll take a closer look asap. Meanwhile, please confirm which multipath-tools version you were using, and whether this occurs with the latest "queue" code: https://github.com/openSUSE/multipath-tools/tree/queue Regards Martin -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-01-19 9:41 ` Martin Wilck @ 2021-01-19 12:46 ` lixiaokeng 0 siblings, 0 replies; 18+ messages in thread From: lixiaokeng @ 2021-01-19 12:46 UTC (permalink / raw) To: Martin Wilck, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 Hi Martin, I have tested 0.7.7 and 0.8.5 code.They both have metadate error. I find I missed some information in last email. Ip1 log in between (1) and (2) and sdi became a path of 36001405b7679bd96b094bccbf971bc90, but uevents weren't deal until (3) finished. The details described are base on code 0.7.7. I think similar process exists in 0.8.5 but there may be another to cause error. The latest "queue" code isn't tested. I will try. Regards, Lixiaokeng On 2021/1/19 17:41, Martin Wilck wrote: > Hi Lixiaokeng, > > On Mon, 2021-01-18 at 19:08 +0800, lixiaokeng wrote: >> Hi >> When we make IO stress test on multipath device, there will >> be a metadata err because of wrong path. >> > > thanks for the detailed report and analysis, also for sharing > your test scripts! I'll take a closer look asap. Meanwhile, please > confirm which multipath-tools version you were using, and whether this > occurs with the latest "queue" code: > > https://github.com/openSUSE/multipath-tools/tree/queue > > Regards > Martin > > > . > -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-01-18 11:08 [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err lixiaokeng 2021-01-19 9:41 ` Martin Wilck @ 2021-01-19 21:57 ` Martin Wilck 2021-01-20 2:30 ` lixiaokeng 2021-01-20 13:02 ` Roger Heflin 1 sibling, 2 replies; 18+ messages in thread From: Martin Wilck @ 2021-01-19 21:57 UTC (permalink / raw) To: lixiaokeng, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 On Mon, 2021-01-18 at 19:08 +0800, lixiaokeng wrote: > Hi > When we make IO stress test on multipath device, there will > be a metadata err because of wrong path. > > > Sequence of events: > (1)multipath -r, ip1 logout at same > The load table params of 36001405ca5165367d67447ea68108e1d is > "0 1 alua 1 1 service-time 0 1 1 8:128 1"(The reason no 128 may > be not long after ip2 login and path_discovery doesn't find sde). > However, domap failed because ip1 logout. The path of sdi is > still in gvecs->pathvec. verify_paths() would detect this. We do call verify_paths() in coalesce_paths() before calling domap(), but not immediately before. Perhaps we should move the verify_paths() call down to immediately before the domap() call. That would at least minimize the time window for this race. It's hard to avoid it entirely. The way multipathd is written, the vecs lock is held all the time during coalesce_paths(), and thus no uevents can be processed. We could also consider calling verify_paths() before *and* after domap(). Was this a map creation or a map reload? Was the map removed after the failure? Do you observe the message "ignoring map" or "removing map"? Do you observe a "remove" uevent for sdi? > > (2) multipathd add path sde > The load table params of 36001405ca5165367d67447ea68108e1d is > "0 1 alua 2 1 service-time 0 1 1 8:64 1 service-time 0 1 1 8:128 " > and domap successes. > At this time, 36001405ca5165367d67447ea68108e1d has two path (sde, > sdi), > but sdi is actually the path of 36001405b7679bd96b094bccbf971bc90. > > (3) metadata of 36001405ca5165367d67447ea68108e1d sync > The metadata of 36001405b7679bd96b094bccbf971bc90 will be covered Well, I suppose the "add" event for sde might come before the remove event for sdi (if any). I wonder if you'd see the issue also if you run the same test without the "multipath -F; multipath -r" loop, or with just one. Ok, one multipath_query() loop simulates an admin working on the system, but 2 parallel loops - 2 admins working in parallel, plus the intensive sequence of actions done in multipathd_query at the same time? The repeated "multipath -r" calls and multipathd commands will cause multipathd to spend a lot of time in reconfigure() and in cli_* calls holding the vecs lock, which makes it likely that uevents are missed or processed late. Don't get me wrong, I don't argue against tough testing. But we should be aware that there are always time intervals during which multipathd's picture of the present devices is different from what the kernel sees. There's definitely room for improvement in multipathd wrt locking and event processing in general, but that's a BIG piece of work. Thanks Martin -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-01-19 21:57 ` Martin Wilck @ 2021-01-20 2:30 ` lixiaokeng 2021-01-20 14:07 ` Martin Wilck 2021-01-20 13:02 ` Roger Heflin 1 sibling, 1 reply; 18+ messages in thread From: lixiaokeng @ 2021-01-20 2:30 UTC (permalink / raw) To: Martin Wilck, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 Hi Martin: Thanks for your reply. > verify_paths() would detect this. We do call verify_paths() in > coalesce_paths() before calling domap(), but not immediately before. > Perhaps we should move the verify_paths() call down to immediately > before the domap() call. That would at least minimize the time window > for this race. It's hard to avoid it entirely. The way multipathd is > written, the vecs lock is held all the time during coalesce_paths(), > and thus no uevents can be processed. We could also consider calling > verify_paths() before *and* after domap(). Can calling verify_paths() before *and* after domap() deal this entirely? > Was this a map creation or a map reload? Was the map removed after the > failure? Do you observe the message "ignoring map" or "removing map"? > > Do you observe a "remove" uevent for sdi? This was a map reload but sdi was not in old map. The "removing map" was observed. The "remove" uevent for sdi was not observed here. > I wonder if you'd see the issue also if you run the same test without > the "multipath -F; multipath -r" loop, or with just one. Ok, one > multipath_query() loop simulates an admin working on the system, but 2 > parallel loops - 2 admins working in parallel, plus the intensive > sequence of actions done in multipathd_query at the same time? The > repeated "multipath -r" calls and multipathd commands will cause > multipathd to spend a lot of time in reconfigure() and in cli_* calls > holding the vecs lock, which makes it likely that uevents are missed or > processed late. As you said, there were lots of cli_* calls but no uevent when error caused. And after finishing them, hundreds of uevent will be found (for example ,"Forwarding 201 uevents" in log). > Don't get me wrong, I don't argue against tough testing. But we should > be aware that there are always time intervals during which multipathd's > picture of the present devices is different from what the kernel sees. What you said is very reasonable. When this problem was found, I think it is difficult to solve that entirely, while it is hard to happen. Well, I will discuss the rationality of test scripts with testers. > There's definitely room for improvement in multipathd wrt locking and > event processing in general, but that's a BIG piece of work. Thanks again! Regards Lixiaokeng -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-01-20 2:30 ` lixiaokeng @ 2021-01-20 14:07 ` Martin Wilck 2021-01-25 1:33 ` lixiaokeng 0 siblings, 1 reply; 18+ messages in thread From: Martin Wilck @ 2021-01-20 14:07 UTC (permalink / raw) To: lixiaokeng, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 On Wed, 2021-01-20 at 10:30 +0800, lixiaokeng wrote: > Hi Martin: > Thanks for your reply. > > > > verify_paths() would detect this. We do call verify_paths() in > > coalesce_paths() before calling domap(), but not immediately > > before. > > Perhaps we should move the verify_paths() call down to immediately > > before the domap() call. That would at least minimize the time > > window > > for this race. It's hard to avoid it entirely. The way multipathd > > is > > written, the vecs lock is held all the time during > > coalesce_paths(), > > and thus no uevents can be processed. We could also consider > > calling > > verify_paths() before *and* after domap(). > > Can calling verify_paths() before *and* after domap() deal this > entirely? Probably yes. If you look at update_map(), we do it there until domap() succeeds. Could you try adding a verify_paths() call after domap() in the failure case, before removing the map? I wouldn't go for a retry loop like in update_map() at this point. Martin -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-01-20 14:07 ` Martin Wilck @ 2021-01-25 1:33 ` lixiaokeng 2021-01-25 12:28 ` Martin Wilck 0 siblings, 1 reply; 18+ messages in thread From: lixiaokeng @ 2021-01-25 1:33 UTC (permalink / raw) To: Martin Wilck, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 >>> verify_paths() before *and* after domap(). >> >> Can calling verify_paths() before *and* after domap() deal this >> entirely? > Hi, Unfortunately the verify_path() called before *and* after domap() in coalesce_paths can't solve this problem. I think it is another way to lead multipath with wrong path, but now I can't find the way from log. Regards, Lixiaokeng -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-01-25 1:33 ` lixiaokeng @ 2021-01-25 12:28 ` Martin Wilck 2021-01-26 6:40 ` lixiaokeng 2021-01-26 11:14 ` lixiaokeng 0 siblings, 2 replies; 18+ messages in thread From: Martin Wilck @ 2021-01-25 12:28 UTC (permalink / raw) To: lixiaokeng, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 Hi lixiaokeng, On Mon, 2021-01-25 at 09:33 +0800, lixiaokeng wrote: > > > > > verify_paths() before *and* after domap(). > > > > > > Can calling verify_paths() before *and* after domap() deal this > > > entirely? > > > Hi, > Unfortunately the verify_path() called before *and* after domap() > in > coalesce_paths can't solve this problem. I think it is another way to > lead multipath with wrong path, but now I can't find the way from > log. Can you provide multipathd -v3 logs, and kernel logs? Maybe I'll see something. Martin -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-01-25 12:28 ` Martin Wilck @ 2021-01-26 6:40 ` lixiaokeng 2021-01-26 11:14 ` lixiaokeng 1 sibling, 0 replies; 18+ messages in thread From: lixiaokeng @ 2021-01-26 6:40 UTC (permalink / raw) To: Martin Wilck, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 > > Can you provide multipathd -v3 logs, and kernel logs? Maybe I'll see > something. > Hi Martin, The first scene was not found, so I can't provide some useful logs. I will try again to get more information from the logs. If possible, please use scripts to reproduce the problem. Regards, Lixiaokeng -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-01-25 12:28 ` Martin Wilck 2021-01-26 6:40 ` lixiaokeng @ 2021-01-26 11:14 ` lixiaokeng 2021-01-26 23:11 ` Martin Wilck 1 sibling, 1 reply; 18+ messages in thread From: lixiaokeng @ 2021-01-26 11:14 UTC (permalink / raw) To: Martin Wilck, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 >> Hi, >> Unfortunately the verify_path() called before *and* after domap() >> in >> coalesce_paths can't solve this problem. I think it is another way to >> lead multipath with wrong path, but now I can't find the way from >> log. > > Can you provide multipathd -v3 logs, and kernel logs? Maybe I'll see > something. > Logs in first sence: Jan 26 12:37:47 client1 multipathd[2532437]: reconfigure (operator) Jan 25 12:37:47 client1 multipathd[2532437]: 3600140584e11eb1818c4afab12c17800: remove multipath map Jan 25 12:37:47 client1 multipathd[2532437]: sdh: orphan path, map flushed Jan 25 12:37:47 client1 multipathd[2532437]: BUG: orphaning path sdh that holds hwe of 3600140584e11eb1818c4afab12c17800 Jan 25 12:37:47 client1 multipathd[2532437]: 3600140531f063b3e19349bc82028e0cc: remove multipath map Jan 25 12:37:47 client1 multipathd[2532437]: sdn: orphan path, map flushed Jan 25 12:37:47 client1 multipathd[2532437]: BUG: orphaning path sdn that holds hwe of 3600140531f063b3e19349bc82028e0cc Jan 25 12:37:47 client1 multipathd[2532437]: 36001405ca5165367d67447ea68108e1d: remove multipath map Jan 25 12:37:47 client1 multipathd[2532437]: sdi: orphan path, map flushed Jan 25 12:37:47 client1 multipathd[2532437]: BUG: orphaning path sdi that holds hwe of 36001405ca5165367d67447ea68108e1d Jan 25 12:37:47 client1 multipathd[2532437]: 8:208: cannot find block device Jan 25 12:37:47 client1 multipathd[2532437]: 3600140531f063b3e19349bc82028e0cc: remove multipath map Jan 25 12:37:47 client1 multipathd[2532437]: 8:128: cannot find block device Jan 25 12:37:47 client1 multipathd[2532437]: 36001405ca5165367d67447ea68108e1d: remove multipath map Jan 25 12:37:47 client1 multipathd[2532437]: 8:112: cannot find block device Jan 25 12:37:47 client1 multipathd[2532437]: 3600140584e11eb1818c4afab12c17800: remove multipath map Jan 25 12:37:47 client1 multipathd[2532437]: sda: orphan path, blacklisted Jan 25 12:37:47 client1 multipathd[2532437]: sdb: orphan path, blacklisted Jan 25 12:37:47 client1 multipathd[2532437]: 3600140584e11eb1818c4afab12c17800: set ACT_CREATE (map does not exist) Jan 25 12:37:47 client1 kernel: [1202538.089589] device-mapper: multipath: Reinstating path 8:208. Jan 25 12:37:47 client1 kernel: [1202538.089606] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:48 1 Jan 25 12:37:47 client1 kernel: [1202538.089607] name dm-3; majir:minor 253:3; dm-3 Jan 25 12:37:47 client1 kernel: [1202538.089688] device-mapper: multipath: Reinstating path 8:128. Jan 25 12:37:47 client1 kernel: [1202538.089762] lxk load_table success 0 Jan 25 12:37:47 client1 kernel: [1202538.089768] device-mapper: multipath: Reinstating path 8:112. Jan 25 12:37:47 client1 kernel: [1202538.090266] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:48 1 Jan 25 12:37:47 client1 kernel: [1202538.090268] name dm-3; majir:minor 253:3; dm-3 Jan 25 12:37:47 client1 multipath[788570]: loading /lib64/multipath/libchecktur.so checker Jan 25 12:37:47 client1 multipath[788570]: 8:112: cannot find block device Jan 25 12:37:47 client1 multipath[788570]: 8:112: Empty device name Jan 25 12:37:47 client1 multipath[788570]: 8:112: Empty device name Jan 25 12:37:47 client1 multipath[788570]: get_udev_device: failed to look up 8:112 with type 1 Jan 25 12:37:47 client1 multipath[788570]: dm-3: no usable paths found Jan 25 12:37:47 client1 multipath[788569]: loading /lib64/multipath/libchecktur.so checker Jan 25 12:37:47 client1 multipath[788571]: loading /lib64/multipath/libchecktur.so checker Jan 25 12:37:47 client1 multipath[788569]: 8:128: cannot find block device Jan 25 12:37:47 client1 multipath[788569]: 8:128: Empty device name Jan 25 12:37:47 client1 multipath[788569]: 8:128: Empty device name Jan 25 12:37:47 client1 multipath[788569]: get_udev_device: failed to look up 8:128 with type 1 Jan 25 12:37:47 client1 multipath[788569]: dm-6: no usable paths found Jan 25 12:37:47 client1 multipath[788571]: 8:208: cannot find block device Jan 25 12:37:47 client1 multipath[788571]: 8:208: Empty device name Jan 25 12:37:47 client1 multipath[788571]: 8:208: Empty device name Jan 25 12:37:47 client1 multipath[788571]: get_udev_device: failed to look up 8:208 with type 1 Jan 25 12:37:47 client1 multipath[788571]: dm-4: no usable paths found Jan 25 12:37:47 client1 multipath[788580]: loading /lib64/multipath/libchecktur.so checker Jan 25 12:37:47 client1 kernel: [1202538.090340] lxk load_table success 0 Jan 25 12:37:47 client1 kernel: [1202538.128005] device-mapper: multipath: Failing path 8:208. Jan 25 12:37:47 client1 kernel: [1202538.128937] device-mapper: multipath: Failing path 8:128. Jan 25 12:37:47 client1 multipathd[2532437]: 3600140584e11eb1818c4afab12c17800: load table [0 20971520 multipath 0 1 alua 1 1 service-time 0 1 1 8:48 1] Jan 25 12:37:47 client1 multipathd[2532437]: 3600140531f063b3e19349bc82028e0cc: set ACT_CREATE (map does not exist) Jan 25 12:37:47 client1 kernel: [1202538.132361] scsi 3:0:0:0: alua: Detached Jan 25 12:37:47 client1 kernel: [1202538.132370] lxk ref put after: name sdh; count 0 Jan 25 12:37:47 client1 kernel: [1202538.133024] EXT4-fs error: 1 callbacks suppressed Jan 25 12:37:47 client1 kernel: [1202538.133027] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0 Jan 25 12:37:47 client1 kernel: [1202538.133031] EXT4-fs: 12 callbacks suppressed Jan 25 12:37:47 client1 kernel: [1202538.133033] EXT4-fs (dm-6): ext4_commit_super: ec42fbc81d447c94 78c83f8f086afab0 seed is 0xca479c5b Jan 25 12:37:47 client1 kernel: [1202538.133550] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:224 1 Jan 25 12:37:47 client1 kernel: [1202538.133552] name dm-4; majir:minor 253:4; dm-4 Jan 25 12:37:47 client1 kernel: [1202538.133682] lxk load_table success 0 Jan 25 12:37:47 client1 kernel: [1202538.133968] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:224 1 Jan 25 12:37:47 client1 kernel: [1202538.133970] name dm-4; majir:minor 253:4; dm-4 Jan 25 12:37:47 client1 kernel: [1202538.134017] lxk load_table success 0 Jan 25 12:37:47 client1 kernel: [1202538.135805] EXT4-fs (dm-6): I/O error while writing superblock Jan 25 12:37:47 client1 kernel: [1202538.136029] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0 Jan 25 12:37:47 client1 kernel: [1202538.136032] EXT4-fs (dm-6): ext4_commit_super: ec42fbc81d447c94 78c83f8f086afab0 seed is 0xca479c5b Jan 25 12:37:47 client1 kernel: [1202538.142127] EXT4-fs (dm-6): I/O error while writing superblock Jan 25 12:37:47 client1 kernel: [1202538.143920] sd 4:0:0:0: alua: port group 00 state A non-preferred supports TOlUSNA Jan 25 12:37:47 client1 kernel: [1202538.144911] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0 Jan 25 12:37:47 client1 kernel: [1202538.144915] EXT4-fs (dm-6): ext4_commit_super: ec42fbc81d447c94 78c83f8f086afab0 seed is 0xca479c5b Jan 25 12:37:47 client1 kernel: [1202538.145373] EXT4-fs (dm-6): I/O error while writing superblock Jan 25 12:37:47 client1 kernel: [1202538.145960] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0 Jan 25 12:37:47 client1 kernel: [1202538.145963] EXT4-fs (dm-6): ext4_commit_super: ec42fbc81d447c94 78c83f8f086afab0 seed is 0xca479c5b Jan 25 12:37:47 client1 kernel: [1202538.146046] EXT4-fs (dm-6): I/O error while writing superblock Jan 25 12:37:47 client1 kernel: [1202538.146195] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0 Jan 25 12:37:47 client1 kernel: [1202538.146196] EXT4-fs (dm-6): ext4_commit_super: ec42fbc81d447c94 78c83f8f086afab0 seed is 0xca479c5b Jan 25 12:37:47 client1 kernel: [1202538.148027] EXT4-fs (dm-6): I/O error while writing superblock Jan 25 12:37:47 client1 kernel: [1202538.148920] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0 Jan 25 12:37:47 client1 kernel: [1202538.149466] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0 Jan 25 12:37:47 client1 kernel: [1202538.149578] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0 Jan 25 12:37:48 client1 kernel: [1202538.149639] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0 Jan 25 12:37:48 client1 kernel: [1202538.149673] EXT4-fs error (device dm-6): ext4_find_entry:1456: inode #131073: comm fsstress: reading directory lblock 0 Jan 25 12:37:48 client1 kernel: [1202538.149961] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block Jan 25 12:37:48 client1 kernel: [1202538.155906] scsi 3:0:0:1: alua: Detached Jan 25 12:37:48 client1 kernel: [1202538.155914] lxk ref put after: name sdn; count 0 Jan 25 12:37:48 client1 kernel: [1202538.158130] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:80 1 Jan 25 12:37:48 client1 kernel: [1202538.158132] name dm-5; majir:minor 253:5; dm-5 Jan 25 12:37:48 client1 kernel: [1202538.159310] dm destory name dm-7; majir:minor 253:7; dm-7 Jan 25 12:37:48 client1 kernel: [1202538.161448] lxk load_table success 0 Jan 25 12:37:48 client1 kernel: [1202538.163417] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:64 1 Jan 25 12:37:48 client1 kernel: [1202538.163419] name dm-6; majir:minor 253:6; dm-6 Jan 25 12:37:48 client1 kernel: [1202538.163498] lxk load_table success 0 Jan 25 12:37:48 client1 kernel: [1202538.163972] dm destory name dm-5; majir:minor 253:5; dm-5 Jan 25 12:37:48 client1 kernel: [1202538.196041] sd 4:0:0:3: alua: port group 00 state A non-preferred supports TOlUSNA Jan 25 12:37:48 client1 kernel: [1202538.200208] scsi 3:0:0:3: alua: Detached Jan 25 12:37:48 client1 kernel: [1202538.224361] lxk ref put after: name sdi; count 0 Jan 25 12:37:48 client1 kernel: [1202538.225086] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:64 1 Jan 25 12:37:48 client1 kernel: [1202538.225088] name dm-6; majir:minor 253:6; dm-6 Jan 25 12:37:48 client1 kernel: [1202538.226914] lxk load_table success 0 Jan 25 12:37:48 client1 kernel: [1202538.311125] sd 4:0:0:0: [sdd] Synchronizing SCSI cache Jan 25 12:37:48 client1 kernel: [1202538.312397] sd 4:0:0:3: alua: port group 00 state A non-preferred supports TOlUSNA Jan 25 12:37:48 client1 kernel: [1202538.350483] device-mapper: multipath: Failing path 8:48. Jan 25 12:37:48 client1 kernel: [1202538.378721] sd 4:0:0:3: [sde] Synchronizing SCSI cache Jan 25 12:37:48 client1 kernel: [1202538.404214] device-mapper: multipath: Failing path 8:64. Jan 25 12:37:48 client1 kernel: [1202538.404656] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block Jan 25 12:37:48 client1 kernel: [1202538.405940] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block Jan 25 12:37:48 client1 kernel: [1202538.467014] sd 4:0:0:2: [sdf] Synchronizing SCSI cache Jan 25 12:37:48 client1 kernel: [1202538.568195] scsi 4:0:0:2: alua: Detached Jan 25 12:37:48 client1 kernel: [1202538.569470] sd 4:0:0:1: [sdo] Synchronizing SCSI cache Jan 25 12:37:48 client1 kernel: [1202538.621389] scsi host2: iSCSI Initiator over TCP/IP Jan 25 12:37:48 client1 kernel: [1202538.625805] scsi 2:0:0:0: Direct-Access LIO-ORG disk1 4.0 PQ: 0 ANSI: 5 Jan 25 12:37:48 client1 kernel: [1202538.627294] scsi 2:0:0:0: alua: supports implicit and explicit TPGS Jan 25 12:37:48 client1 kernel: [1202538.627302] scsi 2:0:0:0: alua: device naa.600140584e11eb1818c4afab12c17800 port group 0 rel port 1 Jan 25 12:37:48 client1 kernel: [1202538.627875] sd 2:0:0:0: Attached scsi generic sg2 type 0 Jan 25 12:37:48 client1 kernel: [1202538.630344] scsi 2:0:0:3: Direct-Access LIO-ORG disk4 4.0 PQ: 0 ANSI: 5 Jan 25 12:37:48 client1 kernel: [1202538.630507] sd 2:0:0:0: [sdf] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB) Jan 25 12:37:48 client1 kernel: [1202538.630655] sd 2:0:0:0: [sdf] Write Protect is off Jan 25 12:37:48 client1 kernel: [1202538.630787] scsi 2:0:0:3: alua: supports implicit and explicit TPGS Jan 25 12:37:48 client1 kernel: [1202538.630793] scsi 2:0:0:3: alua: device naa.6001405ca5165367d67447ea68108e1d port group 0 rel port 1 Jan 25 12:37:48 client1 kernel: [1202538.630914] sd 2:0:0:0: [sdf] Write cache: enabled, read cache: enabled, supports DPO and FUA Jan 25 12:37:48 client1 kernel: [1202538.631010] sd 2:0:0:3: Attached scsi generic sg3 type 0 Jan 25 12:37:48 client1 kernel: [1202538.632201] scsi 2:0:0:2: Direct-Access LIO-ORG disk3 4.0 PQ: 0 ANSI: 5 Jan 25 12:37:48 client1 kernel: [1202538.632388] sd 2:0:0:0: [sdf] Optimal transfer size 33488896 bytes Jan 25 12:37:48 client1 kernel: [1202538.632590] scsi 2:0:0:2: alua: supports implicit and explicit TPGS Jan 25 12:37:48 client1 kernel: [1202538.632595] scsi 2:0:0:2: alua: device naa.6001405b7679bd96b094bccbf971bc90 port group 0 rel port 1 Jan 25 12:37:48 client1 kernel: [1202538.632798] sd 2:0:0:2: Attached scsi generic sg4 type 0 Jan 25 12:37:48 client1 kernel: [1202538.633665] scsi 2:0:0:1: Direct-Access LIO-ORG disk2 4.0 PQ: 0 ANSI: 5 Jan 25 12:37:48 client1 kernel: [1202538.633869] sd 2:0:0:3: [sdh] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB) Jan 25 12:37:48 client1 kernel: [1202538.634858] scsi 2:0:0:1: alua: supports implicit and explicit TPGS Jan 25 12:37:48 client1 kernel: [1202538.634864] scsi 2:0:0:1: alua: device naa.600140531f063b3e19349bc82028e0cc port group 0 rel port 1 Jan 25 12:37:48 client1 kernel: [1202538.635089] sd 2:0:0:1: Attached scsi generic sg5 type 0 Jan 25 12:37:48 client1 kernel: [1202538.635295] sd 2:0:0:3: [sdh] Write Protect is off Jan 25 12:37:48 client1 kernel: [1202538.635851] sd 2:0:0:2: [sdi] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB) Jan 25 12:37:48 client1 kernel: [1202538.636726] sd 2:0:0:2: [sdi] Write Protect is off Jan 25 12:37:48 client1 kernel: [1202538.636763] sd 2:0:0:0: alua: port group 00 state A non-preferred supports TOlUSNA Jan 25 12:37:48 client1 kernel: [1202538.636988] sd 2:0:0:2: [sdi] Write cache: enabled, read cache: enabled, supports DPO and FUA Jan 25 12:37:48 client1 kernel: [1202538.638092] sd 2:0:0:1: [sdk] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB) Jan 25 12:37:48 client1 kernel: [1202538.638188] sd 2:0:0:2: [sdi] Optimal transfer size 33488896 bytes Jan 25 12:37:48 client1 kernel: [1202538.638271] sd 2:0:0:1: [sdk] Write Protect is off Jan 25 12:37:48 client1 kernel: [1202538.638454] sd 2:0:0:1: [sdk] Write cache: enabled, read cache: enabled, supports DPO and FUA Jan 25 12:37:48 client1 kernel: [1202538.638618] sd 2:0:0:1: [sdk] Optimal transfer size 33488896 bytes Jan 25 12:37:48 client1 kernel: [1202538.639700] sd 2:0:0:3: [sdh] Write cache: enabled, read cache: enabled, supports DPO and FUA Jan 25 12:37:48 client1 kernel: [1202538.640445] sd 2:0:0:3: alua: port group 00 state A non-preferred supports TOlUSNA Jan 25 12:37:48 client1 kernel: [1202538.640798] sd 2:0:0:3: [sdh] Optimal transfer size 33488896 bytes Jan 25 12:37:48 client1 kernel: [1202538.648538] sd 2:0:0:1: alua: port group 00 state A non-preferred supports TOlUSNA Jan 25 12:37:48 client1 kernel: [1202538.648550] sd 2:0:0:2: alua: port group 00 state A non-preferred supports TOlUSNA Jan 25 12:37:48 client1 kernel: [1202538.808805] sd 2:0:0:1: [sdk] Attached SCSI disk Jan 25 12:37:48 client1 kernel: [1202538.814309] sd 2:0:0:2: [sdi] Attached SCSI disk Jan 25 12:37:48 client1 kernel: [1202538.820568] sd 2:0:0:3: [sdh] Attached SCSI disk Jan 25 12:37:48 client1 kernel: [1202538.825745] sd 2:0:0:0: [sdf] Attached SCSI disk Jan 25 12:37:48 client1 kernel: [1202539.030796] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block Jan 25 12:37:48 client1 kernel: [1202539.033019] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block Jan 25 12:37:48 client1 kernel: [1202539.033378] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block Jan 25 12:37:48 client1 kernel: [1202539.039261] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block Jan 25 12:37:48 client1 kernel: [1202539.039337] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block Jan 25 12:37:48 client1 kernel: [1202539.039377] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block Jan 25 12:37:48 client1 kernel: [1202539.040013] device-mapper: multipath: Failing path 8:224. Jan 25 12:37:48 client1 kernel: [1202539.116181] EXT4-fs warning (device dm-6): htree_dirblock_to_tree:995: inode #131073: lblock 0: comm fsstress: error -5 reading directory block Jan 25 12:37:48 client1 multipath[788595]: loading /lib64/multipath/libchecktur.so checker Jan 25 12:37:48 client1 multipathd[2532437]: 3600140531f063b3e19349bc82028e0cc: load table [0 20971520 multipath 0 1 alua 1 1 service-time 0 1 1 8:224 1] Jan 25 12:37:48 client1 multipathd[2532437]: 36001405b7679bd96b094bccbf971bc90: set ACT_CREATE (map does not exist) Jan 25 12:37:48 client1 multipathd[2532437]: 36001405b7679bd96b094bccbf971bc90: load table [0 20971520 multipath 0 1 alua 1 1 service-time 0 1 1 8:80 1] Jan 25 12:37:48 client1 multipathd[2532437]: 36001405ca5165367d67447ea68108e1d: set ACT_CREATE (map does not exist) Jan 25 12:37:48 client1 multipathd[2532437]: 36001405ca5165367d67447ea68108e1d: load table [0 20971520 multipath 0 1 alua 1 1 service-time 0 1 1 8:64 1] (1)multipath -r: The sdf is found as a path of 36001405b7679bd96b094bccbf971bc90 (iscsi node is 4:0:0:2) (2)iscsi logout: The sdf is removed in iscsi in system time [1202538.467014]. (3)iscsi login: The sdf appears in iscsi in system time [1202538.825745]. It is a path of 3600140584e11eb1818c4afab12c17800 (iscsi node 2:0:0:0) Here I have a doubt. When I stop in domap using gdb and iscsi log out/in, the sdf will not be used again becasue the disk refcount is not zero. I add a print if the disk refcount is zero in put_disk_and_module (for example lxk ref put after: name sdi; count 0), but there is not this print about sdf. (4)domap successd. Do you have any advice about this? Regards; Lixiaokeng -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-01-26 11:14 ` lixiaokeng @ 2021-01-26 23:11 ` Martin Wilck 2021-01-28 8:27 ` lixiaokeng 2021-02-04 11:25 ` lixiaokeng 0 siblings, 2 replies; 18+ messages in thread From: Martin Wilck @ 2021-01-26 23:11 UTC (permalink / raw) To: lixiaokeng, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 On Tue, 2021-01-26 at 19:14 +0800, lixiaokeng wrote: > > > > Hi, > > > Unfortunately the verify_path() called before *and* after > > > domap() > > > in > > > coalesce_paths can't solve this problem. I think it is another > > > way to > > > lead multipath with wrong path, but now I can't find the way from > > > log. > > > > Can you provide multipathd -v3 logs, and kernel logs? Maybe I'll > > see > > something. This is not a -v3 log, right? We can't see much what multipathd is doing. Anyway, I understand now that verify_paths() won't help. It looks only for paths that have been removed (i.e. don't exist any more in sysfs) since the last path detection. But then, when the error occurs, it seems that sdf has been removed *and re-added*. So, the check whether the path still exists succeeds. The uevents were also missed because the uevent handler didn't get the lock. > > (1)multipath -r: The sdf is found as a path of > 36001405b7679bd96b094bccbf971bc90 > (iscsi node is 4:0:0:2) > > (2)iscsi logout: The sdf is removed in iscsi in system time > [1202538.467014]. > > (3)iscsi login: The sdf appears in iscsi in system time > [1202538.825745]. > It is a path of 3600140584e11eb1818c4afab12c17800 (iscsi node > 2:0:0:0) > > Here I have a doubt. When I stop in domap using gdb and iscsi log > out/in, > the sdf will not be used again becasue the disk refcount is not > zero. I > add a print if the disk refcount is zero in put_disk_and_module (for > example lxk ref put after: name sdi; count 0), but there is not this > print > about sdf. Yes, this is a very good point, and it's indeed strange. multipathd should have opened a file descriptor to /dev/sdf in pathinfo(), and as long as that file is open, the use count shouldn't drop to 0, the disk devices (block device and scsi_disk device) shouldn't be released, and the major/minor number shouldn't be reused. Unless I'm missing something essential, that is. > Jan 25 12:37:48 client1 kernel: [1202538.467014] sd 4:0:0:2: [sdf] Synchronizing SCSI cache > Jan 25 12:37:48 client1 kernel: [1202538.568195] scsi 4:0:0:2: alua: Detached > Jan 25 12:37:48 client1 kernel: [1202538.630507] sd 2:0:0:0: [sdf] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB) Less than 0.1s between the disappearance of 4:0:0:2 as sdf and reappearance of 2:0:0:0, without any sign of multipathd having noticed this change, is indeed quite strange. So we can only conclude that (if there's no kernel refcounting bug, which I doubt) either orphan_path()->uninitialize_path() had been called (closing the fd), or that opening the sd device had failed in the first place (in which case the path WWID should have been nulled in pathinfo(). In both cases it makes little sense that the path should still be part of a struct multipath. Please increase the log level of the "Couldn't open device node" message in pathinfo(), and see if respective errors are logged. Can you verify in the debugger if multipathd still has the fd to the disk device open? Perhaps you could trace scsi_disk_release() in the kernel? Martin -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-01-26 23:11 ` Martin Wilck @ 2021-01-28 8:27 ` lixiaokeng 2021-01-28 21:15 ` Martin Wilck 2021-02-04 11:25 ` lixiaokeng 1 sibling, 1 reply; 18+ messages in thread From: lixiaokeng @ 2021-01-28 8:27 UTC (permalink / raw) To: Martin Wilck, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 > >> >> (1)multipath -r: The sdf is found as a path of >> 36001405b7679bd96b094bccbf971bc90 >> (iscsi node is 4:0:0:2) >> Here is a log "dm destory name dm-5; majir:minor 253:5; dm-5" (dm-5 36001405b7679bd96b094bccbf971bc90) in system time [1202538.163972]. I add this print in dm_destroy. >> (2)iscsi logout: The sdf is removed in iscsi in system time >> [1202538.467014]. >> I add refcount print in put_disk instead of put_disk_and_module. The refcount of sdf will be reduced to zero even the fd to /dev/sdf is opened (there is no "Couldn't open device node for sdf in log) when 36001405b7679bd96b094bccbf971bc90 is destoried and iscsi logout. I don't know why dm-5 is destoried. I doubt there may be some issue in the kernel that I add some print. I have test this in three computers, but the other two have no problem (they have been runing for 96h and for 48h respectively). I will replace the kernel of the computer with this issue and test it again. Do you have any great idea about dm-5 destruction? Thanks. Regards, Lixiaokeng >> (3)iscsi login: The sdf appears in iscsi in system time >> [1202538.825745]. >> It is a path of 3600140584e11eb1818c4afab12c17800 (iscsi node >> 2:0:0:0) >> >> Here I have a doubt. When I stop in domap using gdb and iscsi log >> out/in, >> the sdf will not be used again becasue the disk refcount is not >> zero. I >> add a print if the disk refcount is zero in put_disk_and_module (for >> example lxk ref put after: name sdi; count 0), but there is not this >> print >> about sdf. > > Yes, this is a very good point, and it's indeed strange. multipathd > should have opened a file descriptor to /dev/sdf in pathinfo(), and as > long as that file is open, the use count shouldn't drop to 0, the disk > devices (block device and scsi_disk device) shouldn't be released, and > the major/minor number shouldn't be reused. Unless I'm missing > something essential, that is. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-01-28 8:27 ` lixiaokeng @ 2021-01-28 21:15 ` Martin Wilck 0 siblings, 0 replies; 18+ messages in thread From: Martin Wilck @ 2021-01-28 21:15 UTC (permalink / raw) To: lixiaokeng, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 On Thu, 2021-01-28 at 16:27 +0800, lixiaokeng wrote: > I don't know why dm-5 is destoried. I doubt there may be some > issue in the kernel that I add some print. I have test this in > three computers, but the other two have no problem (they have been > runing for 96h and for 48h respectively). It's normal behavior of multipathd to destroy (flush) a map that has no paths any more, see ev_remove_path(). If all paths are *deleted*, that is. Martin -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-01-26 23:11 ` Martin Wilck 2021-01-28 8:27 ` lixiaokeng @ 2021-02-04 11:25 ` lixiaokeng 2021-02-04 14:56 ` Martin Wilck 1 sibling, 1 reply; 18+ messages in thread From: lixiaokeng @ 2021-02-04 11:25 UTC (permalink / raw) To: Martin Wilck, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 Hi Martin, On 2021/1/27 7:11, Martin Wilck wrote: > So we can only conclude that (if there's no kernel refcounting bug, > which I doubt) either orphan_path()->uninitialize_path() had been > called (closing the fd), or that opening the sd device had failed in > the first place (in which case the path WWID should have been nulled in > pathinfo(). In both cases it makes little sense that the path should > still be part of a struct multipath. I have an idea. If pp->fd < 0 ("Couldn't open device node"), pathinfo() returns PATHINFO_FAILED. Don't close(pp->fd) in orphan_path(). It may solve the problem (device with wrong path). I will take some time to test it. However, I don’t know if there are potential risks. Do you have suggestions about this? Regards, Lixiaokeng -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-02-04 11:25 ` lixiaokeng @ 2021-02-04 14:56 ` Martin Wilck 2021-02-05 11:49 ` lixiaokeng 0 siblings, 1 reply; 18+ messages in thread From: Martin Wilck @ 2021-02-04 14:56 UTC (permalink / raw) To: lixiaokeng, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 On Thu, 2021-02-04 at 19:25 +0800, lixiaokeng wrote: > > Hi Martin, > > On 2021/1/27 7:11, Martin Wilck wrote: > > So we can only conclude that (if there's no kernel refcounting bug, > > which I doubt) either orphan_path()->uninitialize_path() had been > > called (closing the fd), or that opening the sd device had failed > > in > > the first place (in which case the path WWID should have been > > nulled in > > pathinfo(). In both cases it makes little sense that the path > > should > > still be part of a struct multipath. > > I have an idea. > > If pp->fd < 0 ("Couldn't open device node"), pathinfo() returns > PATHINFO_FAILED. Don't close(pp->fd) in orphan_path(). It may solve > the > problem (device with wrong path). I will take some time to test it. Do you have evidence that the fd had been closed in your error case? The path in question wasn't orphaned, if I understood correctly. You said it was still member of a map. In that case, the fd *must* be open. > However, I don’t know if there are potential risks. Do you have > suggestions about this? Other than resource usage ... users might be irritated because if we do this and a device is remove and reappears, it will *always* have a different device node attached. But the device nodes are random today, anyway. If we missed a delete event, we might keep this fd open forever, because a re-added path would never get the same sysfs path again; not sure if that might hurt in some scenarios. We shouldn't miss delete events anyway, of course. So no, at least off the to of my head, I can't think of anything serious. Famous last words ;-) We must make sure to close the fd in the free_path() code path, of course. Btw, I just double-checked that the kernel really behaves as I thought. You can run e.g. in python: >>> import os >>> f=os.open("/dev/sdh", os.O_RDWR|os.O_EXCL) This will keep an fd to the device open. Now if you delete the device and re-add it by scanning the scsi host, it will get a new device ID. echo 1 >/sys/block/sdh/device/delete echo - - - >/sys/class/scsi_host/host2/scan If you close the fd in python and repeat the delete/re-add (and nothing else happened in the meantime), it will become "sdh" again. Cheers, Martin -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-02-04 14:56 ` Martin Wilck @ 2021-02-05 11:49 ` lixiaokeng 0 siblings, 0 replies; 18+ messages in thread From: lixiaokeng @ 2021-02-05 11:49 UTC (permalink / raw) To: Martin Wilck, Benjamin Marzinski, Christophe Varoqui, dm-devel mailing list Cc: linfeilong, liuzhiqiang (I), lihaotian9 Hi Martin, I have removed multipathd_query in my test script. And a flock is added before/after reonfigure() and iscsi login/out. Sequence of events: (1)iscsi log out /dev/sdi(36001405b7679bd96b094bccbf971bc90) is removed. multipath -r: sdi->fd is closed. ref of sdi becomes 0. (2)iscsi log in /dev/sdi(3600140531f063b3e19349bc82028e0cc) is added. (3)multipath -r: /dev/sdi is discovered but added to 36001405b7679bd96b094bccbf971bc90. This is very confusing. The reason I can think of is that get_uid() in pathinfo() gets the wrong wwid. What do you think of this phenomenon? Regards, Lixiaokeng The kernel log: 84087 Feb 5 16:13:12 client1 kernel: [618082.728130] sd 1:0:0:2: [sdi] Synchronizing SCSI cache multipath -r (first) 84103 Feb 5 16:13:12 client1 kernel: [618082.992107] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:96 1 84104 Feb 5 16:13:12 client1 kernel: [618082.992109] name dm-6; majir:minor 253:6; dm-6 84153 Feb 5 16:13:12 client1 kernel: [618083.244950] sd 2:0:0:1: [sdi] Attached SCSI disk multipath -r (second) 84200 Feb 5 16:13:12 client1 kernel: [618084.453983] lxk kernel params 0 1 alua 1 1 service-time 0 1 1 8:128 1 84201 Feb 5 16:13:12 client1 kernel: [618084.453985] name dm-6; majir:minor 253:6; dm-6 The userspace log: multipath -r (first) 84334 Feb 5 16:13:12 client1 multipathd[3816730]: lock .mul_iscsi.lock 84335 Feb 5 16:13:12 client1 multipathd[3816730]: 36001405ca5165367d67447ea68108e1d: remove multipath map 84336 Feb 5 16:13:12 client1 multipathd[3816730]: sdh: orphan path, map flushed 84337 Feb 5 16:13:12 client1 multipathd[3816730]: BUG: orphaning path sdh that holds hwe of 36001405ca5165367d67447ea68108e1d 84338 Feb 5 16:13:12 client1 multipathd[3816730]: close sdh pp->fd 84339 Feb 5 16:13:12 client1 multipathd[3816730]: 3600140531f063b3e19349bc82028e0cc: remove multipath map 84340 Feb 5 16:13:12 client1 multipathd[3816730]: sdj: orphan path, map flushed 84341 Feb 5 16:13:12 client1 multipathd[3816730]: BUG: orphaning path sdj that holds hwe of 3600140531f063b3e19349bc82028e0cc 84342 Feb 5 16:13:12 client1 multipathd[3816730]: close sdj pp->fd 84343 Feb 5 16:13:12 client1 multipathd[3816730]: 3600140584e11eb1818c4afab12c17800: remove multipath map 84344 Feb 5 16:13:12 client1 multipathd[3816730]: sde: orphan path, map flushed 84345 Feb 5 16:13:12 client1 multipathd[3816730]: BUG: orphaning path sde that holds hwe of 3600140584e11eb1818c4afab12c17800 84346 Feb 5 16:13:12 client1 multipathd[3816730]: close sde pp->fd 84347 Feb 5 16:13:12 client1 systemd[3714763]: opt-test-3600140584e11eb1818c4afab12c17800.mount: Succeeded. 84348 Feb 5 16:13:12 client1 multipathd[3816730]: 36001405b7679bd96b094bccbf971bc90: remove multipath map 84349 Feb 5 16:13:12 client1 systemd[1]: opt-test-3600140584e11eb1818c4afab12c17800.mount: Succeeded. 84350 Feb 5 16:13:12 client1 multipathd[3816730]: sdi: orphan path, map flushed 84351 Feb 5 16:13:12 client1 multipathd[3816730]: BUG: orphaning path sdi that holds hwe of 36001405b7679bd96b094bccbf971bc90 84352 Feb 5 16:13:12 client1 multipathd[3816730]: close sdi pp->fd 84353 Feb 5 16:13:12 client1 multipathd[3816730]: 8:144: cannot find block device 84354 Feb 5 16:13:12 client1 multipathd[3816730]: 3600140531f063b3e19349bc82028e0cc: remove multipath map 84355 Feb 5 16:13:12 client1 multipathd[3816730]: 8:112: cannot find block device 84356 Feb 5 16:13:12 client1 multipathd[3816730]: 36001405ca5165367d67447ea68108e1d: remove multipath map 84357 Feb 5 16:13:12 client1 multipathd[3816730]: 8:64: cannot find block device 84358 Feb 5 16:13:12 client1 multipathd[3816730]: 3600140584e11eb1818c4afab12c17800: remove multipath map 84359 Feb 5 16:13:12 client1 multipathd[3816730]: 8:128: cannot find block device 84360 Feb 5 16:13:12 client1 multipathd[3816730]: 36001405b7679bd96b094bccbf971bc90: remove multipath map 84361 Feb 5 16:13:12 client1 multipathd[3816730]: sda: orphan path, blacklisted 84362 Feb 5 16:13:12 client1 multipathd[3816730]: sdb: orphan path, blacklisted 84363 Feb 5 16:13:12 client1 multipathd[3816730]: 3600140584e11eb1818c4afab12c17800: set ACT_CREATE (map does not exist) 84364 Feb 5 16:13:12 client1 multipathd[3816730]: 3600140584e11eb1818c4afab12c17800: load table [0 20971520 multipath 0 1 alua 1 1 service-time 0 1 1 8:32 1] 84365 Feb 5 16:13:12 client1 multipathd[3816730]: 36001405b7679bd96b094bccbf971bc90: set ACT_CREATE (map does not exist) 84366 Feb 5 16:13:12 client1 multipathd[3816730]: 36001405b7679bd96b094bccbf971bc90: load table [0 20971520 multipath 0 1 alua 1 1 service-time 0 1 1 8:96 1] 84367 Feb 5 16:13:12 client1 multipathd[3816730]: 36001405ca5165367d67447ea68108e1d: set ACT_CREATE (map does not exist) 84368 Feb 5 16:13:12 client1 iscsid[294822]: Connection61647:0 to [target: iqn.2003-01.org.linux-iscsi.openeuler.aarch64:sn.155b59349631, portal: 100.1.2.1,3260] through [iface: default] is shutdown. 84369 Feb 5 16:13:12 client1 multipathd[3816730]: 36001405ca5165367d67447ea68108e1d: load table [0 20971520 multipath 0 1 alua 1 1 service-time 0 1 1 8:48 1] 84370 Feb 5 16:13:12 client1 multipathd[3816730]: lock .mul_iscsi.lock 84371 Feb 5 16:13:12 client1 multipath[73791]: lock .mul_iscsi.lock multipath -r (second) 84439 Feb 5 16:13:12 client1 multipathd[3816730]: reconfigure (operator) 84440 Feb 5 16:13:12 client1 multipath[73858]: loading /lib64/multipath/libchecktur.so checker 84441 Feb 5 16:13:12 client1 multipathd[3816730]: lock .mul_iscsi.lock 84442 Feb 5 16:13:12 client1 multipathd[3816730]: 3600140584e11eb1818c4afab12c17800: remove multipath map 84443 Feb 5 16:13:12 client1 multipathd[3816730]: sdc: orphan path, map flushed 84444 Feb 5 16:13:12 client1 multipathd[3816730]: BUG: orphaning path sdc that holds hwe of 3600140584e11eb1818c4afab12c17800 84445 Feb 5 16:13:12 client1 multipathd[3816730]: close sdc pp->fd 84446 Feb 5 16:13:12 client1 multipathd[3816730]: 36001405b7679bd96b094bccbf971bc90: remove multipath map 84447 Feb 5 16:13:12 client1 multipathd[3816730]: sdg: orphan path, map flushed 84448 Feb 5 16:13:12 client1 multipathd[3816730]: BUG: orphaning path sdg that holds hwe of 36001405b7679bd96b094bccbf971bc90 84449 Feb 5 16:13:12 client1 multipathd[3816730]: close sdg pp->fd 84450 Feb 5 16:13:12 client1 multipathd[3816730]: 36001405ca5165367d67447ea68108e1d: remove multipath map 84451 Feb 5 16:13:12 client1 multipathd[3816730]: sdd: orphan path, map flushed 84452 Feb 5 16:13:12 client1 multipathd[3816730]: BUG: orphaning path sdd that holds hwe of 36001405ca5165367d67447ea68108e1d 84453 Feb 5 16:13:12 client1 multipathd[3816730]: close sdd pp->fd 84454 Feb 5 16:13:12 client1 multipathd[3816730]: 8:144: cannot find block device 84455 Feb 5 16:13:12 client1 multipathd[3816730]: 3600140531f063b3e19349bc82028e0cc: remove multipath map 84456 Feb 5 16:13:12 client1 multipathd[3816730]: 8:48: cannot find block device 84457 Feb 5 16:13:12 client1 multipathd[3816730]: 36001405ca5165367d67447ea68108e1d: remove multipath map 84458 Feb 5 16:13:12 client1 multipathd[3816730]: 8:32: cannot find block device 84459 Feb 5 16:13:12 client1 multipathd[3816730]: 3600140584e11eb1818c4afab12c17800: remove multipath map 84460 Feb 5 16:13:12 client1 multipathd[3816730]: 8:96: cannot find block device 84461 Feb 5 16:13:12 client1 multipathd[3816730]: 36001405b7679bd96b094bccbf971bc90: remove multipath map 84462 Feb 5 16:13:12 client1 multipathd[3816730]: sda: orphan path, blacklisted 84463 Feb 5 16:13:12 client1 multipathd[3816730]: sdb: orphan path, blacklisted 84464 Feb 5 16:13:12 client1 multipathd[3816730]: 36001405b7679bd96b094bccbf971bc90: set ACT_CREATE (map does not exist) 84465 Feb 5 16:13:12 client1 multipathd[3816730]: 36001405b7679bd96b094bccbf971bc90: load table [0 20971520 multipath 0 1 alua 1 1 service-time 0 1 1 8:128 1] 84466 Feb 5 16:13:12 client1 multipathd[3816730]: lock .mul_iscsi.lock -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-01-19 21:57 ` Martin Wilck 2021-01-20 2:30 ` lixiaokeng @ 2021-01-20 13:02 ` Roger Heflin 2021-01-20 20:45 ` Martin Wilck 1 sibling, 1 reply; 18+ messages in thread From: Roger Heflin @ 2021-01-20 13:02 UTC (permalink / raw) To: Martin Wilck Cc: lixiaokeng, lihaotian9, linfeilong, dm-devel mailing list, liuzhiqiang (I) > verify_paths() would detect this. We do call verify_paths() in > coalesce_paths() before calling domap(), but not immediately before. > Perhaps we should move the verify_paths() call down to immediately > before the domap() call. That would at least minimize the time window > for this race. It's hard to avoid it entirely. The way multipathd is > written, the vecs lock is held all the time during coalesce_paths(), > and thus no uevents can be processed. We could also consider calling > verify_paths() before *and* after domap(). > > Was this a map creation or a map reload? Was the map removed after the > failure? Do you observe the message "ignoring map" or "removing map"? > > Do you observe a "remove" uevent for sdi? > > > I wonder if you'd see the issue also if you run the same test without > the "multipath -F; multipath -r" loop, or with just one. Ok, one > multipath_query() loop simulates an admin working on the system, but 2 > parallel loops - 2 admins working in parallel, plus the intensive > sequence of actions done in multipathd_query at the same time? The > repeated "multipath -r" calls and multipathd commands will cause > multipathd to spend a lot of time in reconfigure() and in cli_* calls > holding the vecs lock, which makes it likely that uevents are missed or > processed late. > > Don't get me wrong, I don't argue against tough testing. But we should > be aware that there are always time intervals during which multipathd's > picture of the present devices is different from what the kernel sees. > > There's definitely room for improvement in multipathd wrt locking and > event processing in general, but that's a BIG piece of work. > > I don't know if this helps, or is exactly like what he is duplicating: I debugged and verified a corruption issue a few years ago where this was what happened: DiskA was presented at say sdh (via SAN) and a multipath device was created on top of its paths, then diskA was unpresented and new disks were put back in the same zone. DiskB was now presented in the same slot (zone+lunid/sdh) and inherited by the still in place multipath device/mapping. In this case I don't believe there was ever a device level event for sdh. In our case, we did not log a case with our vendor as this was never supposed to happen and this seemed at the time to be how we expected it to work and because during the unpresent a script was supposed to have been run to clean up all of the dead paths prior to any new storage being presented. You might have to verify the path is still the same device any time you are recovering from a path like failure since in the above case the sdh device always existed but had the underlying storage/lun changed. I am not sure in his case if this is what is going on or if the sdX device he is using actually goes away in his test. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err 2021-01-20 13:02 ` Roger Heflin @ 2021-01-20 20:45 ` Martin Wilck 0 siblings, 0 replies; 18+ messages in thread From: Martin Wilck @ 2021-01-20 20:45 UTC (permalink / raw) To: Roger Heflin Cc: lixiaokeng, lihaotian9, linfeilong, dm-devel mailing list, liuzhiqiang (I) On Wed, 2021-01-20 at 07:02 -0600, Roger Heflin wrote: > > > I don't know if this helps, or is exactly like what he is > duplicating: > > I debugged and verified a corruption issue a few years ago where this > was what happened: > > DiskA was presented at say sdh (via SAN) and a multipath device was > created on top of its paths, then diskA was unpresented and new disks > were put back in the same zone. > DiskB was now presented in the same slot (zone+lunid/sdh) and > inherited by the still in place multipath device/mapping. In this > case I don't believe there was ever a device level event for sdh. If there really was no event, what should multipathd have done? We can't poll SCSI devices for device ID changes. Even if we did, how often should we do it? A wrong disk in an array might lead to data corruption in the course of just a few ms. Anyway, if this was a few years ago, I'm confident that we do better in this area today than we used to, even if lixiaokeng's case shows that we still have dark areas. Regards Martin -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2021-02-05 11:49 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-01-18 11:08 [dm-devel] [QUESTION]: multipath device with wrong path lead to metadata err lixiaokeng 2021-01-19 9:41 ` Martin Wilck 2021-01-19 12:46 ` lixiaokeng 2021-01-19 21:57 ` Martin Wilck 2021-01-20 2:30 ` lixiaokeng 2021-01-20 14:07 ` Martin Wilck 2021-01-25 1:33 ` lixiaokeng 2021-01-25 12:28 ` Martin Wilck 2021-01-26 6:40 ` lixiaokeng 2021-01-26 11:14 ` lixiaokeng 2021-01-26 23:11 ` Martin Wilck 2021-01-28 8:27 ` lixiaokeng 2021-01-28 21:15 ` Martin Wilck 2021-02-04 11:25 ` lixiaokeng 2021-02-04 14:56 ` Martin Wilck 2021-02-05 11:49 ` lixiaokeng 2021-01-20 13:02 ` Roger Heflin 2021-01-20 20:45 ` Martin Wilck
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.