linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm]  system boot time regression when using lvm2-2.03.05
@ 2019-08-29 13:52 Heming Zhao
  2019-08-29 14:37 ` David Teigland
  0 siblings, 1 reply; 22+ messages in thread
From: Heming Zhao @ 2019-08-29 13:52 UTC (permalink / raw)
  To: linux-lvm, Martin Wilck

Hello List,

I found lvm2-2.03 has some performance regression when system booting.

My env as below:
```
x86-64 qemu vm, 2vcpu, 8G memory,  7 disks (per disk 1GB)
each disk have 128 primary partition, each partition size is 6MB.
total pvs: 896 (each partition create a pv),  vgs 56 (every 16 pvs 
create a vg), lvs 56 (each vg create a lv).
```

When using lvm2-2.02, it only took about a few seconds to enter login 
prompt, but lvm2-2.03 took about 2mins.


## how to create lvm2-2.03.05

In fedora system,
go to website: https://src.fedoraproject.org/rpms/lvm2/tree/master
git clone https://src.fedoraproject.org/rpms/lvm2.git
To use rpmbuild to create rpm package and install.
And to run mkinitrd after changing the lvm.conf.

## teset result

below time got from the command: systemd-analyze --no-pager blame
the time line of "lvm2-pvscan@major:minor.service"

centos 7.6 (lvm2-2.02.180):
disable lvmetad:  2.341s
enable lvmetad:  I wait about more than two hours, the system can't 
enter login phase.

fedora-server (kernel: 5.2.9-200  with default installed lvm2: 
lvm2-2.02.183-3)
use_lvmetad=0: 187ms
use_lvmetad=1: (no test)

fedora-server (kernel: 5.2.9-200  with rpmbuild: lvm2-2.03.05)
event_activation=1: 2min 3.661s
event_activation=0: 1min 57.478s


Could you give me some advice to locate this issue?

Thank you.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-08-29 13:52 [linux-lvm] system boot time regression when using lvm2-2.03.05 Heming Zhao
@ 2019-08-29 14:37 ` David Teigland
  2019-09-03  5:02   ` Heming Zhao
  0 siblings, 1 reply; 22+ messages in thread
From: David Teigland @ 2019-08-29 14:37 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Martin Wilck

On Thu, Aug 29, 2019 at 01:52:48PM +0000, Heming Zhao wrote:
> fedora-server (kernel: 5.2.9-200  with rpmbuild: lvm2-2.03.05)
> event_activation=1: 2min 3.661s
> event_activation=0: 1min 57.478s

We've recently been looking at other similar cases, although not with the
unusual partitioning scheme you have.  Could you send the output of
systemctl status on each of the lvm2-pvscan services?  Also, try setting
obtain_device_list_from_udev=0 in lvm.conf to avoid some delays from udev.
Thanks, Dave

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-08-29 14:37 ` David Teigland
@ 2019-09-03  5:02   ` Heming Zhao
  2019-09-03 15:17     ` David Teigland
  0 siblings, 1 reply; 22+ messages in thread
From: Heming Zhao @ 2019-09-03  5:02 UTC (permalink / raw)
  To: LVM general discussion and development, David Teigland; +Cc: Martin Wilck

Hello

I found my mail which sent on last Friday and yesterday didn't appear in 
mail list, it may be blocked for attachment. So I resend it without 
attachment.

---below is original mail--
Hello David,

When event_activation=1 & obtain_device_list_from_udev = 0 in lvm.conf.
the boot time became: 1min 9.349s  (before time: 2min 3.661s)

When event_activation=0 & obtain_device_list_from_udev = 0
time: 1min 7.219s (before time: 1min 57.478s)

And the systemctl status for pvscan when time is 2min & 1min (obtain_xx=0)
```
# ls -lh *.txt
-rw-r--r--. 1 root root 640K Aug 30 19:17 
systemctl-status-all-pvscan-1min.txt
-rw-r--r--. 1 root root 664K Aug 30 18:49 
systemctl-status-all-pvscan-2min.txt
```
the scripts:
```
for i  in `systemctl list-units | grep lvm2-pvscan | cut -d' ' -f 3`; do 
systemctl status $i; done > systemctl-status-all-pvscan-2min.txt
```

There are 2 test result for above for loop. (these files will delete 3 
months later)
1> systemctl-status-all-pvscan-1min.txt
    event_activation=1 & obtain_device_list_from_udev = 0

2> systemctl-status-all-pvscan-2min.txt
    event_activation=1 & obtain_device_list_from_udev = 1

Test result URL:
https://gist.github.com/zhaohem/f9951bb016962cdd07bf7c9d3d7fd525

Thanks.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-03  5:02   ` Heming Zhao
@ 2019-09-03 15:17     ` David Teigland
  2019-09-04  8:13       ` Heming Zhao
  0 siblings, 1 reply; 22+ messages in thread
From: David Teigland @ 2019-09-03 15:17 UTC (permalink / raw)
  To: Heming Zhao; +Cc: Martin Wilck, linux-lvm

On Tue, Sep 03, 2019 at 05:02:25AM +0000, Heming Zhao wrote:
> Test result URL:
> https://gist.github.com/zhaohem/f9951bb016962cdd07bf7c9d3d7fd525

At least part of the problem is caused by lvm waiting on udev, e.g.
WARNING: Device /dev/vdf76 not initialized in udev database even after waiting 10000000 microseconds.

I recently wrote this patch to stop that:
https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=0534cd9cd4066c88a7dd815f2f3206a177169334

With this older patch, obtain_device_list_from_udev=0 can also help avoid it:
https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=3ebce8dbd2d9afc031e0737f8feed796ec7a8df9

Also, I just pushed out this commit that makes the pvscan activations
faster when there are many PVs:
https://sourceware.org/git/?p=lvm2.git;a=commit;h=25b58310e3d606a85abc9bd50991ccb7ddcbfe25

Dave

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-03 15:17     ` David Teigland
@ 2019-09-04  8:13       ` Heming Zhao
  2019-09-05 12:35         ` Heming Zhao
  0 siblings, 1 reply; 22+ messages in thread
From: Heming Zhao @ 2019-09-04  8:13 UTC (permalink / raw)
  To: David Teigland; +Cc: Martin Wilck, linux-lvm

Thanks for you reply.

I found the latest lvm2 git source code contains your 3 commits. So I 
built lvm2 with today's git code. But there is no big change as before.

ENV: fedora30 server edition, 896 PVs.

All below results from today's work.

[with patch]
event_activation = 1 && obtain_device_list_from_udev = 1
boot time: 1min 44.295s

event_activation = 0 && obtain_device_list_from_udev = 0
boot time: 59.759s

[without patch]
event_activation = 1 && obtain_device_list_from_udev = 1
boot time: 1min 56.040s

event_activation = 0 && obtain_device_list_from_udev = 0
boot time: 1min 6.715s

Thanks.

On 9/3/19 11:17 PM, David Teigland wrote:
> On Tue, Sep 03, 2019 at 05:02:25AM +0000, Heming Zhao wrote:
>> Test result URL:
>> https://gist.github.com/zhaohem/f9951bb016962cdd07bf7c9d3d7fd525
> 
> At least part of the problem is caused by lvm waiting on udev, e.g.
> WARNING: Device /dev/vdf76 not initialized in udev database even after waiting 10000000 microseconds.
> 
> I recently wrote this patch to stop that:
> https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=0534cd9cd4066c88a7dd815f2f3206a177169334
> 
> With this older patch, obtain_device_list_from_udev=0 can also help avoid it:
> https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=3ebce8dbd2d9afc031e0737f8feed796ec7a8df9
> 
> Also, I just pushed out this commit that makes the pvscan activations
> faster when there are many PVs:
> https://sourceware.org/git/?p=lvm2.git;a=commit;h=25b58310e3d606a85abc9bd50991ccb7ddcbfe25
> 
> Dave
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-04  8:13       ` Heming Zhao
@ 2019-09-05 12:35         ` Heming Zhao
  2019-09-05 16:55           ` David Teigland
  0 siblings, 1 reply; 22+ messages in thread
From: Heming Zhao @ 2019-09-05 12:35 UTC (permalink / raw)
  To: David Teigland; +Cc: Martin Wilck, linux-lvm

Hello David,

Today I may find the key of regression.

In pvscan_cache_cmd, the code in below area "#if 0 .. #endif take a huge 
time. When I used below modified code to boot, the time reduced from 
1min to 1.389s.

The _online_pvscan_one responsible to lvmetad_pvscan_single in lvm2-2.02 
code. But lvmetad_pvscan_single had been removed from lvm2 since 2.03. 
So below if() area looks useless in lvm2-2.03.

pvscan_cache_cmd() //code for git latest version
{
    ...
         if (!dm_list_empty(&add_devs)) {
                 log_print("zhm %s %d", __func__, __LINE__);
                 label_scan_devs(cmd, cmd->filter, &add_devs);

                 dm_list_iterate_items(devl, &add_devs) {
                         dev = devl->dev;

                         if (dev->flags & DEV_FILTER_OUT_SCAN) {
                                 log_print("pvscan[%d] device %s 
excluded by filter.", getpid(), dev_name(dev));
                                 continue;
                         }

                         add_single_count++;
#if 0
                         //zhm: lvm2-2.02 func: lvmetad_pvscan_single()
                         if (!_online_pvscan_one(cmd, dev, NULL, 
complete_vgnames, saved_vgs, 0, &pvid_without_metadata))
                                 add_errors++;
#endif
                 }
         }
    ...
}

Thanks


On 9/4/19 4:13 PM, Heming Zhao wrote:
> Thanks for you reply.
> 
> I found the latest lvm2 git source code contains your 3 commits. So I
> built lvm2 with today's git code. But there is no big change as before.
> 
> ENV: fedora30 server edition, 896 PVs.
> 
> All below results from today's work.
> 
> [with patch]
> event_activation = 1 && obtain_device_list_from_udev = 1
> boot time: 1min 44.295s
> 
> event_activation = 0 && obtain_device_list_from_udev = 0
> boot time: 59.759s
> 
> [without patch]
> event_activation = 1 && obtain_device_list_from_udev = 1
> boot time: 1min 56.040s
> 
> event_activation = 0 && obtain_device_list_from_udev = 0
> boot time: 1min 6.715s
> 
> Thanks.
> 
> On 9/3/19 11:17 PM, David Teigland wrote:
>> On Tue, Sep 03, 2019 at 05:02:25AM +0000, Heming Zhao wrote:
>>> Test result URL:
>>> https://gist.github.com/zhaohem/f9951bb016962cdd07bf7c9d3d7fd525
>>
>> At least part of the problem is caused by lvm waiting on udev, e.g.
>> WARNING: Device /dev/vdf76 not initialized in udev database even after waiting 10000000 microseconds.
>>
>> I recently wrote this patch to stop that:
>> https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=0534cd9cd4066c88a7dd815f2f3206a177169334
>>
>> With this older patch, obtain_device_list_from_udev=0 can also help avoid it:
>> https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=3ebce8dbd2d9afc031e0737f8feed796ec7a8df9
>>
>> Also, I just pushed out this commit that makes the pvscan activations
>> faster when there are many PVs:
>> https://sourceware.org/git/?p=lvm2.git;a=commit;h=25b58310e3d606a85abc9bd50991ccb7ddcbfe25
>>
>> Dave
>>
> 
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-05 12:35         ` Heming Zhao
@ 2019-09-05 16:55           ` David Teigland
  2019-09-06  4:31             ` Heming Zhao
  0 siblings, 1 reply; 22+ messages in thread
From: David Teigland @ 2019-09-05 16:55 UTC (permalink / raw)
  To: Heming Zhao; +Cc: Martin Wilck, linux-lvm

On Thu, Sep 05, 2019 at 12:35:53PM +0000, Heming Zhao wrote:
> In pvscan_cache_cmd, the code in below area "#if 0 .. #endif take a huge 
> time. When I used below modified code to boot, the time reduced from 
> 1min to 1.389s.

That stops the command from doing any work.  I suspect that in your tests,
the "fast" case is not doing any activation, and the "slow" case is.
Please check where the LVs are being activated in the fast case.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-05 16:55           ` David Teigland
@ 2019-09-06  4:31             ` Heming Zhao
  2019-09-06  5:01               ` Heming Zhao
  0 siblings, 1 reply; 22+ messages in thread
From: Heming Zhao @ 2019-09-06  4:31 UTC (permalink / raw)
  To: David Teigland; +Cc: Martin Wilck, linux-lvm

The status:
```
[root@f30-lvmroot ~]# systemd-analyze blame | less
[root@f30-lvmroot ~]# pvs | tail -n 5
   /dev/vdh95  vgtst-54           lvm2 a--   4.00m 4.00m
   /dev/vdh96  vgtst-54           lvm2 a--   4.00m 4.00m
   /dev/vdh97  vgtst-55           lvm2 a--   4.00m    0
   /dev/vdh98  vgtst-55           lvm2 a--   4.00m 4.00m
   /dev/vdh99  vgtst-55           lvm2 a--   4.00m 4.00m
[root@f30-lvmroot ~]# vgs | tail -n 5
   vgtst-56            16   1   0 wz--n- 64.00m 60.00m
   vgtst-6             16   1   0 wz--n- 64.00m 60.00m
   vgtst-7             16   1   0 wz--n- 64.00m 60.00m
   vgtst-8             16   1   0 wz--n- 64.00m 60.00m
   vgtst-9             16   1   0 wz--n- 64.00m 60.00m
[root@f30-lvmroot ~]# lvs | tail -n 5
   vgtst-56-lv56 vgtst-56           -wi-a-----    4.00m
   vgtst-6-lv6   vgtst-6            -wi-a-----    4.00m
   vgtst-7-lv7   vgtst-7            -wi-a-----    4.00m
   vgtst-8-lv8   vgtst-8            -wi-a-----    4.00m
   vgtst-9-lv9   vgtst-9            -wi-a-----    4.00m
[root@f30-lvmroot ~]# pvs | wc -l
899
[root@f30-lvmroot ~]# vgs | wc -l
58
[root@f30-lvmroot ~]# lvs | wc -l
60
[root@f30-lvmroot ~]# rpm -qa | grep lvm2
lvm2-devel-2.03.06-3.fc30.x86_64
lvm2-dbusd-2.03.06-3.fc30.noarch
lvm2-2.03.06-3.fc30.x86_64
lvm2-lockd-2.03.06-3.fc30.x86_64
udisks2-lvm2-2.8.4-1.fc30.x86_64
lvm2-libs-2.03.06-3.fc30.x86_64
[root@f30-lvmroot ~]#
```

you can see the 'a' bit of lv attr.


Yesterday I only showed the key change of the modification. below is the complete patch.
1>
comment out calling _online_pvscan_one in pvscan_cache_cmd
2>
partly backout (use "#if 0") your commit: 25b58310e3d606a85abc9bd50991ccb7ddcbfe25
https://sourceware.org/git/?p=lvm2.git;a=commit;h=25b58310e3d606a85abc9bd50991ccb7ddcbfe25

```patch
diff --git a/tools/pvscan.c b/tools/pvscan.c
index b025ae3e6b..52a50af962 100644
--- a/tools/pvscan.c
+++ b/tools/pvscan.c
@@ -928,7 +928,7 @@ static int _online_vg_file_create(struct cmd_context *cmd, const char *vgname)
   * We end up with a list of struct devices that we need to
   * scan/read in order to process/activate the VG.
   */
-
+#if 0
  static int _get_devs_from_saved_vg(struct cmd_context *cmd, char *vgname,
  				   struct dm_list *saved_vgs,
  				   struct dm_list *devs)
@@ -1126,6 +1126,7 @@ out:
  	release_vg(vg);
  	return ret;
  }
+#endif
  
  static int _pvscan_aa(struct cmd_context *cmd, struct pvscan_aa_params *pp,
  		      struct dm_list *vgnames, struct dm_list *saved_vgs)
@@ -1166,7 +1167,9 @@ static int _pvscan_aa(struct cmd_context *cmd, struct pvscan_aa_params *pp,
  		destroy_processing_handle(cmd, handle);
  		return ECMD_PROCESSED;
  	}
-
+#if 1
+	ret = process_each_vg(cmd, 0, NULL, NULL, vgnames, READ_FOR_ACTIVATE, 0, handle, _pvscan_aa_single);
+#else
  	if (dm_list_size(vgnames) == 1) {
  		dm_list_iterate_items(sl, vgnames)
  			ret = _pvscan_aa_direct(cmd, pp, (char *)sl->str, saved_vgs);
@@ -1174,6 +1177,7 @@ static int _pvscan_aa(struct cmd_context *cmd, struct pvscan_aa_params *pp,
  		/* FIXME: suppress label scan in process_each if label scan already done? */
  		ret = process_each_vg(cmd, 0, NULL, NULL, vgnames, READ_FOR_ACTIVATE, 0, handle, _pvscan_aa_single);
  	}
+#endif
  
  	destroy_processing_handle(cmd, handle);
  
@@ -1418,9 +1422,10 @@ int pvscan_cache_cmd(struct cmd_context *cmd, int argc, char **argv)
  			}
  
  			add_single_count++;
-
+#if 0
  			if (!_online_pvscan_one(cmd, dev, NULL, complete_vgnames, saved_vgs, 0, &pvid_without_metadata))
  				add_errors++;
+#endif
  		}
  	}
```


On 9/6/19 12:55 AM, David Teigland wrote:
> On Thu, Sep 05, 2019 at 12:35:53PM +0000, Heming Zhao wrote:
>> In pvscan_cache_cmd, the code in below area "#if 0 .. #endif take a huge
>> time. When I used below modified code to boot, the time reduced from
>> 1min to 1.389s.
> 
> That stops the command from doing any work.  I suspect that in your tests,
> the "fast" case is not doing any activation, and the "slow" case is.
> Please check where the LVs are being activated in the fast case.
> 
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-06  4:31             ` Heming Zhao
@ 2019-09-06  5:01               ` Heming Zhao
  2019-09-06  6:51                 ` Martin Wilck
  0 siblings, 1 reply; 22+ messages in thread
From: Heming Zhao @ 2019-09-06  5:01 UTC (permalink / raw)
  To: David Teigland; +Cc: Martin Wilck, linux-lvm

I just tried to only apply below patch (didn't partly backout commit 25b58310e3).
The attrs of lvs output still have 'a' bit.

```patch
+#if 0
    			if (!_online_pvscan_one(cmd, dev, NULL, complete_vgnames, saved_vgs, 0, &pvid_without_metadata))
    				add_errors++;
+#endif
```

the output of "systemd-analysis blame | head -n 10":
```
          59.279s systemd-udev-settle.service
          39.979s dracut-initqueue.service
           1.676s lvm2-activation-net.service
           1.605s initrd-switch-root.service
           1.330s NetworkManager-wait-online.service
           1.250s sssd.service
            958ms initrd-parse-etc.service
            931ms lvm2-activation-early.service
            701ms lvm2-pvscan@259:97.service
            700ms firewalld.service
```

On 9/6/19 12:31 PM, Heming Zhao wrote:
> The status:
> ```
> [root@f30-lvmroot ~]# systemd-analyze blame | less
> [root@f30-lvmroot ~]# pvs | tail -n 5
>     /dev/vdh95  vgtst-54           lvm2 a--   4.00m 4.00m
>     /dev/vdh96  vgtst-54           lvm2 a--   4.00m 4.00m
>     /dev/vdh97  vgtst-55           lvm2 a--   4.00m    0
>     /dev/vdh98  vgtst-55           lvm2 a--   4.00m 4.00m
>     /dev/vdh99  vgtst-55           lvm2 a--   4.00m 4.00m
> [root@f30-lvmroot ~]# vgs | tail -n 5
>     vgtst-56            16   1   0 wz--n- 64.00m 60.00m
>     vgtst-6             16   1   0 wz--n- 64.00m 60.00m
>     vgtst-7             16   1   0 wz--n- 64.00m 60.00m
>     vgtst-8             16   1   0 wz--n- 64.00m 60.00m
>     vgtst-9             16   1   0 wz--n- 64.00m 60.00m
> [root@f30-lvmroot ~]# lvs | tail -n 5
>     vgtst-56-lv56 vgtst-56           -wi-a-----    4.00m
>     vgtst-6-lv6   vgtst-6            -wi-a-----    4.00m
>     vgtst-7-lv7   vgtst-7            -wi-a-----    4.00m
>     vgtst-8-lv8   vgtst-8            -wi-a-----    4.00m
>     vgtst-9-lv9   vgtst-9            -wi-a-----    4.00m
> [root@f30-lvmroot ~]# pvs | wc -l
> 899
> [root@f30-lvmroot ~]# vgs | wc -l
> 58
> [root@f30-lvmroot ~]# lvs | wc -l
> 60
> [root@f30-lvmroot ~]# rpm -qa | grep lvm2
> lvm2-devel-2.03.06-3.fc30.x86_64
> lvm2-dbusd-2.03.06-3.fc30.noarch
> lvm2-2.03.06-3.fc30.x86_64
> lvm2-lockd-2.03.06-3.fc30.x86_64
> udisks2-lvm2-2.8.4-1.fc30.x86_64
> lvm2-libs-2.03.06-3.fc30.x86_64
> [root@f30-lvmroot ~]#
> ```
> 
> you can see the 'a' bit of lv attr.
> 
> 
> Yesterday I only showed the key change of the modification. below is the complete patch.
> 1>
> comment out calling _online_pvscan_one in pvscan_cache_cmd
> 2>
> partly backout (use "#if 0") your commit: 25b58310e3d606a85abc9bd50991ccb7ddcbfe25
> https://sourceware.org/git/?p=lvm2.git;a=commit;h=25b58310e3d606a85abc9bd50991ccb7ddcbfe25
> 
> ```patch
> diff --git a/tools/pvscan.c b/tools/pvscan.c
> index b025ae3e6b..52a50af962 100644
> --- a/tools/pvscan.c
> +++ b/tools/pvscan.c
> @@ -928,7 +928,7 @@ static int _online_vg_file_create(struct cmd_context *cmd, const char *vgname)
>     * We end up with a list of struct devices that we need to
>     * scan/read in order to process/activate the VG.
>     */
> -
> +#if 0
>    static int _get_devs_from_saved_vg(struct cmd_context *cmd, char *vgname,
>    				   struct dm_list *saved_vgs,
>    				   struct dm_list *devs)
> @@ -1126,6 +1126,7 @@ out:
>    	release_vg(vg);
>    	return ret;
>    }
> +#endif
>    
>    static int _pvscan_aa(struct cmd_context *cmd, struct pvscan_aa_params *pp,
>    		      struct dm_list *vgnames, struct dm_list *saved_vgs)
> @@ -1166,7 +1167,9 @@ static int _pvscan_aa(struct cmd_context *cmd, struct pvscan_aa_params *pp,
>    		destroy_processing_handle(cmd, handle);
>    		return ECMD_PROCESSED;
>    	}
> -
> +#if 1
> +	ret = process_each_vg(cmd, 0, NULL, NULL, vgnames, READ_FOR_ACTIVATE, 0, handle, _pvscan_aa_single);
> +#else
>    	if (dm_list_size(vgnames) == 1) {
>    		dm_list_iterate_items(sl, vgnames)
>    			ret = _pvscan_aa_direct(cmd, pp, (char *)sl->str, saved_vgs);
> @@ -1174,6 +1177,7 @@ static int _pvscan_aa(struct cmd_context *cmd, struct pvscan_aa_params *pp,
>    		/* FIXME: suppress label scan in process_each if label scan already done? */
>    		ret = process_each_vg(cmd, 0, NULL, NULL, vgnames, READ_FOR_ACTIVATE, 0, handle, _pvscan_aa_single);
>    	}
> +#endif
>    
>    	destroy_processing_handle(cmd, handle);
>    
> @@ -1418,9 +1422,10 @@ int pvscan_cache_cmd(struct cmd_context *cmd, int argc, char **argv)
>    			}
>    
>    			add_single_count++;
> -
> +#if 0
>    			if (!_online_pvscan_one(cmd, dev, NULL, complete_vgnames, saved_vgs, 0, &pvid_without_metadata))
>    				add_errors++;
> +#endif
>    		}
>    	}
> ```
> 
> 
> On 9/6/19 12:55 AM, David Teigland wrote:
>> On Thu, Sep 05, 2019 at 12:35:53PM +0000, Heming Zhao wrote:
>>> In pvscan_cache_cmd, the code in below area "#if 0 .. #endif take a huge
>>> time. When I used below modified code to boot, the time reduced from
>>> 1min to 1.389s.
>>
>> That stops the command from doing any work.  I suspect that in your tests,
>> the "fast" case is not doing any activation, and the "slow" case is.
>> Please check where the LVs are being activated in the fast case.
>>
>>
> 
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-06  5:01               ` Heming Zhao
@ 2019-09-06  6:51                 ` Martin Wilck
  2019-09-06  8:46                   ` Heming Zhao
  2019-09-06 14:03                   ` David Teigland
  0 siblings, 2 replies; 22+ messages in thread
From: Martin Wilck @ 2019-09-06  6:51 UTC (permalink / raw)
  To: LVM general discussion and development, David Teigland, Heming Zhao

On Fri, 2019-09-06 at 05:01 +0000, Heming Zhao wrote:
> I just tried to only apply below patch (didn't partly backout commit
> 25b58310e3).
> The attrs of lvs output still have 'a' bit.
> 
> ```patch
> +#if 0
>     			if (!_online_pvscan_one(cmd, dev, NULL,
> complete_vgnames, saved_vgs, 0, &pvid_without_metadata))
>     				add_errors++;
> +#endif


IIUC this would mean that you skip David's "pvs_online" file generation
entirely. How did the auto-activation happen, then?

> ```
> 
> the output of "systemd-analysis blame | head -n 10":
> ```
>           59.279s systemd-udev-settle.service
>           39.979s dracut-initqueue.service
>            1.676s lvm2-activation-net.service

Could it be that lvm2-activation-net.service activated the VGs? I can
imagine that that would be efficient, because when this service runs
late in the boot process, I'd expect all PVs to be online, so
everything can be activated in a single big swoop. Unfortunately, this
wouldn't work in general, as it would be too late for booting from LVM
volumes.

However I thought all lvm2-acticvation... services were gone with LVM
2.03?

Regards
Martin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-06  6:51                 ` Martin Wilck
@ 2019-09-06  8:46                   ` Heming Zhao
  2019-09-06 14:15                     ` David Teigland
  2019-09-06 14:26                     ` David Teigland
  2019-09-06 14:03                   ` David Teigland
  1 sibling, 2 replies; 22+ messages in thread
From: Heming Zhao @ 2019-09-06  8:46 UTC (permalink / raw)
  To: Martin Wilck, LVM general discussion and development, David Teigland

the _online_pvscan_one cost too much time when booting.
It mainly job is to create file in /run/lvm/pvs_online, which role is to replace lvmetad.

when comment out _online_pvscan_one, below folders are blank
/run/lvm/pvs_online  & /run/lvm/vgs_online

I am not familiar with vg metadata layout. I GUESS the vg metadata is recorded on the first PV dev of PVs group devs. So the _online_pv_found job just use the vg metadata info to find/print whether or not this vg is ready. this part code cost time & less useful when system boot. Or in another word, this part code is not necessary for online vg/lv.
```c
_online_pv_found()
{
     _online_pvid_file_create // call open() to create "/run/lvm/pvs_online/xx"

     ... ...

     //zhm: below code just count the not online PV number.
     dm_list_iterate_items(pvl, &vg->pvs) {
         if (!_online_pvid_file_exists((const char *)&pvl->pv->id.uuid))
             pvids_not_online++;

         /* Check if one of the devs on the command line is in this VG. */
         if (dev_args && dev_in_device_list(pvl->pv->dev, dev_args))
             dev_args_in_vg = 1;
     }
}
```

the core/key code for online lvs is _pvscan_aa():
_pvscan_aa
  +-> _pvscan_aa_direct
  |    vgchange_activate
  |
  or
  |
  +-> process_each_vg   //this func can work without reading /run/lvm/pvs_online/xx
       _pvscan_aa_single
         vgchange_activate

So my first patch partly backout commit 25b58310e3d6. To use process_each_vg active lvs, this func can work without reading /run/lvm/pvs_online/xx.

For the _pvscan_aa_direct(), I don't totally understand it. I need some time to dig it.


On 9/6/19 2:51 PM, Martin Wilck wrote:
> On Fri, 2019-09-06 at 05:01 +0000, Heming Zhao wrote:
>> I just tried to only apply below patch (didn't partly backout commit
>> 25b58310e3).
>> The attrs of lvs output still have 'a' bit.
>>
>> ```patch
>> +#if 0
>>      			if (!_online_pvscan_one(cmd, dev, NULL,
>> complete_vgnames, saved_vgs, 0, &pvid_without_metadata))
>>      				add_errors++;
>> +#endif
> 
> 
> IIUC this would mean that you skip David's "pvs_online" file generation
> entirely. How did the auto-activation happen, then?
> 
>> ```
>>
>> the output of "systemd-analysis blame | head -n 10":
>> ```
>>            59.279s systemd-udev-settle.service
>>            39.979s dracut-initqueue.service
>>             1.676s lvm2-activation-net.service
> 
> Could it be that lvm2-activation-net.service activated the VGs? I can
> imagine that that would be efficient, because when this service runs
> late in the boot process, I'd expect all PVs to be online, so
> everything can be activated in a single big swoop. Unfortunately, this
> wouldn't work in general, as it would be too late for booting from LVM
> volumes.
> 
> However I thought all lvm2-acticvation... services were gone with LVM
> 2.03?
> 
> Regards
> Martin
> 
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-06  6:51                 ` Martin Wilck
  2019-09-06  8:46                   ` Heming Zhao
@ 2019-09-06 14:03                   ` David Teigland
  2019-09-09 11:42                     ` Heming Zhao
  1 sibling, 1 reply; 22+ messages in thread
From: David Teigland @ 2019-09-06 14:03 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Heming Zhao, LVM general discussion and development

On Fri, Sep 06, 2019 at 08:51:47AM +0200, Martin Wilck wrote:
> IIUC this would mean that you skip David's "pvs_online" file generation
> entirely. How did the auto-activation happen, then?

I'd like to know which services/commands are activating the LVs.  In the
slow case it was clearly done by the lvm2-pvscan services, but in the fast
case it looked like it was not.

> Could it be that lvm2-activation-net.service activated the VGs?  I can
> imagine that that would be efficient, because when this service runs
> late in the boot process, I'd expect all PVs to be online, so
> everything can be activated in a single big swoop. Unfortunately, this
> wouldn't work in general, as it would be too late for booting from LVM
> volumes.
> 
> However I thought all lvm2-acticvation... services were gone with LVM
> 2.03?

They still exist.  In lvm 2.03, the lvm.conf event_activation setting
controls whether activation is event-based via lvm2-pvscan services, or
done by lvm2-activation services at fixed points during startup.  LVM
commands in initramfs could also be interfering and activating more than
the root LV.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-06  8:46                   ` Heming Zhao
@ 2019-09-06 14:15                     ` David Teigland
  2019-09-06 14:26                     ` David Teigland
  1 sibling, 0 replies; 22+ messages in thread
From: David Teigland @ 2019-09-06 14:15 UTC (permalink / raw)
  To: Heming Zhao; +Cc: Martin Wilck, LVM general discussion and development

On Fri, Sep 06, 2019 at 08:46:52AM +0000, Heming Zhao wrote:
> the core/key code for online lvs is _pvscan_aa():
> _pvscan_aa
>   +-> _pvscan_aa_direct
>   |    vgchange_activate
>   |
>   or
>   |
>   +-> process_each_vg   //this func can work without reading /run/lvm/pvs_online/xx
>        _pvscan_aa_single
>          vgchange_activate
> 
> So my first patch partly backout commit 25b58310e3d6. To use process_each_vg active lvs, this func can work without reading /run/lvm/pvs_online/xx.

That commit is a couple days old, so there could still be a bug in there,
but I think it's a distraction.  You reported this slowness prior to that
commit existing.  You could revert it if it's causing questions.

I don't see much use in testing modified code until we've determined that
a given command is indeed slower when doing the same thing as before.  If
you can do that, then collect the debug output from both old and new for
me to compare (-vvvv or lvm.conf file and verbose settings).

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-06  8:46                   ` Heming Zhao
  2019-09-06 14:15                     ` David Teigland
@ 2019-09-06 14:26                     ` David Teigland
  1 sibling, 0 replies; 22+ messages in thread
From: David Teigland @ 2019-09-06 14:26 UTC (permalink / raw)
  To: Heming Zhao; +Cc: Martin Wilck, LVM general discussion and development

On Fri, Sep 06, 2019 at 08:46:52AM +0000, Heming Zhao wrote:
> the _online_pvscan_one cost too much time when booting.

The pvscan debug output has timestamps and should show any steps that are
slow or delayed somehow.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-06 14:03                   ` David Teigland
@ 2019-09-09 11:42                     ` Heming Zhao
  2019-09-09 14:09                       ` David Teigland
  0 siblings, 1 reply; 22+ messages in thread
From: Heming Zhao @ 2019-09-09 11:42 UTC (permalink / raw)
  To: David Teigland, Martin Wilck; +Cc: LVM general discussion and development

Hello David,

You are right.  Without calling _online_pvscan_one(), the pv/vg/lv won't be actived.
The activation jobs will be done by systemd calling lvm2-activation-*.services later.

Current code, the boot process is mainly blocked by:
```
_pvscan_aa
  vgchange_activate
   _activate_lvs_in_vg
    sync_local_dev_names
     fs_unlock
      dm_udev_wait <=== this point!
```

For fix this boot time regression, it looks lvm2 should have a config item in lvm2.conf
i.e.: large_PV_boot_speedup.
When this item is 1, pvcan won't call _online_pvscan_one, then let lvm2-activation*.service
do the active jobs.
Is it a workable solution?

Thanks

On 9/6/19 10:03 PM, David Teigland wrote:
> On Fri, Sep 06, 2019 at 08:51:47AM +0200, Martin Wilck wrote:
>> IIUC this would mean that you skip David's "pvs_online" file generation
>> entirely. How did the auto-activation happen, then?
> 
> I'd like to know which services/commands are activating the LVs.  In the
> slow case it was clearly done by the lvm2-pvscan services, but in the fast
> case it looked like it was not.
> 
>> Could it be that lvm2-activation-net.service activated the VGs?  I can
>> imagine that that would be efficient, because when this service runs
>> late in the boot process, I'd expect all PVs to be online, so
>> everything can be activated in a single big swoop. Unfortunately, this
>> wouldn't work in general, as it would be too late for booting from LVM
>> volumes.
>>
>> However I thought all lvm2-acticvation... services were gone with LVM
>> 2.03?
> 
> They still exist.  In lvm 2.03, the lvm.conf event_activation setting
> controls whether activation is event-based via lvm2-pvscan services, or
> done by lvm2-activation services at fixed points during startup.  LVM
> commands in initramfs could also be interfering and activating more than
> the root LV.
> 
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-09 11:42                     ` Heming Zhao
@ 2019-09-09 14:09                       ` David Teigland
  2019-09-10  8:01                         ` Martin Wilck
  0 siblings, 1 reply; 22+ messages in thread
From: David Teigland @ 2019-09-09 14:09 UTC (permalink / raw)
  To: Heming Zhao; +Cc: Martin Wilck, LVM general discussion and development

On Mon, Sep 09, 2019 at 11:42:17AM +0000, Heming Zhao wrote:
> Hello David,
> 
> You are right.  Without calling _online_pvscan_one(), the pv/vg/lv won't be actived.
> The activation jobs will be done by systemd calling lvm2-activation-*.services later.
> 
> Current code, the boot process is mainly blocked by:
> ```
> _pvscan_aa
>   vgchange_activate
>    _activate_lvs_in_vg
>     sync_local_dev_names
>      fs_unlock
>       dm_udev_wait <=== this point!
> ```

Thanks for debugging that.  With so many devices, one possibility that
comes to mind is this error you would probably have seen:
"Limit for the maximum number of semaphores reached"

> For fix this boot time regression, it looks lvm2 should have a config item in lvm2.conf
> i.e.: large_PV_boot_speedup.
> When this item is 1, pvcan won't call _online_pvscan_one, then let lvm2-activation*.service
> do the active jobs.
> Is it a workable solution?

We should look into fixing the udev problems.  I don't mind working around
udev when it won't do what we need; I'm not sure what the options are in
this case.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-09 14:09                       ` David Teigland
@ 2019-09-10  8:01                         ` Martin Wilck
  2019-09-10 15:20                           ` David Teigland
  0 siblings, 1 reply; 22+ messages in thread
From: Martin Wilck @ 2019-09-10  8:01 UTC (permalink / raw)
  To: Heming Zhao, David Teigland; +Cc: LVM general discussion and development

Hi David,

On Mon, 2019-09-09 at 09:09 -0500, David Teigland wrote:
> On Mon, Sep 09, 2019 at 11:42:17AM +0000, Heming Zhao wrote:
> > Hello David,
> > 
> > You are right.  Without calling _online_pvscan_one(), the pv/vg/lv
> > won't be actived.
> > The activation jobs will be done by systemd calling lvm2-
> > activation-*.services later.
> > 
> > Current code, the boot process is mainly blocked by:
> > ```
> > _pvscan_aa
> >   vgchange_activate
> >    _activate_lvs_in_vg
> >     sync_local_dev_names
> >      fs_unlock
> >       dm_udev_wait <=== this point!
> > ```
> 
> Thanks for debugging that.  With so many devices, one possibility
> that
> comes to mind is this error you would probably have seen:
> "Limit for the maximum number of semaphores reached"

Could you explain to us what's happening in this code? IIUC, an
incoming uevent triggers pvscan, which then possibly triggers VG
activation. That in turn would create more uevents. The pvscan process
then waits for uevents for the tree "root" of the activated LVs to be
processed.

Can't we move this waiting logic out of the uevent handling? It seems
weird to me that a process that acts on a uevent waits for the
completion of another, later uevent. This is almost guaranteed to cause
delays during "uevent storms". Is it really necessary?

Maybe we could create a separate service that would be responsible for
waiting for all these outstanding udev cookies?

Martin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-10  8:01                         ` Martin Wilck
@ 2019-09-10 15:20                           ` David Teigland
  2019-09-10 20:38                             ` Zdenek Kabelac
  0 siblings, 1 reply; 22+ messages in thread
From: David Teigland @ 2019-09-10 15:20 UTC (permalink / raw)
  To: Martin Wilck; +Cc: LVM general discussion and development, Heming Zhao

> > > _pvscan_aa
> > >   vgchange_activate
> > >    _activate_lvs_in_vg
> > >     sync_local_dev_names
> > >      fs_unlock
> > >       dm_udev_wait <=== this point!
> > > ```

> Could you explain to us what's happening in this code? IIUC, an
> incoming uevent triggers pvscan, which then possibly triggers VG
> activation. That in turn would create more uevents. The pvscan process
> then waits for uevents for the tree "root" of the activated LVs to be
> processed.
> 
> Can't we move this waiting logic out of the uevent handling? It seems
> weird to me that a process that acts on a uevent waits for the
> completion of another, later uevent. This is almost guaranteed to cause
> delays during "uevent storms". Is it really necessary?
> 
> Maybe we could create a separate service that would be responsible for
> waiting for all these outstanding udev cookies?

Peter Rajnoha walked me through the details of this, and explained that a
timeout as you describe looks quite possible given default timeouts, and
that lvm doesn't really require that udev wait.

So, I pushed out this patch to allow pvscan with --noudevsync:
https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=3e5e7fd6c93517278b2451a08f47e16d052babbb

You'll want to add that option to lvm2-pvscan.service; we can hopefully
update the service to use that if things look good from testing.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-10 15:20                           ` David Teigland
@ 2019-09-10 20:38                             ` Zdenek Kabelac
  2019-09-11  7:17                               ` Martin Wilck
  0 siblings, 1 reply; 22+ messages in thread
From: Zdenek Kabelac @ 2019-09-10 20:38 UTC (permalink / raw)
  To: LVM general discussion and development, David Teigland, Martin Wilck
  Cc: Heming Zhao

Dne 10. 09. 19 v 17:20 David Teigland napsal(a):
>>>> _pvscan_aa
>>>>    vgchange_activate
>>>>     _activate_lvs_in_vg
>>>>      sync_local_dev_names
>>>>       fs_unlock
>>>>        dm_udev_wait <=== this point!
>>>> ```
> 
>> Could you explain to us what's happening in this code? IIUC, an
>> incoming uevent triggers pvscan, which then possibly triggers VG
>> activation. That in turn would create more uevents. The pvscan process
>> then waits for uevents for the tree "root" of the activated LVs to be
>> processed.
>>
>> Can't we move this waiting logic out of the uevent handling? It seems
>> weird to me that a process that acts on a uevent waits for the
>> completion of another, later uevent. This is almost guaranteed to cause
>> delays during "uevent storms". Is it really necessary?
>>
>> Maybe we could create a separate service that would be responsible for
>> waiting for all these outstanding udev cookies?
> 
> Peter Rajnoha walked me through the details of this, and explained that a
> timeout as you describe looks quite possible given default timeouts, and
> that lvm doesn't really require that udev wait.
> 
> So, I pushed out this patch to allow pvscan with --noudevsync:
> https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=3e5e7fd6c93517278b2451a08f47e16d052babbb
> 
> You'll want to add that option to lvm2-pvscan.service; we can hopefully
> update the service to use that if things look good from testing.

This is certainly a bug.

lvm2 surely does need to communication with udev for any activation.

We can't let running activation 'on-the-fly' without control on system with 
udev (so we do not issue 'remove' while there is still 'add' in progress)

Also any more complex target like thin-pool need to wait till metadata LV gets 
ready for thin-check.

Regards

Zdenek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-10 20:38                             ` Zdenek Kabelac
@ 2019-09-11  7:17                               ` Martin Wilck
  2019-09-11  9:13                                 ` Zdenek Kabelac
  0 siblings, 1 reply; 22+ messages in thread
From: Martin Wilck @ 2019-09-11  7:17 UTC (permalink / raw)
  To: Zdenek Kabelac, LVM general discussion and development, David Teigland
  Cc: Heming Zhao

On Tue, 2019-09-10 at 22:38 +0200, Zdenek Kabelac wrote:
> Dne 10. 09. 19 v 17:20 David Teigland napsal(a):
> > > > > _pvscan_aa
> > > > >    vgchange_activate
> > > > >     _activate_lvs_in_vg
> > > > >      sync_local_dev_names
> > > > >       fs_unlock
> > > > >        dm_udev_wait <=== this point!
> > > > > ```
> > > Could you explain to us what's happening in this code? IIUC, an
> > > incoming uevent triggers pvscan, which then possibly triggers VG
> > > activation. That in turn would create more uevents. The pvscan
> > > process
> > > then waits for uevents for the tree "root" of the activated LVs
> > > to be
> > > processed.
> > > 
> > > Can't we move this waiting logic out of the uevent handling? It
> > > seems
> > > weird to me that a process that acts on a uevent waits for the
> > > completion of another, later uevent. This is almost guaranteed to
> > > cause
> > > delays during "uevent storms". Is it really necessary?
> > > 
> > > Maybe we could create a separate service that would be
> > > responsible for
> > > waiting for all these outstanding udev cookies?
> > 
> > Peter Rajnoha walked me through the details of this, and explained
> > that a
> > timeout as you describe looks quite possible given default
> > timeouts, and
> > that lvm doesn't really require that udev wait.
> > 
> > So, I pushed out this patch to allow pvscan with --noudevsync:
> > https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=3e5e7fd6c93517278b2451a08f47e16d052babbb
> > 
> > You'll want to add that option to lvm2-pvscan.service; we can
> > hopefully
> > update the service to use that if things look good from testing.
> 
> This is certainly a bug.
> 
> lvm2 surely does need to communication with udev for any activation.
> 
> We can't let running activation 'on-the-fly' without control on
> system with 
> udev (so we do not issue 'remove' while there is still 'add' in
> progress)
> 
> Also any more complex target like thin-pool need to wait till
> metadata LV gets 
> ready for thin-check.

My idea was not to skip synchronization entirely, but to consider
moving it to a separate process / service. I surely don't want to re-
invent lvmetad, but Heming's findings show that it's more efficient to
do activation in a "single swoop" (like lvm2-activation.service) than
with many concurrent pvscan processes.

So instead of activating a VG immediately when it sees all necessary
PVs are detected, pvscan could simply spawn a new service which would
then take care of the activation, and sync with udev.

Just a thought, I lack in-depth knowledge of LVM2 internals to know if
it's possible.

Thanks
Martin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-11  7:17                               ` Martin Wilck
@ 2019-09-11  9:13                                 ` Zdenek Kabelac
  2019-09-12 13:58                                   ` Martin Wilck
  0 siblings, 1 reply; 22+ messages in thread
From: Zdenek Kabelac @ 2019-09-11  9:13 UTC (permalink / raw)
  To: Martin Wilck, LVM general discussion and development, David Teigland
  Cc: Heming Zhao

Dne 11. 09. 19 v 9:17 Martin Wilck napsal(a):
> On Tue, 2019-09-10 at 22:38 +0200, Zdenek Kabelac wrote:
>> Dne 10. 09. 19 v 17:20 David Teigland napsal(a):
>>>>>> _pvscan_aa
>>>>>>     vgchange_activate
>>>>>>      _activate_lvs_in_vg
>>>>>>       sync_local_dev_names
>>>>>>        fs_unlock
>>>>>>         dm_udev_wait <=== this point!
>>>>>> ```
>>>> Could you explain to us what's happening in this code? IIUC, an
>>>> incoming uevent triggers pvscan, which then possibly triggers VG
>>>> activation. That in turn would create more uevents. The pvscan
>>>> process
>>>> then waits for uevents for the tree "root" of the activated LVs
>>>> to be
>>>> processed.
>>>>
>>>> Can't we move this waiting logic out of the uevent handling? It
>>>> seems
>>>> weird to me that a process that acts on a uevent waits for the
>>>> completion of another, later uevent. This is almost guaranteed to
>>>> cause
>>>> delays during "uevent storms". Is it really necessary?
>>>>
>>>> Maybe we could create a separate service that would be
>>>> responsible for
>>>> waiting for all these outstanding udev cookies?
>>>
>>> Peter Rajnoha walked me through the details of this, and explained
>>> that a
>>> timeout as you describe looks quite possible given default
>>> timeouts, and
>>> that lvm doesn't really require that udev wait.
>>>
>>> So, I pushed out this patch to allow pvscan with --noudevsync:
>>> https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=3e5e7fd6c93517278b2451a08f47e16d052babbb
>>>
>>> You'll want to add that option to lvm2-pvscan.service; we can
>>> hopefully
>>> update the service to use that if things look good from testing.
>>
>> This is certainly a bug.
>>
>> lvm2 surely does need to communication with udev for any activation.
>>
>> We can't let running activation 'on-the-fly' without control on
>> system with
>> udev (so we do not issue 'remove' while there is still 'add' in
>> progress)
>>
>> Also any more complex target like thin-pool need to wait till
>> metadata LV gets
>> ready for thin-check.
> 
> My idea was not to skip synchronization entirely, but to consider
> moving it to a separate process / service. I surely don't want to re-
> invent lvmetad, but Heming's findings show that it's more efficient to
> do activation in a "single swoop" (like lvm2-activation.service) than
> with many concurrent pvscan processes.
> 
> So instead of activating a VG immediately when it sees all necessary
> PVs are detected, pvscan could simply spawn a new service which would
> then take care of the activation, and sync with udev.
> 
> Just a thought, I lack in-depth knowledge of LVM2 internals to know if
> it's possible.


Well for relatively long time we do want to move 'pvscan' back to be processed 
within udev rules  and activation service being really just a service
doing  'vgchange -ay'.

Another floating idea is to move towards monitoring instead of using semaphore
(since those SysV resources are kind-of limited and a bit problematic
when there are left in the system).


Regards

Zdenek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [linux-lvm] system boot time regression when using lvm2-2.03.05
  2019-09-11  9:13                                 ` Zdenek Kabelac
@ 2019-09-12 13:58                                   ` Martin Wilck
  0 siblings, 0 replies; 22+ messages in thread
From: Martin Wilck @ 2019-09-12 13:58 UTC (permalink / raw)
  To: Zdenek Kabelac, LVM general discussion and development, David Teigland
  Cc: Heming Zhao

On Wed, 2019-09-11 at 11:13 +0200, Zdenek Kabelac wrote:
> Dne 11. 09. 19 v 9:17 Martin Wilck napsal(a):
> > 
> > My idea was not to skip synchronization entirely, but to consider
> > moving it to a separate process / service. I surely don't want to
> > re-
> > invent lvmetad, but Heming's findings show that it's more efficient
> > to
> > do activation in a "single swoop" (like lvm2-activation.service)
> > than
> > with many concurrent pvscan processes.
> > 
> > So instead of activating a VG immediately when it sees all
> > necessary
> > PVs are detected, pvscan could simply spawn a new service which
> > would
> > then take care of the activation, and sync with udev.
> > 
> > Just a thought, I lack in-depth knowledge of LVM2 internals to know
> > if
> > it's possible.
> 
> Well for relatively long time we do want to move 'pvscan' back to be
> processed 
> within udev rules  and activation service being really just a service
> doing  'vgchange -ay'.

That sounds promising (I believe pvscan could well still be called via
'ENV{SYSTEMD_WANTS}+=' rather than being directly called from udev
rules, but that's just a detail). 
But it doesn't sound as if such a solution was imminent, right?

> Another floating idea is to move towards monitoring instead of using
> semaphore
> (since those SysV resources are kind-of limited and a bit problematic
> when there are left in the system).

I'm not sure I understand - are you talking about udev monitoring?

Thanks
Martin

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-09-12 13:58 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-29 13:52 [linux-lvm] system boot time regression when using lvm2-2.03.05 Heming Zhao
2019-08-29 14:37 ` David Teigland
2019-09-03  5:02   ` Heming Zhao
2019-09-03 15:17     ` David Teigland
2019-09-04  8:13       ` Heming Zhao
2019-09-05 12:35         ` Heming Zhao
2019-09-05 16:55           ` David Teigland
2019-09-06  4:31             ` Heming Zhao
2019-09-06  5:01               ` Heming Zhao
2019-09-06  6:51                 ` Martin Wilck
2019-09-06  8:46                   ` Heming Zhao
2019-09-06 14:15                     ` David Teigland
2019-09-06 14:26                     ` David Teigland
2019-09-06 14:03                   ` David Teigland
2019-09-09 11:42                     ` Heming Zhao
2019-09-09 14:09                       ` David Teigland
2019-09-10  8:01                         ` Martin Wilck
2019-09-10 15:20                           ` David Teigland
2019-09-10 20:38                             ` Zdenek Kabelac
2019-09-11  7:17                               ` Martin Wilck
2019-09-11  9:13                                 ` Zdenek Kabelac
2019-09-12 13:58                                   ` Martin Wilck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).