linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] [PATCH 1/1] pvscan: wait for udevd
@ 2021-02-11 11:16 Christian Hesse
       [not found] ` <f18dff7a-050a-d41d-c643-4616522ba4a0@suse.com>
  0 siblings, 1 reply; 15+ messages in thread
From: Christian Hesse @ 2021-02-11 11:16 UTC (permalink / raw)
  To: linux-lvm; +Cc: Christian Hesse

From: Christian Hesse <mail@eworm.de>

Running the scan before udevd finished startup may result in failure.
This has been reported for Arch Linux [0] and proper ordering fixes
the issue.

[0] https://bugs.archlinux.org/task/69611

Signed-off-by: Christian Hesse <mail@eworm.de>
---
 scripts/lvm2-pvscan.service.in | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/lvm2-pvscan.service.in b/scripts/lvm2-pvscan.service.in
index 09753e8c9..7b4ace551 100644
--- a/scripts/lvm2-pvscan.service.in
+++ b/scripts/lvm2-pvscan.service.in
@@ -4,6 +4,7 @@ Documentation=man:pvscan(8)
 DefaultDependencies=no
 StartLimitIntervalSec=0
 BindsTo=dev-block-%i.device
+After=systemd-udevd.service
 Before=shutdown.target
 Conflicts=shutdown.target
 


_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] [PATCH 1/1] pvscan: wait for udevd
       [not found] ` <f18dff7a-050a-d41d-c643-4616522ba4a0@suse.com>
@ 2021-02-17  8:22   ` Martin Wilck
       [not found]     ` <20210217130329.7de41147@leda>
  0 siblings, 1 reply; 15+ messages in thread
From: Martin Wilck @ 2021-02-17  8:22 UTC (permalink / raw)
  To: LVM general discussion and development,
	LVM general discussion and development, Christian Hesse
  Cc: Heming Zhao

On Wed, 2021-02-17 at 09:49 +0800, heming.zhao@suse.com wrote:
> On 2/11/21 7:16 PM, Christian Hesse wrote:
> > From: Christian Hesse <mail@eworm.de>
> > 
> > Running the scan before udevd finished startup may result in
> > failure.
> > This has been reported for Arch Linux [0] and proper ordering fixes
> > the issue.
> > 
> > [0] https://bugs.archlinux.org/task/69611
> > 
> > Signed-off-by: Christian Hesse <mail@eworm.de>
> > ---
> >   scripts/lvm2-pvscan.service.in | 1 +
> >   1 file changed, 1 insertion(+)
> > 
> > diff --git a/scripts/lvm2-pvscan.service.in b/scripts/lvm2-
> > pvscan.service.in
> > index 09753e8c9..7b4ace551 100644
> > --- a/scripts/lvm2-pvscan.service.in
> > +++ b/scripts/lvm2-pvscan.service.in
> > @@ -4,6 +4,7 @@ Documentation=man:pvscan(8)
> >   DefaultDependencies=no
> >   StartLimitIntervalSec=0
> >   BindsTo=dev-block-%i.device
> > +After=systemd-udevd.service
> >   Before=shutdown.target
> >   Conflicts=shutdown.target
> >   
> > 
> 
> I watched a similar issue with lvm2-monitor.service.
> In a very old machine (i586), udevd cost too much time to finish, it
> triggered
> lvm2-monitor timeout then reported:
> > WARNING: Device /dev/sda not initialized in udev database even
> > after waiting 10000000 microseconds.
> 
> One workable solution is to add "systemd-udev-settle.service"
> (obsoleted) or
> "local-fs.target" in "After" of lvm2-monitor.service.

We have to differentiate here. In our case we had to wait for "systemd-
udev-settle.service". In the arch case, it was only necessary to wait
for systemd-udevd.service itself. "After=systemd-udevd.service" just
means that the daemon is up, it says nothing about any device
initialization being completed.

But in general, I think this needs deeper analysis. Looking at
https://bugs.archlinux.org/task/69611, the workaround appears to have
been found simply by drawing an analogy to a previous similar case.
I'd like to understand what happened on the arch system when the error
occured, and why this simple ordering directive avoided it.

1. How had the offending pvscan process been started? I'd expect that
"pvscan" (unlike "lvm monitor" in our case) was started by an udev
rule. If udevd hadn't started yet, how would that udev rule have be
executed? OTOH, if pvscan had not been started by udev but by another
systemd service, than *that* service would probably need to get the
After=systemd-udevd.service directive.

2. Even without the "After=" directive, I'd assume that pvscan wasn't
started "before" systemd-udevd, but rather "simultaneously" (i.e. in
the same systemd transaction). Thus systemd-udevd should have started
up while pvscan was running, and pvscan should have noticed that udevd
eventually became available. Why did pvscan time out? What was it
waiting for? We know that lvm checks for the existence of
"/run/udev/control", but that should have become avaiable after some
fractions of a second of waiting.

Regards,
Martin



_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] [PATCH 1/1] pvscan: wait for udevd
       [not found]     ` <20210217130329.7de41147@leda>
@ 2021-02-17 13:38       ` Oleksandr Natalenko
  2021-02-18 15:19         ` Martin Wilck
  2021-02-17 13:49       ` Martin Wilck
  1 sibling, 1 reply; 15+ messages in thread
From: Oleksandr Natalenko @ 2021-02-17 13:38 UTC (permalink / raw)
  To: Christian Hesse
  Cc: Oleksandr Natalenko, LVM general discussion and development,
	Heming Zhao, Martin Wilck,
	LVM general discussion and development

Hi.

Please check my log excerpts and comments below.

On Wed, Feb 17, 2021 at 01:03:29PM +0100, Christian Hesse wrote:
> > But in general, I think this needs deeper analysis. Looking at
> > https://bugs.archlinux.org/task/69611, the workaround appears to have
> > been found simply by drawing an analogy to a previous similar case.
> > I'd like to understand what happened on the arch system when the error
> > occured, and why this simple ordering directive avoided it.
> 
> As said I can not reproduce it myself... Oleksandr, can you give more details?
> Possibly everything from journal regarding systemd-udevd.service (and
> systemd-udevd.socket) and lvm2-pvscan@*.service could help.

I'm not sure this issue is reproducible with any kind of LVM layout.
What I have is thin-LVM-on-LUKS-on-LVM:

```
[~]> lsblk
NAME                                   MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
nvme0n1                                259:0    0 238,5G  0 disk
├─nvme0n1p1                            259:1    0     8M  0 part  /boot/EFI
└─nvme0n1p2                            259:2    0 238,5G  0 part
  ├─base-boot                          254:0    0   256M  0 lvm   /boot
  └─base-sys                           254:1    0 238,2G  0 lvm
    └─sys                              254:2    0 238,2G  0 crypt
      ├─sys-swap                       254:3    0     2G  0 lvm   [SWAP]
      ├─sys-pool_tmeta                 254:4    0   120M  0 lvm
      │ └─sys-pool-tpool               254:6    0   236G  0 lvm
      │   ├─sys-pool                   254:7    0   236G  1 lvm
      │   ├─sys-root                   254:8    0   235G  0 lvm   /
      │   ├─sys-home                   254:9    0   235G  0 lvm   /home
      │   ├─sys-containers             254:10   0   235G  0 lvm   /mnt/containers
      │   ├─sys-vms                    254:11   0   235G  0 lvm   /mnt/vms
      │   ├─sys-email                  254:12   0   235G  0 lvm   /mnt/email
      │   ├─sys-vm--01--rhel6--sda     254:13   0     8G  0 lvm
      │   ├─sys-vm--02--rhel7--sda     254:14   0    16G  0 lvm
      │   ├─sys-vm--03--rhel8--sda     254:15   0     8G  0 lvm
      │   ├─sys-vm--04--rhel6--32--sda 254:16   0     8G  0 lvm
      │   ├─sys-vm--05--rhel5--sda     254:17   0     8G  0 lvm
      │   ├─sys-vm--06--fedora--sda    254:18   0     8G  0 lvm
      │   └─sys-vm--02--rhel7--sdb     254:19   0     1G  0 lvm
      └─sys-pool_tdata                 254:5    0   236G  0 lvm
        └─sys-pool-tpool               254:6    0   236G  0 lvm
          ├─sys-pool                   254:7    0   236G  1 lvm
          ├─sys-root                   254:8    0   235G  0 lvm   /
          ├─sys-home                   254:9    0   235G  0 lvm   /home
          ├─sys-containers             254:10   0   235G  0 lvm   /mnt/containers
          ├─sys-vms                    254:11   0   235G  0 lvm   /mnt/vms
          ├─sys-email                  254:12   0   235G  0 lvm   /mnt/email
          ├─sys-vm--01--rhel6--sda     254:13   0     8G  0 lvm
          ├─sys-vm--02--rhel7--sda     254:14   0    16G  0 lvm
          ├─sys-vm--03--rhel8--sda     254:15   0     8G  0 lvm
          ├─sys-vm--04--rhel6--32--sda 254:16   0     8G  0 lvm
          ├─sys-vm--05--rhel5--sda     254:17   0     8G  0 lvm
          ├─sys-vm--06--fedora--sda    254:18   0     8G  0 lvm
          └─sys-vm--02--rhel7--sdb     254:19   0     1G  0 lvm

```

(you may ask why? because full disk encryption is mandated by my
employer, which is, surprise-surprise, Red Hat ☺; I'll Cc myself to
@rh address as well just to be sure I do not miss anything in the future)

Also, this doesn't happen if the amount of thin volumes is small (like,
less than 5). So, in order to reproduce this locally you may need to
throw quite of LVs into the system.

The issue is reproducible with this layout as well:

```
[~]> lsblk
NAME                                     MAJ:MIN RM   SIZE RO TYPE   MOUNTPOINT
nvme0n1                                  259:0    0 465,8G  0 disk
├─nvme0n1p1                              259:1    0     8M  0 part   /boot/EFI
└─nvme0n1p2                              259:2    0   465G  0 part
  └─md0                                    9:0    0 464,9G  0 raid10
    ├─base-boot                          253:0    0   384M  0 lvm    /boot
    └─base-sys                           253:1    0 464,5G  0 lvm
      └─sys                              253:2    0 464,5G  0 crypt
        ├─sys-swap                       253:3    0     2G  0 lvm    [SWAP]
        ├─sys-pool_tmeta                 253:4    0   116M  0 lvm
        │ └─sys-pool-tpool               253:6    0 462,3G  0 lvm
        │   ├─sys-pool                   253:7    0 462,3G  1 lvm
        │   ├─sys-root                   253:8    0   462G  0 lvm    /
        │   ├─sys-home                   253:9    0   462G  0 lvm    /home
        │   ├─sys-vms                    253:10   0   462G  0 lvm    /mnt/vms
        │   ├─sys-vm--01--archlinux--sda 253:11   0     8G  0 lvm
        │   ├─sys-vm--02--windows--sda   253:12   0    50G  0 lvm
        │   └─sys-mail                   253:13   0   462G  0 lvm    /mnt/mail
        └─sys-pool_tdata                 253:5    0 462,3G  0 lvm
          └─sys-pool-tpool               253:6    0 462,3G  0 lvm
            ├─sys-pool                   253:7    0 462,3G  1 lvm
            ├─sys-root                   253:8    0   462G  0 lvm    /
            ├─sys-home                   253:9    0   462G  0 lvm    /home
            ├─sys-vms                    253:10   0   462G  0 lvm    /mnt/vms
            ├─sys-vm--01--archlinux--sda 253:11   0     8G  0 lvm
            ├─sys-vm--02--windows--sda   253:12   0    50G  0 lvm
            └─sys-mail                   253:13   0   462G  0 lvm    /mnt/mail
nvme1n1                                  259:3    0 465,8G  0 disk
├─nvme1n1p1                              259:4    0     8M  0 part
└─nvme1n1p2                              259:5    0   465G  0 part
  └─md0                                    9:0    0 464,9G  0 raid10
    ├─base-boot                          253:0    0   384M  0 lvm    /boot
    └─base-sys                           253:1    0 464,5G  0 lvm
      └─sys                              253:2    0 464,5G  0 crypt
        ├─sys-swap                       253:3    0     2G  0 lvm    [SWAP]
        ├─sys-pool_tmeta                 253:4    0   116M  0 lvm
        │ └─sys-pool-tpool               253:6    0 462,3G  0 lvm
        │   ├─sys-pool                   253:7    0 462,3G  1 lvm
        │   ├─sys-root                   253:8    0   462G  0 lvm    /
        │   ├─sys-home                   253:9    0   462G  0 lvm    /home
        │   ├─sys-vms                    253:10   0   462G  0 lvm    /mnt/vms
        │   ├─sys-vm--01--archlinux--sda 253:11   0     8G  0 lvm
        │   ├─sys-vm--02--windows--sda   253:12   0    50G  0 lvm
        │   └─sys-mail                   253:13   0   462G  0 lvm    /mnt/mail
        └─sys-pool_tdata                 253:5    0 462,3G  0 lvm
          └─sys-pool-tpool               253:6    0 462,3G  0 lvm
            ├─sys-pool                   253:7    0 462,3G  1 lvm
            ├─sys-root                   253:8    0   462G  0 lvm    /
            ├─sys-home                   253:9    0   462G  0 lvm    /home
            ├─sys-vms                    253:10   0   462G  0 lvm    /mnt/vms
            ├─sys-vm--01--archlinux--sda 253:11   0     8G  0 lvm
            ├─sys-vm--02--windows--sda   253:12   0    50G  0 lvm
            └─sys-mail                   253:13   0   462G  0 lvm    /mnt/mail
```

(this is the one from the Arch bugreport).

With regard to the journal, here it is (from the same machine in the
Arch bugreport; matches the second layout above):

```
-- Journal begins at Fri 2020-12-18 16:33:22 CET, ends at Wed 2021-02-17 14:25:57 CET. --
Feb 10 17:24:26 archlinux systemd[1]: Stopping Rule-based Manager for Device Events and Files...
Feb 10 17:24:26 archlinux systemd[1]: systemd-udevd.service: Succeeded.
Feb 10 17:24:26 archlinux systemd[1]: Stopped Rule-based Manager for Device Events and Files.
Feb 10 17:24:27 spock systemd[1]: Started Rule-based Manager for Device Events and Files.
Feb 10 17:24:27 spock mtp-probe[897]: checking bus 6, device 5: "/sys/devices/pci0000:00/0000:00:08.1/0000:0d:00.3/usb6/6-4/6-4.4"
Feb 10 17:24:27 spock mtp-probe[896]: checking bus 6, device 2: "/sys/devices/pci0000:00/0000:00:08.1/0000:0d:00.3/usb6/6-1"
Feb 10 17:24:27 spock mtp-probe[899]: checking bus 6, device 4: "/sys/devices/pci0000:00/0000:00:08.1/0000:0d:00.3/usb6/6-4/6-4.3"
Feb 10 17:24:27 spock mtp-probe[898]: checking bus 6, device 6: "/sys/devices/pci0000:00/0000:00:08.1/0000:0d:00.3/usb6/6-4/6-4.5"
Feb 10 17:24:27 spock mtp-probe[897]: bus: 6, device: 5 was not an MTP device
Feb 10 17:24:27 spock mtp-probe[898]: bus: 6, device: 6 was not an MTP device
Feb 10 17:24:27 spock mtp-probe[899]: bus: 6, device: 4 was not an MTP device
Feb 10 17:24:27 spock mtp-probe[896]: bus: 6, device: 2 was not an MTP device
Feb 10 17:24:27 spock systemd-udevd[881]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Feb 10 17:24:27 spock systemd-udevd[887]: Using default interface naming scheme 'v247'.
Feb 10 17:24:27 spock systemd-udevd[887]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Feb 10 17:24:27 spock systemd-udevd[879]: Using default interface naming scheme 'v247'.
Feb 10 17:24:27 spock systemd-udevd[879]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Feb 10 17:24:27 spock systemd-udevd[887]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Feb 10 17:24:27 spock systemd-udevd[887]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Feb 10 17:24:27 spock systemd-udevd[826]: Using default interface naming scheme 'v247'.
Feb 10 17:24:27 spock systemd-udevd[826]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Feb 10 17:24:27 spock systemd-udevd[826]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.

[~]> LC_TIME=C sudo journalctl -b -10 -u systemd-udevd.socket
-- Journal begins at Fri 2020-12-18 16:33:22 CET, ends at Wed 2021-02-17 14:27:58 CET. --
-- No entries --

[~]> LC_TIME=C sudo journalctl -b -10 -u lvm2-pvscan@\*
-- Journal begins at Fri 2020-12-18 16:33:22 CET, ends at Wed 2021-02-17 14:28:05 CET. --
Feb 10 17:24:17 archlinux systemd[1]: Starting LVM event activation on device 9:0...
Feb 10 17:24:17 archlinux lvm[463]:   pvscan[463] PV /dev/md0 online, VG base is complete.
Feb 10 17:24:17 archlinux lvm[463]:   pvscan[463] VG base run autoactivation.
Feb 10 17:24:17 archlinux lvm[463]:   2 logical volume(s) in volume group "base" now active
Feb 10 17:24:17 archlinux systemd[1]: Finished LVM event activation on device 9:0.
Feb 10 17:24:26 archlinux systemd[1]: Starting LVM event activation on device 253:2...
Feb 10 17:24:26 archlinux lvm[643]:   pvscan[643] PV /dev/mapper/sys online, VG sys is complete.
Feb 10 17:24:26 archlinux lvm[643]:   pvscan[643] VG sys run autoactivation.
Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat failed: No such file or directory
Feb 10 17:24:26 archlinux lvm[643]:   WARNING: Failed to monitor sys/pool.
Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat failed: No such file or directory
Feb 10 17:24:26 archlinux lvm[643]:   WARNING: Failed to monitor sys/pool.
Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat failed: No such file or directory
Feb 10 17:24:26 archlinux lvm[643]:   WARNING: Failed to monitor sys/pool.
Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat failed: No such file or directory
Feb 10 17:24:26 archlinux lvm[643]:   WARNING: Failed to monitor sys/pool.
Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat failed: No such file or directory
Feb 10 17:24:26 archlinux lvm[643]:   WARNING: Failed to monitor sys/pool.
Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat failed: No such file or directory
Feb 10 17:24:26 archlinux lvm[643]:   WARNING: Failed to monitor sys/pool.
Feb 10 17:24:26 archlinux systemd[1]: Stopping LVM event activation on device 9:0...
Feb 10 17:24:26 archlinux lvm[720]:   pvscan[720] PV /dev/md0 online.
Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat failed: No such file or directory
Feb 10 17:24:26 archlinux lvm[643]:   WARNING: Failed to monitor sys/pool.
Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: State 'stop-sigterm' timed out. Killing.
Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: Killing process 643 (lvm) with signal SIGKILL.
Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: Main process exited, code=killed, status=9/KILL
Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: Failed with result 'timeout'.
Feb 10 17:24:56 spock systemd[1]: Stopped LVM event activation on device 253:2.

[~]> LC_TIME=C sudo journalctl -b -10 --grep pvscan
-- Journal begins at Fri 2020-12-18 16:33:22 CET, ends at Wed 2021-02-17 14:31:27 CET. --
Feb 10 17:24:17 archlinux systemd[1]: Created slice system-lvm2\x2dpvscan.slice.
Feb 10 17:24:17 archlinux lvm[463]:   pvscan[463] PV /dev/md0 online, VG base is complete.
Feb 10 17:24:17 archlinux lvm[463]:   pvscan[463] VG base run autoactivation.
Feb 10 17:24:17 archlinux audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=lvm2-pvscan@9:0 comm="systemd" exe="/init" hostname=? addr=? terminal=? res=success'
Feb 10 17:24:17 archlinux kernel: audit: type=1130 audit(1612974257.986:6): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=lvm2-pvscan@9:0 comm="systemd" exe="/init" hostname=? addr=? terminal=? res=success'
Feb 10 17:24:26 archlinux lvm[643]:   pvscan[643] PV /dev/mapper/sys online, VG sys is complete.
Feb 10 17:24:26 archlinux lvm[643]:   pvscan[643] VG sys run autoactivation.
Feb 10 17:24:26 archlinux lvm[720]:   pvscan[720] PV /dev/md0 online.
Feb 10 17:24:27 spock systemd[1]: lvm2-pvscan@9:0.service: Control process exited, code=killed, status=15/TERM
Feb 10 17:24:27 spock systemd[1]: lvm2-pvscan@9:0.service: Failed with result 'signal'.
Feb 10 17:24:26 spock audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=lvm2-pvscan@9:0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Feb 10 17:24:27 spock systemd[1]: Requested transaction contradicts existing jobs: Transaction for lvm2-pvscan@253:2.service/start is destructive (lvm2-pvscan@253:2.service has 'stop' job queued, but 'start' is included in transaction).
Feb 10 17:24:27 spock systemd[1]: sys-devices-virtual-block-dm\x2d2.device: Failed to enqueue SYSTEMD_WANTS= job, ignoring: Transaction for lvm2-pvscan@253:2.service/start is destructive (lvm2-pvscan@253:2.service has 'stop' job queued, but 'start' is included in transaction).
Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: State 'stop-sigterm' timed out. Killing.
Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: Killing process 643 (lvm) with signal SIGKILL.
Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: Main process exited, code=killed, status=9/KILL
Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: Failed with result 'timeout'.
Feb 10 17:24:56 spock audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=lvm2-pvscan@253:2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Feb 10 17:24:56 spock kernel: audit: type=1130 audit(1612974296.950:93): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=lvm2-pvscan@253:2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Feb 10 17:29:57 spock systemd[1]: Removed slice system-lvm2\x2dpvscan.slice.
```

FWIW, here's what it looks like now, after adding the "After=" fix:

```
[~]> LC_TIME=C sudo journalctl -b -u systemd-udevd.service
-- Journal begins at Fri 2020-12-18 16:33:22 CET, ends at Wed 2021-02-17 14:28:47 CET. --
Feb 16 22:13:37 archlinux systemd[1]: Stopping Rule-based Manager for Device Events and Files...
Feb 16 22:13:37 archlinux systemd[1]: systemd-udevd.service: Succeeded.
Feb 16 22:13:37 archlinux systemd[1]: Stopped Rule-based Manager for Device Events and Files.
Feb 16 22:13:38 spock systemd[1]: Starting Rule-based Manager for Device Events and Files...
Feb 16 22:13:38 spock systemd[1]: Started Rule-based Manager for Device Events and Files.
Feb 16 22:13:38 spock mtp-probe[895]: checking bus 6, device 2: "/sys/devices/pci0000:00/0000:00:08.1/0000:0d:00.3/usb6/6-1"
Feb 16 22:13:38 spock mtp-probe[897]: checking bus 6, device 6: "/sys/devices/pci0000:00/0000:00:08.1/0000:0d:00.3/usb6/6-4/6-4.5"
Feb 16 22:13:38 spock mtp-probe[898]: checking bus 6, device 5: "/sys/devices/pci0000:00/0000:00:08.1/0000:0d:00.3/usb6/6-4/6-4.4"
Feb 16 22:13:38 spock mtp-probe[896]: checking bus 6, device 4: "/sys/devices/pci0000:00/0000:00:08.1/0000:0d:00.3/usb6/6-4/6-4.3"
Feb 16 22:13:38 spock mtp-probe[897]: bus: 6, device: 6 was not an MTP device
Feb 16 22:13:38 spock mtp-probe[898]: bus: 6, device: 5 was not an MTP device
Feb 16 22:13:38 spock mtp-probe[896]: bus: 6, device: 4 was not an MTP device
Feb 16 22:13:38 spock mtp-probe[895]: bus: 6, device: 2 was not an MTP device
Feb 16 22:13:38 spock systemd-udevd[825]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Feb 16 22:13:38 spock systemd-udevd[866]: Using default interface naming scheme 'v247'.
Feb 16 22:13:38 spock systemd-udevd[860]: Using default interface naming scheme 'v247'.
Feb 16 22:13:38 spock systemd-udevd[866]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Feb 16 22:13:38 spock systemd-udevd[860]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Feb 16 22:13:38 spock systemd-udevd[833]: Using default interface naming scheme 'v247'.
Feb 16 22:13:38 spock systemd-udevd[833]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Feb 16 22:13:38 spock systemd-udevd[833]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Feb 16 22:13:39 spock systemd-udevd[892]: Using default interface naming scheme 'v247'.
Feb 16 22:13:39 spock systemd-udevd[892]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Feb 16 22:13:39 spock systemd-udevd[842]: Using default interface naming scheme 'v247'.
Feb 16 22:13:39 spock systemd-udevd[842]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.

[~]> LC_TIME=C sudo journalctl -b -u systemd-udevd.socket
-- Journal begins at Fri 2020-12-18 16:33:22 CET, ends at Wed 2021-02-17 14:29:04 CET. --
-- No entries --

[~]> LC_TIME=C sudo journalctl -b -u lvm2-pvscan@\*
-- Journal begins at Fri 2020-12-18 16:33:22 CET, ends at Wed 2021-02-17 14:29:15 CET. --
Feb 16 22:13:27 archlinux systemd[1]: Starting LVM event activation on device 9:0...
Feb 16 22:13:27 archlinux lvm[495]:   pvscan[495] PV /dev/md0 online, VG base is complete.
Feb 16 22:13:27 archlinux lvm[495]:   pvscan[495] VG base run autoactivation.
Feb 16 22:13:27 archlinux lvm[495]:   2 logical volume(s) in volume group "base" now active
Feb 16 22:13:28 archlinux systemd[1]: Finished LVM event activation on device 9:0.
Feb 16 22:13:37 archlinux systemd[1]: Starting LVM event activation on device 253:2...
Feb 16 22:13:37 archlinux lvm[665]:   pvscan[665] PV /dev/mapper/sys online, VG sys is complete.
Feb 16 22:13:37 archlinux lvm[665]:   pvscan[665] VG sys run autoactivation.
Feb 16 22:13:37 archlinux lvm[665]:   /usr/bin/dmeventd: stat failed: No such file or directory
Feb 16 22:13:37 archlinux lvm[665]:   WARNING: Failed to monitor sys/pool.
Feb 16 22:13:37 archlinux lvm[665]:   /usr/bin/dmeventd: stat failed: No such file or directory
Feb 16 22:13:37 archlinux lvm[665]:   WARNING: Failed to monitor sys/pool.
Feb 16 22:13:37 archlinux lvm[665]:   /usr/bin/dmeventd: stat failed: No such file or directory
Feb 16 22:13:37 archlinux lvm[665]:   WARNING: Failed to monitor sys/pool.
Feb 16 22:13:37 archlinux lvm[665]:   /usr/bin/dmeventd: stat failed: No such file or directory
Feb 16 22:13:37 archlinux lvm[665]:   WARNING: Failed to monitor sys/pool.
Feb 16 22:13:37 archlinux lvm[665]:   /usr/bin/dmeventd: stat failed: No such file or directory
Feb 16 22:13:37 archlinux lvm[665]:   WARNING: Failed to monitor sys/pool.
Feb 16 22:13:37 archlinux lvm[665]:   /usr/bin/dmeventd: stat failed: No such file or directory
Feb 16 22:13:37 archlinux lvm[665]:   WARNING: Failed to monitor sys/pool.
Feb 16 22:13:37 archlinux systemd[1]: Stopping LVM event activation on device 9:0...
Feb 16 22:13:37 archlinux lvm[743]:   pvscan[743] PV /dev/md0 online.
Feb 16 22:13:37 archlinux lvm[665]:   /usr/bin/dmeventd: stat failed: No such file or directory
Feb 16 22:13:37 archlinux lvm[665]:   WARNING: Failed to monitor sys/pool.
Feb 16 22:13:37 archlinux lvm[665]:   8 logical volume(s) in volume group "sys" now active
Feb 16 22:13:37 archlinux systemd[1]: lvm2-pvscan@9:0.service: Succeeded.
Feb 16 22:13:37 archlinux systemd[1]: Stopped LVM event activation on device 9:0.
Feb 16 22:13:37 archlinux systemd[1]: lvm2-pvscan@253:2.service: Main process exited, code=killed, status=15/TERM
Feb 16 22:13:37 archlinux systemd[1]: lvm2-pvscan@253:2.service: Failed with result 'signal'.
Feb 16 22:13:37 archlinux systemd[1]: Stopped LVM event activation on device 253:2.
Feb 16 22:13:38 spock systemd[1]: Starting LVM event activation on device 253:2...
Feb 16 22:13:38 spock lvm[929]:   pvscan[929] PV /dev/mapper/sys online, VG sys is complete.
Feb 16 22:13:38 spock lvm[929]:   pvscan[929] VG sys skip autoactivation.
Feb 16 22:13:38 spock systemd[1]: Finished LVM event activation on device 253:2.

[~]> LC_TIME=C sudo journalctl -b --grep pvscan | /bin/cat -
-- Journal begins at Fri 2020-12-18 16:33:22 CET, ends at Wed 2021-02-17 14:31:58 CET. --
Feb 16 22:13:27 archlinux systemd[1]: Created slice system-lvm2\x2dpvscan.slice.
Feb 16 22:13:27 archlinux lvm[495]:   pvscan[495] PV /dev/md0 online, VG base is complete.
Feb 16 22:13:27 archlinux lvm[495]:   pvscan[495] VG base run autoactivation.
Feb 16 22:13:28 archlinux audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=lvm2-pvscan@9:0 comm="systemd" exe="/init" hostname=? addr=? terminal=? res=success'
Feb 16 22:13:28 archlinux kernel: audit: type=1130 audit(1613510008.000:6): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=lvm2-pvscan@9:0 comm="systemd" exe="/init" hostname=? addr=? terminal=? res=success'
Feb 16 22:13:37 archlinux lvm[665]:   pvscan[665] PV /dev/mapper/sys online, VG sys is complete.
Feb 16 22:13:37 archlinux lvm[665]:   pvscan[665] VG sys run autoactivation.
Feb 16 22:13:37 archlinux lvm[743]:   pvscan[743] PV /dev/md0 online.
Feb 16 22:13:37 archlinux systemd[1]: lvm2-pvscan@9:0.service: Succeeded.
Feb 16 22:13:37 archlinux audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=lvm2-pvscan@9:0 comm="systemd" exe="/init" hostname=? addr=? terminal=? res=success'
Feb 16 22:13:37 archlinux systemd[1]: lvm2-pvscan@253:2.service: Main process exited, code=killed, status=15/TERM
Feb 16 22:13:37 archlinux systemd[1]: lvm2-pvscan@253:2.service: Failed with result 'signal'.
Feb 16 22:13:37 archlinux audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=lvm2-pvscan@253:2 comm="systemd" exe="/init" hostname=? addr=? terminal=? res=failed'
Feb 16 22:13:38 spock lvm[929]:   pvscan[929] PV /dev/mapper/sys online, VG sys is complete.
Feb 16 22:13:38 spock lvm[929]:   pvscan[929] VG sys skip autoactivation.
Feb 16 22:13:38 spock audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=lvm2-pvscan@253:2 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
```

Note I do not see any presence of systemd-udevd.socket there. Am I
missing something?

Also, I see the following difference between these runs:

```
Feb 10 17:24:27 spock systemd[1]: Requested transaction contradicts existing jobs: Transaction for lvm2-pvscan@253:2.service/start is destructive (lvm2-pvscan@253:2.service has 'stop' job queued, but 'start' is included in transaction).
Feb 10 17:24:27 spock systemd[1]: sys-devices-virtual-block-dm\x2d2.device: Failed to enqueue SYSTEMD_WANTS= job, ignoring: Transaction for lvm2-pvscan@253:2.service/start is destructive (lvm2-pvscan@253:2.service has 'stop' job queued, but 'start' is included in transaction).
```

I'm not sure I understand what it means, though. Likely, this is just
because of first pvscan invocation being stuck.

If needed, I can intentionally break one of the machines again, but to
do that please let me know what kind of data I should collect. I'm not
sure plain journal will reveal more, maybe, I should apply some extra
debug options?

Thanks.

-- 
  Oleksandr Natalenko (post-factum)


_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] [PATCH 1/1] pvscan: wait for udevd
       [not found]     ` <20210217130329.7de41147@leda>
  2021-02-17 13:38       ` Oleksandr Natalenko
@ 2021-02-17 13:49       ` Martin Wilck
  2021-02-17 19:11         ` Oleksandr Natalenko
  1 sibling, 1 reply; 15+ messages in thread
From: Martin Wilck @ 2021-02-17 13:49 UTC (permalink / raw)
  To: Christian Hesse
  Cc: LVM general discussion and development, Heming Zhao,
	Oleksandr Natalenko, LVM general discussion and development

On Wed, 2021-02-17 at 13:03 +0100, Christian Hesse wrote:
> 
> Let's keep this in mind. Now let's have a look at udevd startup: It
> signals
> being ready by calling sd_notifyf(), but it loads rules and applies
> permissions before doing so [0].
> Even before we have some code about handling events and monitoring
> stuff.

It loads the rules, but events will only be processed after entering
sd_event_loop(), which happens after the sd_notify() call.

Anyway, booting the system with "udev.log-priority=debug" might provide
further insight. Oleksandr, could you try that (without the After=
directive)?

> So I guess pvscan is started in initialization phase before udevd
> signals
> being ready. And obviously there is any kind of race condition.

Right. Some uevent might arrive between the creation of the monitor
socket in monitor_new() and entering the event loop. Such event would
be handled immediately, and possibly before systemd receives the
sd_notify message, so a race condition looks possible.

> 
> With the ordering "After=" in `lvm2-pvscan@.service` the service
> start is
> queued at initialization phase, but actual start and pvscan execution
> is
> delayed until udevd signaled being ready.
> 
> > But in general, I think this needs deeper analysis. Looking at
> > https://bugs.archlinux.org/task/69611, the workaround appears to
> > have
> > been found simply by drawing an analogy to a previous similar case.
> > I'd like to understand what happened on the arch system when the
> > error
> > occured, and why this simple ordering directive avoided it.
> 
> As said I can not reproduce it myself... Oleksandr, can you give more
> details?
> Possibly everything from journal regarding systemd-udevd.service (and
> systemd-udevd.socket) and lvm2-pvscan@*.service could help.
> 
> > 1. How had the offending pvscan process been started? I'd expect
> > that
> > "pvscan" (unlike "lvm monitor" in our case) was started by an udev
> > rule. If udevd hadn't started yet, how would that udev rule have be
> > executed? OTOH, if pvscan had not been started by udev but by
> > another
> > systemd service, than *that* service would probably need to get the
> > After=systemd-udevd.service directive.
> 
> To my understanding it was started from udevd by a rule in
> `69-dm-lvm-metad.rules`.
> 
> (BTW, renaming that rule file may make sense now that lvm2-metad is
> gone...)
> 
> > 2. Even without the "After=" directive, I'd assume that pvscan
> > wasn't
> > started "before" systemd-udevd, but rather "simultaneously" (i.e.
> > in
> > the same systemd transaction). Thus systemd-udevd should have
> > started
> > up while pvscan was running, and pvscan should have noticed that
> > udevd
> > eventually became available. Why did pvscan time out? What was it
> > waiting for? We know that lvm checks for the existence of
> > "/run/udev/control", but that should have become avaiable after
> > some
> > fractions of a second of waiting.
> 
> I do not think there is anything starting pvscan before udevd.

I agree. The race described above looks at least possible.
I would go one step further and say that *every* systemd service that
might be started from an udev rule should have an "After=systemd-
udevd.service".

Martin


_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] [PATCH 1/1] pvscan: wait for udevd
  2021-02-17 13:49       ` Martin Wilck
@ 2021-02-17 19:11         ` Oleksandr Natalenko
  0 siblings, 0 replies; 15+ messages in thread
From: Oleksandr Natalenko @ 2021-02-17 19:11 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Oleksandr Natalenko, LVM general discussion and development,
	Christian Hesse, Heming Zhao,
	LVM general discussion and development

Hello.

On Wed, Feb 17, 2021 at 02:49:00PM +0100, Martin Wilck wrote:
> On Wed, 2021-02-17 at 13:03 +0100, Christian Hesse wrote:
> > 
> > Let's keep this in mind. Now let's have a look at udevd startup: It
> > signals
> > being ready by calling sd_notifyf(), but it loads rules and applies
> > permissions before doing so [0].
> > Even before we have some code about handling events and monitoring
> > stuff.
> 
> It loads the rules, but events will only be processed after entering
> sd_event_loop(), which happens after the sd_notify() call.
> 
> Anyway, booting the system with "udev.log-priority=debug" might provide
> further insight. Oleksandr, could you try that (without the After=
> directive)?

Yes.

pvscan: http://ix.io/2PLK
udev: http://ix.io/2PLL
lvm: http://ix.io/2PLM

Let me know if I can collect something else.

Thanks.

-- 
  Oleksandr Natalenko (post-factum)

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] [PATCH 1/1] pvscan: wait for udevd
  2021-02-17 13:38       ` Oleksandr Natalenko
@ 2021-02-18 15:19         ` Martin Wilck
  2021-02-18 15:30           ` Oleksandr Natalenko
  2021-02-19 16:37           ` David Teigland
  0 siblings, 2 replies; 15+ messages in thread
From: Martin Wilck @ 2021-02-18 15:19 UTC (permalink / raw)
  To: LVM general discussion and development, Christian Hesse
  Cc: Oleksandr Natalenko, LVM, Heming Zhao, development

On Wed, 2021-02-17 at 14:38 +0100, Oleksandr Natalenko wrote:
> Hi.
> 

Thanks for the logs!

> I'm not sure this issue is reproducible with any kind of LVM layout.
> What I have is thin-LVM-on-LUKS-on-LVM:

I saw MD in your other logs...?

More comments below.

> With regard to the journal, here it is (from the same machine in the
> Arch bugreport; matches the second layout above):
> 
> 
> [~]> LC_TIME=C sudo journalctl -b -10 -u lvm2-pvscan@\*
> -- Journal begins at Fri 2020-12-18 16:33:22 CET, ends at Wed 2021-
> 02-17 14:28:05 CET. --
> Feb 10 17:24:17 archlinux systemd[1]: Starting LVM event activation
> on device 9:0...
> Feb 10 17:24:17 archlinux lvm[463]:   pvscan[463] PV /dev/md0 online,
> VG base is complete.
> Feb 10 17:24:17 archlinux lvm[463]:   pvscan[463] VG base run
> autoactivation.
> Feb 10 17:24:17 archlinux lvm[463]:   2 logical volume(s) in volume
> group "base" now active
> Feb 10 17:24:17 archlinux systemd[1]: Finished LVM event activation
> on device 9:0.
> Feb 10 17:24:26 archlinux systemd[1]: Starting LVM event activation
> on device 253:2...
> Feb 10 17:24:26 archlinux lvm[643]:   pvscan[643] PV /dev/mapper/sys
> online, VG sys is complete.

All good up to here, but then...

> Feb 10 17:24:26 archlinux lvm[643]:   pvscan[643] VG sys run
> autoactivation.
> Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat failed:
> No such file or directory

What's going on here? pvscan trying to start dmeventd ? Why ? There's a
dedicated service for starting dmeventd (lvm2-monitor.service). I can
see that running dmeventd makes sense as you have thin pools, but I'm
at a loss why it has to be started at that early stage during boot
already.

This is a curious message, it looks as if pvscan was running from an
environment (initramfs??) where dmeventd wasn't available. The message
is repeated, and after that, pvscan appears to hang...


> Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat failed:
> No such file or directory
> Feb 10 17:24:26 archlinux lvm[643]:   WARNING: Failed to monitor
> sys/pool.
> Feb 10 17:24:26 archlinux systemd[1]: Stopping LVM event activation
> on device 9:0...

Here I suppose systemd is switching root, and trying to stop jobs,
including the pvscan job.


> Feb 10 17:24:26 archlinux lvm[720]:   pvscan[720] PV /dev/md0 online.
> Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat failed:
> No such file or directory
> Feb 10 17:24:26 archlinux lvm[643]:   WARNING: Failed to monitor
> sys/pool.
> Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: State
> 'stop-sigterm' timed out. Killing.
> Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: Killing
> process 643 (lvm) with signal SIGKILL.
> Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: Main
> process exited, code=killed, status=9/KILL
> Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: Failed
> with result 'timeout'.
> Feb 10 17:24:56 spock systemd[1]: Stopped LVM event activation on
> device 253:2.

So what's timing out here is the attempt to _stop_ pvscan. That's
curious. It looks like a problem in pvscan to me, not having reacted to
a TERM signal for 30s.

It's also worth noting that the parallel pvscan process for device 9:0
terminated correctly (didn't hang).

> 
> [~]> LC_TIME=C sudo journalctl -b -10 --grep pvscan
> -- Journal begins at Fri 2020-12-18 16:33:22 CET, ends at Wed 2021-
> 02-17 14:31:27 CET. --
> Feb 10 17:24:17 archlinux systemd[1]: Created slice system-
> lvm2\x2dpvscan.slice.
> Feb 10 17:24:17 archlinux lvm[463]:   pvscan[463] PV /dev/md0 online,
> VG base is complete.
> Feb 10 17:24:17 archlinux lvm[463]:   pvscan[463] VG base run
> autoactivation.
> Feb 10 17:24:17 archlinux audit[1]: SERVICE_START pid=1 uid=0
> auid=4294967295 ses=4294967295 msg='unit=lvm2-pvscan@9:0
> comm="systemd" exe="/init" hostname=? addr=? terminal=? res=success'
> Feb 10 17:24:17 archlinux kernel: audit: type=1130
> audit(1612974257.986:6): pid=1 uid=0 auid=4294967295 ses=4294967295 
> msg='unit=lvm2-pvscan@9:0 comm="systemd" exe="/init" hostname=?
> addr=? terminal=? res=success'
> Feb 10 17:24:26 archlinux lvm[643]:   pvscan[643] PV /dev/mapper/sys
> online, VG sys is complete.
> Feb 10 17:24:26 archlinux lvm[643]:   pvscan[643] VG sys run
> autoactivation.
> Feb 10 17:24:26 archlinux lvm[720]:   pvscan[720] PV /dev/md0 online.
> Feb 10 17:24:27 spock systemd[1]: lvm2-pvscan@9:0.service: Control
> process exited, code=killed, status=15/TERM
> Feb 10 17:24:27 spock systemd[1]: lvm2-pvscan@9:0.service: Failed
> with result 'signal'.
> Feb 10 17:24:26 spock audit[1]: SERVICE_STOP pid=1 uid=0
> auid=4294967295 ses=4294967295 msg='unit=lvm2-pvscan@9:0
> comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=?
> terminal=? res=failed'
> Feb 10 17:24:27 spock systemd[1]: Requested transaction contradicts
> existing jobs: Transaction for lvm2-pvscan@253:2.service/start is
> destructive (lvm2-pvscan@253:2.service has 'stop' job queued, but
> 'start' is included in transaction).

This one is curious, too. Apparently systemd is done switching root,
while the pvscan service from the initrd is still hanging, so the
"stop" job for it isn't finished. This looks like a problem in systemd
to me. It shouldn't have proceeded with switching root before
sucessfully terminating pvscan.


> Feb 10 17:24:27 spock systemd[1]: sys-devices-virtual-block-
> dm\x2d2.device: Failed to enqueue SYSTEMD_WANTS= job, ignoring:
> Transaction for lvm2-pvscan@253:2.service/start is destructive
> (lvm2-pvscan@253:2.service has 'stop' job queued, but 'start' is
> included in transaction).

So systemd doesn't enqueue another pvscan job for this device because 
the old one isn't stopped yet. Looks fishy, again.

> 
> FWIW, here's what it looks like now, after adding the "After=" fix:
> [~]> LC_TIME=C sudo journalctl -b -u lvm2-pvscan@\*
> -- Journal begins at Fri 2020-12-18 16:33:22 CET, ends at Wed 2021-
> 02-17 14:29:15 CET. --
> Feb 16 22:13:27 archlinux systemd[1]: Starting LVM event activation
> on device 9:0...
> Feb 16 22:13:27 archlinux lvm[495]:   pvscan[495] PV /dev/md0 online,
> VG base is complete.
> Feb 16 22:13:27 archlinux lvm[495]:   pvscan[495] VG base run
> autoactivation.
> Feb 16 22:13:27 archlinux lvm[495]:   2 logical volume(s) in volume
> group "base" now active
> Feb 16 22:13:28 archlinux systemd[1]: Finished LVM event activation
> on device 9:0.
> Feb 16 22:13:37 archlinux systemd[1]: Starting LVM event activation
> on device 253:2...
> Feb 16 22:13:37 archlinux lvm[665]:   pvscan[665] PV /dev/mapper/sys
> online, VG sys is complete.
> Feb 16 22:13:37 archlinux lvm[665]:   pvscan[665] VG sys run
> autoactivation.
> Feb 16 22:13:37 archlinux lvm[665]:   /usr/bin/dmeventd: stat failed:
> No such file or directory
> Feb 16 22:13:37 archlinux lvm[665]:   WARNING: Failed to monitor
> sys/pool.
> Feb 16 22:13:37 archlinux lvm[665]:   /usr/bin/dmeventd: stat failed:
> No such file or directory
> Feb 16 22:13:37 archlinux lvm[665]:   WARNING: Failed to monitor
> sys/pool.
> Feb 16 22:13:37 archlinux lvm[665]:   /usr/bin/dmeventd: stat failed:
> No such file or directory
> Feb 16 22:13:37 archlinux lvm[665]:   WARNING: Failed to monitor
> sys/pool.
> Feb 16 22:13:37 archlinux lvm[665]:   /usr/bin/dmeventd: stat failed:
> No such file or directory
> Feb 16 22:13:37 archlinux lvm[665]:   WARNING: Failed to monitor
> sys/pool.
> Feb 16 22:13:37 archlinux lvm[665]:   /usr/bin/dmeventd: stat failed:
> No such file or directory
> Feb 16 22:13:37 archlinux lvm[665]:   WARNING: Failed to monitor
> sys/pool.
> Feb 16 22:13:37 archlinux lvm[665]:   /usr/bin/dmeventd: stat failed:
> No such file or directory
> Feb 16 22:13:37 archlinux lvm[665]:   WARNING: Failed to monitor
> sys/pool.
> Feb 16 22:13:37 archlinux systemd[1]: Stopping LVM event activation
> on device 9:0...
> Feb 16 22:13:37 archlinux lvm[743]:   pvscan[743] PV /dev/md0 online.
> Feb 16 22:13:37 archlinux lvm[665]:   /usr/bin/dmeventd: stat failed:
> No such file or directory
> Feb 16 22:13:37 archlinux lvm[665]:   WARNING: Failed to monitor
> sys/pool.
> Feb 16 22:13:37 archlinux lvm[665]:   8 logical volume(s) in volume
> group "sys" now active
> Feb 16 22:13:37 archlinux systemd[1]: lvm2-pvscan@9:0.service:
> Succeeded.
> Feb 16 22:13:37 archlinux systemd[1]: Stopped LVM event activation on
> device 9:0.
> Feb 16 22:13:37 archlinux systemd[1]: lvm2-pvscan@253:2.service: Main
> process exited, code=killed, status=15/TERM
> Feb 16 22:13:37 archlinux systemd[1]: lvm2-pvscan@253:2.service:
> Failed with result 'signal'.
> Feb 16 22:13:37 archlinux systemd[1]: Stopped LVM event activation on
> device 253:2.

We have the same error messages about dmeventd as before. But unlike
before, pvscan was successfully terminated this time.

I think this is the explanation: with the "After=", systemd kills
pvscan first and then udevd. Wihtout the "After=", it may kill them
both at once, and that might have caused pvscan to hang and not
terminate cleanly. Apparently it was still waiting for some udev-
related action to complete.

Thanks
Martin



_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] [PATCH 1/1] pvscan: wait for udevd
  2021-02-18 15:19         ` Martin Wilck
@ 2021-02-18 15:30           ` Oleksandr Natalenko
  2021-02-19  9:22             ` Martin Wilck
  2021-02-19 16:37           ` David Teigland
  1 sibling, 1 reply; 15+ messages in thread
From: Oleksandr Natalenko @ 2021-02-18 15:30 UTC (permalink / raw)
  To: Martin Wilck
  Cc: LVM general discussion and development, Christian Hesse,
	Heming Zhao, Oleksandr Natalenko,
	LVM general discussion and development

Hi.

(will comment only on what I can comment, dropping the rest)

On Thu, Feb 18, 2021 at 04:19:01PM +0100, Martin Wilck wrote:
> > I'm not sure this issue is reproducible with any kind of LVM layout.
> > What I have is thin-LVM-on-LUKS-on-LVM:
> 
> I saw MD in your other logs...?

FWIW, one of the machines have soft RAID, another does not, the issue
is reproducible regardless of whether there's soft RAID or not.

> > Feb 10 17:24:26 archlinux lvm[643]:   pvscan[643] VG sys run
> > autoactivation.
> > Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat failed:
> > No such file or directory
> 
> What's going on here? pvscan trying to start dmeventd ? Why ? There's a
> dedicated service for starting dmeventd (lvm2-monitor.service). I can
> see that running dmeventd makes sense as you have thin pools, but I'm
> at a loss why it has to be started at that early stage during boot
> already.
> 
> This is a curious message, it looks as if pvscan was running from an
> environment (initramfs??) where dmeventd wasn't available. The message
> is repeated, and after that, pvscan appears to hang...

Not sure either. FWIW, real root is on a thin volume (everything is,
in fact, except /boot and swap).

> > Feb 10 17:24:26 archlinux lvm[720]:   pvscan[720] PV /dev/md0 online.
> > Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat failed:
> > No such file or directory
> > Feb 10 17:24:26 archlinux lvm[643]:   WARNING: Failed to monitor
> > sys/pool.
> > Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: State
> > 'stop-sigterm' timed out. Killing.
> > Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: Killing
> > process 643 (lvm) with signal SIGKILL.
> > Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: Main
> > process exited, code=killed, status=9/KILL
> > Feb 10 17:24:56 spock systemd[1]: lvm2-pvscan@253:2.service: Failed
> > with result 'timeout'.
> > Feb 10 17:24:56 spock systemd[1]: Stopped LVM event activation on
> > device 253:2.
> 
> So what's timing out here is the attempt to _stop_ pvscan. That's
> curious. It looks like a problem in pvscan to me, not having reacted to
> a TERM signal for 30s.
> 
> It's also worth noting that the parallel pvscan process for device 9:0
> terminated correctly (didn't hang).

Yes, pvscan seems to not react to SIGTERM. I have
DefaultTimeoutStopSec=30s, if I set this to 90s, pvscan hangs for 90s
respectively.

-- 
  Best regards,
    Oleksandr Natalenko (post-factum)
    Principal Software Maintenance Engineer

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] [PATCH 1/1] pvscan: wait for udevd
  2021-02-18 15:30           ` Oleksandr Natalenko
@ 2021-02-19  9:22             ` Martin Wilck
  0 siblings, 0 replies; 15+ messages in thread
From: Martin Wilck @ 2021-02-19  9:22 UTC (permalink / raw)
  To: Oleksandr Natalenko
  Cc: LVM general discussion and development, Christian Hesse,
	Heming Zhao, Oleksandr Natalenko,
	LVM general discussion and development

On Thu, 2021-02-18 at 16:30 +0100, Oleksandr Natalenko wrote:
> > 
> > So what's timing out here is the attempt to _stop_ pvscan. That's
> > curious. It looks like a problem in pvscan to me, not having
> > reacted to
> > a TERM signal for 30s.
> > 
> > It's also worth noting that the parallel pvscan process for device
> > 9:0
> > terminated correctly (didn't hang).
> 
> Yes, pvscan seems to not react to SIGTERM. I have
> DefaultTimeoutStopSec=30s, if I set this to 90s, pvscan hangs for 90s
> respectively.
> 

Good point. That allows us to conclude that pvscan may hang on exit
when udevd isn't available at the time (has been already stopped). That
positively looks like an lvm problem. The After= is a viable
workaround, nothing more and nothing less. We'd need to run pvscan with
increased debug/log level to figure out why it doesn't stop. Given that
you have a workaround, I'm not sure if it's worth the effort for you.

What strikes me more in your logs is the fact that systemd proceeds
with switching root even though the pvscan@253:2 service hasn't
terminated yet. That looks a bit fishy, really. systemd should have
KILLed pvscan before proceeding.

Martin


_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] [PATCH 1/1] pvscan: wait for udevd
  2021-02-18 15:19         ` Martin Wilck
  2021-02-18 15:30           ` Oleksandr Natalenko
@ 2021-02-19 16:37           ` David Teigland
  2021-02-19 22:47             ` Zdenek Kabelac
  2021-02-21 20:26             ` Martin Wilck
  1 sibling, 2 replies; 15+ messages in thread
From: David Teigland @ 2021-02-19 16:37 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Oleksandr Natalenko, linux-lvm, Christian Hesse, Heming Zhao, linux-lvm

On Thu, Feb 18, 2021 at 04:19:01PM +0100, Martin Wilck wrote:
> > Feb 10 17:24:26 archlinux lvm[643]:   pvscan[643] VG sys run
> > autoactivation.
> > Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat failed:
> > No such file or directory
> 
> What's going on here? pvscan trying to start dmeventd ? Why ? There's a
> dedicated service for starting dmeventd (lvm2-monitor.service). I can
> see that running dmeventd makes sense as you have thin pools, but I'm
> at a loss why it has to be started at that early stage during boot
> already.
> 
> This is a curious message, it looks as if pvscan was running from an
> environment (initramfs??) where dmeventd wasn't available. The message
> is repeated, and after that, pvscan appears to hang...

I've found that when pvscan activates a VG, there's a bit of code that
attempts to monitor any LVs that are already active in the VG.  Monitoring
means interacting with dmeventd.  I don't know why it's doing that, it
seems strange, but the logic around monitoring in lvm seems ad hoc and in
need of serious reworking.  In this case I'm guessing there's already an
LV active in "sys", perhaps from direct activation in initrd, and when
pvscan activates that VG it attempts to monitor the already active LV.

Another missing piece in lvm monitoring is that we don't have a way to
start lvm2-monitor/dmeventd at the right time (I'm not sure anyone even
knows when the right time is), so we get random behavior depending on if
it's running or not at a given point.  In this case, it looks like it
happens to not be running yet.  I sometimes suggest disabling lvm2-monitor
and starting it manually once the system is up, to avoid having it
interfere during startup.

> > Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat failed:
> > No such file or directory
> > Feb 10 17:24:26 archlinux lvm[643]:   WARNING: Failed to monitor
> > sys/pool.
> > Feb 10 17:24:26 archlinux systemd[1]: Stopping LVM event activation
> > on device 9:0...

The unwanted failed monitoring seems to have caused the pvscan command to
exit with an error, which then leads to further mess and confusion where
systemd then thinks it should stop or kill the pvscan service, whatever
that means.  pvscan should probably never exit with an error.

Dave

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] [PATCH 1/1] pvscan: wait for udevd
  2021-02-19 16:37           ` David Teigland
@ 2021-02-19 22:47             ` Zdenek Kabelac
  2021-02-21 20:23               ` Martin Wilck
  2021-02-21 20:26             ` Martin Wilck
  1 sibling, 1 reply; 15+ messages in thread
From: Zdenek Kabelac @ 2021-02-19 22:47 UTC (permalink / raw)
  To: LVM general discussion and development, David Teigland, Martin Wilck
  Cc: Oleksandr Natalenko, linux-lvm, Christian Hesse, Heming Zhao

Dne 19. 02. 21 v 17:37 David Teigland napsal(a):
> On Thu, Feb 18, 2021 at 04:19:01PM +0100, Martin Wilck wrote:
>>> Feb 10 17:24:26 archlinux lvm[643]:   pvscan[643] VG sys run
>>> autoactivation.
>>> Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat failed:
>>> No such file or directory
>>
>> What's going on here? pvscan trying to start dmeventd ? Why ? There's a
>> dedicated service for starting dmeventd (lvm2-monitor.service). I can
>> see that running dmeventd makes sense as you have thin pools, but I'm
>> at a loss why it has to be started at that early stage during boot
>> already.
>>
>> This is a curious message, it looks as if pvscan was running from an
>> environment (initramfs??) where dmeventd wasn't available. The message
>> is repeated, and after that, pvscan appears to hang...
> 
> I've found that when pvscan activates a VG, there's a bit of code that
> attempts to monitor any LVs that are already active in the VG.  Monitoring
> means interacting with dmeventd.  I don't know why it's doing that, it
> seems strange, but the logic around monitoring in lvm seems ad hoc and in
> need of serious reworking.  In this case I'm guessing there's already an
> LV active in "sys", perhaps from direct activation in initrd, and when
> pvscan activates that VG it attempts to monitor the already active LV.

The existing design for lvm2 rootfs using was like:

Activate 'LV' within ramdisk by dracut - which discovers rootfs VG/LV
and activates it (by rather 'brute-force' naive approach).

Such activation is WITHOUT monitoring - as ramdisk is without 'dmeventd'
and we do not want to 'lock' the binary from ramdisk into memory.

So once the system switches to rootfs - 'vgchange --monitor y' enables 
monitoring for all activated LVs from ramdisk and process continues.

Event based activation within ramdisk is a 3rd. party initiative by Arch linux 
and thus needs to be 'reinvented' with its own problems that arise from this.

So far - in lvm2 the current dracut method is more maintainable.

> Another missing piece in lvm monitoring is that we don't have a way to
> start lvm2-monitor/dmeventd at the right time (I'm not sure anyone even
> knows when the right time is), so we get random behavior depending on if
> it's running or not at a given point.  In this case, it looks like it
> happens to not be running yet.  I sometimes suggest disabling lvm2-monitor
> and starting it manually once the system is up, to avoid having it
> interfere during startup.

Right time is when switch is finished and we have rootfs with /usr
available - should be ensured by  lvm2-monitor.service and it dependencies.

Zdenek

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] [PATCH 1/1] pvscan: wait for udevd
  2021-02-19 22:47             ` Zdenek Kabelac
@ 2021-02-21 20:23               ` Martin Wilck
  2021-02-22  9:57                 ` Zdenek Kabelac
  0 siblings, 1 reply; 15+ messages in thread
From: Martin Wilck @ 2021-02-21 20:23 UTC (permalink / raw)
  To: Zdenek Kabelac, LVM general discussion and development, David Teigland
  Cc: Oleksandr Natalenko, linux-lvm, Christian Hesse, Heming Zhao

On Fri, 2021-02-19 at 23:47 +0100, Zdenek Kabelac wrote:
> 
> Right time is when switch is finished and we have rootfs with /usr
> available - should be ensured by  lvm2-monitor.service and it
> dependencies.

While we're at it - I'm wondering why dmeventd is started so early. dm-
event.service on recent installments has only "Requires=dm-
event.socket", so it'll be started almost immediately after switching
root. In particular, it doesn't wait for any sort of device
initialization or udev initialization.

I've gone through the various tasks that dmeventd is responsible for,
and I couldn't see anything that'd be strictly necessary during early
boot. I may be overlooking something of course. Couldn't the monitoring
be delayed to after local-fs.target, for example?

(This is also related to our previous discussion about
external_device_info_source=udev; we found that dmeventd was one of the
primary sources of strange errors with that setting).

Regards
Martin



_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] [PATCH 1/1] pvscan: wait for udevd
  2021-02-19 16:37           ` David Teigland
  2021-02-19 22:47             ` Zdenek Kabelac
@ 2021-02-21 20:26             ` Martin Wilck
  1 sibling, 0 replies; 15+ messages in thread
From: Martin Wilck @ 2021-02-21 20:26 UTC (permalink / raw)
  To: David Teigland
  Cc: Natalenko, linux-lvm, Christian Hesse, Oleksandr, linux-lvm, Heming Zhao

On Fri, 2021-02-19 at 10:37 -0600, David Teigland wrote:
> On Thu, Feb 18, 2021 at 04:19:01PM +0100, Martin Wilck wrote:
> > > Feb 10 17:24:26 archlinux lvm[643]:   pvscan[643] VG sys run
> > > autoactivation.
> > > Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat
> > > failed:
> > > No such file or directory
> > 
> > What's going on here? pvscan trying to start dmeventd ? Why ?
> > There's a
> > dedicated service for starting dmeventd (lvm2-monitor.service). I
> > can
> > see that running dmeventd makes sense as you have thin pools, but
> > I'm
> > at a loss why it has to be started at that early stage during boot
> > already.
> > 
> > This is a curious message, it looks as if pvscan was running from
> > an
> > environment (initramfs??) where dmeventd wasn't available. The
> > message
> > is repeated, and after that, pvscan appears to hang...
> 
> I've found that when pvscan activates a VG, there's a bit of code
> that
> attempts to monitor any LVs that are already active in the VG. 
> Monitoring
> means interacting with dmeventd.  I don't know why it's doing that,
> it
> seems strange, but the logic around monitoring in lvm seems ad hoc
> and in
> need of serious reworking.  In this case I'm guessing there's already
> an
> LV active in "sys", perhaps from direct activation in initrd, and
> when
> pvscan activates that VG it attempts to monitor the already active
> LV.
> 
> Another missing piece in lvm monitoring is that we don't have a way
> to
> start lvm2-monitor/dmeventd at the right time (I'm not sure anyone
> even
> knows when the right time is), so we get random behavior depending on
> if
> it's running or not at a given point.  In this case, it looks like it
> happens to not be running yet.  I sometimes suggest disabling lvm2-
> monitor
> and starting it manually once the system is up, to avoid having it
> interfere during startup.

That sounds familiar.

> 
> > > Feb 10 17:24:26 archlinux lvm[643]:   /usr/bin/dmeventd: stat
> > > failed:
> > > No such file or directory
> > > Feb 10 17:24:26 archlinux lvm[643]:   WARNING: Failed to monitor
> > > sys/pool.
> > > Feb 10 17:24:26 archlinux systemd[1]: Stopping LVM event
> > > activation
> > > on device 9:0...
> 
> The unwanted failed monitoring seems to have caused the pvscan
> command to
> exit with an error, which then leads to further mess and confusion
> where
> systemd then thinks it should stop or kill the pvscan service,
> whatever
> that means.

The way I read Oleksandr's logs, systemd is killing all processes
because it wants to switch root, not because of errors in the pvscan
service. The weird thing is that that fails for one of the pvscan tasks
(253:2), and that that service continues to "run" (rather, "hang") long
after the root switch has happened.

Thanks,
Martin



_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] [PATCH 1/1] pvscan: wait for udevd
  2021-02-21 20:23               ` Martin Wilck
@ 2021-02-22  9:57                 ` Zdenek Kabelac
  2021-02-22 13:04                   ` Christian Hesse
  0 siblings, 1 reply; 15+ messages in thread
From: Zdenek Kabelac @ 2021-02-22  9:57 UTC (permalink / raw)
  To: Martin Wilck, LVM general discussion and development, David Teigland
  Cc: Oleksandr Natalenko, linux-lvm, Christian Hesse, Heming Zhao

Dne 21. 02. 21 v 21:23 Martin Wilck napsal(a):
> On Fri, 2021-02-19 at 23:47 +0100, Zdenek Kabelac wrote:
>>
>> Right time is when switch is finished and we have rootfs with /usr
>> available - should be ensured by  lvm2-monitor.service and it
>> dependencies.
> 
> While we're at it - I'm wondering why dmeventd is started so early. dm-
> event.service on recent installments has only "Requires=dm-
> event.socket", so it'll be started almost immediately after switching
> root. In particular, it doesn't wait for any sort of device
> initialization or udev initialization.

Hi

Dmeventd alone does not depend on lvm2 in any way - it's the lvm2 plugin which 
then does all the 'scanning' for VGs/LVs and gets loaded when lvm2 connects to 
monitoring socket. That's also why dmeventd belongs to dm subsystem.

Dmeventd is nothing else then a process to check DM devices periodically - and 
can be used by i.e. dmraid or others...

So as such it doesn't need any devices - but it needs to be initialized early 
so it can accept connections from tools like lvm2 and starts to monitor a 
device without delaying command (as lvm2 wait for confirmation device is 
monitored).

> I've gone through the various tasks that dmeventd is responsible for,
> and I couldn't see anything that'd be strictly necessary during early
> boot. I may be overlooking something of course. Couldn't the monitoring

As said - during ramdisk boot - monitor shall not be used (AFAIK - dracut is 
supposed to use disabled monitoring in it's modified copy of lvm.conf within 
ramdisk)

But we want to switch to monitoring ASAP when we switch to rootfs - so the 
'unmonitored' window is as small as possible - there are still same 'grey' 
areas in the correct logic thought...

Zdenek

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] [PATCH 1/1] pvscan: wait for udevd
  2021-02-22  9:57                 ` Zdenek Kabelac
@ 2021-02-22 13:04                   ` Christian Hesse
  2021-02-25 16:51                     ` Oleksandr Natalenko
  0 siblings, 1 reply; 15+ messages in thread
From: Christian Hesse @ 2021-02-22 13:04 UTC (permalink / raw)
  To: Zdenek Kabelac
  Cc: Natalenko, linux-lvm, Oleksandr, David Teigland,
	LVM general discussion and development, Martin Wilck,
	Heming Zhao


[-- Attachment #1.1: Type: text/plain, Size: 1572 bytes --]

Zdenek Kabelac <zkabelac@redhat.com> on Mon, 2021/02/22 10:57:
> > I've gone through the various tasks that dmeventd is responsible for,
> > and I couldn't see anything that'd be strictly necessary during early
> > boot. I may be overlooking something of course. Couldn't the monitoring  
> 
> As said - during ramdisk boot - monitor shall not be used (AFAIK - dracut
> is supposed to use disabled monitoring in it's modified copy of lvm.conf
> within ramdisk)

I could not find anything in dracut that modifies lvm.conf, but looks like
dracut calls the lvm commands with `--ignoremonitoring`.

To date this is not handled in lvm2's mkinitcpio hook... Wondering if
that would help. Oleksandr, you could undo the udev workaround, then apply
the following diff to /usr/lib/initcpio/install/lvm2, regenerate the
initramfs and retry?

--- lvm2_install	(revision 408582)
+++ lvm2_install	(working copy)
@@ -34,6 +34,7 @@
     add_file "/usr/lib/udev/rules.d/95-dm-notify.rules"
     add_file "/usr/lib/initcpio/udev/11-dm-initramfs.rules" "/usr/lib/udev/rules.d/11-dm-initramfs.rules"
     add_file "/etc/lvm/lvm.conf"
+    sed -i '/^\smonitoring =/s/1/0/' "${BUILDROOT}/etc/lvm/lvm.conf"
 
     # this udev rule is specific for systemd and non-systemd systems
     if command -v add_systemd_unit >/dev/null; then
-- 
main(a){char*c=/*    Schoene Gruesse                         */"B?IJj;MEH"
"CX:;",b;for(a/*    Best regards             my address:    */=0;b=c[a++];)
putchar(b-1/(/*    Chris            cc -ox -xc - && ./x    */b/42*2-3)*42);}

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [linux-lvm] [PATCH 1/1] pvscan: wait for udevd
  2021-02-22 13:04                   ` Christian Hesse
@ 2021-02-25 16:51                     ` Oleksandr Natalenko
  0 siblings, 0 replies; 15+ messages in thread
From: Oleksandr Natalenko @ 2021-02-25 16:51 UTC (permalink / raw)
  To: Christian Hesse, Zdenek Kabelac
  Cc: Oleksandr Natalenko, linux-lvm, David Teigland,
	LVM general discussion and development, Martin Wilck,
	Heming Zhao

On 22 February 2021 14:04:17 CET, Christian Hesse <list@eworm.de> wrote:
>Zdenek Kabelac <zkabelac@redhat.com> on Mon, 2021/02/22 10:57:
>> > I've gone through the various tasks that dmeventd is responsible
>for,
>> > and I couldn't see anything that'd be strictly necessary during
>early
>> > boot. I may be overlooking something of course. Couldn't the
>monitoring  
>> 
>> As said - during ramdisk boot - monitor shall not be used (AFAIK -
>dracut
>> is supposed to use disabled monitoring in it's modified copy of
>lvm.conf
>> within ramdisk)
>
>I could not find anything in dracut that modifies lvm.conf, but looks
>like
>dracut calls the lvm commands with `--ignoremonitoring`.
>
>To date this is not handled in lvm2's mkinitcpio hook... Wondering if
>that would help. Oleksandr, you could undo the udev workaround, then
>apply
>the following diff to /usr/lib/initcpio/install/lvm2, regenerate the
>initramfs and retry?
>
>--- lvm2_install	(revision 408582)
>+++ lvm2_install	(working copy)
>@@ -34,6 +34,7 @@
>     add_file "/usr/lib/udev/rules.d/95-dm-notify.rules"
>add_file "/usr/lib/initcpio/udev/11-dm-initramfs.rules"
>"/usr/lib/udev/rules.d/11-dm-initramfs.rules"
>     add_file "/etc/lvm/lvm.conf"
>+    sed -i '/^\smonitoring =/s/1/0/' "${BUILDROOT}/etc/lvm/lvm.conf"
> 
>     # this udev rule is specific for systemd and non-systemd systems
>     if command -v add_systemd_unit >/dev/null; then

Hi.

This changes nothing, sorry. The issue is still there.
-- 
  Best regards,
    Oleksandr Natalenko (post-factum)
    Principal Software Maintenance Engineer


_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-02-26  8:43 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-11 11:16 [linux-lvm] [PATCH 1/1] pvscan: wait for udevd Christian Hesse
     [not found] ` <f18dff7a-050a-d41d-c643-4616522ba4a0@suse.com>
2021-02-17  8:22   ` Martin Wilck
     [not found]     ` <20210217130329.7de41147@leda>
2021-02-17 13:38       ` Oleksandr Natalenko
2021-02-18 15:19         ` Martin Wilck
2021-02-18 15:30           ` Oleksandr Natalenko
2021-02-19  9:22             ` Martin Wilck
2021-02-19 16:37           ` David Teigland
2021-02-19 22:47             ` Zdenek Kabelac
2021-02-21 20:23               ` Martin Wilck
2021-02-22  9:57                 ` Zdenek Kabelac
2021-02-22 13:04                   ` Christian Hesse
2021-02-25 16:51                     ` Oleksandr Natalenko
2021-02-21 20:26             ` Martin Wilck
2021-02-17 13:49       ` Martin Wilck
2021-02-17 19:11         ` Oleksandr Natalenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).