Thanks, Claudius. I really appreciate your responses. I'm not trying to be pedantic. Since I don't have your test setup, I was just trying to make sure I understood the context of the problem as I figure out how to deal with issues this is causing in our setup. I was also hoping one of the recipe maintainers of either systemd or rng-tools would comment on systemd-udev-settle. I'll take a look at the caam module to see if I can understand how it works. On Thu, Feb 3, 2022 at 3:35 AM Claudius Heine wrote: > On 2022-02-02 17:26, Kyle Russell wrote: > > Thanks, Claudius. > > > > On Wed, Feb 2, 2022 at 8:08 AM Claudius Heine > > wrote: > > > > Hi Kyle, > > > > On 2022-02-02 13:38, Kyle Russell wrote: > > > Is this the correct approach? Even the > > systemd-udev-settle.service man > > > pages recommends not using its service. Were the kernel modules > > really > > > not loaded when rngd started? Or is the original problem just a > > matter > > > of waiting for sufficient entropy? > > > > IIRC, the rngd could not find any random source device node > (/dev/hwrng > > in that case), so the service failed to start. > > > > > > If /dev/hwrng didn't exist, this feels like the original problem was a > > misconfigured > > kernel or module that wasn't being loaded properly. > > Yes, however it is a timing issue. The module was loaded properly at > bootup, however at the time rngd was started the module was not loaded > yet and thus the service fails to start. If it would be delayed until > the module is loaded everything would be fine. > > It does not happen if the module is compiled into the kernel or if a > initramfs is used which loads the module (I think). I our case it > happend with the caam module as an external module loaded on boot from > the real root file system. > > > The patch you are commenting on only adds `Wants` weak dependency to > > make sure `systemd-udev-settle.service` is pulled in to the job > queue, > > the `After` ordering rule was already there. > > > > > > Correct. Just because an `After` exists does not mean the service gets > > pulled into > > the job queue, so prior to this change no other service was causing the > > deprecated > > systemd-udev-settle.service to be run during boot. But now, every > > device including > > openssh (which has a default PACKAGECONFIG option for rng-tools) now > depends > > on this deprecated service, which may cause unexpected boot delays. > > > > So changing this service file to be triggered by a udev event or > maybe > > wrap it in a script, which makes sure the right modules are loaded > and > > device nodes are available, could be an improvement, but it would be > > out > > of scope of this patch IMO. > > > > > > I'm more curious whether this change should be reverted from upstream. > > It seems > > like a drop-in file could have been applied to your distro instead of > > adding a dependency > > on a deprecated service for all openssh users. > > This patch just adds a missing entry into the service file. If you have > solved the described issue in some way and can revert this patch and > remove the `Wants=systemd-udev-settle.service` then you can also remove > the `After=systemd-udev-settle.service` at the same time and at that > point you can just remove both of those entries directly in the patch > that solved the timing issue. > > I agree that `systemd-udev-settle.service` should probably not be used > anymore, however that file already used it in a non-functional way and > all this patch did was make it fulfill its intended function. > > In retrospect I probably should have tried to find a way to remove the > usage of `systemd-udev-settle.service` completely, when I looked into > the issue, however all this patch in essence does is revive dead code, > which was already in place. > > Also I think at that time I couldn't find a more precise instrument in > systemds massive toolbox to delay the start of rngd and services that > should be started in succession until the just the hardware random > generator device is ready and `After=systemd-udev-settle.service` was > already there. I guess some `ExecStartPre=` script which delays the > start until the conditions are met could be implemented, but that seems > a bit hackish. > > regards, > Claudius >