linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] cpufreq: Excessive CPUFreq driver loading
@ 2021-05-06 14:25 Meyer, Kyle
  2021-05-14 14:06 ` Meyer, Kyle
  0 siblings, 1 reply; 3+ messages in thread
From: Meyer, Kyle @ 2021-05-06 14:25 UTC (permalink / raw)
  To: LKML

Hello,

acpi-cpufreq is mutually exclusive with intel_pstate, however, acpi-cpufreq is
loaded multiple times during startup while intel_pstate is enabled.

This issue was reported to the systemd maintainers and they indicated that it
should be fixed in the kernel: https://github.com/systemd/systemd/issues/19439

During startup, the kernel triggers one uevent for each device as a result of
systemd-udev-trigger.service executing "udevadm trigger --type=subsystems
--action=add" and "udevadm trigger --type=devices --action=add". The service
exists to retrigger all devices as uevents sent by the kernel, before
systemd-udevd is running, would have been missed. When systemd-udevd receives a
uevent it matches its configured rules against the device. If a uevent's
ACTION=="add", systemd-udevd will run "kmod load $env{MODALIAS}" from
80-drivers.rules. udev's builtin kmod will then attempt to load modules
matching the device's MODALIAS.

When systemd-udevd recieves an "add" uevent from
/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0007:XXX it runs "kmod load cpu:type:x86,
...,00E8,..." as "cpu:type:x86,...,00E8,..." is that devices MODALIAS.

When systemd-udevd recieves an "add" uevent from /devices/system/cpu/cpuXXX it
runs "kmod load acpi:ACPI0007:" as "acpi:ACPI0007:" is that devices MODALIAS.

acpi-cpufreq is loaded as it matches both devices MODALIASes.
# modinfo acpi-cpufreq | grep alias
alias:          acpi
alias:          cpu:type:x86,ven*fam*mod*:feature:*00E8*
alias:          cpu:type:x86,ven*fam*mod*:feature:*0016*
alias:          acpi*:ACPI0007:*
alias:          acpi*:LNXCPU:*

On a system with 1536 logical CPUs, systemd-udevd attempts to load acpi-cpufreq
3072 times.

1536 * /devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0007:XXX
1536 * /devices/system/cpu/cpuXXX

The delay, caused by systemd-udevd attempting to load the driver, has a
significant impact on the startup time. It causes some devices to be
unavailable after reaching the root login prompt as it postpones the loading of
other drivers.

Each time that the driver is loaded it returns -EEXIST from acpi_cpufreq_init.
static int __init acpi_cpufreq_init(void)
{
        int ret;

        if (acpi_disabled)
                return -ENODEV;

        /* don't keep reloading if cpufreq_driver exists */
        if (cpufreq_get_current_driver())
                return -EEXIST;
...

Changing the return value from -EEXIST to 0 when another driver exists prevents
the driver from being loaded multiple times as kmod won't load a "live" module.
Alternatively, blacklisting the driver (or disabling intel_pstate) prevents the
issue as well. Below are the before and after startup times.

# systemd-analyze
Startup finished in 37.939s (kernel) + 10.909s (initrd) + 3min 55.004s (userspace) = 4min 43.852s

# systemd-analyze
Startup finished in 38.307s (kernel) + 10.205s (initrd) + 38.312s (userspace) = 1min 26.826s

Thank you,
Kyle Meyer

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] cpufreq: Excessive CPUFreq driver loading
  2021-05-06 14:25 [RFC] cpufreq: Excessive CPUFreq driver loading Meyer, Kyle
@ 2021-05-14 14:06 ` Meyer, Kyle
  2021-05-14 16:15   ` Srinivas Pandruvada
  0 siblings, 1 reply; 3+ messages in thread
From: Meyer, Kyle @ 2021-05-14 14:06 UTC (permalink / raw)
  To: LKML; +Cc: rjw, viresh.kumar, srinivas.pandruvada, lenb

Adding maintainers to the CC list.

Thank you,
Kyle Meyer

________________________________________
From: Meyer, Kyle <kyle.meyer@hpe.com>
Sent: Thursday, May 6, 2021 9:25 AM
To: LKML
Subject: [RFC] cpufreq: Excessive CPUFreq driver loading

Hello,

acpi-cpufreq is mutually exclusive with intel_pstate, however, acpi-cpufreq is
loaded multiple times during startup while intel_pstate is enabled.

This issue was reported to the systemd maintainers and they indicated that it
should be fixed in the kernel: https://github.com/systemd/systemd/issues/19439

During startup, the kernel triggers one uevent for each device as a result of
systemd-udev-trigger.service executing "udevadm trigger --type=subsystems
--action=add" and "udevadm trigger --type=devices --action=add". The service
exists to retrigger all devices as uevents sent by the kernel, before
systemd-udevd is running, would have been missed. When systemd-udevd receives a
uevent it matches its configured rules against the device. If a uevent's
ACTION=="add", systemd-udevd will run "kmod load $env{MODALIAS}" from
80-drivers.rules. udev's builtin kmod will then attempt to load modules
matching the device's MODALIAS.

When systemd-udevd recieves an "add" uevent from
/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0007:XXX it runs "kmod load cpu:type:x86,
...,00E8,..." as "cpu:type:x86,...,00E8,..." is that devices MODALIAS.

When systemd-udevd recieves an "add" uevent from /devices/system/cpu/cpuXXX it
runs "kmod load acpi:ACPI0007:" as "acpi:ACPI0007:" is that devices MODALIAS.

acpi-cpufreq is loaded as it matches both devices MODALIASes.
# modinfo acpi-cpufreq | grep alias
alias:          acpi
alias:          cpu:type:x86,ven*fam*mod*:feature:*00E8*
alias:          cpu:type:x86,ven*fam*mod*:feature:*0016*
alias:          acpi*:ACPI0007:*
alias:          acpi*:LNXCPU:*

On a system with 1536 logical CPUs, systemd-udevd attempts to load acpi-cpufreq
3072 times.

1536 * /devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0007:XXX
1536 * /devices/system/cpu/cpuXXX

The delay, caused by systemd-udevd attempting to load the driver, has a
significant impact on the startup time. It causes some devices to be
unavailable after reaching the root login prompt as it postpones the loading of
other drivers.

Each time that the driver is loaded it returns -EEXIST from acpi_cpufreq_init.
static int __init acpi_cpufreq_init(void)
{
        int ret;

        if (acpi_disabled)
                return -ENODEV;

        /* don't keep reloading if cpufreq_driver exists */
        if (cpufreq_get_current_driver())
                return -EEXIST;
...

Changing the return value from -EEXIST to 0 when another driver exists prevents
the driver from being loaded multiple times as kmod won't load a "live" module.
Alternatively, blacklisting the driver (or disabling intel_pstate) prevents the
issue as well. Below are the before and after startup times.

# systemd-analyze
Startup finished in 37.939s (kernel) + 10.909s (initrd) + 3min 55.004s (userspace) = 4min 43.852s

# systemd-analyze
Startup finished in 38.307s (kernel) + 10.205s (initrd) + 38.312s (userspace) = 1min 26.826s

Thank you,
Kyle Meyer

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] cpufreq: Excessive CPUFreq driver loading
  2021-05-14 14:06 ` Meyer, Kyle
@ 2021-05-14 16:15   ` Srinivas Pandruvada
  0 siblings, 0 replies; 3+ messages in thread
From: Srinivas Pandruvada @ 2021-05-14 16:15 UTC (permalink / raw)
  To: Meyer, Kyle, LKML; +Cc: rjw, viresh.kumar, lenb

On Fri, 2021-05-14 at 14:06 +0000, Meyer, Kyle wrote:
> Adding maintainers to the CC list.
> 
> Thank you,
> Kyle Meyer
> 
> ________________________________________
> From: Meyer, Kyle <kyle.meyer@hpe.com>
> Sent: Thursday, May 6, 2021 9:25 AM
> To: LKML
> Subject: [RFC] cpufreq: Excessive CPUFreq driver loading
> 
> Hello,
> 
> acpi-cpufreq is mutually exclusive with intel_pstate, however, acpi-
> cpufreq is
> loaded multiple times during startup while intel_pstate is enabled.
> 
> This issue was reported to the systemd maintainers and they indicated
> that it
> should be fixed in the kernel: 
> https://github.com/systemd/systemd/issues/19439
> 
> During startup, the kernel triggers one uevent for each device as a
> result of
> systemd-udev-trigger.service executing "udevadm trigger --
> type=subsystems
> --action=add" and "udevadm trigger --type=devices --action=add". The
> service
> exists to retrigger all devices as uevents sent by the kernel, before
> systemd-udevd is running, would have been missed. When systemd-udevd
> receives a
> uevent it matches its configured rules against the device. If a
> uevent's
> ACTION=="add", systemd-udevd will run "kmod load $env{MODALIAS}" from
> 80-drivers.rules. udev's builtin kmod will then attempt to load
> modules
> matching the device's MODALIAS.
> 
> When systemd-udevd recieves an "add" uevent from
> /devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0007:XXX it runs "kmod load
> cpu:type:x86,
> ...,00E8,..." as "cpu:type:x86,...,00E8,..." is that devices
> MODALIAS.
> 
> When systemd-udevd recieves an "add" uevent from
> /devices/system/cpu/cpuXXX it
> runs "kmod load acpi:ACPI0007:" as "acpi:ACPI0007:" is that devices
> MODALIAS.
> 
> acpi-cpufreq is loaded as it matches both devices MODALIASes.
> # modinfo acpi-cpufreq | grep alias
> alias:          acpi
> alias:          cpu:type:x86,ven*fam*mod*:feature:*00E8*
> alias:          cpu:type:x86,ven*fam*mod*:feature:*0016*
> alias:          acpi*:ACPI0007:*
> alias:          acpi*:LNXCPU:*
> 
> On a system with 1536 logical CPUs, systemd-udevd attempts to load
> acpi-cpufreq
> 3072 times.
> 
> 1536 * /devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0007:XXX
> 1536 * /devices/system/cpu/cpuXXX
> 
> The delay, caused by systemd-udevd attempting to load the driver, has
> a
> significant impact on the startup time. It causes some devices to be
> unavailable after reaching the root login prompt as it postpones the
> loading of
> other drivers.
> 
> Each time that the driver is loaded it returns -EEXIST from
> acpi_cpufreq_init.
> static int __init acpi_cpufreq_init(void)
> {
>         int ret;
> 
>         if (acpi_disabled)
>                 return -ENODEV;
> 
>         /* don't keep reloading if cpufreq_driver exists */
>         if (cpufreq_get_current_driver())
>                 return -EEXIST;
> ...
> 
> Changing the return value from -EEXIST to 0 when another driver
> exists prevents
> the driver from being loaded multiple times as kmod won't load a
> "live" module.
> Alternatively, blacklisting the driver (or disabling intel_pstate)
> prevents the
> issue as well. Below are the before and after startup times.
> 
> # systemd-analyze
> Startup finished in 37.939s (kernel) + 10.909s (initrd) + 3min
> 55.004s (userspace) = 4min 43.852s
> 
> # systemd-analyze
> Startup finished in 38.307s (kernel) + 10.205s (initrd) + 38.312s
> (userspace) = 1min 26.826s

That is a big difference. i think when you return 0, lsmod will show
the module loaded. But that shouldn't be a problem in my opinion.

Thanks,
Srinivas




> 
> Thank you,
> Kyle Meyer



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-05-14 16:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-06 14:25 [RFC] cpufreq: Excessive CPUFreq driver loading Meyer, Kyle
2021-05-14 14:06 ` Meyer, Kyle
2021-05-14 16:15   ` Srinivas Pandruvada

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).