linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Prarit Bhargava <prarit@redhat.com>
To: Petr Mladek <pmladek@suse.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>,
	Petr Pavlu <petr.pavlu@suse.com>,
	linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 2/2] module: Merge same-name module load requests
Date: Mon, 24 Oct 2022 09:22:35 -0400	[thread overview]
Message-ID: <848276b0-f038-565a-cad1-30a0531cc851@redhat.com> (raw)
In-Reply-To: <Y1D2kFy+I8dgDkll@alley>

On 10/20/22 03:19, Petr Mladek wrote:
> On Tue 2022-10-18 15:53:03, Prarit Bhargava wrote:
>> On 10/18/22 14:33, Luis Chamberlain wrote:
>>> On Sat, Oct 15, 2022 at 11:27:10AM +0200, Petr Pavlu wrote:
>>>> The patch does address a regression observed after commit 6e6de3dee51a
>>>> ("kernel/module.c: Only return -EEXIST for modules that have finished
>>>> loading"). I guess it can have a Fixes tag added to the patch.
>>>>
>>>> I think it is hard to split this patch into parts because the implemented
>>>> "optimization" is the fix.
>>>
>>> git describe --contains 6e6de3dee51a
>>> v5.3-rc1~38^2~6
>>>
>>> I'm a bit torn about this situation. Reverting 6e6de3dee51a would be the
>>> right thing to do, but without it, it still leaves the issue reported
>>> by Prarit Bhargava. We need a way to resolve the issue on stable and
>>> then your optimizations can be applied on top.
>>>
>>> Prarit Bhargava, please review Petry's work and see if you can come up
>>> with a sensible way to address this for stable.
>>
>> I found the original thread surrounding these changes and I do not think
>> this solution should have been committed to the kernel.  It is not a good
>> solution to the problem being solved and adds complexity where none is
>> needed IMO.
>>
>> Quoting from the original thread,
>>
>>>
>>> Motivation for this patch is to fix an issue observed on larger machines with
>>> many CPUs where it can take a significant amount of time during boot to run
>>> systemd-udev-trigger.service. An x86-64 system can have already intel_pstate
>>> active but as its CPUs can match also acpi_cpufreq and pcc_cpufreq, udev will
>>> attempt to load these modules too. The operation will eventually fail in the
>>> init function of a respective module where it gets recognized that another
>>> cpufreq driver is already loaded and -EEXIST is returned. However, one uevent
>>> is triggered for each CPU and so multiple loads of these modules will be
>>> present. The current code then processes all such loads individually and
>>> serializes them with the barrier in add_unformed_module().
>>>
>>
>> The way to solve this is not in the module loading code, but in the udev
>> code by adding a new event or in the userspace which handles the loading
>> events.
>>
>> Option 1)
>>
>> Write/modify a udev rule to to use a flock userspace file lock to prevent
>> repeated loading.  The problem with this is that it is still racy and still
>> consumes CPU time repeated load the ELF header and, depending on the system
>> (ie a large number of cpus) would still cause a boot delay.  This would be
>> better than what we have and is worth looking at as a simple solution.  I'd
>> like to see boot times with this change, and I'll try to come up with a
>> measurement on a large CPU system.
> 
> This sounds like a ping-pong between projects where to put the
> complexity. Honestly, I think that it would be more reliable if
> we serialized/squashed parallel loads on the kernel side.
> 

Well, that's the world we live in.  Module loading ping pongs between 
udev and the kernel.

> 
>> Option 2)
>>
>> Create a new udev action, "add_once" to indicate to userspace that the
>> module only needs to be loaded one time, and to ignore further load
>> requests.  This is a bit tricky as both kernel space and userspace would
>> have be modified.  The udev rule would end up looking very similar to what
>> we now.
>>
>> The benefit of option 2 is that driver writers themselves can choose which
>> drivers should issue "add_once" instead of add.  Drivers that are known to
>> run on all devices at once would call "add_once" to only issue a single
>> load.
> 
> My concern is how to distinguish first attempts and later (fixed)
> attempts.
> 
> I mean that the module load might fail at boot. But the user might
> fix the root of the problem and try to load the module once again
> without reboot. The other load need not be direct. It might be part
> of another more complex operation. Reload of another module.
> Restart of a service.
> 

This is an interesting complication, and something to think about.

P.


> It might be problematic to do this an user-friendly way.
> And it might be much more complicated in the end.


> 
> Best Regards,
> Petr
> 


  reply	other threads:[~2022-10-24 19:12 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-19 12:32 [PATCH v2 0/2] module: Merge same-name module load requests Petr Pavlu
2022-09-19 12:32 ` [PATCH v2 1/2] module: Correct wake up of module_wq Petr Pavlu
2022-09-30 20:22   ` Luis Chamberlain
2022-10-14  8:40     ` Petr Mladek
2022-09-19 12:32 ` [PATCH v2 2/2] module: Merge same-name module load requests Petr Pavlu
2022-09-30 20:30   ` Luis Chamberlain
2022-10-15  9:27     ` Petr Pavlu
2022-10-18 18:33       ` Luis Chamberlain
2022-10-18 19:19         ` Prarit Bhargava
2022-10-18 19:53         ` Prarit Bhargava
2022-10-20  7:19           ` Petr Mladek
2022-10-24 13:22             ` Prarit Bhargava [this message]
2022-10-24 17:08               ` Luis Chamberlain
2022-10-24 12:37           ` Petr Pavlu
2022-10-24 14:00             ` Prarit Bhargava
2022-11-13 16:44               ` Petr Pavlu
2022-10-19 12:00         ` Petr Pavlu
2022-10-20  7:03           ` Petr Mladek
2022-10-24 17:53             ` Luis Chamberlain
2022-11-12  1:47           ` Luis Chamberlain
2022-11-14  8:57             ` David Hildenbrand
2022-11-14 15:38               ` Luis Chamberlain
2022-11-14 15:45                 ` David Hildenbrand
2022-11-15 19:29                   ` Luis Chamberlain
2022-11-16 16:03                     ` Prarit Bhargava
2022-11-21 16:00                       ` Petr Pavlu
2022-11-21 19:03                         ` Luis Chamberlain
2022-11-21 19:50                           ` David Hildenbrand
2022-11-21 20:27                             ` Luis Chamberlain
2022-11-22 13:59                           ` Petr Pavlu
2022-11-22 17:58                             ` Luis Chamberlain
2022-11-16 16:04                     ` David Hildenbrand
2022-11-18 17:32                     ` David Hildenbrand
2022-11-28 16:29                   ` Prarit Bhargava
2022-11-29 13:13                     ` Petr Pavlu
2022-12-02 16:36                       ` Petr Mladek
2022-12-06 12:31                         ` Prarit Bhargava
2022-12-07 13:23                           ` Petr Pavlu
2022-12-04 19:58                       ` Prarit Bhargava
2022-10-14  7:54   ` David Hildenbrand
2022-10-15  9:49     ` Petr Pavlu
2022-10-14 13:52   ` Petr Mladek
2022-10-16 12:25     ` Petr Pavlu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=848276b0-f038-565a-cad1-30a0531cc851@redhat.com \
    --to=prarit@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=petr.pavlu@suse.com \
    --cc=pmladek@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).