All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@canonical.com>
To: "Luis R. Rodriguez" <mcgrof@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>, Takashi Iwai <tiwai@suse.de>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	NeilBrown <neilb@suse.com>, Oleg Nesterov <oleg@redhat.com>,
	Daniel Wagner <wagi@monom.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Daniel Wagner <daniel.wagner@bmw-carit.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Dmitry Torokhov <dmitry.torokhov@gmail.com>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	Arend Van Spriel <arend@broadcom.com>,
	Johannes Berg <johannes@sipsolutions.net>,
	Mimi Zohar <zohar@linux.vnet.ibm.com>,
	David Howells <dhowells@redhat.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	Jeff Mahoney <jeffm@suse.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Wu Fengguang <fengguang.wu@intel.com>,
	Julia Lawall <Julia.Lawall@lip6.fr>
Subject: Re: [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper()
Date: Fri, 9 Sep 2016 09:22:41 +0800	[thread overview]
Message-ID: <CACVXFVOEXFk5kitng9huJkiNjw077jJtkuR++3fkonTJJCbyGg@mail.gmail.com> (raw)
In-Reply-To: <20160908201106.GI3296@wotan.suse.de>

On Fri, Sep 9, 2016 at 4:11 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Thu, Sep 08, 2016 at 11:37:54PM +0800, Ming Lei wrote:
>> On Wed, Sep 7, 2016 at 4:45 PM, Daniel Wagner <wagi@monom.org> wrote:
>> > From: Daniel Wagner <daniel.wagner@bmw-carit.de>
>> >
>> > When we load the firmware directly we don't need to take the umh
>> > lock.
>>
>> I am wondering if it can be wrong.
>
> If you disable the firmware UMH why would we need to lock if the lock is being
> shown only used for the firmare UMH ?
>
>> Actually in case of firmware loading, the usermode helper lock doesn't
>> only mean the user helper is usable, and it also may serve to mark the
>> filesystem/block device is ready for firmware loading, and of couse direct
>> loading need fs/block to be ready too.
>
> Yes but that's a race I've identified a while ago present even if you use initramfs *and*
> use kernel_read_file_from_path() on any part of the kernel [0], I proposed a possible

Actualy I mean the situation of suspend vs. resume, and some drivers
still may not benefit from firmware loading cache when requesting loading
in .resume(), at that time it is still too early for direct loading.
With UMH lock,
we can get a warning or avoid the issue.


> solution recently [1], given tons of folks were *complaining* about this but despite there
> being one solution proposed [2] it was still incorrect, as only *userspace* can know

In case of initramfs and built-in drivers, it isn't a surprise to see this race,
and that shouldn't be a common case. But for suspend vs. resume, it can be
true.


> for sure when your critical filesystems are mounted. Since we now have a shared
> "read file fs" helper kernel_read_file_from_path() it implicates the race is possible
> elsewhere as well.
>
> The race is present given you simply cannot know when the root filesystem
> (consider a series of pivot_root() calls, hey -- anyone can do that, who are we to
> tell them they can't), or in this particular case importance, only considering firmware,
> when /lib/firmware/ is ready.  In short I am saying this whole race thing is a
> bigger issue, and as much as Linus hates my proposed solution I have not heard
> any proposal alternatives to address this properly, not even clarifications on
> what our expectations for drivers then are if they use kernel_read_file_from_path()
> early on their driver code.
>
> The original goal of the usermode helper lock came can be traced back in
> time via a144c6a ("PM: Print a warning if firmware is requested when task
> are frozen) when Rafael noticed drivers were calling to request firmware
> on *resume* calls! Why would that be an issue? It was because of the stupid
> delays incurred on resume when firmware *was not found* __due__ to the stupid
> user mode helper timeout delay and people believing this often meant users
> confusing resume *stalling* for resume was *broken! Later b298d289c7
> ("PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled()")
> addressed races. That code in turn was later massaged into shape to better
> accept direct FS loading via commit 4e0c92d015235 ("firmware: Refactoring for
> splitting user-mode helper code").
>
> So for a while we've been nagging driver developers to not add request_firmware()
> calls on resume calls. In fact the drivers/base/firmware_class.c code *kills*
> firmware UMH calls when we go to suspend for the firmware cache, as such even
> on suspend its a stupid idea to use the UMH, not only because it can
> stall suspend but *also* because we kill these UMH calls. Its why I recently
> proposed removing the possibility to call the firmware UMH from the firmware
> cache [3]. If this patch is wrong please do chime in!
>
> So, as I see it the user mode helper lock should have *nothing* to do with
> the race between reading files from the kernel and having a path ready. If
> it was *used* for that, that was papering over the real issue. Its no
> different of a hack for instance as if a driver using request_firmware() tried
> to use async probe to avoid the same race. Yes it will help, but no, it does
> not mean it is deterministically safe.
>
> Reviewing commit 247bc03742545 ("PM / Sleep: Mitigate race between the freezer
> and request_firmware()") which originally extended umh state machine from
> just being enabled/disabled, with the concepts of UMH_ENABLED, UMH_FREEZING,
> UMH_DISABLED -- its goal was to prevent UMH uses during suspend. So -- the
> "UMH lock" on firmware was actually added to help avoid races between freezing
> and request_firmware(). We should not re-use UMH status notifiers when the
> firmware UMH is disabled for the same concepts -- if we needed such a concept
> then we should take this out from UMH code and generalize it.
>
> In fact this shows there's still an issue here for UMH code as well. The
> usermodehelper_enable() call
>
> rest_init() --> kernel_thread(kernel_init, NULL, CLONE_FS); -->
>         kernel_init --> kernel_init_freeable() --> wait_for_completion(&kthreadd_done);
>
> So after kernel_init_freeable() kthreadd_done should be done but note that
> only userspace will know what partition set up to use. Shortly after
> this though in kernel_init_freeable() we call prepare_namespace() so if
> initramfs was set up its set up after.
>
> Meanwhile driver's inits are called via do_initcalls() called from do_basic_setup()
> and that is called within kernel_init_freeable() *prior* to prepare_namespace().
>
> So calling usermodehelper_enable() doesn't necessarily ensure that the paths
> for the UMH used will actually work either. This path also has a race.
>
> We need to address both things then:
>
>   1) *freeze* races / stalls
>   2)  userspace paths ready for whatever the kernel needs to read off of the
>       filesystem
>
> These issues are not particular to firmware -- this applies to all
> kernel_read_file_from_path() and as is being revealed now the UMH code. It gets
> me wondering if we have any shared code possible between UMH and
> kernel_read_file_from_path().
>
> For 1) we could just generalize the code.
>
> For 2) I proposed a solution as RFC although some hate it, my latest preference
> would be either *declare* these uses early reads clearly as not supported or
> have a proper init level that does guarantee to be safe against the userspace
> paths.
>
> I'm all ears for alternatives.
>
> [0] https://marc.info/?t=147286207700002&r=1&w=2
> [1] http://lkml.kernel.org/r/CAB=NE6UBRa0K7=PomJzKxsoj4GzAqkYrkp=O+UfVvu2fwM25pA@mail.gmail.com
> [2] https://lkml.org/lkml/2014/6/15/47
> [3] https://lkml.kernel.org/r/1473208930-6835-6-git-send-email-mcgrof@kernel.org
>
>   Luis

  reply	other threads:[~2016-09-09  1:22 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-07  8:45 [PATCH v4 0/4] firmware: encapsulate firmware loading status Daniel Wagner
2016-09-07  8:45 ` [PATCH v4 1/4] firmware: Move umh locking code into fw_load_from_user_helper() Daniel Wagner
2016-09-07 23:33   ` Luis R. Rodriguez
2016-09-08 12:41     ` Daniel Wagner
2016-09-08 14:55       ` Luis R. Rodriguez
2016-09-08 15:37   ` Ming Lei
2016-09-08 20:11     ` Luis R. Rodriguez
2016-09-09  1:22       ` Ming Lei [this message]
     [not found]         ` <CAB=NE6XK9g9muQ8p5+VaVGt2t0E_4K0wiavacQGqt_S4PyN28A@mail.gmail.com>
2016-09-09  4:19           ` Ming Lei
2016-09-09 22:10             ` Luis R. Rodriguez
     [not found]               ` <efba9730-d3bc-1fd4-2511-e509a4c60971@bmw-carit.de>
2016-09-15 15:43                 ` Luis R. Rodriguez
2016-09-07  8:45 ` [PATCH v4 2/4] firmware: encapsulate firmware loading status Daniel Wagner
2016-09-08  1:39   ` Luis R. Rodriguez
2016-09-08  8:05     ` Daniel Wagner
2016-09-08  9:45       ` Daniel Wagner
2016-09-08 11:26   ` Ming Lei
2016-09-08 12:26     ` Daniel Wagner
2016-09-08 15:49       ` Ming Lei
2016-09-09 11:43         ` Daniel Wagner
2016-09-08 12:32   ` Daniel Wagner
2016-09-07  8:45 ` [PATCH v4 3/4] firmware: Drop bit ops in favor of simple state machine Daniel Wagner
2016-09-08  1:45   ` Luis R. Rodriguez
2016-09-08 12:44     ` Daniel Wagner
2016-09-07  8:45 ` [PATCH v4 4/4] firmware: Do not use fw_lock for fw_status protection Daniel Wagner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACVXFVOEXFk5kitng9huJkiNjw077jJtkuR++3fkonTJJCbyGg@mail.gmail.com \
    --to=ming.lei@canonical.com \
    --cc=Julia.Lawall@lip6.fr \
    --cc=akpm@linux-foundation.org \
    --cc=arend@broadcom.com \
    --cc=bjorn.andersson@linaro.org \
    --cc=daniel.vetter@ffwll.ch \
    --cc=daniel.wagner@bmw-carit.de \
    --cc=dhowells@redhat.com \
    --cc=dmitry.torokhov@gmail.com \
    --cc=dwmw2@infradead.org \
    --cc=fengguang.wu@intel.com \
    --cc=fweisbec@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jeffm@suse.com \
    --cc=johannes@sipsolutions.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mcgrof@kernel.org \
    --cc=neilb@suse.com \
    --cc=oleg@redhat.com \
    --cc=rjw@sisk.pl \
    --cc=tiwai@suse.de \
    --cc=torvalds@linux-foundation.org \
    --cc=wagi@monom.org \
    --cc=zohar@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.