linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Luis R. Rodriguez" <mcgrof@kernel.org>
To: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>,
	Chris Wilson <chris@chris-wilson.co.uk>,
	linux-kernel-dev@beckhoff.com,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	Daniel Wagner <daniel.wagner@bmw-carit.de>,
	Ming Lei <ming.lei@canonical.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	oss-drivers@netronome.com
Subject: Re: [PATCHv2] firmware: Correct handling of fw_state_wait_timeout() return value
Date: Tue, 17 Jan 2017 21:53:27 +0100	[thread overview]
Message-ID: <20170117205327.GF13946@wotan.suse.de> (raw)
In-Reply-To: <CAJpBn1zg7AX9v93dtMpQyvip9zwUk+aAKU8U6bAaYP7gu-+bdA@mail.gmail.com>

On Tue, Jan 17, 2017 at 10:04:20AM -0800, Jakub Kicinski wrote:
> On Tue, Jan 17, 2017 at 9:30 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > On Tue, Jan 17, 2017 at 08:30:37AM -0800, Jakub Kicinski wrote:
> >> On Tue, Jan 17, 2017 at 8:21 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> >> >>>
> >> >>>       retval = fw_state_wait_timeout(&buf->fw_st, timeout);
> >> >>> -     if (retval < 0) {
> >> >>> +     if (retval == -ETIMEDOUT || retval == -ERESTARTSYS) {
> >> >>>               mutex_lock(&fw_lock);
> >> >>>               fw_load_abort(fw_priv);
> >> >>>               mutex_unlock(&fw_lock);
> >> >>
> >> >> This is a bit messy, two other similar issues were reported before
> >> >> and upon review I suggested Patrick Bruenn's fix with a better commit
> >> >> log seems best fit. Patrick sent a patch Jan 4, 2017 but never followed up
> >> >> despite my feedback on a small change on the commit log message [0]. Can you
> >> >> try that and if that fixes it can you adjust the commit log accordingly? Please
> >> >> note the preferred solution would be:
> >> >>
> >> >> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> >> >> index b9ac348e8d33..c530f8b4af01 100644
> >> >> --- a/drivers/base/firmware_class.c
> >> >> +++ b/drivers/base/firmware_class.c
> >> >> @@ -542,6 +542,8 @@ static struct firmware_priv *to_firmware_priv(struct device *dev)
> >> >>
> >> >>  static void __fw_load_abort(struct firmware_buf *buf)
> >> >>  {
> >> >> +       if (!buf)
> >> >> +               return;
> >>
> >> Allow me to try to persuade you one last time :)  My patch makes the
> >> code more logical and easier to follow.  The code says:
> >> in case no wake up happened - finish the wait (otherwise the waking
> >> thread finishes it).
> >
> > Your patch is still wrong, as Patrick great commit log notes a null defer
> > can also happen on a race with a case of -1 being sent and a -ENOENT error,
> > so we'd have to adjust for when __fw_state_wait_common() returns also
> > -ENOENT.
> 
> Sorry, I don't follow.  _Not_ calling abort on -ENOENT error is
> exactly what my patch does.

Yeah I see now what you mean. Your approach avoids the buf issue as well.
Its still not addressing the real issue though, which is the chicken
sloppy use of a status on the buf, which at one point gets set to NULL.
This later practice makes it rather hard to make it correct to use
a stateful check properly.

> >> Adding a NULL-check would just paper over the
> >> issue and can cause trouble down the line.
> >
> > We typically bail on errors and use similar code to bail out, and we
> > typically do these things. Here its no different. The *real* issue
> > is the fact that we have a waiting timeout which can fail race against
> > a user imposed error out on the sysfs interface. There is one catch:
> >
> > We already lock with the big fw_lock and use this to be able to check
> > for the status of the fw, so once aborted we technically should not have
> > to abort again. A proper way to address then this would have been to check
> > for the status of the fw prior to aborting again given we also lock on the
> > big fw_lock. A problem with this though is the status is part of the buf
> > which is set to NULL after we are done aborting.
> 
> Yes, I've seen that too :\  This race seems to have been there prior
> to 4.9, though.  I guess we could fix both issues with the NULL-check
> although I would prefer if we had both patches.
> 
> FWIW I think the NULL-check could be put in the existing conditional:
> 
>          * There is a small window in which user can write to 'loading'
>          * between loading done and disappearance of 'loading'
>          */
> -       if (fw_state_is_done(&buf->fw_st))
> +       if (!buf || fw_state_is_done(&buf->fw_st))
>                 return;
> 
>         list_del_init(&buf->pending_list);
> 
> Note that the comment above seems to be mentioning the race we're
> trying to solve.

Right, I think another approach is to *enable* the state of the buf
to be used to avoid further use on the sysfs iterface instead. Fortunately
other sysfs interfaces already use fw_state_is_done() to bail out,
so all that would be needed I think would be:

diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index b9ac348e8d33..30ccf7aea3ca 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -558,9 +558,6 @@ static void fw_load_abort(struct firmware_priv *fw_priv)
 	struct firmware_buf *buf = fw_priv->buf;
 
 	__fw_load_abort(buf);
-
-	/* avoid user action after loading abort */
-	fw_priv->buf = NULL;
 }
 
 static LIST_HEAD(pending_fw_head);
@@ -713,7 +710,7 @@ static ssize_t firmware_loading_store(struct device *dev,
 
 	mutex_lock(&fw_lock);
 	fw_buf = fw_priv->buf;
-	if (!fw_buf)
+	if (!fw_buf || fw_state_is_aborted(&fw_buf->fw_st))
 		goto out;
 
 	switch (loading) {

  reply	other threads:[~2017-01-17 21:39 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-17 15:35 [PATCHv2] firmware: Correct handling of fw_state_wait_timeout() return value Jakub Kicinski
2017-01-17 16:15 ` Luis R. Rodriguez
2017-01-17 16:21   ` Luis R. Rodriguez
2017-01-17 16:30     ` Jakub Kicinski
2017-01-17 17:30       ` Luis R. Rodriguez
2017-01-17 18:04         ` Jakub Kicinski
2017-01-17 20:53           ` Luis R. Rodriguez [this message]
2017-01-17 21:17             ` Jakub Kicinski
2017-01-18  6:33               ` linux-kernel-dev
2017-01-18 20:01                 ` Luis R. Rodriguez
2017-01-23 16:11                   ` [PATCH 0/7] firmware: expand test units for fallback mechanism Luis R. Rodriguez
2017-01-23 16:11                     ` [PATCH 1/7] test_firmware: move misc_device down Luis R. Rodriguez
2017-01-23 16:11                     ` [PATCH 2/7] test_firmware: use device attribute groups Luis R. Rodriguez
2017-01-23 16:11                     ` [PATCH 3/7] tools: firmware: check for distro fallback udev cancel rule Luis R. Rodriguez
2017-01-23 16:11                     ` [PATCH 4/7] tools: firmware: rename fallback mechanism script Luis R. Rodriguez
2017-01-23 16:11                     ` [PATCH 5/7] tools: firmware: add fallback cancelation testing Luis R. Rodriguez
2017-01-23 16:11                     ` [PATCH 6/7] test_firmware: add test custom fallback trigger Luis R. Rodriguez
2017-01-23 16:11                     ` [PATCH 7/7] firmware: firmware: fix NULL pointer dereference in __fw_load_abort() Luis R. Rodriguez
2017-01-25 10:52                       ` Greg KH
2017-01-25 13:36                         ` Luis R. Rodriguez
2017-01-25 13:42                           ` Luis R. Rodriguez
2017-01-25 14:41                             ` Greg KH
2017-01-25 15:21                               ` [PATCH v2] " Luis R. Rodriguez
2017-01-25 15:47                                 ` Greg KH
2017-01-25 18:31                                   ` Luis R. Rodriguez
2017-01-25 18:31                                   ` [PATCH v3] " Luis R. Rodriguez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170117205327.GF13946@wotan.suse.de \
    --to=mcgrof@kernel.org \
    --cc=bjorn.andersson@linaro.org \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel.wagner@bmw-carit.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=jakub.kicinski@netronome.com \
    --cc=linux-kernel-dev@beckhoff.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@canonical.com \
    --cc=oss-drivers@netronome.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).