From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752283AbdARUBq (ORCPT <rfc822;w@1wt.eu>);
        Wed, 18 Jan 2017 15:01:46 -0500
Received: from mx2.suse.de ([195.135.220.15]:35881 "EHLO mx2.suse.de"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1751121AbdARUBo (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 18 Jan 2017 15:01:44 -0500
Date: Wed, 18 Jan 2017 21:01:41 +0100
From: "Luis R. Rodriguez" <mcgrof@kernel.org>
To: linux-kernel-dev <linux-kernel-dev@beckhoff.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>,
        "Luis R. Rodriguez" <mcgrof@kernel.org>,
        Chris Wilson <chris@chris-wilson.co.uk>,
        Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
        Bjorn Andersson <bjorn.andersson@linaro.org>,
        Daniel Wagner <daniel.wagner@bmw-carit.de>,
        Ming Lei <ming.lei@canonical.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "oss-drivers@netronome.com" <oss-drivers@netronome.com>
Subject: Re: [PATCHv2] firmware: Correct handling of fw_state_wait_timeout()
 return value
Message-ID: <20170118200141.GH13946@wotan.suse.de>
References: <20170117153505.20308-1-jakub.kicinski@netronome.com>
 <20170117161512.GC13946@wotan.suse.de>
 <CAB=NE6Xj0TpwMVTDWtEaYAqSn8HdVapXqUf7j4a+i2f+zdkSZA@mail.gmail.com>
 <CAJpBn1wkUzNxQGy+d1Lq_7UCsgjvM65E+=cNZcP7NBSMyS157g@mail.gmail.com>
 <20170117173041.GE13946@wotan.suse.de>
 <CAJpBn1zg7AX9v93dtMpQyvip9zwUk+aAKU8U6bAaYP7gu-+bdA@mail.gmail.com>
 <20170117205327.GF13946@wotan.suse.de>
 <CAJpBn1yngh7hgRN0FKPY=Qgk3s85dUL1Xpjb9ud8_YB8pbL2PA@mail.gmail.com>
 <D30C732FB58879449E51C7A346E6B9718D43F658@NT-MAIL05.beckhoff.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <D30C732FB58879449E51C7A346E6B9718D43F658@NT-MAIL05.beckhoff.com>
User-Agent: Mutt/1.6.0 (2016-04-01)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, Jan 18, 2017 at 06:33:56AM +0000, linux-kernel-dev wrote:
> >From: Jakub Kicinski [mailto:jakub.kicinski@netronome.com]
> >Sent: Dienstag, 17. Januar 2017 22:18
> >
> >On Tue, Jan 17, 2017 at 12:53 PM, Luis R. Rodriguez <mcgrof@kernel.org>
> >wrote:
> >> On Tue, Jan 17, 2017 at 10:04:20AM -0800, Jakub Kicinski wrote:
> >>> On Tue, Jan 17, 2017 at 9:30 AM, Luis R. Rodriguez <mcgrof@kernel.org>
> >wrote:
> >>> > On Tue, Jan 17, 2017 at 08:30:37AM -0800, Jakub Kicinski wrote:
> >>> >> Adding a NULL-check would just paper over the
> >>> >> issue and can cause trouble down the line.
> >>> >
> >>> > We typically bail on errors and use similar code to bail out, and we
> >>> > typically do these things. Here its no different. The *real* issue
> >>> > is the fact that we have a waiting timeout which can fail race against
> >>> > a user imposed error out on the sysfs interface. There is one catch:
> >>> >
> >>> > We already lock with the big fw_lock and use this to be able to check
> >>> > for the status of the fw, so once aborted we technically should not have
> >>> > to abort again. A proper way to address then this would have been to
> >check
> >>> > for the status of the fw prior to aborting again given we also lock on the
> >>> > big fw_lock. A problem with this though is the status is part of the buf
> >>> > which is set to NULL after we are done aborting.
> >>>
> >>> Yes, I've seen that too :\  This race seems to have been there prior
> >>> to 4.9, though.  I guess we could fix both issues with the NULL-check
> >>> although I would prefer if we had both patches.
> >>>
> >>> FWIW I think the NULL-check could be put in the existing conditional:
> >>>
> >>>          * There is a small window in which user can write to 'loading'
> >>>          * between loading done and disappearance of 'loading'
> >>>          */
> >>> -       if (fw_state_is_done(&buf->fw_st))
> >>> +       if (!buf || fw_state_is_done(&buf->fw_st))
> >>>                 return;
> >>>
> >>>         list_del_init(&buf->pending_list);
> >>>
> >>> Note that the comment above seems to be mentioning the race we're
> >>> trying to solve.
> >>
> >> Right, I think another approach is to *enable* the state of the buf
> >> to be used to avoid further use on the sysfs iterface instead. Fortunately
> >> other sysfs interfaces already use fw_state_is_done() to bail out,
> >> so all that would be needed I think would be:
> >>
> >> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> >> index b9ac348e8d33..30ccf7aea3ca 100644
> >> --- a/drivers/base/firmware_class.c
> >> +++ b/drivers/base/firmware_class.c
> >> @@ -558,9 +558,6 @@ static void fw_load_abort(struct firmware_priv
> >*fw_priv)
> >>         struct firmware_buf *buf = fw_priv->buf;
> >>
> >>         __fw_load_abort(buf);
> >> -
> >> -       /* avoid user action after loading abort */
> >> -       fw_priv->buf = NULL;
> >>  }
> >>
> >>  static LIST_HEAD(pending_fw_head);
> >> @@ -713,7 +710,7 @@ static ssize_t firmware_loading_store(struct device
> >*dev,
> >>
> >>         mutex_lock(&fw_lock);
> >>         fw_buf = fw_priv->buf;
> >> -       if (!fw_buf)
> >> +       if (!fw_buf || fw_state_is_aborted(&fw_buf->fw_st))
> >>                 goto out;
> >>
> >>         switch (loading) {
> >
> >IMHO this one is nice!  I think you can even drop the !fw_buf check in
> >this case because AFAICS the only case where fw_buf is set to NULL is
> >in the abort function.
> >
> I can confirm, that patch looks nice and is working for my setup, even without the !fw_buf. 
> Feel free to grab everything you need from my commit log, if it helps.
> Unfortunately there is a crazy spam filter between us, so you can't rely on me.

OK I'll submit this version with both your Reported-and-Tested-by.

  Luis