From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751481AbdARGe6 (ORCPT ); Wed, 18 Jan 2017 01:34:58 -0500 Received: from internet2.beckhoff.com ([194.25.186.210]:63695 "EHLO Internet2.beckhoff.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751115AbdARGe5 (ORCPT ); Wed, 18 Jan 2017 01:34:57 -0500 From: linux-kernel-dev To: Jakub Kicinski , "Luis R. Rodriguez" CC: Chris Wilson , Greg Kroah-Hartman , Bjorn Andersson , Daniel Wagner , Ming Lei , "linux-kernel@vger.kernel.org" , "oss-drivers@netronome.com" Subject: RE: [PATCHv2] firmware: Correct handling of fw_state_wait_timeout() return value Thread-Topic: [PATCHv2] firmware: Correct handling of fw_state_wait_timeout() return value Thread-Index: AQHScNzvuWCMf7iAN0adPNYdJU1w2KE8yRgAgAACe4CAABDIgIAACWcAgAAvQICAAAbaAIAAneYQ Date: Wed, 18 Jan 2017 06:33:56 +0000 Message-ID: References: <20170117153505.20308-1-jakub.kicinski@netronome.com> <20170117161512.GC13946@wotan.suse.de> <20170117173041.GE13946@wotan.suse.de> <20170117205327.GF13946@wotan.suse.de> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.17.64.136] x-olx-disclaimer: Done Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id v0I6Z4Bx025932 >From: Jakub Kicinski [mailto:jakub.kicinski@netronome.com] >Sent: Dienstag, 17. Januar 2017 22:18 > >On Tue, Jan 17, 2017 at 12:53 PM, Luis R. Rodriguez >wrote: >> On Tue, Jan 17, 2017 at 10:04:20AM -0800, Jakub Kicinski wrote: >>> On Tue, Jan 17, 2017 at 9:30 AM, Luis R. Rodriguez >wrote: >>> > On Tue, Jan 17, 2017 at 08:30:37AM -0800, Jakub Kicinski wrote: >>> >> Adding a NULL-check would just paper over the >>> >> issue and can cause trouble down the line. >>> > >>> > We typically bail on errors and use similar code to bail out, and we >>> > typically do these things. Here its no different. The *real* issue >>> > is the fact that we have a waiting timeout which can fail race against >>> > a user imposed error out on the sysfs interface. There is one catch: >>> > >>> > We already lock with the big fw_lock and use this to be able to check >>> > for the status of the fw, so once aborted we technically should not have >>> > to abort again. A proper way to address then this would have been to >check >>> > for the status of the fw prior to aborting again given we also lock on the >>> > big fw_lock. A problem with this though is the status is part of the buf >>> > which is set to NULL after we are done aborting. >>> >>> Yes, I've seen that too :\ This race seems to have been there prior >>> to 4.9, though. I guess we could fix both issues with the NULL-check >>> although I would prefer if we had both patches. >>> >>> FWIW I think the NULL-check could be put in the existing conditional: >>> >>> * There is a small window in which user can write to 'loading' >>> * between loading done and disappearance of 'loading' >>> */ >>> - if (fw_state_is_done(&buf->fw_st)) >>> + if (!buf || fw_state_is_done(&buf->fw_st)) >>> return; >>> >>> list_del_init(&buf->pending_list); >>> >>> Note that the comment above seems to be mentioning the race we're >>> trying to solve. >> >> Right, I think another approach is to *enable* the state of the buf >> to be used to avoid further use on the sysfs iterface instead. Fortunately >> other sysfs interfaces already use fw_state_is_done() to bail out, >> so all that would be needed I think would be: >> >> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c >> index b9ac348e8d33..30ccf7aea3ca 100644 >> --- a/drivers/base/firmware_class.c >> +++ b/drivers/base/firmware_class.c >> @@ -558,9 +558,6 @@ static void fw_load_abort(struct firmware_priv >*fw_priv) >> struct firmware_buf *buf = fw_priv->buf; >> >> __fw_load_abort(buf); >> - >> - /* avoid user action after loading abort */ >> - fw_priv->buf = NULL; >> } >> >> static LIST_HEAD(pending_fw_head); >> @@ -713,7 +710,7 @@ static ssize_t firmware_loading_store(struct device >*dev, >> >> mutex_lock(&fw_lock); >> fw_buf = fw_priv->buf; >> - if (!fw_buf) >> + if (!fw_buf || fw_state_is_aborted(&fw_buf->fw_st)) >> goto out; >> >> switch (loading) { > >IMHO this one is nice! I think you can even drop the !fw_buf check in >this case because AFAICS the only case where fw_buf is set to NULL is >in the abort function. > I can confirm, that patch looks nice and is working for my setup, even without the !fw_buf. Feel free to grab everything you need from my commit log, if it helps. Unfortunately there is a crazy spam filter between us, so you can't rely on me.