From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751297AbdAQQau (ORCPT ); Tue, 17 Jan 2017 11:30:50 -0500 Received: from mail-qt0-f179.google.com ([209.85.216.179]:32822 "EHLO mail-qt0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751246AbdAQQas (ORCPT ); Tue, 17 Jan 2017 11:30:48 -0500 MIME-Version: 1.0 In-Reply-To: References: <20170117153505.20308-1-jakub.kicinski@netronome.com> <20170117161512.GC13946@wotan.suse.de> From: Jakub Kicinski Date: Tue, 17 Jan 2017 08:30:37 -0800 Message-ID: Subject: Re: [PATCHv2] firmware: Correct handling of fw_state_wait_timeout() return value To: "Luis R. Rodriguez" Cc: Chris Wilson , linux-kernel-dev@beckhoff.com, Greg Kroah-Hartman , Bjorn Andersson , Daniel Wagner , Ming Lei , "linux-kernel@vger.kernel.org" , oss-drivers@netronome.com Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 17, 2017 at 8:21 AM, Luis R. Rodriguez wrote: >>> >>> retval = fw_state_wait_timeout(&buf->fw_st, timeout); >>> - if (retval < 0) { >>> + if (retval == -ETIMEDOUT || retval == -ERESTARTSYS) { >>> mutex_lock(&fw_lock); >>> fw_load_abort(fw_priv); >>> mutex_unlock(&fw_lock); >> >> This is a bit messy, two other similar issues were reported before >> and upon review I suggested Patrick Bruenn's fix with a better commit >> log seems best fit. Patrick sent a patch Jan 4, 2017 but never followed up >> despite my feedback on a small change on the commit log message [0]. Can you >> try that and if that fixes it can you adjust the commit log accordingly? Please >> note the preferred solution would be: >> >> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c >> index b9ac348e8d33..c530f8b4af01 100644 >> --- a/drivers/base/firmware_class.c >> +++ b/drivers/base/firmware_class.c >> @@ -542,6 +542,8 @@ static struct firmware_priv *to_firmware_priv(struct device *dev) >> >> static void __fw_load_abort(struct firmware_buf *buf) >> { >> + if (!buf) >> + return; Allow me to try to persuade you one last time :) My patch makes the code more logical and easier to follow. The code says: in case no wake up happened - finish the wait (otherwise the waking thread finishes it). Adding a NULL-check would just paper over the issue and can cause trouble down the line. If fw_state_wait_timeout() returned because someone woke it up - there is no reason to abort the wait. The wait is already finished. The buggy commit mixed up return codes from fw_state_wait_timeout() - mixed "nobody woke us up" with "we couldn't find the FW", that's why we need to check for specific error codes.