From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752316AbdAYSbg (ORCPT ); Wed, 25 Jan 2017 13:31:36 -0500 Received: from mx2.suse.de ([195.135.220.15]:43102 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752024AbdAYSbf (ORCPT ); Wed, 25 Jan 2017 13:31:35 -0500 Date: Wed, 25 Jan 2017 19:31:27 +0100 From: "Luis R. Rodriguez" To: Greg KH Cc: "Luis R. Rodriguez" , ming.lei@canonical.com, keescook@chromium.org, linux-kernel-dev@beckhoff.com, jakub.kicinski@netronome.com, chris@chris-wilson.co.uk, oss-drivers@netronome.com, johannes@sipsolutions.net, j@w1.fi, teg@jklm.no, kay@vrfy.org, jwboyer@fedoraproject.org, dmitry.torokhov@gmail.com, seth.forshee@canonical.com, bjorn.andersson@linaro.org, linux-kernel@vger.kernel.org, wagi@monom.org, stephen.boyd@linaro.org, zohar@linux.vnet.ibm.com, tiwai@suse.de, dwmw2@infradead.org, fengguang.wu@intel.com, dhowells@redhat.com, arend.vanspriel@broadcom.com, kvalo@codeaurora.org, kimran@codeaurora.org, "[3.10+]" Subject: Re: [PATCH v2] firmware: fix NULL pointer dereference in __fw_load_abort() Message-ID: <20170125183127.GS13946@wotan.suse.de> References: <20170125144121.GA15767@kroah.com> <20170125152118.27171-1-mcgrof@kernel.org> <20170125154725.GB21106@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170125154725.GB21106@kroah.com> User-Agent: Mutt/1.6.0 (2016-04-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 25, 2017 at 04:47:25PM +0100, Greg KH wrote: > On Wed, Jan 25, 2017 at 07:21:18AM -0800, Luis R. Rodriguez wrote: > > Since commit 5d47ec02c37ea632398cb251c884e3a488dff794 > > ("firmware: Correct handling of fw_state_wait() return value") > > fw_load_abort(fw_priv) could be called twice and lead us to a > > kernel crash. This happens only when the firmware fallback mechanism > > (regular or custom) is used. The fallback mechanism exposes a sysfs > > interface for userspace to upload a file and notify the kernel when > > the file is loaded and ready, or to cancel an upload by echo'ing -1 > > into on the loading file: > > > > echo -n "-1" > /sys/$DEVPATH/loading > > > > This will call fw_load_abort(). Some distributions actually have > > a udev rule in place to *always* immediately cancel all firmware > > fallback mechanism requests (Debian), they have: > > > > $ cat /lib/udev/rules.d/50-firmware.rules > > # stub for immediately telling the kernel that userspace firmware loading > > # failed; necessary to avoid long timeouts with CONFIG_FW_LOADER_USER_HELPER=y > > SUBSYSTEM=="firmware", ACTION=="add", ATTR{loading}="-1 > > > > This was done since udev removed the firmware fallback mechanism a while ago > > and a long standing misunderstood issues with the timeout (but now corrected). > > Distributions with this udev rule would run into this crash only if the > > fallback mechanism is used. Since most distributions disable by default > > using the fallback mechanism (CONFIG_FW_LOADER_USER_HELPER_FALLBACK), this > > would typicaly mean only 2 drivers which *require* the fallback mechanism > > could typically incur a crash: drivers/firmware/dell_rbu.c and the > > drivers/leds/leds-lp55xx-common.c driver. > > > > The crash happens because after commit 5b029624948d ("firmware: do not > > use fw_lock for fw_state protection") and subsequent fix commit > > 5d47ec02c37ea6 ("firmware: Correct handling of fw_state_wait() return > > value") a race can happen between this cancelation and the firmware > > fw_state_wait_timeout() being woken up after a state change with which > > fw_load_abort() as that calls swake_up(). Upon error fw_state_wait_timeout() > > will also again call fw_load_abort() and trigger a null reference. > > > > At first glance we could just fix this with a !buf check on > > fw_load_abort() before accessing buf->fw_st, however there is > > a logical issue in having a state machine used for the fallback > > mechanism and preventing access from it once we abort as its inside > > the buf (buf->fw_st). > > > > The firmware_class.c code is setting the buf to NULL to annotate an > > abort has occurred. Replace this mechanism by simply using the state check > > instead. All the other code in place already uses similar checks > > for aborting as well so no further changes are needed. > > > > An oops can be reproduced with the new fw_fallback.sh fallback > > mechanism cancellation test. Either cancelling the fallback mechanism > > or the custom fallback mechanism triggers a crash. > > You are still writing books here. Alright trimmed. > With crazy margins, pick one line width (72 columns), and stick with it > please. > > Can you reformat this and resend please? Sure. Luis