From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1754712AbdEXUcx (ORCPT <rfc822;w@1wt.eu>);
        Wed, 24 May 2017 16:32:53 -0400
Received: from mx2.suse.de ([195.135.220.15]:44105 "EHLO mx1.suse.de"
        rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP
        id S1750774AbdEXUcu (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 24 May 2017 16:32:50 -0400
Date: Wed, 24 May 2017 22:32:46 +0200
From: "Luis R. Rodriguez" <mcgrof@kernel.org>
To: "Luis R. Rodriguez" <mcgrof@kernel.org>,
        John Ewalt <jewalt@lgsinnovations.com>
Cc: yi1.li@linux.intel.com, atull@kernel.org, gregkh@linuxfoundation.org,
        wagi@monom.org, dwmw2@infradead.org, rafal@milecki.pl,
        arend.vanspriel@broadcom.com, rjw@rjwysocki.net,
        moritz.fischer@ettus.com, pmladek@suse.com, johannes.berg@intel.com,
        emmanuel.grumbach@intel.com, luciano.coelho@intel.com,
        kvalo@codeaurora.org, luto@kernel.org, takahiro.akashi@linaro.org,
        dhowells@redhat.com, pjones@redhat.com, linux-kernel@vger.kernel.org,
        linux-fpga@vger.kernel.org
Subject: Re: [PATCHv2 0/3] Enable no_cache flag to driver_data
Message-ID: <20170524203246.GD8951@wotan.suse.de>
References: <1495262819-981-1-git-send-email-yi1.li@linux.intel.com>
 <20170524190357.GB8951@wotan.suse.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170524190357.GB8951@wotan.suse.de>
User-Agent: Mutt/1.6.0 (2016-04-01)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, May 24, 2017 at 09:03:57PM +0200, Luis R. Rodriguez wrote:
> __fw_lookup_buf() really should be
> renamed to something that reflects this is a cache lookup.

Actually I take this back, other than the cache, note that when we
fw_lookup_and_allocate_buf() we first __fw_lookup_buf() but if the buf is not
there we allocate a new one and list_add() it to the cache regardless of
whether or not the cache thing has been enabled.

What this does then *also* other than caching for suspend/resume (which we
should document more formally) is to gather up contending lookups together
with one buf, and share the final status of just one lookup. If a buf is
found we fw_state_wait() on _request_firmware_prepare() until the buf clears.

John Ewalt recently reported some issues with loadng multiple files at the
same time. He also provided a patch. The swake_up() fix seems sensible
and would seem to have been caused by the swait transformation, but in
inspecting the other proposed changes it would seem we have had tons of
other lingering bugs which have probably existed for ages.

For instance, if there are pending requests for a leader request to send back
info, and one is about to complete but in the last moment on
assign_firmware_buf() fails, all the error paths lack a wake up call. As such
all pending requests may just wait and linger, and since none of these have
a timeout I would expect these to just linger forever. I'm not even sure we
kref_get() properly on the buf for pending requests when we are waiting for
a serialized request, ie, we might be able to take a buf underneath the nose
of a waiter.

Although some are new bugs, some seem to be really old bugs.

These are the sorts of issues I wish for a test driver to be able to uncover,
test and ensure we never regress again. This is also why I am being careful
about enabling a feature, we should *really* think things through well before
enabling on the new API.

[0] https://bugzilla.kernel.org/show_bug.cgi?id=195477

  Luis