From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C294C433EF for ; Tue, 12 Oct 2021 14:56:07 +0000 (UTC) Received: from alsa0.perex.cz (alsa0.perex.cz [77.48.224.243]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E974561052 for ; Tue, 12 Oct 2021 14:56:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E974561052 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=alsa-project.org Received: from alsa1.perex.cz (alsa1.perex.cz [207.180.221.201]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by alsa0.perex.cz (Postfix) with ESMTPS id 959E3167D; Tue, 12 Oct 2021 16:55:13 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 alsa0.perex.cz 959E3167D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=alsa-project.org; s=default; t=1634050563; bh=ThdmByDjDeUDIS+TR7RAO9cKl9cDDcDUXXfVAx7aSF8=; h=Subject:To:References:From:Date:In-Reply-To:Cc:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From; b=ZmTteSMauqRDBJnQFLQ54kr67qpDlx2aiqjfjn2e+ufTlSr7ldtncnInyVSH5LSzF 8u3IRGn1R0N8H0kVNvTwFbPr9j+8ViEghwMBCxFbl0PJNizwBR662gz6DPzLKjlRg1 q//W4RfZciYMDTiCTjKK87IrWoNuJzTYnPjg6rp0= Received: from alsa1.perex.cz (localhost.localdomain [127.0.0.1]) by alsa1.perex.cz (Postfix) with ESMTP id 1B0A9F80088; Tue, 12 Oct 2021 16:55:13 +0200 (CEST) Received: by alsa1.perex.cz (Postfix, from userid 50401) id 1BA36F80245; Tue, 12 Oct 2021 16:55:11 +0200 (CEST) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by alsa1.perex.cz (Postfix) with ESMTPS id D31E5F80088 for ; Tue, 12 Oct 2021 16:55:04 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 alsa1.perex.cz D31E5F80088 X-IronPort-AV: E=McAfee;i="6200,9189,10135"; a="290662537" X-IronPort-AV: E=Sophos;i="5.85,367,1624345200"; d="scan'208";a="290662537" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Oct 2021 07:54:55 -0700 X-IronPort-AV: E=Sophos;i="5.85,367,1624345200"; d="scan'208";a="486417507" Received: from csharp1-mobl.amr.corp.intel.com (HELO [10.213.183.127]) ([10.213.183.127]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Oct 2021 07:54:54 -0700 Subject: Re: [RFC PATCH v2 0/5] ASoC: soc-pcm: fix trigger race conditions with shared BE To: Takashi Iwai References: <20211004225441.233375-1-pierre-louis.bossart@linux.intel.com> <1efa1c31-7342-05f8-5f73-95e2462d4179@linux.intel.com> <3683cf39-632b-50df-c65d-63779c464850@nvidia.com> <11257d77-9975-3b00-94da-5dc1b5c95fc6@linux.intel.com> <80882fe6-ea30-43f6-8d83-8995dd28c748@linux.intel.com> <60c6a90b-290d-368c-ce61-4d86e70eaa78@linux.intel.com> <75894aba-ca1a-51d6-df7d-ad53fcd89f79@linux.intel.com> <29397354-dc5b-7837-c71b-df4bde707df2@linux.intel.com> <8c1353f0-e897-e7b0-c7b9-5712b05ac91f@linux.intel.com> From: Pierre-Louis Bossart Message-ID: <9d336461-b2fe-8cd4-0096-356620622f8d@linux.intel.com> Date: Tue, 12 Oct 2021 08:45:41 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Cc: alsa-devel@alsa-project.org, Kuninori Morimoto , Sameer Pujar , vkoul@kernel.org, broonie@kernel.org, Gyeongtaek Lee , Peter Ujfalusi X-BeenThere: alsa-devel@alsa-project.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: "Alsa-devel mailing list for ALSA developers - http://www.alsa-project.org" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: alsa-devel-bounces@alsa-project.org Sender: "Alsa-devel" >>> And indeed there's a deadlock! >>> >>> snd_pcm_period_elapsed() takes the FE pcm stream lock, and will call >>> snd_pcm_trigger. >> >> Indeed, this would deadlock. >> >>> So if we also take the pcm stream lock in the BE >>> trigger, there's a conceptual deadlock: we painted ourselves in a corner >>> by using the same lock twice. >>> >>> Takashi, are you really sure we should protect these for_each_dpcm_be() >>> loops with the same pcm lock? >> >> The call within the FE lock is done only in dpcm_dai_trigger_fe_be(), >> and this should call dpcm_be_dai_trigger() as is. In other places, >> the calls are without FE lock, hence they can take the lock, >> e.g. create a variant dpcm_dai_trigger_fe_be_lock(). > > Or rather it was the code path of snd_soc_dpcm_check_state()? > The function could be called from dpcm_be_dai_trigger(). > Currently dpcm_lock seems to be added at places casually covering some > of the for_each_dpcm_be() or whatever... The whole lock scheme has to > be revisited. > >>> it seems like asking for trouble to >>> revisit the ALSA core just to walking through a list of BEs? Would you >>> object to changing dpcm_lock as I suggested, but not interfering with >>> stream handling? >> >> That would work, too, it's just a pity to degrade the fine-grained >> locks that have been already taken into global locks... > > For the performance problem, at least, you can make it rwlock and > rwsem types (depending on nonatomic) so that the concurrent accesses > would work. The only exclusive lock is the case for adding and > removing the entries, which should be done with write lock / sem down, > while other link traversals can be executed in the read lock / sem. Thanks Takashi for your feedback. Let's first get the locking right. We can optimize performance later. I will continue with the idea of using existing fine-grained locks a bit more, I am with you that this dpcm_lock was not added in a consistent way and reusing the concept is really building on sand. We can remove the lock in snd_soc_dpcm_check_state, I already did the change in other versions. But what I'll need to check further is if indeed dpcm_be_dai_trigger() is called with the FE lock taken already. Adding a lockdep_assert_help() or something would help I guess. The part where I am still not clear is that snd_pcm_period_elapsed uses the irqsave/irqrestore version, but in the initial code you suggested the vanilla irq version is fine. I am not sure I get the nuance here, and btw in the case of SOF the snd_pcm_period_elapsed is called from a workqueue, not an interrupt handler, as a work-around to avoid an IPC deadlock, so we probably don't need the irqsave/irqrestore version anyways.