From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0CE3C4363A for ; Thu, 22 Oct 2020 09:50:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4E78B2245F for ; Thu, 22 Oct 2020 09:50:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2896130AbgJVJus (ORCPT ); Thu, 22 Oct 2020 05:50:48 -0400 Received: from out3-smtp.messagingengine.com ([66.111.4.27]:41459 "EHLO out3-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2895990AbgJVJur (ORCPT ); Thu, 22 Oct 2020 05:50:47 -0400 Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 5BD8B5C010D; Thu, 22 Oct 2020 05:50:45 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Thu, 22 Oct 2020 05:50:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cerno.tech; h= date:from:to:cc:subject:message-id:mime-version:content-type; s= fm1; bh=6BHBolasL6AWLQMU1Q79xtHYTVSi6GRqnV5yGU5u1bY=; b=G2SoQpON +S6bFzg7pDzYLNs9U8EN6+k9OZJKfHTiqTG1HIq/0mOI//xMX4D0/yovv7G+fecM ox1NVDe3qdN2NkEGXhPmNaj2ik4uiGxyQi1Nfhg8zetyVic7pFifRP2tFXEkMfho Bmxy+xkUqLhXL+tG1m6opWHwOkX9c/r43ta/h8R8KBNre8Cozs1L0ag76CWh/HEE myEOcfjWnTo29Bl2sFWMEo7gh7kMMltkT9vLa2lYTILdU3vWRm2FXnER+vQdYvTq ySLQinewqJWdbrtn1PU2rN8cgeFA3OEMjpidmRShQRm3h2FSMxJdRBwyQJaydk1i ojgDRyJe890fSw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:message-id :mime-version:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=6BHBolasL6AWLQMU1Q79xtHYTVSi6 GRqnV5yGU5u1bY=; b=FlxBKD1VsefiOYh3FaBdulr3toW7MJoBWWvQOSeAdoYlX mdR6ePWJ+A/U0iiHR1Bglj9JO8GkWvBWGFSJ4LdwB3h3CogrpTqUXnO93oBE6P6t XT4yotgdgPAmVJSYYYdWgZWFZe4tgK4p8gL7KKJn8cchQ56LEsh2qPyy4burirOM B2X3CC6GQKxl8uOxmAUlclbPlws5jKK5r5zav/PCcQMKd/iSwLTnIAxBTTr774CS M4mGTDZ4QpYXvu2LJou1ty9mPtb+4P9VotGbbFO2hM6mIpu16ygGGremppy5Ef+v 1CDUdZFcrREDSVtaj5p9SG0Ypviwrw1ZDXYqogsgA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrjeejgddvtdcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpeffhffvuffkgggtugesghdtreertddtvdenucfhrhhomhepofgrgihimhgvucft ihhprghrugcuoehmrgigihhmvgestggvrhhnohdrthgvtghhqeenucggtffrrghtthgvrh hnpefgvdeutddvvedvvddvtdehvedvleeghfeghfeluedvhfehudfhheffuefhjeffgeen ucffohhmrghinhepkhgvrhhnvghlrdhorhhgnecukfhppeeltddrkeelrdeikedrjeeine cuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepmhgrgihi mhgvsegtvghrnhhordhtvggthh X-ME-Proxy: Received: from localhost (lfbn-tou-1-1502-76.w90-89.abo.wanadoo.fr [90.89.68.76]) by mail.messagingengine.com (Postfix) with ESMTPA id E04E13280064; Thu, 22 Oct 2020 05:50:43 -0400 (EDT) Date: Thu, 22 Oct 2020 11:50:41 +0200 From: Maxime Ripard To: Jaroslav Kysela , Takashi Iwai , Liam Girdwood , Mark Brown Cc: alsa-devel@alsa-project.org, linux-kernel@vger.kernel.org, Dom Cobley , Dave Stevenson , Nicolas Saenz Julienne Subject: Context expectations in ALSA Message-ID: <20201022095041.44jytaelnlako54w@gilmour.lan> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="gvzixar4mrubl242" Content-Disposition: inline Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --gvzixar4mrubl242 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi! We currently have an issue reported by lockdep on the RaspberryPi and its HDMI audio output where, at startup, we end up scheduling in atomic context. This is caused by the HDMI driver polling some status bit that reports that the infoframes have been properly sent, and calling usleep_range between each iteration[1], and that is done in our trigger callback that seems to be run with a spinlock taken and the interrupt disabled (snd_pcm_action_lock_irq) as part of snd_pcm_start_lock_irq. This is the entire stack trace: # aplay --channels=3D2 -D 'iec958:CARD=3Dvc4hdmi0,DEV=3D0' /root/test-tone.= wav Playing WAVE '/root/test-tone.wav' : Signed 16 bit Little Endian, Rate 4410= 0 Hz, Stereo [ 14.732730] BUG: sleeping function called from invalid context at driver= s/gpu/drm/vc4/vc4_hdmi.c:276 [ 14.742005] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 140= , name: aplay [ 14.749955] CPU: 0 PID: 140 Comm: aplay Not tainted 5.9.0-rc5-v7l+ #66 [ 14.756578] Hardware name: BCM2711 [ 14.760026] Backtrace:=20 [ 14.762524] [] (dump_backtrace) from [] (show_stack+= 0x20/0x24) [ 14.770209] r7:c167c9e4 r6:600e0093 r5:00000000 r4:c167c9e4 [ 14.775960] [] (show_stack) from [] (dump_stack+0xb8= /0xd8) [ 14.783297] [] (dump_stack) from [] (___might_sleep+= 0x138/0x17c) [ 14.791159] r10:00000084 r9:f0979b00 r8:c10ffd50 r7:00000010 r6:0000000= 0 r5:00000114 [ 14.799105] r4:c10fe000 r3:600e0093 [ 14.802736] [] (___might_sleep) from [] (__might_sle= ep+0x70/0xb0) [ 14.810680] r4:c0e8f5f4 [ 14.813256] [] (__might_sleep) from [] (vc4_hdmi_sto= p_packet+0x120/0x330) [ 14.821907] r6:ef21f280 r5:00000003 r4:73e746f9 [ 14.826598] [] (vc4_hdmi_stop_packet) from [] (vc4_h= dmi_write_infoframe+0x140/0x5d0) [ 14.836220] r9:f0979b00 r8:c10ffd50 r7:ef219900 r6:ef248840 r5:00000000= r4:ef21f280 [ 14.844084] [] (vc4_hdmi_write_infoframe) from [] (v= c4_hdmi_set_audio_infoframe+0x5c/0x80) [ 14.854236] r10:efac7738 r9:ef3ea240 r8:00000001 r7:ef219900 r6:ef24884= 0 r5:ef21f040 [ 14.862180] r4:ef21f280 [ 14.864756] [] (vc4_hdmi_set_audio_infoframe) from [= ] (vc4_hdmi_audio_trigger+0x38/0x238) [ 14.874815] r4:00000001 [ 14.877392] [] (vc4_hdmi_audio_trigger) from [] (snd= _soc_pcm_dai_trigger+0x64/0xcc) [ 14.886923] r5:efac5a00 r4:00000000 [ 14.890555] [] (snd_soc_pcm_dai_trigger) from [] (so= c_pcm_trigger+0x70/0xac) [ 14.899472] r9:ef3ea240 r8:c1133540 r7:00000003 r6:ef219900 r5:ef219900= r4:00000001 [ 14.907336] [] (soc_pcm_trigger) from [] (snd_pcm_do= _start+0x3c/0x40) [ 14.915633] r5:c0c749dc r4:00000000 [ 14.919264] [] (snd_pcm_do_start) from [] (snd_pcm_a= ction_single+0x48/0x88) [ 14.928094] [] (snd_pcm_action_single) from [] (snd_= pcm_action+0x6c/0x78) [ 14.936746] r7:00000003 r6:c0c749dc r5:00000000 r4:ef219900 [ 14.942493] [] (snd_pcm_action) from [] (snd_pcm_act= ion_lock_irq+0x38/0x50) [ 14.951320] r7:00000030 r6:00000003 r5:c0c749dc r4:ef219900 [ 14.957069] [] (snd_pcm_action_lock_irq) from [] (sn= d_pcm_ioctl+0x930/0x14e4) [ 14.966073] r7:00000030 r6:00000000 r5:0085ee60 r4:ef219900 [ 14.971823] [] (snd_pcm_ioctl) from [] (sys_ioctl+0x= e4/0x9e4) [ 14.979420] r10:efac7738 r9:00000004 r8:c1133540 r7:0085ee60 r6:0000562= 4 r5:c1133540 [ 14.987364] r4:00004142 [ 14.989940] [] (sys_ioctl) from [] (ret_fast_syscall= +0x0/0x28) [ 14.997621] Exception stack(0xc10fffa8 to 0xc10ffff0) [ 15.002749] ffa0: 00869260 b6fa8000 00000004 00004142 = 0085ee60 00000001 [ 15.011050] ffc0: 00869260 b6fa8000 00005624 00000036 00000000 00000000 = 00001589 00001589 [ 15.019350] ffe0: b6fa8504 bedb59fc b6f2ff68 b6da515c [ 15.024479] r10:00000036 r9:c10fe000 r8:c0200204 r7:00000036 r6:0000562= 4 r5:b6fa8000 [ 15.032423] r4:00869260 We could switch to a udelay instead, but the idea of busy-waiting for up to a 100ms while having the interrupt disabled doesn't sound ideal either. It looks like the snd_soc_dai_link structure has a nonatomic flag that seems to be made to address more or less that issue, taking a mutex instead of a spinlock. However setting that flag results in another lockdep issue, since the dmaengine controller doing the DMA transfer would call snd_pcm_period_elapsed on completion, in a tasklet, this time taking a mutex in an atomic context which is just as bad as the initial issue. This is the stacktrace this time: # aplay --channels=3D2 -D 'iec958:CARD=3Dvc4hdmi0,DEV=3D0' /root/test-tone.= wav Playing WAVE '/root/test-tone.wav' : Signed 16 bit Little Endian, Rate 4410= 0 Hz, Stereo [ 43.078900] BUG: sleeping function called from invalid context at kernel= /locking/mutex.c:281 [ 43.087485] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, na= me: swapper/0 [ 43.095452] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc5-v7l+ #67 [ 43.102252] Hardware name: BCM2711 [ 43.105701] Backtrace:=20 [ 43.108199] [] (dump_backtrace) from [] (show_stack+= 0x20/0x24) [ 43.115884] r7:c167c9e4 r6:600e0113 r5:00000000 r4:c167c9e4 [ 43.121636] [] (show_stack) from [] (dump_stack+0xb8= /0xd8) [ 43.128972] [] (dump_stack) from [] (___might_sleep+= 0x138/0x17c) [ 43.136833] r10:c1600000 r9:00000006 r8:00000000 r7:c1601d70 r6:0000000= 0 r5:00000119 [ 43.144777] r4:c1600000 r3:600e0113 [ 43.148409] [] (___might_sleep) from [] (__might_sle= ep+0x70/0xb0) [ 43.156353] r4:c0e32994 [ 43.158930] [] (__might_sleep) from [] (mutex_lock+0= x2c/0x74) [ 43.166524] r6:efbfaa84 r5:ef26f548 r4:ef219998 [ 43.171216] [] (mutex_lock) from [] (_snd_pcm_stream= _lock_irqsave+0x2c/0x40) [ 43.180130] r5:ef26f548 r4:ef219900 [ 43.183765] [] (_snd_pcm_stream_lock_irqsave) from [= ] (snd_pcm_period_elapsed+0x20/0xa4) [ 43.193743] [] (snd_pcm_period_elapsed) from [] (dma= engine_pcm_dma_complete+0x54/0x58) [ 43.203539] r7:c1601d70 r6:efbfaa84 r5:ef26f548 r4:00000000 [ 43.209290] [] (dmaengine_pcm_dma_complete) from [] = (vchan_complete+0x238/0x240) [ 43.218567] [] (vchan_complete) from [] (tasklet_act= ion_common.constprop.0+0x84/0x12c) [ 43.228365] r10:c1600000 r9:00000006 r8:00000000 r7:dac802e0 r6:efbfaa8= 4 r5:00000000 [ 43.236309] r4:efbfaa80 [ 43.238886] [] (tasklet_action_common.constprop.0) from [] (tasklet_action+0x28/0x30) [ 43.248596] r9:00000040 r8:ef80b800 r7:00000101 r6:00000006 r5:00000007= r4:c1603098 [ 43.256459] [] (tasklet_action) from [] (__do_softir= q+0x18c/0x47c) [ 43.264498] [] (__do_softirq) from [] (irq_exit+0xd0= /0xf8) [ 43.271829] r10:c1051f40 r9:c1600000 r8:ef80b800 r7:00000001 r6:0000000= 0 r5:00000000 [ 43.279774] r4:ffffe000 [ 43.282349] [] (irq_exit) from [] (__handle_domain_i= rq+0x70/0xc0) [ 43.290294] r5:00000000 r4:c148ee80 [ 43.293927] [] (__handle_domain_irq) from [] (gic_ha= ndle_irq+0x4c/0x88) [ 43.302403] r9:c1600000 r8:c1601ed0 r7:f0803000 r6:f0802000 r5:f080200c= r4:c16058d8 [ 43.310266] [] (gic_handle_irq) from [] (__irq_svc+0= x5c/0x7c) [ 43.317856] Exception stack(0xc1601ed0 to 0xc1601f18) [ 43.322983] 1ec0: c0208b84 00000000 = 19bf8000 600e0093 [ 43.331283] 1ee0: c1600000 00000000 c1604e34 c1604e7c c16d20e5 c0e30cac = c1051f40 c1601f2c [ 43.339582] 1f00: ffffe000 c1601f20 c1600000 c0208b88 600e0013 ffffffff [ 43.346297] r9:c1600000 r8:c16d20e5 r7:c1601f04 r6:ffffffff r5:600e0013= r4:c0208b88 [ 43.354164] [] (arch_cpu_idle) from [] (default_idle= _call+0x3c/0x184) [ 43.362469] [] (default_idle_call) from [] (do_idle+= 0x110/0x170) [ 43.370327] r5:00000000 r4:c1600000 [ 43.373958] [] (do_idle) from [] (cpu_startup_entry+= 0x28/0x2c) [ 43.381641] r9:00000193 r8:effff900 r7:c1604e00 r6:00000000 r5:c160ec50= r4:000000d7 [ 43.389503] [] (cpu_startup_entry) from [] (rest_ini= t+0x108/0x110) [ 43.397542] [] (rest_init) from [] (arch_call_rest_i= nit+0x18/0x1c) [ 43.405575] r5:c1051f40 r4:c16e7078 [ 43.409205] [] (arch_call_rest_init) from [] (start_= kernel+0x7f4/0x834) [ 43.417682] [] (start_kernel) from [<00000000>] (0x0) So, I'm not really sure what I'm supposed to do here. The drivers involved don't appear to be doing anything extraordinary, but the issues lockdep report are definitely valid too. What are the expectations in terms of context from ALSA when running the callbacks, and how can we fix it? Thanks! Maxime 1: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/= drivers/gpu/drm/vc4/vc4_hdmi.c#n234 --gvzixar4mrubl242 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCX5FV8QAKCRDj7w1vZxhR xd+YAP4qhrOcrrRABpAqke0n1wCMBK8Tuns2MtNAWplJyBAUMgEAwG0fqoF3xPd8 CLTH9Y/74KG7s9Y9hIQtMjrtoEZebgk= =FjFM -----END PGP SIGNATURE----- --gvzixar4mrubl242-- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A75FC4363A for ; Thu, 22 Oct 2020 09:51:51 +0000 (UTC) Received: from alsa0.perex.cz (alsa0.perex.cz [77.48.224.243]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 982CE2245F for ; Thu, 22 Oct 2020 09:51:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=alsa-project.org header.i=@alsa-project.org header.b="jevOjKz9"; dkim=temperror (0-bit key) header.d=cerno.tech header.i=@cerno.tech header.b="G2SoQpON"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="FlxBKD1V" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 982CE2245F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cerno.tech Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=alsa-devel-bounces@alsa-project.org Received: from alsa1.perex.cz (alsa1.perex.cz [207.180.221.201]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by alsa0.perex.cz (Postfix) with ESMTPS id 525571657; Thu, 22 Oct 2020 11:50:57 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 alsa0.perex.cz 525571657 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=alsa-project.org; s=default; t=1603360307; bh=u8Ug0Sis7rt3w9Xz/Y9z0ONlTZx7KqpW8UmbP0ft6jo=; h=Date:From:To:Subject:Cc:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From; b=jevOjKz9lLZUj2i+Ai6zLXlEFc+JB3UyNJ4GG1/m3dbb1uvjS92M/QZgJgV1SvVGy oa5udAPePMWFXTejW+IZWV9k8J0hJSXbK+c96DMtgOjS6lbDIoo8N2mJayQkDppTyP pEYnU1b6b2qiefeZB7iQ0C/CkbaImqlqbhO6sj9o= Received: from alsa1.perex.cz (localhost.localdomain [127.0.0.1]) by alsa1.perex.cz (Postfix) with ESMTP id D33A2F8049C; Thu, 22 Oct 2020 11:50:56 +0200 (CEST) Received: by alsa1.perex.cz (Postfix, from userid 50401) id A8E9FF804A9; Thu, 22 Oct 2020 11:50:51 +0200 (CEST) Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by alsa1.perex.cz (Postfix) with ESMTPS id 8DB7BF80247 for ; Thu, 22 Oct 2020 11:50:46 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 alsa1.perex.cz 8DB7BF80247 Authentication-Results: alsa1.perex.cz; dkim=pass (2048-bit key) header.d=cerno.tech header.i=@cerno.tech header.b="G2SoQpON"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="FlxBKD1V" Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 5BD8B5C010D; Thu, 22 Oct 2020 05:50:45 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Thu, 22 Oct 2020 05:50:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cerno.tech; h= date:from:to:cc:subject:message-id:mime-version:content-type; s= fm1; bh=6BHBolasL6AWLQMU1Q79xtHYTVSi6GRqnV5yGU5u1bY=; b=G2SoQpON +S6bFzg7pDzYLNs9U8EN6+k9OZJKfHTiqTG1HIq/0mOI//xMX4D0/yovv7G+fecM ox1NVDe3qdN2NkEGXhPmNaj2ik4uiGxyQi1Nfhg8zetyVic7pFifRP2tFXEkMfho Bmxy+xkUqLhXL+tG1m6opWHwOkX9c/r43ta/h8R8KBNre8Cozs1L0ag76CWh/HEE myEOcfjWnTo29Bl2sFWMEo7gh7kMMltkT9vLa2lYTILdU3vWRm2FXnER+vQdYvTq ySLQinewqJWdbrtn1PU2rN8cgeFA3OEMjpidmRShQRm3h2FSMxJdRBwyQJaydk1i ojgDRyJe890fSw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:message-id :mime-version:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=6BHBolasL6AWLQMU1Q79xtHYTVSi6 GRqnV5yGU5u1bY=; b=FlxBKD1VsefiOYh3FaBdulr3toW7MJoBWWvQOSeAdoYlX mdR6ePWJ+A/U0iiHR1Bglj9JO8GkWvBWGFSJ4LdwB3h3CogrpTqUXnO93oBE6P6t XT4yotgdgPAmVJSYYYdWgZWFZe4tgK4p8gL7KKJn8cchQ56LEsh2qPyy4burirOM B2X3CC6GQKxl8uOxmAUlclbPlws5jKK5r5zav/PCcQMKd/iSwLTnIAxBTTr774CS M4mGTDZ4QpYXvu2LJou1ty9mPtb+4P9VotGbbFO2hM6mIpu16ygGGremppy5Ef+v 1CDUdZFcrREDSVtaj5p9SG0Ypviwrw1ZDXYqogsgA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrjeejgddvtdcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpeffhffvuffkgggtugesghdtreertddtvdenucfhrhhomhepofgrgihimhgvucft ihhprghrugcuoehmrgigihhmvgestggvrhhnohdrthgvtghhqeenucggtffrrghtthgvrh hnpefgvdeutddvvedvvddvtdehvedvleeghfeghfeluedvhfehudfhheffuefhjeffgeen ucffohhmrghinhepkhgvrhhnvghlrdhorhhgnecukfhppeeltddrkeelrdeikedrjeeine cuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepmhgrgihi mhgvsegtvghrnhhordhtvggthh X-ME-Proxy: Received: from localhost (lfbn-tou-1-1502-76.w90-89.abo.wanadoo.fr [90.89.68.76]) by mail.messagingengine.com (Postfix) with ESMTPA id E04E13280064; Thu, 22 Oct 2020 05:50:43 -0400 (EDT) Date: Thu, 22 Oct 2020 11:50:41 +0200 From: Maxime Ripard To: Jaroslav Kysela , Takashi Iwai , Liam Girdwood , Mark Brown Subject: Context expectations in ALSA Message-ID: <20201022095041.44jytaelnlako54w@gilmour.lan> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="gvzixar4mrubl242" Content-Disposition: inline Cc: alsa-devel@alsa-project.org, Dave Stevenson , linux-kernel@vger.kernel.org, Nicolas Saenz Julienne , Dom Cobley X-BeenThere: alsa-devel@alsa-project.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: "Alsa-devel mailing list for ALSA developers - http://www.alsa-project.org" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: alsa-devel-bounces@alsa-project.org Sender: "Alsa-devel" --gvzixar4mrubl242 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi! We currently have an issue reported by lockdep on the RaspberryPi and its HDMI audio output where, at startup, we end up scheduling in atomic context. This is caused by the HDMI driver polling some status bit that reports that the infoframes have been properly sent, and calling usleep_range between each iteration[1], and that is done in our trigger callback that seems to be run with a spinlock taken and the interrupt disabled (snd_pcm_action_lock_irq) as part of snd_pcm_start_lock_irq. This is the entire stack trace: # aplay --channels=3D2 -D 'iec958:CARD=3Dvc4hdmi0,DEV=3D0' /root/test-tone.= wav Playing WAVE '/root/test-tone.wav' : Signed 16 bit Little Endian, Rate 4410= 0 Hz, Stereo [ 14.732730] BUG: sleeping function called from invalid context at driver= s/gpu/drm/vc4/vc4_hdmi.c:276 [ 14.742005] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 140= , name: aplay [ 14.749955] CPU: 0 PID: 140 Comm: aplay Not tainted 5.9.0-rc5-v7l+ #66 [ 14.756578] Hardware name: BCM2711 [ 14.760026] Backtrace:=20 [ 14.762524] [] (dump_backtrace) from [] (show_stack+= 0x20/0x24) [ 14.770209] r7:c167c9e4 r6:600e0093 r5:00000000 r4:c167c9e4 [ 14.775960] [] (show_stack) from [] (dump_stack+0xb8= /0xd8) [ 14.783297] [] (dump_stack) from [] (___might_sleep+= 0x138/0x17c) [ 14.791159] r10:00000084 r9:f0979b00 r8:c10ffd50 r7:00000010 r6:0000000= 0 r5:00000114 [ 14.799105] r4:c10fe000 r3:600e0093 [ 14.802736] [] (___might_sleep) from [] (__might_sle= ep+0x70/0xb0) [ 14.810680] r4:c0e8f5f4 [ 14.813256] [] (__might_sleep) from [] (vc4_hdmi_sto= p_packet+0x120/0x330) [ 14.821907] r6:ef21f280 r5:00000003 r4:73e746f9 [ 14.826598] [] (vc4_hdmi_stop_packet) from [] (vc4_h= dmi_write_infoframe+0x140/0x5d0) [ 14.836220] r9:f0979b00 r8:c10ffd50 r7:ef219900 r6:ef248840 r5:00000000= r4:ef21f280 [ 14.844084] [] (vc4_hdmi_write_infoframe) from [] (v= c4_hdmi_set_audio_infoframe+0x5c/0x80) [ 14.854236] r10:efac7738 r9:ef3ea240 r8:00000001 r7:ef219900 r6:ef24884= 0 r5:ef21f040 [ 14.862180] r4:ef21f280 [ 14.864756] [] (vc4_hdmi_set_audio_infoframe) from [= ] (vc4_hdmi_audio_trigger+0x38/0x238) [ 14.874815] r4:00000001 [ 14.877392] [] (vc4_hdmi_audio_trigger) from [] (snd= _soc_pcm_dai_trigger+0x64/0xcc) [ 14.886923] r5:efac5a00 r4:00000000 [ 14.890555] [] (snd_soc_pcm_dai_trigger) from [] (so= c_pcm_trigger+0x70/0xac) [ 14.899472] r9:ef3ea240 r8:c1133540 r7:00000003 r6:ef219900 r5:ef219900= r4:00000001 [ 14.907336] [] (soc_pcm_trigger) from [] (snd_pcm_do= _start+0x3c/0x40) [ 14.915633] r5:c0c749dc r4:00000000 [ 14.919264] [] (snd_pcm_do_start) from [] (snd_pcm_a= ction_single+0x48/0x88) [ 14.928094] [] (snd_pcm_action_single) from [] (snd_= pcm_action+0x6c/0x78) [ 14.936746] r7:00000003 r6:c0c749dc r5:00000000 r4:ef219900 [ 14.942493] [] (snd_pcm_action) from [] (snd_pcm_act= ion_lock_irq+0x38/0x50) [ 14.951320] r7:00000030 r6:00000003 r5:c0c749dc r4:ef219900 [ 14.957069] [] (snd_pcm_action_lock_irq) from [] (sn= d_pcm_ioctl+0x930/0x14e4) [ 14.966073] r7:00000030 r6:00000000 r5:0085ee60 r4:ef219900 [ 14.971823] [] (snd_pcm_ioctl) from [] (sys_ioctl+0x= e4/0x9e4) [ 14.979420] r10:efac7738 r9:00000004 r8:c1133540 r7:0085ee60 r6:0000562= 4 r5:c1133540 [ 14.987364] r4:00004142 [ 14.989940] [] (sys_ioctl) from [] (ret_fast_syscall= +0x0/0x28) [ 14.997621] Exception stack(0xc10fffa8 to 0xc10ffff0) [ 15.002749] ffa0: 00869260 b6fa8000 00000004 00004142 = 0085ee60 00000001 [ 15.011050] ffc0: 00869260 b6fa8000 00005624 00000036 00000000 00000000 = 00001589 00001589 [ 15.019350] ffe0: b6fa8504 bedb59fc b6f2ff68 b6da515c [ 15.024479] r10:00000036 r9:c10fe000 r8:c0200204 r7:00000036 r6:0000562= 4 r5:b6fa8000 [ 15.032423] r4:00869260 We could switch to a udelay instead, but the idea of busy-waiting for up to a 100ms while having the interrupt disabled doesn't sound ideal either. It looks like the snd_soc_dai_link structure has a nonatomic flag that seems to be made to address more or less that issue, taking a mutex instead of a spinlock. However setting that flag results in another lockdep issue, since the dmaengine controller doing the DMA transfer would call snd_pcm_period_elapsed on completion, in a tasklet, this time taking a mutex in an atomic context which is just as bad as the initial issue. This is the stacktrace this time: # aplay --channels=3D2 -D 'iec958:CARD=3Dvc4hdmi0,DEV=3D0' /root/test-tone.= wav Playing WAVE '/root/test-tone.wav' : Signed 16 bit Little Endian, Rate 4410= 0 Hz, Stereo [ 43.078900] BUG: sleeping function called from invalid context at kernel= /locking/mutex.c:281 [ 43.087485] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, na= me: swapper/0 [ 43.095452] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc5-v7l+ #67 [ 43.102252] Hardware name: BCM2711 [ 43.105701] Backtrace:=20 [ 43.108199] [] (dump_backtrace) from [] (show_stack+= 0x20/0x24) [ 43.115884] r7:c167c9e4 r6:600e0113 r5:00000000 r4:c167c9e4 [ 43.121636] [] (show_stack) from [] (dump_stack+0xb8= /0xd8) [ 43.128972] [] (dump_stack) from [] (___might_sleep+= 0x138/0x17c) [ 43.136833] r10:c1600000 r9:00000006 r8:00000000 r7:c1601d70 r6:0000000= 0 r5:00000119 [ 43.144777] r4:c1600000 r3:600e0113 [ 43.148409] [] (___might_sleep) from [] (__might_sle= ep+0x70/0xb0) [ 43.156353] r4:c0e32994 [ 43.158930] [] (__might_sleep) from [] (mutex_lock+0= x2c/0x74) [ 43.166524] r6:efbfaa84 r5:ef26f548 r4:ef219998 [ 43.171216] [] (mutex_lock) from [] (_snd_pcm_stream= _lock_irqsave+0x2c/0x40) [ 43.180130] r5:ef26f548 r4:ef219900 [ 43.183765] [] (_snd_pcm_stream_lock_irqsave) from [= ] (snd_pcm_period_elapsed+0x20/0xa4) [ 43.193743] [] (snd_pcm_period_elapsed) from [] (dma= engine_pcm_dma_complete+0x54/0x58) [ 43.203539] r7:c1601d70 r6:efbfaa84 r5:ef26f548 r4:00000000 [ 43.209290] [] (dmaengine_pcm_dma_complete) from [] = (vchan_complete+0x238/0x240) [ 43.218567] [] (vchan_complete) from [] (tasklet_act= ion_common.constprop.0+0x84/0x12c) [ 43.228365] r10:c1600000 r9:00000006 r8:00000000 r7:dac802e0 r6:efbfaa8= 4 r5:00000000 [ 43.236309] r4:efbfaa80 [ 43.238886] [] (tasklet_action_common.constprop.0) from [] (tasklet_action+0x28/0x30) [ 43.248596] r9:00000040 r8:ef80b800 r7:00000101 r6:00000006 r5:00000007= r4:c1603098 [ 43.256459] [] (tasklet_action) from [] (__do_softir= q+0x18c/0x47c) [ 43.264498] [] (__do_softirq) from [] (irq_exit+0xd0= /0xf8) [ 43.271829] r10:c1051f40 r9:c1600000 r8:ef80b800 r7:00000001 r6:0000000= 0 r5:00000000 [ 43.279774] r4:ffffe000 [ 43.282349] [] (irq_exit) from [] (__handle_domain_i= rq+0x70/0xc0) [ 43.290294] r5:00000000 r4:c148ee80 [ 43.293927] [] (__handle_domain_irq) from [] (gic_ha= ndle_irq+0x4c/0x88) [ 43.302403] r9:c1600000 r8:c1601ed0 r7:f0803000 r6:f0802000 r5:f080200c= r4:c16058d8 [ 43.310266] [] (gic_handle_irq) from [] (__irq_svc+0= x5c/0x7c) [ 43.317856] Exception stack(0xc1601ed0 to 0xc1601f18) [ 43.322983] 1ec0: c0208b84 00000000 = 19bf8000 600e0093 [ 43.331283] 1ee0: c1600000 00000000 c1604e34 c1604e7c c16d20e5 c0e30cac = c1051f40 c1601f2c [ 43.339582] 1f00: ffffe000 c1601f20 c1600000 c0208b88 600e0013 ffffffff [ 43.346297] r9:c1600000 r8:c16d20e5 r7:c1601f04 r6:ffffffff r5:600e0013= r4:c0208b88 [ 43.354164] [] (arch_cpu_idle) from [] (default_idle= _call+0x3c/0x184) [ 43.362469] [] (default_idle_call) from [] (do_idle+= 0x110/0x170) [ 43.370327] r5:00000000 r4:c1600000 [ 43.373958] [] (do_idle) from [] (cpu_startup_entry+= 0x28/0x2c) [ 43.381641] r9:00000193 r8:effff900 r7:c1604e00 r6:00000000 r5:c160ec50= r4:000000d7 [ 43.389503] [] (cpu_startup_entry) from [] (rest_ini= t+0x108/0x110) [ 43.397542] [] (rest_init) from [] (arch_call_rest_i= nit+0x18/0x1c) [ 43.405575] r5:c1051f40 r4:c16e7078 [ 43.409205] [] (arch_call_rest_init) from [] (start_= kernel+0x7f4/0x834) [ 43.417682] [] (start_kernel) from [<00000000>] (0x0) So, I'm not really sure what I'm supposed to do here. The drivers involved don't appear to be doing anything extraordinary, but the issues lockdep report are definitely valid too. What are the expectations in terms of context from ALSA when running the callbacks, and how can we fix it? Thanks! Maxime 1: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/= drivers/gpu/drm/vc4/vc4_hdmi.c#n234 --gvzixar4mrubl242 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCX5FV8QAKCRDj7w1vZxhR xd+YAP4qhrOcrrRABpAqke0n1wCMBK8Tuns2MtNAWplJyBAUMgEAwG0fqoF3xPd8 CLTH9Y/74KG7s9Y9hIQtMjrtoEZebgk= =FjFM -----END PGP SIGNATURE----- --gvzixar4mrubl242--