From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 15718C6FA82 for ; Thu, 22 Sep 2022 15:10:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=Zm1s3NyiRlo4+fiyRHUAkgDnjaEQ++O+d+YtPf0SLl8=; b=Tq5Lw65KwEHsHG LyzriciGdlcN9VbcREyKP9AbkdKMxb8zyidihfaafSp7fU6eI3nTNkdNOpwhAfvbOtb4mB5w9+2Dw lSUWaPwIPiZwjK+kBJrprx356v4xZ+X2GZl/P/TwO8Ju19MWi25TjDxxtciZrAruW6qC/DuZUOSIo BuCPU6ho++WCeiXKgZ9FwuJLKLSwrrOMyMrDMn/qOem0fR9BL8Vc9joIDNg8EaspRhB4RS4zMuVWQ zfb2duYNNgmBj1z4pcII85G43EKHcykS+5JKX0lnEXK2KnUCoKwNvH6lXGJGmsOB2C+8gO6o27AN6 P2FuFxOeMtrBRUkhA21w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1obNpB-00GJ8y-Kw; Thu, 22 Sep 2022 15:09:13 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1obNb2-00GCcA-2V for linux-arm-kernel@lists.infradead.org; Thu, 22 Sep 2022 14:54:37 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C8DB91042; Thu, 22 Sep 2022 07:54:37 -0700 (PDT) Received: from e120937-lin (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 565713F73D; Thu, 22 Sep 2022 07:54:30 -0700 (PDT) Date: Thu, 22 Sep 2022 15:54:24 +0100 From: Cristian Marussi To: Shivnandan Kumar Cc: sudeep.holla@arm.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, quic_rgottimu@quicinc.com, quic_avajid@quicinc.com Subject: Re: Query regarding "firmware: arm_scmi: Free mailbox channels if probe fails" Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220922_075436_206339_6C1CE6F7 X-CRM114-Status: GOOD ( 18.93 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Sep 22, 2022 at 10:31:47AM +0530, Shivnandan Kumar wrote: > Hi Christian, > = Hi Shivnandan, > = > Do you have any update or suggestion regarding thread https://lore.kernel= .org/lkml/20211105094310.GI6526@e120937-lin/T/#m07993053f6f238864acad4e9bad= 9f08d85aeb019. > = > We are still getting this issue and wanted to check if=A0 there is any fix > that I can try. > = Sorry this issue fell to the bottom of my list in these past months... ... but it stil on TODO :D So today I tried to get my head around this issue again (i.e. mainly re-reading the above thread to remind me what was the status and wth I had written... :P) In summary the racy thing seemed to be caused by the a delayed late SCMI Base reply happily served on one core by scmi_rx_callback operating on some well-defined SCMI channel, while on another core we are effectively shutting down the system and destroying such channels: now this should be clearly NOT be possible and it is what we have to synchronize. Looking at the transport layer that you use, mailbox, I see that while setup/free helpers are synchronized on an internal chan->lock, the RX path inside the mailbox core is not, so I tried this: diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c index 4229b9b5da98..bb6173c0ad54 100644 --- a/drivers/mailbox/mailbox.c +++ b/drivers/mailbox/mailbox.c @@ -157,9 +157,13 @@ static enum hrtimer_restart txdone_hrtimer(struct hrti= mer *hrtimer) */ void mbox_chan_received_data(struct mbox_chan *chan, void *mssg) { + unsigned long flags; + + spin_lock_irqsave(&chan->lock, flags); /* No buffering the received data */ if (chan->cl->rx_callback) chan->cl->rx_callback(chan->cl, mssg); + spin_unlock_irqrestore(&chan->lock, flags); } EXPORT_SYMBOL_GPL(mbox_chan_received_data); = ... can you try on your setup ? I dont have a way to easily reproduce your race as of now... NOTE THAT, I am not still convinced that the above fix, if it works, will constitute the final solution to this issue, I could maybe move this same kind of sync up into the SCMI transport layer to avoid to impact all other users of the above mailbox interface (since, as of today, nobody has reported any issue like ours due to the missing spinlock..)..but it could be helpful to test the above to verify that this is really where the root issue is. Thanks, Cristian _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel