From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61D26C2D0EC for ; Thu, 26 Mar 2020 18:07:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 356C32073E for ; Thu, 26 Mar 2020 18:07:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="iajwo6oj" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727803AbgCZSHZ (ORCPT ); Thu, 26 Mar 2020 14:07:25 -0400 Received: from mail-ua1-f68.google.com ([209.85.222.68]:42348 "EHLO mail-ua1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726267AbgCZSHY (ORCPT ); Thu, 26 Mar 2020 14:07:24 -0400 Received: by mail-ua1-f68.google.com with SMTP id m18so2479671uap.9 for ; Thu, 26 Mar 2020 11:07:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=S+88eV5gK0LljhqKdZ1yGcysPszYlI5kB6q6EVwjYtM=; b=iajwo6ojaH6PQ0M+kFhP3CIwc4WeldtAB8KkVDaqEJ2XOiscCgtWqKrv9JgYGGrzCN 9HdQjVQDYrEB0Tpwu/BZBGrA+jNrpOZ5/jUEOE9t1h6DXRmYGPEbeHF4k7AwRlraXoPf nkYRKiOHMLvzeuAXL9m/nvB/Uo76yXnDU+Dw8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=S+88eV5gK0LljhqKdZ1yGcysPszYlI5kB6q6EVwjYtM=; b=it/uKAWyIHkFAUE99rEFhT4jYVhirB7Zn4v6I6w1ujrZ7OtfK/pirBZEsKXirglbp6 s3z4p5Kiwftzd44KimN0uvJ9ZGhtMOl1Urnqr3myLCcCa7Q3uF7o9hwEn7W1z21JlCBr D+NwiGXqWXz5hso45oHO44whoBQ8t6xMvTW5pXJ6rCvR9tpCZZhzfwBgh2ALeYZIk281 ifPhaGsn0kO493bsXhloCAzXnzOIvEQRb9ol3BFfm8unrdQA7sme847qzwaOEndt+Y5c Z9+b+S0MUXwgPxHdwtmIRC4x16QQFMOgOAvtwczz2SqTr6Ub4JPbLvB2alPeqfCFuuTI mxdg== X-Gm-Message-State: ANhLgQ28MrNAz2GOu1bONz+Y3WdFHmZKfxEr2g+4svlJikbl22w4VfeT eTkjte+MidlI0FFqavyoO4BTNSlQI9U= X-Google-Smtp-Source: ADFU+vuEOKNuvrj8UH6Jx89Tl5inCvKoBrMUxACs6YFgR7l/gpsT9TolSe7eb1lDLEQnbW/guq9abQ== X-Received: by 2002:ab0:28c9:: with SMTP id g9mr7088355uaq.117.1585246040700; Thu, 26 Mar 2020 11:07:20 -0700 (PDT) Received: from mail-vs1-f41.google.com (mail-vs1-f41.google.com. [209.85.217.41]) by smtp.gmail.com with ESMTPSA id k10sm1514515vsr.31.2020.03.26.11.07.19 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 26 Mar 2020 11:07:19 -0700 (PDT) Received: by mail-vs1-f41.google.com with SMTP id w185so4476457vsw.10 for ; Thu, 26 Mar 2020 11:07:19 -0700 (PDT) X-Received: by 2002:a67:1e46:: with SMTP id e67mr8011826vse.106.1585246038937; Thu, 26 Mar 2020 11:07:18 -0700 (PDT) MIME-Version: 1.0 References: <20190218140210.14631-1-rplsssn@codeaurora.org> <20190218140210.14631-2-rplsssn@codeaurora.org> In-Reply-To: <20190218140210.14631-2-rplsssn@codeaurora.org> From: Doug Anderson Date: Thu, 26 Mar 2020 11:07:07 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH RESEND v1 1/2] drivers: qcom: rpmh-rsc: clear active mode configuration for wake TCS To: Andy Gross , David Brown , linux-arm-msm , "open list:ARM/QUALCOMM SUPPORT" Cc: Rajendra Nayak , Bjorn Andersson , LKML , Linux PM , Stephen Boyd , Evan Green , Matthias Kaehlcke , Lina Iyer , "Raju P.L.S.S.S.N" , Maulik Shah Content-Type: text/plain; charset="UTF-8" Sender: linux-arm-msm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org Hi On Mon, 18 Feb 2019 19:32:09 Raju P.L.S.S.S.N wrote: > > For RSCs that have sleep & wake TCS but no dedicated active TCS, wake > TCS can be re-purposed to send active requests. Once the active requests > are sent and response is received, the active mode configuration needs > to be cleared so that controller can use wake TCS for sending wake > requests. > > Signed-off-by: Raju P.L.S.S.S.N > Reviewed-by: Matthias Kaehlcke > --- > drivers/soc/qcom/rpmh-rsc.c | 77 ++++++++++++++++++++++++++----------- > 1 file changed, 54 insertions(+), 23 deletions(-) > > diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c > index 75bd9a83aef0..6cc7f219ce48 100644 > --- a/drivers/soc/qcom/rpmh-rsc.c > +++ b/drivers/soc/qcom/rpmh-rsc.c > @@ -201,6 +201,42 @@ static const struct tcs_request *get_req_from_tcs(struct rsc_drv *drv, > return NULL; > } > > +static void __tcs_trigger(struct rsc_drv *drv, int tcs_id, bool trigger) nit: can you rename this to __tcs_set_trigger() so it's a little more obvious that the last value means trigger/untrigger? It'd also be nice to really understand why it has to be structured this way. It's weird that the two options are "untrigger + retrigger" and "untrigger" > +{ > + u32 enable; > + > + /* > + * HW req: Clear the DRV_CONTROL and enable TCS again > + * While clearing ensure that the AMC mode trigger is cleared > + * and then the mode enable is cleared. > + */ > + enable = read_tcs_reg(drv, RSC_DRV_CONTROL, tcs_id, 0); > + enable &= ~TCS_AMC_MODE_TRIGGER; > + write_tcs_reg_sync(drv, RSC_DRV_CONTROL, tcs_id, enable); > + enable &= ~TCS_AMC_MODE_ENABLE; > + write_tcs_reg_sync(drv, RSC_DRV_CONTROL, tcs_id, enable); > + > + if (trigger) { > + /* Enable the AMC mode on the TCS and then trigger the TCS */ > + enable = TCS_AMC_MODE_ENABLE; > + write_tcs_reg_sync(drv, RSC_DRV_CONTROL, tcs_id, enable); > + enable |= TCS_AMC_MODE_TRIGGER; > + write_tcs_reg_sync(drv, RSC_DRV_CONTROL, tcs_id, enable); > + } > +} > + > +static inline void enable_tcs_irq(struct rsc_drv *drv, int tcs_id, bool enable) > +{ > + u32 data; > + > + data = read_tcs_reg(drv, RSC_DRV_IRQ_ENABLE, 0, 0); > + if (enable) > + data |= BIT(tcs_id); > + else > + data &= ~BIT(tcs_id); > + write_tcs_reg(drv, RSC_DRV_IRQ_ENABLE, 0, data); > +} > + > /** > * tcs_tx_done: TX Done interrupt handler > */ > @@ -237,6 +273,21 @@ static irqreturn_t tcs_tx_done(int irq, void *p) > } > > trace_rpmh_tx_done(drv, i, req, err); > + > + /* > + * if wake tcs was re-purposed for sending active > + * votes, clear AMC trigger & enable modes and > + * disable interrupt for this TCS > + */ > + if (!drv->tcs[ACTIVE_TCS].num_tcs) { > + __tcs_trigger(drv, i, false); I assume that the reason that the code originally didn't try to "untrigger" in the interrupt handler is that it's slow (it uses write_tcs_reg_sync). If that's true then maybe you shouldn't do the untrigger here for the case when you're on a borrowed TCS. Can't you just do the untrigger later when you reprogram the TCS for someone else's use? > + /* > + * Disable interrupt for this TCS to avoid being > + * spammed with interrupts coming when the solver > + * sends its wake votes. > + */ > + enable_tcs_irq(drv, i, false); Should you be doing this under the spinlock? You're doing a read-modify-write of the RSC_DRV_IRQ_ENABLE register which seems like it could race with someone trying to enable an IRQ if the borrowed TCS type has more than one TCS (so you could be trying to start a transfer on one TCS while one is finishing on another). It would be somewhat hard for this to happen, but I don't _think_ it's impossible. Specifically: 1. Two threads can call rpmh_write() at the same time. 2. Both threads call into rpmh_rsc_send_data() w/out holding any locks. 3. Both threads call into tcs_write() w/out holding any locks. 4. Both threads call get_tcs_for_msg() w/out holding any locks. 5. Both threads notice they need to borrow the wake TCS. 6. Both threads call rpmh_rsc_invalidate(). There are locks in here, but nothing stops both threads from returning 0 (not -EAGAIN) since nobody has claimed the wake TCS by setting 'tcs_in_use' yet. Assuming that there are more than one wake TCSs it is possible that both transfers can be happening at the same time and I believe it's even possible (though you'd need crazy timing) for one thread to hit the interrupt handler and finish at the same time that the other thread starts. Assuming we care about the case of having zero-active TCS and more-than-one-wake TCS, it'd be nice to fix. If we don't care about this case, it should be documented in the code. Funny enough, most of the time having zero-active TCS and more-than-one-wake TCS doesn't buy us much with the current code because the 2nd thread will return -EAGAIN from rpmh_rsc_invalidate() assuming that the 1st thread manages to set "tcs_in_use" before the 2nd thread gets there. Overall the locking involved with borrowing a wake TCS is really tricky if you want to close all corner cases. I need to constantly refer to my series adding documentation to have any chance here. https://lore.kernel.org/r/20200311161104.RFT.v2.5.I52653eb85d7dc8981ee0dafcd0b6cc0f273e9425@changeid I'd love review feedback on that! Some of this stuff maybe becomes easier to understand if we don't have Maulik's flushing series and we can always assume that writing active TCSs and writing sleep/wake TCSs never happen at the same time (I think traditionally sleep/wake TCSs only get written from special PM code when we know nothing else is running). -Doug