linux-remoteproc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Andersson <bjorn.andersson@linaro.org>
To: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alex Elder <elder@linaro.org>,
	ohad@wizery.com, linux-remoteproc@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/1] remoteproc: use freezable workqueue for crash notifications
Date: Mon, 31 May 2021 18:13:52 -0500	[thread overview]
Message-ID: <YLVtsPNCr6dk9X0h@yoga> (raw)
In-Reply-To: <20210531172153.GA1718330@xps15>

On Mon 31 May 12:21 CDT 2021, Mathieu Poirier wrote:

> On Thu, May 27, 2021 at 10:55:05PM -0500, Bjorn Andersson wrote:
> > On Wed 19 May 18:44 CDT 2021, Alex Elder wrote:
> > 
> > > When a remoteproc has crashed, rproc_report_crash() is called to
> > > handle whatever recovery is desired.  This can happen at almost any
> > > time, often triggered by an interrupt, though it can also be
> > > initiated by a write to debugfs file remoteproc/remoteproc*/crash.
> > > 
> > > When a crash is reported, the crash handler worker is scheduled to
> > > run (rproc_crash_handler_work()).  One thing that worker does is
> > > call rproc_trigger_recovery(), which calls rproc_stop().  That calls
> > > the ->stop method for any remoteproc subdevices before making the
> > > remote processor go offline.
> > > 
> > > The Q6V5 modem remoteproc driver implements an SSR subdevice that
> > > notifies registered drivers when the modem changes operational state
> > > (prepare, started, stop/crash, unprepared).  The IPA driver
> > > registers to receive these notifications.
> > > 
> > > With that as context, I'll now describe the problem.
> > > 
> > > There was a situation in which buggy modem firmware led to a modem
> > > crash very soon after system (AP) resume had begun.  The crash caused
> > > a remoteproc SSR crash notification to be sent to the IPA driver.
> > > The problem was that, although system resume had begun, it had not
> > > yet completed, and the IPA driver was still in a suspended state.
> 
> This is a very tight race condition - I agree with you that it is next to
> impossible to test.
> 

I certainly appreciate to see the upstream kernel be put through the
level of product testing necessary to find issues like this.

> > > 
> > > This scenario could happen to any driver that registers for these
> > > SSR notifications, because they are delivered without knowledge of
> > > the (suspend) state of registered recipient drivers.
> > > 
> > > This patch offers a simple fix for this, by having the crash
> > > handling worker function run on the system freezable workqueue.
> > > This workqueue does not operate if user space is frozen (for
> > > suspend).  As a result, the SSR subdevice only delivers its
> > > crash notification when the system is fully operational (i.e.,
> > > neither suspended nor in suspend/resume transition).
> > > 
> 
> I think the real fix for this problem should be in the platform driver where
> the remoteproc interrupt would be masked while suspending and re-enabled again
> when resuming.  The runtime PM API would work just fine for that...  But doing
> so wouldn't guarantee that other drivers, i.e IPA, would be operational.  Unless
> of one is a child of the other or using a bus like mechanic, and getting
> to that point will introduce a lot more churn than what this patch does. 
> 

Disabling the related interrupt(s) would mean that if the modem
remoteproc firmware crashes while Linux is suspended we would not know
about this until the next time Linux resumes. The expected outcome of
this would be that until something else happens to wake up Linux you
won't get any notifications from the network (i.e. no phone calls, text
messages or incoming notifications)

Regards,
Bjorn

  reply	other threads:[~2021-05-31 23:13 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-19 23:44 [PATCH 0/1] remoteproc: avoid notification when suspended Alex Elder
2021-05-19 23:44 ` [PATCH 1/1] remoteproc: use freezable workqueue for crash notifications Alex Elder
2021-05-28  3:55   ` Bjorn Andersson
2021-05-28 15:09     ` Mathieu Poirier
2021-05-29  0:12     ` Siddharth Gupta
2021-06-04 20:46       ` Siddharth Gupta
     [not found]     ` <20210529024847.5164-1-hdanton@sina.com>
2021-05-29 17:28       ` Bjorn Andersson
     [not found]       ` <20210530030728.8340-1-hdanton@sina.com>
2021-05-31 23:25         ` Bjorn Andersson
2021-05-31 17:21     ` Mathieu Poirier
2021-05-31 23:13       ` Bjorn Andersson [this message]
2021-06-01 14:12       ` Alex Elder
2021-08-04 19:31 ` [PATCH 0/1] remoteproc: avoid notification when suspended patchwork-bot+linux-remoteproc

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YLVtsPNCr6dk9X0h@yoga \
    --to=bjorn.andersson@linaro.org \
    --cc=elder@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-remoteproc@vger.kernel.org \
    --cc=mathieu.poirier@linaro.org \
    --cc=ohad@wizery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).