All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Poirier <mathieu.poirier@linaro.org>
To: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Ohad Ben-Cohen <ohad@wizery.com>,
	Loic PALLARDY <loic.pallardy@st.com>,
	Arnaud POULIQUEN <arnaud.pouliquen@st.com>,
	Suman Anna <s-anna@ti.com>,
	linux-remoteproc <linux-remoteproc@vger.kernel.org>,
	Jon Corbet <corbet@lwn.net>,
	"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3 03/14] remoteproc: Add new operation and flags for synchronistation
Date: Thu, 21 May 2020 15:55:35 -0600	[thread overview]
Message-ID: <CANLsYkyQJ30tyoKctsYyy8iSxwanampEmRX1Mkoc4RyUvhvQow@mail.gmail.com> (raw)
In-Reply-To: <20200521052155.GE11847@yoga>

On Wed, 20 May 2020 at 23:22, Bjorn Andersson
<bjorn.andersson@linaro.org> wrote:
>
> On Wed 20 May 15:06 PDT 2020, Mathieu Poirier wrote:
>
> > On Mon, May 18, 2020 at 05:55:00PM -0700, Bjorn Andersson wrote:
> > > On Fri 15 May 12:24 PDT 2020, Mathieu Poirier wrote:
> > >
> > > > Good day Bjorn,
> > > >
> > > > On Wed, May 13, 2020 at 06:32:24PM -0700, Bjorn Andersson wrote:
> > > > > On Fri 08 May 14:01 PDT 2020, Mathieu Poirier wrote:
> > > > >
> > > > > > On Tue, May 05, 2020 at 05:22:53PM -0700, Bjorn Andersson wrote:
> > > > > > > On Fri 24 Apr 13:01 PDT 2020, Mathieu Poirier wrote:
> [..]
> > > > > > > > + bool after_crash;
> > > > > > >
> > > > > > > Similarly what is the expected steps to be taken by the core when this
> > > > > > > is true? Should rproc_report_crash() simply stop/start the subdevices
> > > > > > > and upon one of the ops somehow tell the remote controller that it can
> > > > > > > proceed with the recovery?
> > > > > >
> > > > > > The exact same sequence of steps will be carried out as they are today, except
> > > > > > that if after_crash == true, the remoteproc core won't be switching the remote
> > > > > > processor on, exactly as it would do when on_init == true.
> > > > > >
> > > > >
> > > > > Just to make sure we're on the same page:
> > > > >
> > > > > after_crash = false is what we have today, and would mean:
> > > > >
> > > > > 1) stop subdevices
> > > > > 2) power off
> > > > > 3) unprepare subdevices
> > > > > 4) generate coredump
> > > > > 5) request firmware
> > > > > 6) load segments
> > > > > 7) find resource table
> > > > > 8) prepare subdevices
> > > > > 9) "boot"
> > > > > 10) start subdevices
> > > >
> > > > Exactly
> > > >
> > > > >
> > > > > after_crash = true would mean:
> > > > >
> > > > > 1) stop subdevices
> > > > > 2) "detach"
> > > > > 3) unprepare subdevices
> > > > > 4) prepare subdevices
> > > > > 5) "attach"
> > > > > 6) start subdevices
> > > > >
> > > >
> > > > Yes
> > > >
> > > > > State diagram wise both of these would represent the transition RUNNING
> > > > > -> CRASHED -> RUNNING, but somehow the platform driver needs to be able
> > > > > to specify which of these sequences to perform. Per your naming
> > > > > suggestion above, this does sound like a "autonomous_recovery" boolean
> > > > > to me.
> > > >
> > > > Right, semantically "rproc->autonomous" would apply quite well.
> > > >
> > > > In function rproc_crash_handler_work(), a call to rproc_set_sync_flag() has been
> > > > strategically placed to set the value of rproc->autonomous based on
> > > > "after_crash".  From there the core knows which rproc_ops to use.  Here too we
> > > > have to rely on the rproc_ops provided by the platform to do the right thing
> > > > based on the scenario to enact.
> > > >
> > >
> > > Do you think that autonomous_recovery would be something that changes
> > > for a given remoteproc instance? I envisioned it as something that you
> > > know at registration time, but perhaps I'm missing some details here.
> >
> > I don't envision any of the transision flags to change once they are set by the
> > platform.   The same applies to the new rproc_ops, it can be set only once.
> > Otherwise combination of possible scenarios becomes too hard to manage, leading
> > to situations where the core and MCU get out of sync and can't talk to each
> > other.
> >
>
> Sounds good, I share this expectation, just wanted to check with you.
>
> > >
> > > > >
> > > > > > These flags are there to indicate how to set rproc::sync_with_rproc after
> > > > > > different events, that is when the remoteproc core boots, when the remoteproc
> > > > > > has been stopped or when it has crashed.
> > > > > >
> > > > >
> > > > > Right, that was clear from your patches. Sorry that my reply didn't
> > > > > convey the information that I had understood this.
> > > > >
> > > > > > >
> > > > > > > > +};
> > > > > > > > +
> > > > > > > >  /**
> > > > > > > >   * struct rproc_ops - platform-specific device handlers
> > > > > > > >   * @start:       power on the device and boot it
> > > > > > > > @@ -459,6 +476,9 @@ struct rproc_dump_segment {
> > > > > > > >   * @firmware: name of firmware file to be loaded
> > > > > > > >   * @priv: private data which belongs to the platform-specific rproc module
> > > > > > > >   * @ops: platform-specific start/stop rproc handlers
> > > > > > > > + * @sync_ops: platform-specific start/stop rproc handlers when
> > > > > > > > + *             synchronising with a remote processor.
> > > > > > > > + * @sync_flags: Determine the rproc_ops to choose in specific states.
> > > > > > > >   * @dev: virtual device for refcounting and common remoteproc behavior
> > > > > > > >   * @power: refcount of users who need this rproc powered up
> > > > > > > >   * @state: state of the device
> > > > > > > > @@ -482,6 +502,7 @@ struct rproc_dump_segment {
> > > > > > > >   * @table_sz: size of @cached_table
> > > > > > > >   * @has_iommu: flag to indicate if remote processor is behind an MMU
> > > > > > > >   * @auto_boot: flag to indicate if remote processor should be auto-started
> > > > > > > > + * @sync_with_rproc: true if currently synchronising with the rproc
> > > > > > > >   * @dump_segments: list of segments in the firmware
> > > > > > > >   * @nb_vdev: number of vdev currently handled by rproc
> > > > > > > >   */
> > > > > > > > @@ -492,6 +513,8 @@ struct rproc {
> > > > > > > >   const char *firmware;
> > > > > > > >   void *priv;
> > > > > > > >   struct rproc_ops *ops;
> > > > > > > > + struct rproc_ops *sync_ops;
> > > > > > >
> > > > > > > Do we really need two rproc_ops, given that both are coming from the
> > > > > > > platform driver and the sync_flags will define which one to look at?
> > > > > > >
> > > > > > > Can't the platform driver just provide an ops table that works with the
> > > > > > > flags it passes?
> > > > > >
> > > > > > That is the approach Loic took in a previous patchset [1] and that was rejected.
> > > > > > It also lead to all of the platform drivers testing rproc->flag before carring
> > > > > > different actions, something you indicated could be done in the core.  This
> > > > > > patch does exactly that, i.e move the testing of rproc->flag to the core and
> > > > > > calls the right function based on that.
> > > > > >
> > > > >
> > > > > I think I see what you mean, as we use "start" for both syncing and
> > > > > starting the core, a { on_init = true, after_stop = false } setup either
> > > > > needs two tables or force conditionals on the platform driver.
> > > > >
> > > > > > The end result is the same and I'm happy with one or the other, I will need to
> > > > > > know which one.
> > > > > >
> > > > >
> > > > > How about adding a new ops named "attach" to rproc_ops, which the
> > > > > platform driver can specify if it supports attaching an already running
> > > > > processor?
> > > >
> > > > Using "attach_ops" works for me.  But would "autonomous_ops", to correlate with
> > > > rproc::autonomous, add clarity?  Either way work equally well for me.
> > > >
> > >
> > > What I meant was that we add a function "attach" to the existing
> > > rproc_ops. In the case of OFFLINE->RUNNING we continue to call
> > > rproc->ops->start() and DETACHED->RUNNING we call this
> > > rproc->ops->attach().
> >
> > If I read the above properly we'd end up with:
> >
> > struct rproc_ops {
> >       int (*start)(struct rproc *rproc);
> >       int (*stop)(struct rproc *rproc);
> >       int (*attach)(struct rproc *rproc);
> >       int (*detach)(struct rproc *rproc);
> >         ...
> >         ...
> > };
>
> Yes, that's what I meant.
>
> >
> > But wed'd have to deal with other operations that are common to both scenarios
> > such as parse_fw() and find_loaded_rsc_table().
> >
>
> I would prefer that we don't parse_fw(NULL), perhaps we can turn that
> upside down and have the platform_driver just provide the information to
> the core as it learns about it during probe?

I need to think about that...  We may have to discuss this again in a
not so distant future.

>
> > So far lot of improvement have already been suggested on this revision.  I
> > suggest to spin off a new patchset that only handles the DETACHED->RUNNING
> > scenario and split common functions such as rproc_fw_boot().  From there we can
> > see if other refinements (such as what you suggest above) are mandated.
> >
>
> As far as I can see, if we take the approach of introducing the DETACHED
> state we can add the various transitions piecemeal. So I'm definitely in
> favour of starting off with DETACHED->RUNNING, then figure out
> "autonomous recovery" and RUNNING->DETACHED in two subsequent series.
>
> > One last thing...  Upon reflecting on all this I think using "attach" is better
> > than "autonomous", the latter is heavy to drag around.
> >
>
> For the action of going from DETACHED->RUNNING I too find "attach" to
> better represent what's going on. The part where I think we need
> something more is to communicate if it's Linux that's in charge or not
> for taking the remote processor through RUNNING->CRASHED->RUNNING. For
> that the word "autonomous" makes sense to me, but let's bring that up
> again after landing this first piece(s).

I agree.

>
> Regards,
> Bjorn

  reply	other threads:[~2020-05-21 21:55 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-24 20:01 [PATCH v3 00/14] remoteproc: Add support for synchronisaton with rproc Mathieu Poirier
2020-04-24 20:01 ` [PATCH v3 01/14] remoteproc: Make core operations optional Mathieu Poirier
2020-04-28 16:18   ` Arnaud POULIQUEN
2020-04-30 19:39     ` Mathieu Poirier
2020-05-05 22:16   ` Bjorn Andersson
2020-05-08 19:09     ` Mathieu Poirier
2020-04-24 20:01 ` [PATCH v3 02/14] remoteproc: Introduce function rproc_alloc_internals() Mathieu Poirier
2020-05-05 22:31   ` Bjorn Andersson
2020-05-08 19:37     ` Mathieu Poirier
2020-04-24 20:01 ` [PATCH v3 03/14] remoteproc: Add new operation and flags for synchronistation Mathieu Poirier
2020-04-28 16:38   ` Arnaud POULIQUEN
2020-04-30 19:49     ` Mathieu Poirier
2020-05-06  0:22   ` Bjorn Andersson
2020-05-08 21:01     ` Mathieu Poirier
2020-05-14  1:32       ` Bjorn Andersson
2020-05-15 19:24         ` Mathieu Poirier
2020-05-19  0:55           ` Bjorn Andersson
2020-05-20 22:06             ` Mathieu Poirier
2020-05-21  5:21               ` Bjorn Andersson
2020-05-21 21:55                 ` Mathieu Poirier [this message]
2020-04-24 20:01 ` [PATCH v3 04/14] remoteproc: Refactor function rproc_boot() Mathieu Poirier
2020-04-24 20:01 ` [PATCH v3 05/14] remoteproc: Refactor function rproc_fw_boot() Mathieu Poirier
2020-05-06  0:33   ` Bjorn Andersson
2020-05-08 21:27     ` Mathieu Poirier
2020-05-14  2:10       ` Bjorn Andersson
2020-05-15 19:46         ` Mathieu Poirier
2020-05-19  0:22           ` Bjorn Andersson
2020-04-24 20:01 ` [PATCH v3 06/14] remoteproc: Refactor function rproc_trigger_auto_boot() Mathieu Poirier
2020-04-28 17:00   ` Arnaud POULIQUEN
2020-04-24 20:01 ` [PATCH v3 07/14] remoteproc: Introducting new start and stop functions Mathieu Poirier
2020-05-06  0:42   ` Bjorn Andersson
2020-04-24 20:01 ` [PATCH v3 08/14] remoteproc: Call core functions based on synchronisation flag Mathieu Poirier
2020-04-28 17:27   ` Arnaud POULIQUEN
2020-04-30 19:57     ` Mathieu Poirier
2020-05-04 11:14       ` Arnaud POULIQUEN
2020-05-05 22:10         ` Mathieu Poirier
2020-04-24 20:01 ` [PATCH v3 09/14] remoteproc: Deal with synchronisation when crashing Mathieu Poirier
2020-04-29  7:44   ` Arnaud POULIQUEN
2020-04-30 20:11     ` Mathieu Poirier
2020-05-06  1:01   ` Bjorn Andersson
2020-05-08 21:47     ` Mathieu Poirier
2020-04-24 20:01 ` [PATCH v3 10/14] remoteproc: Deal with synchronisation when shutting down Mathieu Poirier
2020-04-29  8:19   ` Arnaud POULIQUEN
2020-04-30 20:23     ` Mathieu Poirier
2020-05-04 11:34       ` Arnaud POULIQUEN
2020-05-05 22:03         ` Mathieu Poirier
2020-05-06  7:51           ` Arnaud POULIQUEN
2020-05-06  1:10   ` Bjorn Andersson
2020-04-24 20:01 ` [PATCH v3 11/14] remoteproc: Deal with synchronisation when changing FW image Mathieu Poirier
2020-04-29  8:52   ` Arnaud POULIQUEN
2020-04-30 20:32     ` Mathieu Poirier
2020-05-06  1:27   ` Bjorn Andersson
2020-04-24 20:01 ` [PATCH v3 12/14] remoteproc: Introducing function rproc_set_state_machine() Mathieu Poirier
2020-04-29  9:22   ` Arnaud POULIQUEN
2020-04-29 14:38     ` Arnaud POULIQUEN
2020-04-30 20:51       ` Mathieu Poirier
2020-05-04 12:00         ` Arnaud POULIQUEN
2020-04-30 20:42     ` Mathieu Poirier
2020-05-04 11:57       ` Arnaud POULIQUEN
2020-05-05 21:43         ` Mathieu Poirier
2020-04-24 20:01 ` [PATCH v3 13/14] remoteproc: Document " Mathieu Poirier
2020-04-24 20:01 ` [PATCH v3 14/14] remoteproc: Expose synchronisation flags via debugfs Mathieu Poirier
2020-05-18 13:28 ` [PATCH v3 00/14] remoteproc: Add support for synchronisaton with rproc Peng Fan
2020-05-18 16:29   ` Mathieu Poirier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANLsYkyQJ30tyoKctsYyy8iSxwanampEmRX1Mkoc4RyUvhvQow@mail.gmail.com \
    --to=mathieu.poirier@linaro.org \
    --cc=arnaud.pouliquen@st.com \
    --cc=bjorn.andersson@linaro.org \
    --cc=corbet@lwn.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-remoteproc@vger.kernel.org \
    --cc=loic.pallardy@st.com \
    --cc=ohad@wizery.com \
    --cc=s-anna@ti.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.