linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Sudeep Holla <sudeep.holla@arm.com>
To: Cristian Marussi <cristian.marussi@arm.com>
Cc: linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, james.quinlan@broadcom.com,
	Jonathan.Cameron@Huawei.com, f.fainelli@gmail.com,
	etienne.carriere@linaro.org, Sudeep Holla <sudeep.holla@arm.com>,
	vincent.guittot@linaro.org, souvik.chakravarty@arm.com
Subject: Re: [RFC PATCH 01/10] firmware: arm_scmi: Reset properly xfer SCMI status
Date: Tue, 8 Jun 2021 12:17:08 +0100	[thread overview]
Message-ID: <20210608111708.lxgjkszrvq4au6bm@bogus> (raw)
In-Reply-To: <20210608101048.GD40811@e120937-lin>

On Tue, Jun 08, 2021 at 11:10:48AM +0100, Cristian Marussi wrote:
> Hi Sudeep,
> 
> On Mon, Jun 07, 2021 at 07:27:54PM +0100, Sudeep Holla wrote:
> > On Mon, Jun 07, 2021 at 07:01:37PM +0100, Cristian Marussi wrote:
> > > On Mon, Jun 07, 2021 at 06:38:09PM +0100, Sudeep Holla wrote:
> > > > On Sun, Jun 06, 2021 at 11:12:23PM +0100, Cristian Marussi wrote:
> > > > > When an SCMI command transfer fails due to some protocol issue an SCMI
> > > > > error code is reported inside the SCMI message payload itself and it is
> > > > > then retrieved and transcribed by the specific transport layer into the
> > > > > xfer.hdr.status field by transport specific .fetch_response().
> > > > >
> > > > > The core SCMI transport layer never explicitly reset xfer.hdr.status,
> > > > > so when an xfer is reused, if a transport misbehaved in handling such
> > > > > status field, we risk to see an invalid ghost error code.
> > > > >
> > > > > Reset xfer.hdr.status to SCMI_SUCCESS right before each transfer is
> > > > > started.
> > > > >
> > > >
> > > > Any particular reason why it can't be part of xfer_get_init which has other
> > > > initialisations ? If none, please move it there.
> > > >
> > >
> > > Well it was there initially then I moved it here.
> > >
> > > The reason is mostly the same as the reason for the other patch in this
> > > series that adds a reinit_completion() in this same point: the core does
> > > not forbid to reuse an xfer multiple times, once obtained with xfer_get()
> > > or xfer_get_init(), and indeed some protocols do such a thing: they
> > > implements such do_xfer looping and bails out on error.
> > >
> > 
> > Makes sense. But it is okay to retain xfer->transfer_id for every transfer
> > in such a loop ?
> > 
> No you are right and indeed I saw that anomaly, but I have not addressed
> it since, even if wrong, it is harmless and transfer_id is really used
> only for debugging/profiling, while the missing reinit_completion is
> potentially broken.
>

No agreed, just wanted to make it clear that if do_xfer is used in loops
the transfer_id remains same. I am fine with that.

> > > In the way that it is implemented now in protocols poses no problem
> > > indeed because the do_xfer loop bails out on error and the xfer is put,
> > > but as soon as some protocol is implemented that violates this common
> > > practice and it just keeps on reuse an xfer after an error fo other
> > > do_xfers() this breaks...so it seemed more defensive to just reinit the
> > > completion and the status before each send.
> > 
> > Fair enough. But they use it to send same message I guess, may be if it
> > gave error or something ? I would like to really know such a sequence
> > instead of assisting that 😉. 
> > 
> 
> So the current real 'looping do_xfer' behavior is safe and so this missing
> reinit is only potentially broken in the future, and we cannot really
> know now in advance about some future protocol needs, but it seems as of now
> wrong that you'll want to keep going on and reuse an xfer for the same command
> after an error in your loop.
>

Fair enough.

> On the other side we allow such behaviour, so I thought was good to
> provide a safe net if it is misused.
>

Agreed.

> But, beside this patches, that, as said, are more defensive that strictly
> needed as of now, I think now it's worth mentioning that this same 'issue'
> affects also, as an example, the new mechanism I introduced later in this
> same series to always use monotonically increasing sequence number for
> outgoing messages.
>

OK, I haven't seen that yet.

> In that case I stick to the current behavior and I assign such monotonically
> increasing sequence numbers to message during xfer_get, but the potential
> issue is the same: if a do_xfer loop is used you end up reusing the same
> seq_num for multiple do_xfers (so defeating really the mechanism itself
> that aims not to reuse immediately the most recently used seq_num).
>

I assumed the do_xfer loop is to avoid those overheads with compromise of
reusing seq_num.

> In that case I did this to keep it simple and to avoid placing more burden
> on tx path by picking and assigning a seq_num upon each transfer...but, again,
> also this behavior of picking a seq_num only at xfer_get is NOT really broken
> as of now even for do_xfer loops since we bail out on error and you won't
> really reuse that xfer.
>

OK.

> It's just that in this seq_num selection case seems to add a lot of burden
> and complexity if moved to the do_xfer phase, while status/reinit seemed
> to me cheaper to move it in the do_xfer so I tried to play defensive.
>

I assumed the same as mentioned above.

> At the end, in general I would say that all of these ops (status/reinit/
> seq_nums/transfer_id) DO really belong logically to the do_xfer phase more than
> to the xfer_get/xfer_get_init, but in reality we can cope with having them
> @xfer_get/get_init and this keeps things simple and reduce burden, especially
> in the monotonic seq_nums case: so I am not so sure anymore if it is fine to
> move reinit/status to the do_xfer, as proposed here, while keeping seq_nums
> (for good reasons) to the xfer_get phase, because we'd use 2 different strategies
> to address similar issues.
>

I almost agreed with the change just to read here you think otherwise now 😄.

> I would say: just keep reinit and status in the xfer_get phase instead and
> maybe warn somehow if a failed xfer is detected being reused. (but this
> would anyway need a check in every tx transaction to see if status != SUCCESS
> so is it worth ?)

I have started thinking why do we need to reset the status. Since it is
always read from the shmem, do we really have to ?

--
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-06-08 11:25 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-06 22:12 [RFC PATCH 00/10] Introduce SCMI transport atomic support Cristian Marussi
2021-06-06 22:12 ` [RFC PATCH 01/10] firmware: arm_scmi: Reset properly xfer SCMI status Cristian Marussi
2021-06-07 17:38   ` Sudeep Holla
2021-06-07 18:01     ` Cristian Marussi
2021-06-07 18:27       ` Sudeep Holla
2021-06-08 10:10         ` Cristian Marussi
2021-06-08 11:17           ` Sudeep Holla [this message]
2021-06-06 22:12 ` [RFC PATCH 02/10] firmware: arm_scmi: Add missing xfer reinit_completion Cristian Marussi
2021-06-07 17:42   ` Sudeep Holla
2021-06-07 18:04     ` Cristian Marussi
2021-06-07 18:30       ` Sudeep Holla
2021-06-09 20:51   ` Sudeep Holla
2021-06-06 22:12 ` [RFC PATCH 03/10] firmware: arm_scmi: Add configurable polling mode for transports Cristian Marussi
2021-06-06 22:12 ` [RFC PATCH 04/10] firmware: arm_scmi: Add support for atomic transports Cristian Marussi
2021-06-06 22:12 ` [RFC PATCH 05/10] include: trace: Add new scmi_xfer_response_wait event Cristian Marussi
2021-06-06 22:12 ` [RFC PATCH 06/10] firmware: arm_scmi: Use new trace event scmi_xfer_response_wait Cristian Marussi
2021-06-06 22:12 ` [RFC PATCH 07/10] firmware: arm_scmi: Add is_transport_atomic() handle method Cristian Marussi
2021-06-06 22:12 ` [RFC PATCH 08/10] clk: scmi: Support atomic enable/disable API Cristian Marussi
2021-06-06 22:12 ` [RFC PATCH 09/10] firmware: arm-scmi: Make smc transport use common completions Cristian Marussi
2021-06-06 22:12 ` [RFC PATCH 10/10] firmware: arm-scmi: Make smc transport atomic Cristian Marussi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210608111708.lxgjkszrvq4au6bm@bogus \
    --to=sudeep.holla@arm.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=cristian.marussi@arm.com \
    --cc=etienne.carriere@linaro.org \
    --cc=f.fainelli@gmail.com \
    --cc=james.quinlan@broadcom.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=souvik.chakravarty@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).