linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Vaittinen, Matti" <Matti.Vaittinen@fi.rohmeurope.com>
To: "broonie@kernel.org" <broonie@kernel.org>
Cc: "lgirdwood@gmail.com" <lgirdwood@gmail.com>,
	"angelogioacchino.delregno@somainline.org" 
	<angelogioacchino.delregno@somainline.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: short-circuit and over-current IRQs
Date: Thu, 28 Jan 2021 12:49:39 +0000	[thread overview]
Message-ID: <a399137345cebc850e5d38886a33f42af4a9c434.camel@fi.rohmeurope.com> (raw)
In-Reply-To: <20210128121019.GB4537@sirena.org.uk>


On Thu, 2021-01-28 at 12:10 +0000, Mark Brown wrote:
> On Thu, Jan 28, 2021 at 09:23:08AM +0000, Vaittinen, Matti wrote:
> > On Wed, 2021-01-27 at 16:32 +0000, Mark Brown wrote:
> > > Note that the events the API currently has are expected to be for
> > > the
> > > actual error conditions, not for the warning ones - indicating
> > > that
> > > the
> > > voltage is out of regulation for example.
> > I am unsure how to interpret this. What is the criteria of issue
> > being
> > an error/warning. When I was talking about warning I meant that the
> > issue which is detected is unexpected and abnormal (error?) - but
> > might
> > still be recoverable (warning?). I understand the regulator
> > framework
> > must not signal same events for different purposes - but I don't
> > really
> > know what the current events are used for - I am grateful for any
> > guidance!
> 
> What the majority of hardware interrupts on is situations where
> things
> have already gone out of spec and there are actual problems with the
> output - for example with current limiting there's often an actual
> limiter in there so the regulator simply won't supply any more
> current
> than is configured.  With a warning everything is still working fine
> but
> getting close to not doing so.

Sounds reasonable. Warning while things are still working - but are
getting to the boundary. Error when things are already pretty wrong.
Thanks.

> > > Well, if these things are kicking in the hardware is in serious
> > > trouble
> > > anyway so it's unclear what the system would be likely to do in
> > > software, and also unclear how safe it is to rely on software to
> > > be
> > > able
> > > to take that action given that it let things get into such a bad
> > > state
> > > in the first place.
> > Actually, bear with me but I am unsure why we have these
> > notifications
> > if we don't expect SW to be able to do anything? Wouldn't the panic
> > print be all that is needed then? I think that setups which have
> > dual
> 
> You'll notice that there aren't any actual users of this stuff in
> tree
> at the minute - people don't generally put much effort into software
> recovery as they're not expecting to be anywhere near limiting in
> normal
> operation.  What I'd expect people to do where they do implement
> handling is something like shutting down all other supplies on the
> device, possibly also trying to shut down the system as a
> whole.  Things
> more about preventing physical damage rather than being part of the
> normal operation of the system.

Again this makes sense. I will try to ask form HW colleagues what they
thought to be the action SW take (I hope they have some scenario on
mind - let's see). If they tell me that they expect SW to shut down
system gracefully - then I keep errors, if they tell me they think SW
will temporarily disable some HW blocks or do other "tricks" and later
resume normal operation - then I will see if I can add some new
'warning' indicators.

> For thermal issues systems generally try to apply software limits
> well
> before an individual component starts flagging things up with an
> interrupt, the limits that devices have are generally super high and
> often there'll be issues at a system level (eg, a case getting
> unusably
> hot) earlier and it can take a while for responses to have an impact.

I think this is also case with the BD9576 - 140 C sounds pretty hot to
me - and I expect this is really where things are already badly wrong.
So I guess I can keep the 'error' here.

> 
> > limits (one for initiating potential SW recovery - other for HW to
> > forcing protection) actually make sense. So does implementing
> > notifiers
> > / error statuses for events where SW recovery is potentially
> > helpful.
> > But whether the existing event notifications / error flags are
> > correct
> > for these is something I can't decide :) Here I ask guidance for
> > Mark &
> > others who know what is the idea behind existing error-
> > flags/events.
> 
> It's not that we shouldn't implement support for warnings, it's that
> they're not the common case for hardware and so won't line up with
> behaviour for other users.


Agreed. As I said, I understand we shouldn't send same events to
different situations. If current errors are used to indicate things are
really "wrong" to the point where safest thing is to shut down system -
then we'd better add these "warnings" to indicate that there would
potentially still be time to change something - before things are shut
off.

Thanks again!

Best regards
	Matti Vaittinen

  reply	other threads:[~2021-01-28 12:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-27 12:01 Vaittinen, Matti
2021-01-27 12:27 ` Mark Brown
2021-01-27 12:56   ` Matti Vaittinen
2021-01-27 14:34     ` AngeloGioacchino Del Regno
2021-01-27 14:42       ` Vaittinen, Matti
2021-01-27 16:32       ` Mark Brown
2021-01-28  9:23         ` Vaittinen, Matti
2021-01-28 12:10           ` Mark Brown
2021-01-28 12:49             ` Vaittinen, Matti [this message]
     [not found]             ` <a89bf6f0e6c1e4b9afe980908b7e36b70b304a96.camel@fi.rohmeurope.com>
2021-01-30 15:43               ` AngeloGioacchino Del Regno
2021-02-01  7:14                 ` Matti Vaittinen
2021-02-01 13:17                 ` Mark Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a399137345cebc850e5d38886a33f42af4a9c434.camel@fi.rohmeurope.com \
    --to=matti.vaittinen@fi.rohmeurope.com \
    --cc=angelogioacchino.delregno@somainline.org \
    --cc=broonie@kernel.org \
    --cc=lgirdwood@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --subject='Re: short-circuit and over-current IRQs' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).