From: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
To: Kees Cook <keescook@chromium.org>,
Andy Shevchenko <andy.shevchenko@gmail.com>,
Zhang Rui <rui.zhang@intel.com>,
Guenter Roeck <linux@roeck-us.net>
Cc: "agross@kernel.org" <agross@kernel.org>,
"broonie@kernel.org" <broonie@kernel.org>,
"devicetree@vger.kernel.org" <devicetree@vger.kernel.org>,
linux-power <linux-power@fi.rohmeurope.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-renesas-soc@vger.kernel.org"
<linux-renesas-soc@vger.kernel.org>,
"linux-arm-msm@vger.kernel.org" <linux-arm-msm@vger.kernel.org>,
"bjorn.andersson@linaro.org" <bjorn.andersson@linaro.org>,
"lgirdwood@gmail.com" <lgirdwood@gmail.com>,
"robh+dt@kernel.org" <robh+dt@kernel.org>
Subject: Re: [PATCH v4 3/7] regulator: IRQ based event/error notification helpers
Date: Mon, 12 Apr 2021 15:24:16 +0300 [thread overview]
Message-ID: <882c4561ebc20313098312bb9cfae60736d69475.camel@fi.rohmeurope.com> (raw)
In-Reply-To: <dbd6a71b1b907de004d23d2ea4b15045320f1ae1.camel@fi.rohmeurope.com>
On Fri, 2021-04-09 at 10:08 +0300, Matti Vaittinen wrote:
> On Thu, 2021-04-08 at 20:20 -0700, Kees Cook wrote:
> > On Wed, Apr 07, 2021 at 03:50:15PM +0300, Andy Shevchenko wrote:
> > > On Wed, Apr 7, 2021 at 12:49 PM Vaittinen, Matti
> > > <Matti.Vaittinen@fi.rohmeurope.com> wrote:
> > > > On Wed, 2021-04-07 at 12:10 +0300, Andy Shevchenko wrote:
> > > > > On Wed, Apr 7, 2021 at 8:02 AM Matti Vaittinen
> > > > > <matti.vaittinen@fi.rohmeurope.com> wrote:
> > > > > > On Wed, 2021-04-07 at 01:44 +0300, Andy Shevchenko wrote:
> > > > > > > On Tuesday, April 6, 2021, Matti Vaittinen <
> > > > > > > matti.vaittinen@fi.rohmeurope.com> wrote:
> > > > > > > > + BUG();
> > > > > > > > +}
> >
> > This, though, are you sure you want to use BUG()? Linus gets upset
> > about
> > such things:
> > https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on
> >
>
> I see. I am unsure of what would be the best action in the regulator
> case we are handling here. To give the context, we assume here a
> situation where power has gone out of regulation and the hardware is
> probably failing. First countermeasure to protect what is left of HW
> is
> to shut-down the failing regulator. BUG() was called here as a last
> resort if shutting the power via regulator interface was not
> implemented or working.
>
> Eg, we try to take what ever last measure we can to minimize the HW
> damage - and BUG() was used for this in the qcom driver where I stole
> the idea. Judging the comment related to BUG() in asm-generic/bug.h
>
> /*
> * Don't use BUG() or BUG_ON() unless there's really no way out; one
>
> * example might be detecting data structure corruption in the middle
> *
> of an operation that can't be backed out of. If the (sub)system
> * can
> somehow continue operating, perhaps with reduced functionality,
> * it's
> probably not BUG-worthy.
> *
> * If you're tempted to BUG(), think
> again: is completely giving up
> * really the *only* solution? There
> are usually better options, where
> * users don't need to reboot ASAP and
> can mostly shut down cleanly.
> */
> https://elixir.bootlin.com/linux/v5.12-rc6/source/include/asm-generic/bug.h#L55
>
> this really might be valid use-case.
>
> To me the real question is what happens after the BUG() - and if
> there
> is any generic handling or if it is platform/board specific? Does it
> actually have any chance to save the HW?
>
> Mark already pointed that we might need to figure a way to punt a
> "failing event" to the user-space to initiate better "safety
> shutdown".
> Such event does not currently exist so I think the main use-case here
> is to do logging and potentially prevent enabling any further actions
> in the failing HW.
>
> So - any better suggestions?
>
Maybe we should take same approach as is taken in thermal_core? Quoting
the thermal documentation:
"On an event of critical trip temperature crossing. Thermal
framework
allows the system to shutdown gracefully by calling
orderly_poweroff().
In the event of a failure of orderly_poweroff() to shut down the
system
we are in danger of keeping the system alive at undesirably
high
temperatures. To mitigate this high risk scenario we program a
work
queue to fire after a pre-determined number of seconds to
start
an emergency shutdown of the device using the
kernel_power_off()
function. In case kernel_power_off() fails then
finally
emergency_restart() is called in the worst case."
Maybe this 'hardware protection, in-kernel, emergency HW saving
shutdown' - logic, should be pulled out of thermal_core.c (or at least
exported) for (other parts like) the regulators to use?
I don't like the idea relying in the user-space to be in shape it can
handle the situation. I may be mistaken but I think a quick action
might be required. Hence the in-kernel handling does not sound so bad
to me.
I am open to all education and suggestions. Meanwhile I am planning to
just convert the BUG() to WARN(). I don't claim I know how BUG() is
implemented on each platform - but my understanding is that it does not
guarantee any power to be cut but just halts the calling process(?). I
guess this does not guarantee what happens next - maybe it even keeps
the power enabled and end up just deadlocking the system by reserved
locks? I think thermal guys have been pondering this scenario for
severe temperature protection shutdown so I would like to hear your
opinions.
Best Regards
Matti Vaittinen
next prev parent reply other threads:[~2021-04-12 12:24 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-06 7:12 [PATCH v4 0/7] Extend regulator notification support Matti Vaittinen
2021-04-06 7:12 ` [PATCH v4 1/7] dt_bindings: Add protection limit properties Matti Vaittinen
2021-04-06 7:12 ` [PATCH v4 2/7] regulator: add warning flags Matti Vaittinen
2021-04-06 7:13 ` [PATCH v4 3/7] regulator: IRQ based event/error notification helpers Matti Vaittinen
2021-04-06 10:27 ` kernel test robot
2021-04-06 10:57 ` Matti Vaittinen
2021-04-06 12:08 ` kernel test robot
[not found] ` <CAHp75VeoTVNDemV0qRA4BTVqOVfyR9UKGWhHgfeat8zVVGcu_Q@mail.gmail.com>
2021-04-07 5:02 ` Matti Vaittinen
2021-04-07 9:10 ` Andy Shevchenko
2021-04-07 9:49 ` Vaittinen, Matti
2021-04-07 12:50 ` Andy Shevchenko
2021-04-09 3:20 ` Kees Cook
2021-04-09 7:08 ` Vaittinen, Matti
2021-04-12 12:24 ` Matti Vaittinen [this message]
2021-04-12 13:09 ` Mark Brown
2021-04-06 7:13 ` [PATCH v4 4/7] regulator: add property parsing and callbacks to set protection limits Matti Vaittinen
2021-04-06 7:14 ` [PATCH v4 5/7] dt-bindings: regulator: bd9576 add FET ON-resistance for OCW Matti Vaittinen
2021-04-06 7:14 ` [PATCH v4 6/7] regulator: bd9576: Support error reporting Matti Vaittinen
2021-04-06 7:16 ` [PATCH v4 7/7] regulator: bd9576: Fix the driver name in id table Matti Vaittinen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=882c4561ebc20313098312bb9cfae60736d69475.camel@fi.rohmeurope.com \
--to=matti.vaittinen@fi.rohmeurope.com \
--cc=agross@kernel.org \
--cc=andy.shevchenko@gmail.com \
--cc=bjorn.andersson@linaro.org \
--cc=broonie@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=keescook@chromium.org \
--cc=lgirdwood@gmail.com \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-power@fi.rohmeurope.com \
--cc=linux-renesas-soc@vger.kernel.org \
--cc=linux@roeck-us.net \
--cc=robh+dt@kernel.org \
--cc=rui.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).