linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
To: Kees Cook <keescook@chromium.org>,
	Andy Shevchenko <andy.shevchenko@gmail.com>,
	Zhang Rui <rui.zhang@intel.com>,
	Guenter Roeck <linux@roeck-us.net>
Cc: "agross@kernel.org" <agross@kernel.org>,
	"broonie@kernel.org" <broonie@kernel.org>,
	"devicetree@vger.kernel.org" <devicetree@vger.kernel.org>,
	linux-power <linux-power@fi.rohmeurope.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-renesas-soc@vger.kernel.org" 
	<linux-renesas-soc@vger.kernel.org>,
	"linux-arm-msm@vger.kernel.org" <linux-arm-msm@vger.kernel.org>,
	"bjorn.andersson@linaro.org" <bjorn.andersson@linaro.org>,
	"lgirdwood@gmail.com" <lgirdwood@gmail.com>,
	"robh+dt@kernel.org" <robh+dt@kernel.org>
Subject: Re: [PATCH v4 3/7] regulator: IRQ based event/error notification helpers
Date: Mon, 12 Apr 2021 15:24:16 +0300	[thread overview]
Message-ID: <882c4561ebc20313098312bb9cfae60736d69475.camel@fi.rohmeurope.com> (raw)
In-Reply-To: <dbd6a71b1b907de004d23d2ea4b15045320f1ae1.camel@fi.rohmeurope.com>


On Fri, 2021-04-09 at 10:08 +0300, Matti Vaittinen wrote:
> On Thu, 2021-04-08 at 20:20 -0700, Kees Cook wrote:
> > On Wed, Apr 07, 2021 at 03:50:15PM +0300, Andy Shevchenko wrote:
> > > On Wed, Apr 7, 2021 at 12:49 PM Vaittinen, Matti
> > > <Matti.Vaittinen@fi.rohmeurope.com> wrote:
> > > > On Wed, 2021-04-07 at 12:10 +0300, Andy Shevchenko wrote:
> > > > > On Wed, Apr 7, 2021 at 8:02 AM Matti Vaittinen
> > > > > <matti.vaittinen@fi.rohmeurope.com> wrote:
> > > > > > On Wed, 2021-04-07 at 01:44 +0300, Andy Shevchenko wrote:
> > > > > > > On Tuesday, April 6, 2021, Matti Vaittinen <
> > > > > > > matti.vaittinen@fi.rohmeurope.com> wrote:
> > > > > > > > +       BUG();
> > > > > > > > +}
> > 
> > This, though, are you sure you want to use BUG()? Linus gets upset
> > about
> > such things:
> > https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on
> > 
> 
> I see. I am unsure of what would be the best action in the regulator
> case we are handling here. To give the context, we assume here a
> situation where power has gone out of regulation and the hardware is
> probably failing. First countermeasure to protect what is left of HW
> is
> to shut-down the failing regulator. BUG() was called here as a last
> resort if shutting the power via regulator interface was not
> implemented or working.
> 
> Eg, we try to take what ever last measure we can to minimize the HW
> damage - and BUG() was used for this in the qcom driver where I stole
> the idea. Judging the comment related to BUG() in asm-generic/bug.h
> 
> /*
>  * Don't use BUG() or BUG_ON() unless there's really no way out; one
>  
> * example might be detecting data structure corruption in the middle
>  *
> of an operation that can't be backed out of.  If the (sub)system
>  * can
> somehow continue operating, perhaps with reduced functionality,
>  * it's
> probably not BUG-worthy.
>  *
>  * If you're tempted to BUG(), think
> again:  is completely giving up
>  * really the *only* solution?  There
> are usually better options, where
>  * users don't need to reboot ASAP and
> can mostly shut down cleanly.
>  */
> https://elixir.bootlin.com/linux/v5.12-rc6/source/include/asm-generic/bug.h#L55
> 
> this really might be valid use-case.
> 
> To me the real question is what happens after the BUG() - and if
> there
> is any generic handling or if it is platform/board specific? Does it
> actually have any chance to save the HW?
> 
> Mark already pointed that we might need to figure a way to punt a
> "failing event" to the user-space to initiate better "safety
> shutdown".
> Such event does not currently exist so I think the main use-case here
> is to do logging and potentially prevent enabling any further actions
> in the failing HW.
> 
> So - any better suggestions?
> 

Maybe we should take same approach as is taken in thermal_core? Quoting
the thermal documentation:

"On an event of critical trip temperature crossing. Thermal
framework             
allows the system to shutdown gracefully by calling
orderly_poweroff().          
In the event of a failure of orderly_poweroff() to shut down the
system          
we are in danger of keeping the system alive at undesirably
high                 
temperatures. To mitigate this high risk scenario we program a
work              
queue to fire after a pre-determined number of seconds to
start                  
an emergency shutdown of the device using the
kernel_power_off()                 
function. In case kernel_power_off() fails then
finally                          
emergency_restart() is called in the worst case."

Maybe this 'hardware protection, in-kernel, emergency HW saving
shutdown' - logic, should be pulled out of thermal_core.c (or at least
exported) for (other parts like) the regulators to use?

I don't like the idea relying in the user-space to be in shape it can
handle the situation. I may be mistaken but I think a quick action
might be required. Hence the in-kernel handling does not sound so bad
to me.

I am open to all education and suggestions. Meanwhile I am planning to
just convert the BUG() to WARN(). I don't claim I know how BUG() is
implemented on each platform - but my understanding is that it does not
guarantee any power to be cut but just halts the calling process(?). I
guess this does not guarantee what happens next - maybe it even keeps
the power enabled and end up just deadlocking the system by reserved
locks? I think thermal guys have been pondering this scenario for
severe temperature protection shutdown so I would like to hear your
opinions.


Best Regards
Matti Vaittinen


  reply	other threads:[~2021-04-12 12:24 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-06  7:12 [PATCH v4 0/7] Extend regulator notification support Matti Vaittinen
2021-04-06  7:12 ` [PATCH v4 1/7] dt_bindings: Add protection limit properties Matti Vaittinen
2021-04-06  7:12 ` [PATCH v4 2/7] regulator: add warning flags Matti Vaittinen
2021-04-06  7:13 ` [PATCH v4 3/7] regulator: IRQ based event/error notification helpers Matti Vaittinen
2021-04-06 10:27   ` kernel test robot
2021-04-06 10:57     ` Matti Vaittinen
2021-04-06 12:08   ` kernel test robot
     [not found]   ` <CAHp75VeoTVNDemV0qRA4BTVqOVfyR9UKGWhHgfeat8zVVGcu_Q@mail.gmail.com>
2021-04-07  5:02     ` Matti Vaittinen
2021-04-07  9:10       ` Andy Shevchenko
2021-04-07  9:49         ` Vaittinen, Matti
2021-04-07 12:50           ` Andy Shevchenko
2021-04-09  3:20             ` Kees Cook
2021-04-09  7:08               ` Vaittinen, Matti
2021-04-12 12:24                 ` Matti Vaittinen [this message]
2021-04-12 13:09                   ` Mark Brown
2021-04-06  7:13 ` [PATCH v4 4/7] regulator: add property parsing and callbacks to set protection limits Matti Vaittinen
2021-04-06  7:14 ` [PATCH v4 5/7] dt-bindings: regulator: bd9576 add FET ON-resistance for OCW Matti Vaittinen
2021-04-06  7:14 ` [PATCH v4 6/7] regulator: bd9576: Support error reporting Matti Vaittinen
2021-04-06  7:16 ` [PATCH v4 7/7] regulator: bd9576: Fix the driver name in id table Matti Vaittinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=882c4561ebc20313098312bb9cfae60736d69475.camel@fi.rohmeurope.com \
    --to=matti.vaittinen@fi.rohmeurope.com \
    --cc=agross@kernel.org \
    --cc=andy.shevchenko@gmail.com \
    --cc=bjorn.andersson@linaro.org \
    --cc=broonie@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=keescook@chromium.org \
    --cc=lgirdwood@gmail.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-power@fi.rohmeurope.com \
    --cc=linux-renesas-soc@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=robh+dt@kernel.org \
    --cc=rui.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).