All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
To: Kees Cook <keescook@chromium.org>,
	Andy Shevchenko <andy.shevchenko@gmail.com>,
	Zhang Rui <rui.zhang@intel.com>,
	Guenter Roeck <linux@roeck-us.net>
Cc: "agross@kernel.org" <agross@kernel.org>,
	"broonie@kernel.org" <broonie@kernel.org>,
	"devicetree@vger.kernel.org" <devicetree@vger.kernel.org>,
	linux-power <linux-power@fi.rohmeurope.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-renesas-soc@vger.kernel.org" 
	<linux-renesas-soc@vger.kernel.org>,
	"linux-arm-msm@vger.kernel.org" <linux-arm-msm@vger.kernel.org>,
	"bjorn.andersson@linaro.org" <bjorn.andersson@linaro.org>,
	"lgirdwood@gmail.com" <lgirdwood@gmail.com>,
	"robh+dt@kernel.org" <robh+dt@kernel.org>
Subject: Re: [PATCH v4 3/7] regulator: IRQ based event/error notification helpers
Date: Mon, 12 Apr 2021 15:24:16 +0300	[thread overview]
Message-ID: <882c4561ebc20313098312bb9cfae60736d69475.camel@fi.rohmeurope.com> (raw)
In-Reply-To: <dbd6a71b1b907de004d23d2ea4b15045320f1ae1.camel@fi.rohmeurope.com>


On Fri, 2021-04-09 at 10:08 +0300, Matti Vaittinen wrote:
> On Thu, 2021-04-08 at 20:20 -0700, Kees Cook wrote:
> > On Wed, Apr 07, 2021 at 03:50:15PM +0300, Andy Shevchenko wrote:
> > > On Wed, Apr 7, 2021 at 12:49 PM Vaittinen, Matti
> > > <Matti.Vaittinen@fi.rohmeurope.com> wrote:
> > > > On Wed, 2021-04-07 at 12:10 +0300, Andy Shevchenko wrote:
> > > > > On Wed, Apr 7, 2021 at 8:02 AM Matti Vaittinen
> > > > > <matti.vaittinen@fi.rohmeurope.com> wrote:
> > > > > > On Wed, 2021-04-07 at 01:44 +0300, Andy Shevchenko wrote:
> > > > > > > On Tuesday, April 6, 2021, Matti Vaittinen <
> > > > > > > matti.vaittinen@fi.rohmeurope.com> wrote:
> > > > > > > > +       BUG();
> > > > > > > > +}
> > 
> > This, though, are you sure you want to use BUG()? Linus gets upset
> > about
> > such things:
> > https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on
> > 
> 
> I see. I am unsure of what would be the best action in the regulator
> case we are handling here. To give the context, we assume here a
> situation where power has gone out of regulation and the hardware is
> probably failing. First countermeasure to protect what is left of HW
> is
> to shut-down the failing regulator. BUG() was called here as a last
> resort if shutting the power via regulator interface was not
> implemented or working.
> 
> Eg, we try to take what ever last measure we can to minimize the HW
> damage - and BUG() was used for this in the qcom driver where I stole
> the idea. Judging the comment related to BUG() in asm-generic/bug.h
> 
> /*
>  * Don't use BUG() or BUG_ON() unless there's really no way out; one
>  
> * example might be detecting data structure corruption in the middle
>  *
> of an operation that can't be backed out of.  If the (sub)system
>  * can
> somehow continue operating, perhaps with reduced functionality,
>  * it's
> probably not BUG-worthy.
>  *
>  * If you're tempted to BUG(), think
> again:  is completely giving up
>  * really the *only* solution?  There
> are usually better options, where
>  * users don't need to reboot ASAP and
> can mostly shut down cleanly.
>  */
> https://elixir.bootlin.com/linux/v5.12-rc6/source/include/asm-generic/bug.h#L55
> 
> this really might be valid use-case.
> 
> To me the real question is what happens after the BUG() - and if
> there
> is any generic handling or if it is platform/board specific? Does it
> actually have any chance to save the HW?
> 
> Mark already pointed that we might need to figure a way to punt a
> "failing event" to the user-space to initiate better "safety
> shutdown".
> Such event does not currently exist so I think the main use-case here
> is to do logging and potentially prevent enabling any further actions
> in the failing HW.
> 
> So - any better suggestions?
> 

Maybe we should take same approach as is taken in thermal_core? Quoting
the thermal documentation:

"On an event of critical trip temperature crossing. Thermal
framework             
allows the system to shutdown gracefully by calling
orderly_poweroff().          
In the event of a failure of orderly_poweroff() to shut down the
system          
we are in danger of keeping the system alive at undesirably
high                 
temperatures. To mitigate this high risk scenario we program a
work              
queue to fire after a pre-determined number of seconds to
start                  
an emergency shutdown of the device using the
kernel_power_off()                 
function. In case kernel_power_off() fails then
finally                          
emergency_restart() is called in the worst case."

Maybe this 'hardware protection, in-kernel, emergency HW saving
shutdown' - logic, should be pulled out of thermal_core.c (or at least
exported) for (other parts like) the regulators to use?

I don't like the idea relying in the user-space to be in shape it can
handle the situation. I may be mistaken but I think a quick action
might be required. Hence the in-kernel handling does not sound so bad
to me.

I am open to all education and suggestions. Meanwhile I am planning to
just convert the BUG() to WARN(). I don't claim I know how BUG() is
implemented on each platform - but my understanding is that it does not
guarantee any power to be cut but just halts the calling process(?). I
guess this does not guarantee what happens next - maybe it even keeps
the power enabled and end up just deadlocking the system by reserved
locks? I think thermal guys have been pondering this scenario for
severe temperature protection shutdown so I would like to hear your
opinions.


Best Regards
Matti Vaittinen


  reply	other threads:[~2021-04-12 12:24 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-06  7:12 [PATCH v4 0/7] Extend regulator notification support Matti Vaittinen
2021-04-06  7:12 ` [PATCH v4 1/7] dt_bindings: Add protection limit properties Matti Vaittinen
2021-04-06  7:12 ` [PATCH v4 2/7] regulator: add warning flags Matti Vaittinen
2021-04-06  7:13 ` [PATCH v4 3/7] regulator: IRQ based event/error notification helpers Matti Vaittinen
2021-04-06 10:27   ` kernel test robot
2021-04-06 10:27     ` kernel test robot
2021-04-06 10:57     ` Matti Vaittinen
2021-04-06 10:57       ` Matti Vaittinen
2021-04-06 12:08   ` kernel test robot
2021-04-06 12:08     ` kernel test robot
     [not found]   ` <CAHp75VeoTVNDemV0qRA4BTVqOVfyR9UKGWhHgfeat8zVVGcu_Q@mail.gmail.com>
2021-04-07  5:02     ` Matti Vaittinen
2021-04-07  9:10       ` Andy Shevchenko
2021-04-07  9:49         ` Vaittinen, Matti
2021-04-07 12:50           ` Andy Shevchenko
2021-04-09  3:20             ` Kees Cook
2021-04-09  7:08               ` Vaittinen, Matti
2021-04-12 12:24                 ` Matti Vaittinen [this message]
2021-04-12 13:09                   ` Mark Brown
2021-04-06  7:13 ` [PATCH v4 4/7] regulator: add property parsing and callbacks to set protection limits Matti Vaittinen
2021-04-06  7:14 ` [PATCH v4 5/7] dt-bindings: regulator: bd9576 add FET ON-resistance for OCW Matti Vaittinen
2021-04-06  7:14 ` [PATCH v4 6/7] regulator: bd9576: Support error reporting Matti Vaittinen
2021-04-06  7:16 ` [PATCH v4 7/7] regulator: bd9576: Fix the driver name in id table Matti Vaittinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=882c4561ebc20313098312bb9cfae60736d69475.camel@fi.rohmeurope.com \
    --to=matti.vaittinen@fi.rohmeurope.com \
    --cc=agross@kernel.org \
    --cc=andy.shevchenko@gmail.com \
    --cc=bjorn.andersson@linaro.org \
    --cc=broonie@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=keescook@chromium.org \
    --cc=lgirdwood@gmail.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-power@fi.rohmeurope.com \
    --cc=linux-renesas-soc@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=robh+dt@kernel.org \
    --cc=rui.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.