All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] cpufreq: Stop BUGing the system
@ 2014-12-17 15:51 ` Nishanth Menon
  0 siblings, 0 replies; 10+ messages in thread
From: Nishanth Menon @ 2014-12-17 15:51 UTC (permalink / raw)
  To: Viresh Kumar, Rafael J. Wysocki; +Cc: linux-kernel, linux-pm, Nishanth Menon

CPUFRreq subsystem is not a system catastrophic failure point.
Failures in these cases DONOT need complete system shutdown with BUG.
just refuse to let cpufreq function should be good enough.

Signed-off-by: Nishanth Menon <nm@ti.com>
---
 drivers/cpufreq/cpufreq.c |   17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index a09a29c..a5aa2fa 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -281,7 +281,10 @@ static inline void adjust_jiffies(unsigned long val, struct cpufreq_freqs *ci)
 static void __cpufreq_notify_transition(struct cpufreq_policy *policy,
 		struct cpufreq_freqs *freqs, unsigned int state)
 {
-	BUG_ON(irqs_disabled());
+	if (irqs_disabled()) {
+		WARN(1, "IRQs disabled!\n");
+		return;
+	}
 
 	if (cpufreq_disabled())
 		return;
@@ -1253,9 +1256,12 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 			/*
 			 * Reaching here after boot in a few seconds may not
 			 * mean that system will remain stable at "unknown"
-			 * frequency for longer duration. Hence, a BUG_ON().
+			 * frequency for longer duration. Hence, a WARN().
 			 */
-			BUG_ON(ret);
+			if (ret) {
+				WARN(1, "SYSTEM operating at invalid freq %u", policy->cur);
+				goto err_out_unregister;
+			}
 			pr_warn("%s: CPU%d: Unlisted initial frequency changed to: %u KHz\n",
 				__func__, policy->cpu, policy->cur);
 		}
@@ -2556,7 +2562,10 @@ static int __init cpufreq_core_init(void)
 		return -ENODEV;
 
 	cpufreq_global_kobject = kobject_create();
-	BUG_ON(!cpufreq_global_kobject);
+	if (!cpufreq_global_kobject) {
+		WARN(1, "No memory for cpufreq_global_kobject\n");
+		return -ENOMEM;
+	}
 
 	return 0;
 }
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH] cpufreq: Stop BUGing the system
@ 2014-12-17 15:51 ` Nishanth Menon
  0 siblings, 0 replies; 10+ messages in thread
From: Nishanth Menon @ 2014-12-17 15:51 UTC (permalink / raw)
  To: Viresh Kumar, Rafael J. Wysocki; +Cc: linux-kernel, linux-pm, Nishanth Menon

CPUFRreq subsystem is not a system catastrophic failure point.
Failures in these cases DONOT need complete system shutdown with BUG.
just refuse to let cpufreq function should be good enough.

Signed-off-by: Nishanth Menon <nm@ti.com>
---
 drivers/cpufreq/cpufreq.c |   17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index a09a29c..a5aa2fa 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -281,7 +281,10 @@ static inline void adjust_jiffies(unsigned long val, struct cpufreq_freqs *ci)
 static void __cpufreq_notify_transition(struct cpufreq_policy *policy,
 		struct cpufreq_freqs *freqs, unsigned int state)
 {
-	BUG_ON(irqs_disabled());
+	if (irqs_disabled()) {
+		WARN(1, "IRQs disabled!\n");
+		return;
+	}
 
 	if (cpufreq_disabled())
 		return;
@@ -1253,9 +1256,12 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 			/*
 			 * Reaching here after boot in a few seconds may not
 			 * mean that system will remain stable at "unknown"
-			 * frequency for longer duration. Hence, a BUG_ON().
+			 * frequency for longer duration. Hence, a WARN().
 			 */
-			BUG_ON(ret);
+			if (ret) {
+				WARN(1, "SYSTEM operating at invalid freq %u", policy->cur);
+				goto err_out_unregister;
+			}
 			pr_warn("%s: CPU%d: Unlisted initial frequency changed to: %u KHz\n",
 				__func__, policy->cpu, policy->cur);
 		}
@@ -2556,7 +2562,10 @@ static int __init cpufreq_core_init(void)
 		return -ENODEV;
 
 	cpufreq_global_kobject = kobject_create();
-	BUG_ON(!cpufreq_global_kobject);
+	if (!cpufreq_global_kobject) {
+		WARN(1, "No memory for cpufreq_global_kobject\n");
+		return -ENOMEM;
+	}
 
 	return 0;
 }
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] cpufreq: Stop BUGing the system
  2014-12-17 15:51 ` Nishanth Menon
@ 2014-12-17 19:16   ` Kevin Hilman
  -1 siblings, 0 replies; 10+ messages in thread
From: Kevin Hilman @ 2014-12-17 19:16 UTC (permalink / raw)
  To: Nishanth Menon; +Cc: Viresh Kumar, Rafael J. Wysocki, linux-kernel, linux-pm

Nishanth Menon <nm@ti.com> writes:

> CPUFRreq subsystem is not a system catastrophic failure point.
> Failures in these cases DONOT need complete system shutdown with BUG.
> just refuse to let cpufreq function should be good enough.
>
> Signed-off-by: Nishanth Menon <nm@ti.com>

Acked-by: Kevin Hilman <khilman@linaro.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cpufreq: Stop BUGing the system
@ 2014-12-17 19:16   ` Kevin Hilman
  0 siblings, 0 replies; 10+ messages in thread
From: Kevin Hilman @ 2014-12-17 19:16 UTC (permalink / raw)
  To: Nishanth Menon; +Cc: Viresh Kumar, Rafael J. Wysocki, linux-kernel, linux-pm

Nishanth Menon <nm@ti.com> writes:

> CPUFRreq subsystem is not a system catastrophic failure point.
> Failures in these cases DONOT need complete system shutdown with BUG.
> just refuse to let cpufreq function should be good enough.
>
> Signed-off-by: Nishanth Menon <nm@ti.com>

Acked-by: Kevin Hilman <khilman@linaro.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cpufreq: Stop BUGing the system
  2014-12-17 15:51 ` Nishanth Menon
  (?)
  (?)
@ 2014-12-18  2:08 ` Viresh Kumar
  2014-12-18 14:49   ` Nishanth Menon
  -1 siblings, 1 reply; 10+ messages in thread
From: Viresh Kumar @ 2014-12-18  2:08 UTC (permalink / raw)
  To: Nishanth Menon, Kevin Hilman
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, linux-pm

On 17 December 2014 at 21:21, Nishanth Menon <nm@ti.com> wrote:
> CPUFRreq subsystem is not a system catastrophic failure point.
> Failures in these cases DONOT need complete system shutdown with BUG.
> just refuse to let cpufreq function should be good enough.
>
> Signed-off-by: Nishanth Menon <nm@ti.com>
> ---
>  drivers/cpufreq/cpufreq.c |   17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index a09a29c..a5aa2fa 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -281,7 +281,10 @@ static inline void adjust_jiffies(unsigned long val, struct cpufreq_freqs *ci)
>  static void __cpufreq_notify_transition(struct cpufreq_policy *policy,
>                 struct cpufreq_freqs *freqs, unsigned int state)
>  {
> -       BUG_ON(irqs_disabled());
> +       if (irqs_disabled()) {
> +               WARN(1, "IRQs disabled!\n");
> +               return;
> +       }

What about:

> +               if (WARN(irqs_disabled(), "IRQs disabled!\n")
> +                       return;

Same for the last change as well..

>
>         if (cpufreq_disabled())
>                 return;
> @@ -1253,9 +1256,12 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
>                         /*
>                          * Reaching here after boot in a few seconds may not
>                          * mean that system will remain stable at "unknown"
> -                        * frequency for longer duration. Hence, a BUG_ON().
> +                        * frequency for longer duration. Hence, a WARN().
>                          */
> -                       BUG_ON(ret);
> +                       if (ret) {
> +                               WARN(1, "SYSTEM operating at invalid freq %u", policy->cur);
> +                               goto err_out_unregister;
> +                       }

And I still don't agree for this one. We shouldn't keep on working on a
potentially unstable frequency.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cpufreq: Stop BUGing the system
  2014-12-18  2:08 ` Viresh Kumar
@ 2014-12-18 14:49   ` Nishanth Menon
  2014-12-19  1:41     ` Viresh Kumar
  0 siblings, 1 reply; 10+ messages in thread
From: Nishanth Menon @ 2014-12-18 14:49 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Kevin Hilman, Rafael J. Wysocki, Linux Kernel Mailing List, linux-pm

On 07:38-20141218, Viresh Kumar wrote:
> On 17 December 2014 at 21:21, Nishanth Menon <nm@ti.com> wrote:
> > CPUFRreq subsystem is not a system catastrophic failure point.
> > Failures in these cases DONOT need complete system shutdown with BUG.
> > just refuse to let cpufreq function should be good enough.
> >
> > Signed-off-by: Nishanth Menon <nm@ti.com>
> > ---
> >  drivers/cpufreq/cpufreq.c |   17 +++++++++++++----
> >  1 file changed, 13 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index a09a29c..a5aa2fa 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -281,7 +281,10 @@ static inline void adjust_jiffies(unsigned long val, struct cpufreq_freqs *ci)
> >  static void __cpufreq_notify_transition(struct cpufreq_policy *policy,
> >                 struct cpufreq_freqs *freqs, unsigned int state)
> >  {
> > -       BUG_ON(irqs_disabled());
> > +       if (irqs_disabled()) {
> > +               WARN(1, "IRQs disabled!\n");
> > +               return;
> > +       }
> 
> What about:
> 
> > +               if (WARN(irqs_disabled(), "IRQs disabled!\n")
> > +                       return;
> 
> Same for the last change as well..

k.
> 
> >
> >         if (cpufreq_disabled())
> >                 return;
> > @@ -1253,9 +1256,12 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
> >                         /*
> >                          * Reaching here after boot in a few seconds may not
> >                          * mean that system will remain stable at "unknown"
> > -                        * frequency for longer duration. Hence, a BUG_ON().
> > +                        * frequency for longer duration. Hence, a WARN().
> >                          */
> > -                       BUG_ON(ret);
> > +                       if (ret) {
> > +                               WARN(1, "SYSTEM operating at invalid freq %u", policy->cur);
> > +                               goto err_out_unregister;
> > +                       }
> 
> And I still don't agree for this one. We shouldn't keep on working on a
> potentially unstable frequency.

I can add "could be unstable" -> the point being there can be psuedo
errors reported in the system - example - clock framework bugs. Dont
just stop the boot. example: what if cpufreq was a driver module - it
would not have rescued the system because cpufreq had'nt detected the
logic - if we are going to force this on the system, we should probably
not do this in cpufreq code, instead should be somewhere generic.

While I do empathise (and had infact advocated in the past) of not
favouring system attempting to continue at an invalid configuration and
our attempt to rescue has failed - given that we cannot provide a
consistent behavior (it is not a core system behavior) and potential of
a false-postive (example clk framework or underlying bug), it should be
good enough to "enhance" WARN to be "severe sounding enough" to
flag it for developer and continue while keeping the system alive as
much as possible.


-- 
Regards,
Nishanth Menon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cpufreq: Stop BUGing the system
  2014-12-18 14:49   ` Nishanth Menon
@ 2014-12-19  1:41     ` Viresh Kumar
  2014-12-19  2:08       ` Rafael J. Wysocki
  0 siblings, 1 reply; 10+ messages in thread
From: Viresh Kumar @ 2014-12-19  1:41 UTC (permalink / raw)
  To: Nishanth Menon
  Cc: Kevin Hilman, Rafael J. Wysocki, Linux Kernel Mailing List, linux-pm

On 18 December 2014 at 20:19, Nishanth Menon <nm@ti.com> wrote:
> I can add "could be unstable" -> the point being there can be psuedo
> errors reported in the system - example - clock framework bugs. Dont
> just stop the boot. example: what if cpufreq was a driver module - it
> would not have rescued the system because cpufreq had'nt detected the
> logic - if we are going to force this on the system, we should probably
> not do this in cpufreq code, instead should be somewhere generic.
>
> While I do empathise (and had infact advocated in the past) of not
> favouring system attempting to continue at an invalid configuration and
> our attempt to rescue has failed - given that we cannot provide a
> consistent behavior (it is not a core system behavior) and potential of
> a false-postive (example clk framework or underlying bug), it should be
> good enough to "enhance" WARN to be "severe sounding enough" to
> flag it for developer and continue while keeping the system alive as
> much as possible.

There is no way out for the kernel to know if its a false positive or a real
bug. And in the worst case, it can screw up a platform completely.

I am still not sure if changing it to a WARN would be good idea.

@Rafael: Thoughts ?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cpufreq: Stop BUGing the system
  2014-12-19  1:41     ` Viresh Kumar
@ 2014-12-19  2:08       ` Rafael J. Wysocki
  2014-12-24 15:06         ` Nishanth Menon
  0 siblings, 1 reply; 10+ messages in thread
From: Rafael J. Wysocki @ 2014-12-19  2:08 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Nishanth Menon, Kevin Hilman, Linux Kernel Mailing List, linux-pm

On Friday, December 19, 2014 07:11:19 AM Viresh Kumar wrote:
> On 18 December 2014 at 20:19, Nishanth Menon <nm@ti.com> wrote:
> > I can add "could be unstable" -> the point being there can be psuedo
> > errors reported in the system - example - clock framework bugs. Dont
> > just stop the boot. example: what if cpufreq was a driver module - it
> > would not have rescued the system because cpufreq had'nt detected the
> > logic - if we are going to force this on the system, we should probably
> > not do this in cpufreq code, instead should be somewhere generic.
> >
> > While I do empathise (and had infact advocated in the past) of not
> > favouring system attempting to continue at an invalid configuration and
> > our attempt to rescue has failed - given that we cannot provide a
> > consistent behavior (it is not a core system behavior) and potential of
> > a false-postive (example clk framework or underlying bug), it should be
> > good enough to "enhance" WARN to be "severe sounding enough" to
> > flag it for developer and continue while keeping the system alive as
> > much as possible.
> 
> There is no way out for the kernel to know if its a false positive or a real
> bug. And in the worst case, it can screw up a platform completely.
> 
> I am still not sure if changing it to a WARN would be good idea.
> 
> @Rafael: Thoughts ?

I'm a bit divided here.  On the one hand I don't like BUG_ON() as a rule and it
is used in too many places where it doesn't have to be used.

On the other hand, in this particular case, I'm not sure if allowing the system
to run without cpufreq when it might rely on it for CPU cooling, for one example,
is a good idea.


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cpufreq: Stop BUGing the system
  2014-12-19  2:08       ` Rafael J. Wysocki
@ 2014-12-24 15:06         ` Nishanth Menon
  2014-12-27 20:37           ` Rafael J. Wysocki
  0 siblings, 1 reply; 10+ messages in thread
From: Nishanth Menon @ 2014-12-24 15:06 UTC (permalink / raw)
  To: Rafael J. Wysocki, Viresh Kumar
  Cc: Kevin Hilman, Linux Kernel Mailing List, linux-pm

On 12/18/2014 08:08 PM, Rafael J. Wysocki wrote:
> On Friday, December 19, 2014 07:11:19 AM Viresh Kumar wrote:
>> On 18 December 2014 at 20:19, Nishanth Menon <nm@ti.com> wrote:
>>> I can add "could be unstable" -> the point being there can be psuedo
>>> errors reported in the system - example - clock framework bugs. Dont
>>> just stop the boot. example: what if cpufreq was a driver module - it
>>> would not have rescued the system because cpufreq had'nt detected the
>>> logic - if we are going to force this on the system, we should probably
>>> not do this in cpufreq code, instead should be somewhere generic.
>>>
>>> While I do empathise (and had infact advocated in the past) of not
>>> favouring system attempting to continue at an invalid configuration and
>>> our attempt to rescue has failed - given that we cannot provide a
>>> consistent behavior (it is not a core system behavior) and potential of
>>> a false-postive (example clk framework or underlying bug), it should be
>>> good enough to "enhance" WARN to be "severe sounding enough" to
>>> flag it for developer and continue while keeping the system alive as
>>> much as possible.
>>
>> There is no way out for the kernel to know if its a false positive or a real
>> bug. And in the worst case, it can screw up a platform completely.
>>
>> I am still not sure if changing it to a WARN would be good idea.
>>
>> @Rafael: Thoughts ?
> 
> I'm a bit divided here.  On the one hand I don't like BUG_ON() as a rule and it
> is used in too many places where it doesn't have to be used.
> 
> On the other hand, in this particular case, I'm not sure if allowing the system
> to run without cpufreq when it might rely on it for CPU cooling, for one example,
> is a good idea.

but then, CPUFReq is not a mandatory feature - we could as well do the
same with CPU_FREQ disabled.

-- 
Regards,
Nishanth Menon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] cpufreq: Stop BUGing the system
  2014-12-24 15:06         ` Nishanth Menon
@ 2014-12-27 20:37           ` Rafael J. Wysocki
  0 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2014-12-27 20:37 UTC (permalink / raw)
  To: Nishanth Menon
  Cc: Viresh Kumar, Kevin Hilman, Linux Kernel Mailing List, linux-pm

On Wednesday, December 24, 2014 09:06:05 AM Nishanth Menon wrote:
> On 12/18/2014 08:08 PM, Rafael J. Wysocki wrote:
> > On Friday, December 19, 2014 07:11:19 AM Viresh Kumar wrote:
> >> On 18 December 2014 at 20:19, Nishanth Menon <nm@ti.com> wrote:
> >>> I can add "could be unstable" -> the point being there can be psuedo
> >>> errors reported in the system - example - clock framework bugs. Dont
> >>> just stop the boot. example: what if cpufreq was a driver module - it
> >>> would not have rescued the system because cpufreq had'nt detected the
> >>> logic - if we are going to force this on the system, we should probably
> >>> not do this in cpufreq code, instead should be somewhere generic.
> >>>
> >>> While I do empathise (and had infact advocated in the past) of not
> >>> favouring system attempting to continue at an invalid configuration and
> >>> our attempt to rescue has failed - given that we cannot provide a
> >>> consistent behavior (it is not a core system behavior) and potential of
> >>> a false-postive (example clk framework or underlying bug), it should be
> >>> good enough to "enhance" WARN to be "severe sounding enough" to
> >>> flag it for developer and continue while keeping the system alive as
> >>> much as possible.
> >>
> >> There is no way out for the kernel to know if its a false positive or a real
> >> bug. And in the worst case, it can screw up a platform completely.
> >>
> >> I am still not sure if changing it to a WARN would be good idea.
> >>
> >> @Rafael: Thoughts ?
> > 
> > I'm a bit divided here.  On the one hand I don't like BUG_ON() as a rule and it
> > is used in too many places where it doesn't have to be used.
> > 
> > On the other hand, in this particular case, I'm not sure if allowing the system
> > to run without cpufreq when it might rely on it for CPU cooling, for one example,
> > is a good idea.
> 
> but then, CPUFReq is not a mandatory feature - we could as well do the
> same with CPU_FREQ disabled.

Some platforms pretty much require CPU_FREQ and will always have it set, but
with the $subject patch they may end up not using it.

So this isn't a valid argument to me, sorry.


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-12-27 20:15 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-17 15:51 [PATCH] cpufreq: Stop BUGing the system Nishanth Menon
2014-12-17 15:51 ` Nishanth Menon
2014-12-17 19:16 ` Kevin Hilman
2014-12-17 19:16   ` Kevin Hilman
2014-12-18  2:08 ` Viresh Kumar
2014-12-18 14:49   ` Nishanth Menon
2014-12-19  1:41     ` Viresh Kumar
2014-12-19  2:08       ` Rafael J. Wysocki
2014-12-24 15:06         ` Nishanth Menon
2014-12-27 20:37           ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.