linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] clockevents/drivers/cs5535: improve resilience to spurious interrupts
@ 2017-10-19 20:57 David Kozub
  2017-10-19 22:04 ` Daniel Lezcano
  2017-10-19 22:21 ` Thomas Gleixner
  0 siblings, 2 replies; 12+ messages in thread
From: David Kozub @ 2017-10-19 20:57 UTC (permalink / raw)
  To: Daniel Lezcano, Thomas Gleixner; +Cc: linux-kernel

This solves a BUG on ALIX 2c3 where mfgpt_tick is called before
clockevents_config_and_register returns. This caused mfgpt_tick to call a
null function pointer.

Thanks to Daniel Lezcano and Thomas Gleixner for helping me analyze this
and suggesting a solution.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Kozub <zub@linux.fjfi.cvut.cz>
---
 drivers/clocksource/cs5535-clockevt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/clocksource/cs5535-clockevt.c b/drivers/clocksource/cs5535-clockevt.c
index a1df588343f2..56100506b933 100644
--- a/drivers/clocksource/cs5535-clockevt.c
+++ b/drivers/clocksource/cs5535-clockevt.c
@@ -117,7 +117,8 @@ static irqreturn_t mfgpt_tick(int irq, void *dev_id)
 	/* Turn off the clock (and clear the event) */
 	disable_timer(cs5535_event_clock);
 
-	if (clockevent_state_shutdown(&cs5535_clockevent))
+	if (clockevent_state_detached(&cs5535_clockevent) ||
+			clockevent_state_shutdown(&cs5535_clockevent))
 		return IRQ_HANDLED;
 
 	/* Clear the counter */
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] clockevents/drivers/cs5535: improve resilience to spurious interrupts
  2017-10-19 20:57 [PATCH] clockevents/drivers/cs5535: improve resilience to spurious interrupts David Kozub
@ 2017-10-19 22:04 ` Daniel Lezcano
  2017-10-19 22:25   ` Thomas Gleixner
  2017-10-19 22:21 ` Thomas Gleixner
  1 sibling, 1 reply; 12+ messages in thread
From: Daniel Lezcano @ 2017-10-19 22:04 UTC (permalink / raw)
  To: David Kozub, Thomas Gleixner; +Cc: linux-kernel

On 19/10/2017 22:57, David Kozub wrote:
> This solves a BUG on ALIX 2c3 where mfgpt_tick is called before
> clockevents_config_and_register returns. This caused mfgpt_tick to call a
> null function pointer.
> 
> Thanks to Daniel Lezcano and Thomas Gleixner for helping me analyze this
> and suggesting a solution.
> 
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: David Kozub <zub@linux.fjfi.cvut.cz>
> ---

Thank for sending this fix.

Can you check if the commit 8f9327cbb is the one introducing the
regression ? So we can add the proper tags and propagate the fix to stable.

Thanks.

 -- Daniel


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] clockevents/drivers/cs5535: improve resilience to spurious interrupts
  2017-10-19 20:57 [PATCH] clockevents/drivers/cs5535: improve resilience to spurious interrupts David Kozub
  2017-10-19 22:04 ` Daniel Lezcano
@ 2017-10-19 22:21 ` Thomas Gleixner
  2017-10-20  7:38   ` David Kozub
  1 sibling, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2017-10-19 22:21 UTC (permalink / raw)
  To: David Kozub; +Cc: Daniel Lezcano, linux-kernel

On Thu, 19 Oct 2017, David Kozub wrote:

> This solves a BUG on ALIX 2c3 where mfgpt_tick is called before
> clockevents_config_and_register returns. This caused mfgpt_tick to call a
> null function pointer.
> 
> Thanks to Daniel Lezcano and Thomas Gleixner for helping me analyze this
> and suggesting a solution.

Thanks for sending this! Though I have two minor issues with this:

1) The changelog.

We try to structure changelogs by explaining the context, the problem and
its consequences and the fix. If the fix is non obvious it wants an
elaborate explanation. Let me give you an example:

   The interrupt handler mfgpt_tick() is not robust versus spurious
   interrupts which happen before the clock event device is registered and
   fully initialized.

   The reason is that the safe guard against spurious interrupts solely
   checks for the clockevents shutdown state, but lacks a check for
   detached state. If the interrupt hits while the device is in detached
   state it passes the safe guard and dereferences the event handler call
   back which is NULL.

   Add the missing state check.

See?


> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: David Kozub <zub@linux.fjfi.cvut.cz>
> ---
>  drivers/clocksource/cs5535-clockevt.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/clocksource/cs5535-clockevt.c b/drivers/clocksource/cs5535-clockevt.c
> index a1df588343f2..56100506b933 100644
> --- a/drivers/clocksource/cs5535-clockevt.c
> +++ b/drivers/clocksource/cs5535-clockevt.c
> @@ -117,7 +117,8 @@ static irqreturn_t mfgpt_tick(int irq, void *dev_id)
>  	/* Turn off the clock (and clear the event) */
>  	disable_timer(cs5535_event_clock);
>  
> -	if (clockevent_state_shutdown(&cs5535_clockevent))
> +	if (clockevent_state_detached(&cs5535_clockevent) ||
> +			clockevent_state_shutdown(&cs5535_clockevent))

Please do not use random indentation for multiline conditionals

	if (clockevent_state_detached(&cs5535_clockevent) ||
	    clockevent_state_shutdown(&cs5535_clockevent))

is the style we use through out the code.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] clockevents/drivers/cs5535: improve resilience to spurious interrupts
  2017-10-19 22:04 ` Daniel Lezcano
@ 2017-10-19 22:25   ` Thomas Gleixner
  2017-10-20  6:46     ` Daniel Lezcano
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2017-10-19 22:25 UTC (permalink / raw)
  To: Daniel Lezcano; +Cc: David Kozub, linux-kernel

On Fri, 20 Oct 2017, Daniel Lezcano wrote:

> On 19/10/2017 22:57, David Kozub wrote:
> > This solves a BUG on ALIX 2c3 where mfgpt_tick is called before
> > clockevents_config_and_register returns. This caused mfgpt_tick to call a
> > null function pointer.
> > 
> > Thanks to Daniel Lezcano and Thomas Gleixner for helping me analyze this
> > and suggesting a solution.
> > 
> > Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> > Signed-off-by: David Kozub <zub@linux.fjfi.cvut.cz>
> > ---
> 
> Thank for sending this fix.
> 
> Can you check if the commit 8f9327cbb is the one introducing the
> regression ? So we can add the proper tags and propagate the fix to stable.

No it's not.

-       if (cs5535_tick_mode == CLOCK_EVT_MODE_SHUTDOWN)
+       if (clockevent_state_shutdown(&cs5535_clockevent))

This particular problem of the missing detached state check has been there
forever and went unnoticed for whatever reason.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] clockevents/drivers/cs5535: improve resilience to spurious interrupts
  2017-10-19 22:25   ` Thomas Gleixner
@ 2017-10-20  6:46     ` Daniel Lezcano
  2017-10-20  7:49       ` David Kozub
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Lezcano @ 2017-10-20  6:46 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: David Kozub, linux-kernel

On 20/10/2017 00:25, Thomas Gleixner wrote:
> On Fri, 20 Oct 2017, Daniel Lezcano wrote:
> 
>> On 19/10/2017 22:57, David Kozub wrote:
>>> This solves a BUG on ALIX 2c3 where mfgpt_tick is called before
>>> clockevents_config_and_register returns. This caused mfgpt_tick to call a
>>> null function pointer.
>>>
>>> Thanks to Daniel Lezcano and Thomas Gleixner for helping me analyze this
>>> and suggesting a solution.
>>>
>>> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
>>> Signed-off-by: David Kozub <zub@linux.fjfi.cvut.cz>
>>> ---
>>
>> Thank for sending this fix.
>>
>> Can you check if the commit 8f9327cbb is the one introducing the
>> regression ? So we can add the proper tags and propagate the fix to stable.
> 
> No it's not.
> 
> -       if (cs5535_tick_mode == CLOCK_EVT_MODE_SHUTDOWN)
> +       if (clockevent_state_shutdown(&cs5535_clockevent))
> 
> This particular problem of the missing detached state check has been there
> forever and went unnoticed for whatever reason.

The detached condition was artificially caught by the initialized variable:

-static unsigned int cs5535_tick_mode = CLOCK_EVT_MODE_SHUTDOWN;

The patch 8f9327cbb removes the variable, so very likely this is where
the problem appeared.


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] clockevents/drivers/cs5535: improve resilience to spurious interrupts
  2017-10-19 22:21 ` Thomas Gleixner
@ 2017-10-20  7:38   ` David Kozub
  2017-10-20  8:18     ` Thomas Gleixner
  0 siblings, 1 reply; 12+ messages in thread
From: David Kozub @ 2017-10-20  7:38 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Daniel Lezcano, linux-kernel

On Fri, 20 Oct 2017, Thomas Gleixner wrote:

> On Thu, 19 Oct 2017, David Kozub wrote:
>
>> This solves a BUG on ALIX 2c3 where mfgpt_tick is called before
>> clockevents_config_and_register returns. This caused mfgpt_tick to call a
>> null function pointer.
>>
>> Thanks to Daniel Lezcano and Thomas Gleixner for helping me analyze this
>> and suggesting a solution.
>
> Thanks for sending this! Though I have two minor issues with this:
>
> 1) The changelog.
>
> We try to structure changelogs by explaining the context, the problem and
> its consequences and the fix. If the fix is non obvious it wants an
> elaborate explanation. Let me give you an example:
>
>   The interrupt handler mfgpt_tick() is not robust versus spurious
>   interrupts which happen before the clock event device is registered and
>   fully initialized.
>
>   The reason is that the safe guard against spurious interrupts solely
>   checks for the clockevents shutdown state, but lacks a check for
>   detached state. If the interrupt hits while the device is in detached
>   state it passes the safe guard and dereferences the event handler call
>   back which is NULL.
>
>   Add the missing state check.
>
> See?

Thanks for the feedback. Yes, that is better.

Now that I'm neither the author of the code nor the author of the commit 
message it doesn't seem right that I should be submitting this.

>> -	if (clockevent_state_shutdown(&cs5535_clockevent))
>> +	if (clockevent_state_detached(&cs5535_clockevent) ||
>> +			clockevent_state_shutdown(&cs5535_clockevent))
>
> Please do not use random indentation for multiline conditionals
>
> 	if (clockevent_state_detached(&cs5535_clockevent) ||
> 	    clockevent_state_shutdown(&cs5535_clockevent))
>
> is the style we use through out the code.

OK. I noticed the four spaces. I was swayed by process/coding-style.rst:

Outside of comments, documentation and except in Kconfig, spaces are never 
used for indentation,

But now I think I just misunderstood that. The spaces on the second line 
of the "if" are not regarded as ident.


Best regards,
David

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] clockevents/drivers/cs5535: improve resilience to spurious interrupts
  2017-10-20  6:46     ` Daniel Lezcano
@ 2017-10-20  7:49       ` David Kozub
  2017-10-20  9:35         ` Daniel Lezcano
  0 siblings, 1 reply; 12+ messages in thread
From: David Kozub @ 2017-10-20  7:49 UTC (permalink / raw)
  To: Daniel Lezcano; +Cc: Thomas Gleixner, linux-kernel

On Fri, 20 Oct 2017, Daniel Lezcano wrote:

> On 20/10/2017 00:25, Thomas Gleixner wrote:
>> On Fri, 20 Oct 2017, Daniel Lezcano wrote:
>>
>>> On 19/10/2017 22:57, David Kozub wrote:
>>>> This solves a BUG on ALIX 2c3 where mfgpt_tick is called before
>>>> clockevents_config_and_register returns. This caused mfgpt_tick to call a
>>>> null function pointer.
>>>>
>>>> Thanks to Daniel Lezcano and Thomas Gleixner for helping me analyze this
>>>> and suggesting a solution.
>>>>
>>>> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
>>>> Signed-off-by: David Kozub <zub@linux.fjfi.cvut.cz>
>>>> ---
>>>
>>> Thank for sending this fix.
>>>
>>> Can you check if the commit 8f9327cbb is the one introducing the
>>> regression ? So we can add the proper tags and propagate the fix to stable.
>>
>> No it's not.
>>
>> -       if (cs5535_tick_mode == CLOCK_EVT_MODE_SHUTDOWN)
>> +       if (clockevent_state_shutdown(&cs5535_clockevent))
>>
>> This particular problem of the missing detached state check has been there
>> forever and went unnoticed for whatever reason.
>
> The detached condition was artificially caught by the initialized variable:
>
> -static unsigned int cs5535_tick_mode = CLOCK_EVT_MODE_SHUTDOWN;
>
> The patch 8f9327cbb removes the variable, so very likely this is where
> the problem appeared.

I will try to test that. But I won't have access to the device till Sunday 
evening. I've had big trouble trying to run kernels > 4.1-rc5 on the 
device and if I'm looking correctly the commit was introduced in 4.3-rc1. 
But I'll try to figure something out.

Best regards,
David

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] clockevents/drivers/cs5535: improve resilience to spurious interrupts
  2017-10-20  7:38   ` David Kozub
@ 2017-10-20  8:18     ` Thomas Gleixner
  0 siblings, 0 replies; 12+ messages in thread
From: Thomas Gleixner @ 2017-10-20  8:18 UTC (permalink / raw)
  To: David Kozub; +Cc: Daniel Lezcano, linux-kernel

On Fri, 20 Oct 2017, David Kozub wrote:
> 
> Now that I'm neither the author of the code nor the author of the commit
> message it doesn't seem right that I should be submitting this.

Don't worry. Just resubmit it for exercise so Daniel or I can pick it up
for merging. You did the hard work of analyzing the problem in the first
place.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] clockevents/drivers/cs5535: improve resilience to spurious interrupts
  2017-10-20  7:49       ` David Kozub
@ 2017-10-20  9:35         ` Daniel Lezcano
  2017-10-20  9:40           ` Thomas Gleixner
  2017-10-20 14:59           ` David Kozub
  0 siblings, 2 replies; 12+ messages in thread
From: Daniel Lezcano @ 2017-10-20  9:35 UTC (permalink / raw)
  To: David Kozub; +Cc: Thomas Gleixner, linux-kernel

On 20/10/2017 09:49, David Kozub wrote:
> On Fri, 20 Oct 2017, Daniel Lezcano wrote:
> 
>> On 20/10/2017 00:25, Thomas Gleixner wrote:
>>> On Fri, 20 Oct 2017, Daniel Lezcano wrote:
>>>
>>>> On 19/10/2017 22:57, David Kozub wrote:
>>>>> This solves a BUG on ALIX 2c3 where mfgpt_tick is called before
>>>>> clockevents_config_and_register returns. This caused mfgpt_tick to
>>>>> call a
>>>>> null function pointer.
>>>>>
>>>>> Thanks to Daniel Lezcano and Thomas Gleixner for helping me analyze
>>>>> this
>>>>> and suggesting a solution.
>>>>>
>>>>> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
>>>>> Signed-off-by: David Kozub <zub@linux.fjfi.cvut.cz>
>>>>> ---
>>>>
>>>> Thank for sending this fix.
>>>>
>>>> Can you check if the commit 8f9327cbb is the one introducing the
>>>> regression ? So we can add the proper tags and propagate the fix to
>>>> stable.
>>>
>>> No it's not.
>>>
>>> -       if (cs5535_tick_mode == CLOCK_EVT_MODE_SHUTDOWN)
>>> +       if (clockevent_state_shutdown(&cs5535_clockevent))
>>>
>>> This particular problem of the missing detached state check has been
>>> there
>>> forever and went unnoticed for whatever reason.
>>
>> The detached condition was artificially caught by the initialized
>> variable:
>>
>> -static unsigned int cs5535_tick_mode = CLOCK_EVT_MODE_SHUTDOWN;
>>
>> The patch 8f9327cbb removes the variable, so very likely this is where
>> the problem appeared.
> 
> I will try to test that. But I won't have access to the device till
> Sunday evening. I've had big trouble trying to run kernels > 4.1-rc5 on
> the device and if I'm looking correctly the commit was introduced in
> 4.3-rc1. But I'll try to figure something out.

David,

thanks again for taking the time to report and investigate this issue.
Usually people just give up and drop the legacy hardware without letting
us know the kernel is broken with it. So don't spend too much time with
it, just check if the commit before works, if not, just add in the log
the kernel version you noticed the breakage.

In case you are interested in doing some debugging to narrow down the
offending commit and you know the versions working and not working, you
can try the command git-bisect. It will use a dichotomy approach to find
out the culprit.

For example:
Let's say you have v4.1-rc5 working and v4.1-rc6 not working and in
between there are 1024 changes. The dichotomy approach will find out the
patch introducing the regression in ln(1024)/ln(2) = 10 iterations.

It's usage is:

git bisect start v4.1-rc6 v4.1-rc5 (bad goes always first)
make && test (the test is ok)
git bisect good
make && test (the test is ok)
git bisect good
make && test (the test fails)
git bisect bad
...

...

And it ends up to the bad commit (assuming there are no compilation
broken patches in the process).

Once it is finished, use git bisect reset

  -- Daniel

-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] clockevents/drivers/cs5535: improve resilience to spurious interrupts
  2017-10-20  9:35         ` Daniel Lezcano
@ 2017-10-20  9:40           ` Thomas Gleixner
  2017-10-20  9:46             ` Daniel Lezcano
  2017-10-20 14:59           ` David Kozub
  1 sibling, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2017-10-20  9:40 UTC (permalink / raw)
  To: Daniel Lezcano; +Cc: David Kozub, linux-kernel

On Fri, 20 Oct 2017, Daniel Lezcano wrote:
> On 20/10/2017 09:49, David Kozub wrote:
> 
> And it ends up to the bad commit (assuming there are no compilation
> broken patches in the process).

I don't think it's worth the trouble for this particular issue. We already
know that the commit which made the spurious detection weaker. So we just
add that as Fixes tag and be done with it.

Thanks,

	tglx


    

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] clockevents/drivers/cs5535: improve resilience to spurious interrupts
  2017-10-20  9:40           ` Thomas Gleixner
@ 2017-10-20  9:46             ` Daniel Lezcano
  0 siblings, 0 replies; 12+ messages in thread
From: Daniel Lezcano @ 2017-10-20  9:46 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: David Kozub, linux-kernel

On 20/10/2017 11:40, Thomas Gleixner wrote:
> On Fri, 20 Oct 2017, Daniel Lezcano wrote:
>> On 20/10/2017 09:49, David Kozub wrote:
>>
>> And it ends up to the bad commit (assuming there are no compilation
>> broken patches in the process).
> 
> I don't think it's worth the trouble for this particular issue. We already
> know that the commit which made the spurious detection weaker. So we just
> add that as Fixes tag and be done with it.

Yes, I'm fine with that.



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] clockevents/drivers/cs5535: improve resilience to spurious interrupts
  2017-10-20  9:35         ` Daniel Lezcano
  2017-10-20  9:40           ` Thomas Gleixner
@ 2017-10-20 14:59           ` David Kozub
  1 sibling, 0 replies; 12+ messages in thread
From: David Kozub @ 2017-10-20 14:59 UTC (permalink / raw)
  To: Daniel Lezcano; +Cc: Thomas Gleixner, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3455 bytes --]

On Fri, 20 Oct 2017, Daniel Lezcano wrote:

> On 20/10/2017 09:49, David Kozub wrote:
>> On Fri, 20 Oct 2017, Daniel Lezcano wrote:
>>
>>> On 20/10/2017 00:25, Thomas Gleixner wrote:
>>>> On Fri, 20 Oct 2017, Daniel Lezcano wrote:
>>>>
>>>>> On 19/10/2017 22:57, David Kozub wrote:
>>>>>> This solves a BUG on ALIX 2c3 where mfgpt_tick is called before
>>>>>> clockevents_config_and_register returns. This caused mfgpt_tick to
>>>>>> call a
>>>>>> null function pointer.
>>>>>>
>>>>>> Thanks to Daniel Lezcano and Thomas Gleixner for helping me analyze
>>>>>> this
>>>>>> and suggesting a solution.
>>>>>>
>>>>>> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
>>>>>> Signed-off-by: David Kozub <zub@linux.fjfi.cvut.cz>
>>>>>> ---
>>>>>
>>>>> Thank for sending this fix.
>>>>>
>>>>> Can you check if the commit 8f9327cbb is the one introducing 
the
>>>>> regression ? So we can add the proper tags and propagate the fix to
>>>>> stable.
>>>>
>>>> No it's not.
>>>>
>>>> -       if (cs5535_tick_mode == CLOCK_EVT_MODE_SHUTDOWN)
>>>> +       if (clockevent_state_shutdown(&cs5535_clockevent))
>>>>
>>>> This particular problem of the missing detached state check has been
>>>> there
>>>> forever and went unnoticed for whatever reason.
>>>
>>> The detached condition was artificially caught by the initialized
>>> variable:
>>>
>>> -static unsigned int cs5535_tick_mode = CLOCK_EVT_MODE_SHUTDOWN;
>>>
>>> The patch 8f9327cbb removes the variable, so very likely this is where
>>> the problem appeared.
>>
>> I will try to test that. But I won't have access to the device till
>> Sunday evening. I've had big trouble trying to run kernels > 4.1-rc5 on
>> the device and if I'm looking correctly the commit was introduced in
>> 4.3-rc1. But I'll try to figure something out.
>
> David,
>
> thanks again for taking the time to report and investigate this issue.
> Usually people just give up and drop the legacy hardware without letting
> us know the kernel is broken with it. So don't spend too much time with
> it, just check if the commit before works, if not, just add in the log
> the kernel version you noticed the breakage.
>
> In case you are interested in doing some debugging to narrow down the
> offending commit and you know the versions working and not working, you
> can try the command git-bisect. It will use a dichotomy approach to find
> out the culprit.

Hi Daniel,

I originally didn't see this issue simply because 4.1-rc7 would not output 
anything on the serial on the device. For kernels >= 4.1-rc7 the kernel 
either reboots immediatelly or freezes, in both cases without giving any 
output in the serial. I tried to find the cause of this with git bisect 
but in the end I was none the wiser. I also suspected my build 
environment.

Eventually I found out the following:
* 4.1-rc6 boots OK
* 4.1-rc7 restarts immediatelly, but if I revert f18c34e48 it works but I 
have no idea why would this commit break anything, the commit looks OK
* 4.1 - works if I revert f18c34e48
* e75c73ad6 feeezes even after reverting f18c34e48
* 4.2-rc1 - reboots, even after reverting f18c34e48

Only recently I tried a current kernel and I was more lucky: it gave me 
some useful output on the serial.

To verify 8f9327cbb I was thinking I could take it and apply it onto 
4.1-rc6 (if that was possible), but as Thomas suggested it might not be 
worth it, then I forget about that.

Thanks for all the help to you and to Thomas.

Best regards,
David

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-10-20 14:59 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-19 20:57 [PATCH] clockevents/drivers/cs5535: improve resilience to spurious interrupts David Kozub
2017-10-19 22:04 ` Daniel Lezcano
2017-10-19 22:25   ` Thomas Gleixner
2017-10-20  6:46     ` Daniel Lezcano
2017-10-20  7:49       ` David Kozub
2017-10-20  9:35         ` Daniel Lezcano
2017-10-20  9:40           ` Thomas Gleixner
2017-10-20  9:46             ` Daniel Lezcano
2017-10-20 14:59           ` David Kozub
2017-10-19 22:21 ` Thomas Gleixner
2017-10-20  7:38   ` David Kozub
2017-10-20  8:18     ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).