linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arm64: arch_timer: XGene-1 has 31 bit, not 32 bit, arch timer.
@ 2022-10-21 15:34 Joe Korty
  2022-10-21 15:44 ` Greg KH
  2022-10-21 18:08 ` Marc Zyngier
  0 siblings, 2 replies; 6+ messages in thread
From: Joe Korty @ 2022-10-21 15:34 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-kernel, stable

arm64: XGene-1 has a 31 bit, not a 32 bit, arch timer.

Fixes: 012f188504528b8cb32f441ac3bd9ea2eba39c9e ("clocksource/drivers/arm_arch_timer:
  Work around broken CVAL implementations")

Testing:
  On an 8-cpu Mustang, the following sequence no longer locks up the system:

     echo 0 >/proc/sys/kernel/watchdog
     for i in {0..7}; do taskset -c $i echo hi there $i; done

Stable:
  To be applied to 5.16 and above, once accepted by mainline.

Signed-off-by: Joe Korty <joe.korty@concurrent-rt.com>

Index: b/drivers/clocksource/arm_arch_timer.c
===================================================================
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -805,7 +805,7 @@ static u64 __arch_timer_check_delta(void
 	const struct midr_range broken_cval_midrs[] = {
 		/*
 		 * XGene-1 implements CVAL in terms of TVAL, meaning
-		 * that the maximum timer range is 32bit. Shame on them.
+		 * that the maximum timer range is 31bit. Shame on them.
 		 */
 		MIDR_ALL_VERSIONS(MIDR_CPU_MODEL(ARM_CPU_IMP_APM,
 						 APM_CPU_PART_POTENZA)),
@@ -813,8 +813,8 @@ static u64 __arch_timer_check_delta(void
 	};
 
 	if (is_midr_in_range_list(read_cpuid_id(), broken_cval_midrs)) {
-		pr_warn_once("Broken CNTx_CVAL_EL1, limiting width to 32bits");
-		return CLOCKSOURCE_MASK(32);
+		pr_warn_once("Broken CNTx_CVAL_EL1, limiting width to 31bits");
+		return CLOCKSOURCE_MASK(31);
 	}
 #endif
 	return CLOCKSOURCE_MASK(arch_counter_get_width());

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] arm64: arch_timer: XGene-1 has 31 bit, not 32 bit, arch timer.
  2022-10-21 15:34 [PATCH] arm64: arch_timer: XGene-1 has 31 bit, not 32 bit, arch timer Joe Korty
@ 2022-10-21 15:44 ` Greg KH
  2022-10-21 18:08 ` Marc Zyngier
  1 sibling, 0 replies; 6+ messages in thread
From: Greg KH @ 2022-10-21 15:44 UTC (permalink / raw)
  To: Joe Korty; +Cc: Marc Zyngier, linux-kernel, stable

On Fri, Oct 21, 2022 at 11:34:24AM -0400, Joe Korty wrote:
> arm64: XGene-1 has a 31 bit, not a 32 bit, arch timer.
> 
> Fixes: 012f188504528b8cb32f441ac3bd9ea2eba39c9e ("clocksource/drivers/arm_arch_timer:
>   Work around broken CVAL implementations")
> 
> Testing:
>   On an 8-cpu Mustang, the following sequence no longer locks up the system:
> 
>      echo 0 >/proc/sys/kernel/watchdog
>      for i in {0..7}; do taskset -c $i echo hi there $i; done
> 
> Stable:
>   To be applied to 5.16 and above, once accepted by mainline.
> 
> Signed-off-by: Joe Korty <joe.korty@concurrent-rt.com>
> 
> Index: b/drivers/clocksource/arm_arch_timer.c
> ===================================================================
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -805,7 +805,7 @@ static u64 __arch_timer_check_delta(void
>  	const struct midr_range broken_cval_midrs[] = {
>  		/*
>  		 * XGene-1 implements CVAL in terms of TVAL, meaning
> -		 * that the maximum timer range is 32bit. Shame on them.
> +		 * that the maximum timer range is 31bit. Shame on them.
>  		 */
>  		MIDR_ALL_VERSIONS(MIDR_CPU_MODEL(ARM_CPU_IMP_APM,
>  						 APM_CPU_PART_POTENZA)),
> @@ -813,8 +813,8 @@ static u64 __arch_timer_check_delta(void
>  	};
>  
>  	if (is_midr_in_range_list(read_cpuid_id(), broken_cval_midrs)) {
> -		pr_warn_once("Broken CNTx_CVAL_EL1, limiting width to 32bits");
> -		return CLOCKSOURCE_MASK(32);
> +		pr_warn_once("Broken CNTx_CVAL_EL1, limiting width to 31bits");
> +		return CLOCKSOURCE_MASK(31);
>  	}
>  #endif
>  	return CLOCKSOURCE_MASK(arch_counter_get_width());

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] arm64: arch_timer: XGene-1 has 31 bit, not 32 bit, arch timer.
  2022-10-21 15:34 [PATCH] arm64: arch_timer: XGene-1 has 31 bit, not 32 bit, arch timer Joe Korty
  2022-10-21 15:44 ` Greg KH
@ 2022-10-21 18:08 ` Marc Zyngier
  2022-10-21 19:47   ` Joe Korty
  1 sibling, 1 reply; 6+ messages in thread
From: Marc Zyngier @ 2022-10-21 18:08 UTC (permalink / raw)
  To: Joe Korty; +Cc: linux-kernel, stable

On Fri, 21 Oct 2022 16:34:24 +0100,
Joe Korty <joe.korty@concurrent-rt.com> wrote:
> 
> arm64: XGene-1 has a 31 bit, not a 32 bit, arch timer.
> 
> Fixes: 012f188504528b8cb32f441ac3bd9ea2eba39c9e ("clocksource/drivers/arm_arch_timer:
>   Work around broken CVAL implementations")

Sorry, but you'll have to provide a bit more of an analysis here. As
far as I can tell, you're just changing a parameter without properly
describing what breaks and how.

>
> Testing:
>   On an 8-cpu Mustang, the following sequence no longer locks up the system:
> 
>      echo 0 >/proc/sys/kernel/watchdog
>      for i in {0..7}; do taskset -c $i echo hi there $i; done
> 
> Stable:
>   To be applied to 5.16 and above, once accepted by mainline.
> 
> Signed-off-by: Joe Korty <joe.korty@concurrent-rt.com>
> 
> Index: b/drivers/clocksource/arm_arch_timer.c
> ===================================================================
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -805,7 +805,7 @@ static u64 __arch_timer_check_delta(void
>  	const struct midr_range broken_cval_midrs[] = {
>  		/*
>  		 * XGene-1 implements CVAL in terms of TVAL, meaning
> -		 * that the maximum timer range is 32bit. Shame on them.
> +		 * that the maximum timer range is 31bit. Shame on them.
>  		 */
>  		MIDR_ALL_VERSIONS(MIDR_CPU_MODEL(ARM_CPU_IMP_APM,
>  						 APM_CPU_PART_POTENZA)),
> @@ -813,8 +813,8 @@ static u64 __arch_timer_check_delta(void
>  	};
>  
>  	if (is_midr_in_range_list(read_cpuid_id(), broken_cval_midrs)) {
> -		pr_warn_once("Broken CNTx_CVAL_EL1, limiting width to 32bits");
> -		return CLOCKSOURCE_MASK(32);
> +		pr_warn_once("Broken CNTx_CVAL_EL1, limiting width to 31bits");
> +		return CLOCKSOURCE_MASK(31);
>  	}
>  #endif
>  	return CLOCKSOURCE_MASK(arch_counter_get_width());
> 

Also, this isn't much of a patch. Please see the documentation on how
to properly submit one.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] arm64: arch_timer: XGene-1 has 31 bit, not 32 bit, arch timer.
  2022-10-21 18:08 ` Marc Zyngier
@ 2022-10-21 19:47   ` Joe Korty
  2022-10-22  5:40     ` Greg KH
  2022-10-22  9:58     ` Marc Zyngier
  0 siblings, 2 replies; 6+ messages in thread
From: Joe Korty @ 2022-10-21 19:47 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-kernel, stable

Hi Marc,

On Fri, Oct 21, 2022 at 07:08:50PM +0100, Marc Zyngier wrote:
> Sorry, but you'll have to provide a bit more of an analysis here. As
> far as I can tell, you're just changing a parameter without properly
> describing what breaks and how.

There isn't much to analyse.  For ages, 0x7fffffff (31 bits) was the
declared width of 'arch timer' for all arm architures, and that worked.
Your patch series made the declared width vary according to which chipset
was in use, which is good, but that rewrite changed the above mask for
the XGene-1 from 0x7fffffff to 0xffffffff.  That change broke timers
for the XGene-1 since it seems that, in actuality, it has only a 31 bit
wide arch timer.  Thus declaring that arch timer has 32-bits is wrong.
This mismatch between the actual and declared sizes would cause arithmetic
errors in the calculation of timer deltas which more than accounts for
the hrtimer failures I am seeing when running 5.16+ on my Mustang XGene1.

Only one line need change, the rest are fluff:

-             return CLOCKSOURCE_MASK(32);
+             return CLOCKSOURCE_MASK(31);

> Also, this isn't much of a patch.

I don't know what this means.  The patch contains all that is needed for
the fix, no more.  I could add more comments as to _why_ it is 31 bits
not 32, but I don't know why.  I only know that the motherboard behaves
as if 31 bits is all that is available in the hardware.

> Please see the documentation on how to properly submit one.

AFAICS, the only submission mistake is that the 'Cc: stable@vger.kernel.org'
line is missing.

Regards,
Joe

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] arm64: arch_timer: XGene-1 has 31 bit, not 32 bit, arch timer.
  2022-10-21 19:47   ` Joe Korty
@ 2022-10-22  5:40     ` Greg KH
  2022-10-22  9:58     ` Marc Zyngier
  1 sibling, 0 replies; 6+ messages in thread
From: Greg KH @ 2022-10-22  5:40 UTC (permalink / raw)
  To: Joe Korty; +Cc: Marc Zyngier, linux-kernel, stable

On Fri, Oct 21, 2022 at 03:47:46PM -0400, Joe Korty wrote:
> Hi Marc,
> 
> On Fri, Oct 21, 2022 at 07:08:50PM +0100, Marc Zyngier wrote:
> > Sorry, but you'll have to provide a bit more of an analysis here. As
> > far as I can tell, you're just changing a parameter without properly
> > describing what breaks and how.
> 
> There isn't much to analyse.  For ages, 0x7fffffff (31 bits) was the
> declared width of 'arch timer' for all arm architures, and that worked.
> Your patch series made the declared width vary according to which chipset
> was in use, which is good, but that rewrite changed the above mask for
> the XGene-1 from 0x7fffffff to 0xffffffff.  That change broke timers
> for the XGene-1 since it seems that, in actuality, it has only a 31 bit
> wide arch timer.  Thus declaring that arch timer has 32-bits is wrong.
> This mismatch between the actual and declared sizes would cause arithmetic
> errors in the calculation of timer deltas which more than accounts for
> the hrtimer failures I am seeing when running 5.16+ on my Mustang XGene1.
> 
> Only one line need change, the rest are fluff:
> 
> -             return CLOCKSOURCE_MASK(32);
> +             return CLOCKSOURCE_MASK(31);
> 
> > Also, this isn't much of a patch.
> 
> I don't know what this means.  The patch contains all that is needed for
> the fix, no more.  I could add more comments as to _why_ it is 31 bits
> not 32, but I don't know why.  I only know that the motherboard behaves
> as if 31 bits is all that is available in the hardware.
> 
> > Please see the documentation on how to properly submit one.
> 
> AFAICS, the only submission mistake is that the 'Cc: stable@vger.kernel.org'
> line is missing.

No, you need a much better changelog text and probably subject line, and
to properly cc: the correct maintainers and developers.  As my bot would
say:

- Kernel development is done in public, please always cc: a public
  mailing list with a patch submission.  Using the tool,
  scripts/get_maintainer.pl on the patch will tell you what mailing list
  to cc.

- You did not specify a description of why the patch is needed, or
  possibly, any description at all, in the email body.  Please read the
  section entitled "The canonical patch format" in the kernel file,
  Documentation/SubmittingPatches for what is needed in order to
  properly describe the change.

- You did not write a descriptive Subject: for the patch, allowing Greg,
  and everyone else, to know what this patch is all about.  Please read
  the section entitled "The canonical patch format" in the kernel file,
  Documentation/SubmittingPatches for what a proper Subject: line should
  look like.


Thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] arm64: arch_timer: XGene-1 has 31 bit, not 32 bit, arch timer.
  2022-10-21 19:47   ` Joe Korty
  2022-10-22  5:40     ` Greg KH
@ 2022-10-22  9:58     ` Marc Zyngier
  1 sibling, 0 replies; 6+ messages in thread
From: Marc Zyngier @ 2022-10-22  9:58 UTC (permalink / raw)
  To: Joe Korty; +Cc: linux-kernel, stable

Hi Joe,

On Fri, 21 Oct 2022 20:47:46 +0100,
Joe Korty <joe.korty@concurrent-rt.com> wrote:
> 
> Hi Marc,
> 
> On Fri, Oct 21, 2022 at 07:08:50PM +0100, Marc Zyngier wrote:
> > Sorry, but you'll have to provide a bit more of an analysis here. As
> > far as I can tell, you're just changing a parameter without properly
> > describing what breaks and how.
> 
> There isn't much to analyse.

Actually, there is plenty to analyse. Starting with *why* 31 is the
correct value (it actually is, see below) other than "hey, I reverted
this and it's all good, just merge it".

> For ages, 0x7fffffff (31 bits) was the
> declared width of 'arch timer' for all arm architures, and that worked.
> Your patch series made the declared width vary according to which chipset
> was in use, which is good, but that rewrite changed the above mask for
> the XGene-1 from 0x7fffffff to 0xffffffff.

This isn't quite what my changes did, but hey, let's not get derailed.

> That change broke timers
> for the XGene-1 since it seems that, in actuality, it has only a 31 bit
> wide arch timer.  Thus declaring that arch timer has 32-bits is wrong.
> This mismatch between the actual and declared sizes would cause arithmetic
> errors in the calculation of timer deltas which more than accounts for
> the hrtimer failures I am seeing when running 5.16+ on my Mustang XGene1.

This is the important point, and the reason why it breaks:

XGene implements CVAL (a 64bit comparator) in terms of TVAL (a
countdown register) instead of the other way around. TVAL being a
32bit register, the width of the counter should equally be 32.
However, TVAL is a *signed* value, and keeps counting down in the
negative range once the timer fires.

It means that any TVAL value with bit 31 set will fire immediately, as
it cannot be distinguished from an already expired timer. Reducing the
timer range back to a paltry 31 bits papers over the issue.

Another problem cannot be fixed though, which is that the timer
interrupt *must* be handled within the negative countdown period, or
the interrupt will be lost (TVAL will rollover to a positive value,
indicative of a new timer deadline).

> Only one line need change, the rest are fluff:
> 
> -             return CLOCKSOURCE_MASK(32);
> +             return CLOCKSOURCE_MASK(31);

Yes, and all you need is to send a proper patch, see below.

> 
> > Also, this isn't much of a patch.
> 
> I don't know what this means.  The patch contains all that is needed for
> the fix, no more.  I could add more comments as to _why_ it is 31 bits
> not 32, but I don't know why.  I only know that the motherboard behaves
> as if 31 bits is all that is available in the hardware.
> 
> > Please see the documentation on how to properly submit one.
> 
> AFAICS, the only submission mistake is that the 'Cc: stable@vger.kernel.org'
> line is missing.

What you have done here is to write an email with a diff appended to
it, which isn't a proper kernel patch. I expect a patch to be
formatted with "git format-patch" instead of "git diff"
(i.e. something that is an actually commit instead of a local diff),
with a proper commit message (feel free to nick some of the
description above), with a Cc: stable@ and a Fixes: tag at the right
spot, Cc'ing all the relevant maintainers.

All of this is eloquently explained in the kernel documentation
(Documentation/process/submitting-patches.rst), and I would definitely
encourage you to read the sections titled "Describe your changes" and
"The canonical patch format". You can also look at the previous
commits to the same file to get a sense of the formatting that people
use.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-10-22 10:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-21 15:34 [PATCH] arm64: arch_timer: XGene-1 has 31 bit, not 32 bit, arch timer Joe Korty
2022-10-21 15:44 ` Greg KH
2022-10-21 18:08 ` Marc Zyngier
2022-10-21 19:47   ` Joe Korty
2022-10-22  5:40     ` Greg KH
2022-10-22  9:58     ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).