All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] pwm: bcm2835: Improve period and duty cycle calculation
@ 2020-12-21 16:55 Uwe Kleine-König
  2020-12-21 23:04 ` Florian Fainelli
  2020-12-22 16:34 ` Lino Sanfilippo
  0 siblings, 2 replies; 5+ messages in thread
From: Uwe Kleine-König @ 2020-12-21 16:55 UTC (permalink / raw)
  To: Thierry Reding, Lee Jones, Nicolas Saenz Julienne,
	Florian Fainelli, Ray Jui, Scott Branden, Lino Sanfilippo,
	Sean Young
  Cc: bcm-kernel-feedback-list, linux-pwm, linux-rpi-kernel, kernel

With an input clk rate bigger than 2000000000, scaler would have been
zero which then would have resulted in a division by zero.

Also the originally implemented algorithm divided by the result of a
division. This nearly always looses precision. Consider a requested period
of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware
was configured with an actual period of 983869.007 ns (PERIOD = 32258)
while the hardware can provide 1000003.508 ns (PERIOD = 32787).
And note if the input clock frequency was 32786886 Hz instead, the hardware
was configured to 1016656.477 ns (PERIOD = 33333) while the optimal
setting results in 1000003.477 ns (PERIOD = 32787).

This patch implements proper range checking and only divides once for
the calculation of period (and similar for duty_cycle).

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
Hello,

during review of the bcm2835 driver I noticed this double division.

I think the practical relevance is low however because the clock rate is
fixed to 10 MHz on this platform which doesn't result in these
deviations. (Is this right, what is the actual rate?)

Having said that this patch is only compile tested.

Best regards
Uwe

 drivers/pwm/pwm-bcm2835.c | 33 +++++++++++++++++++++++++--------
 1 file changed, 25 insertions(+), 8 deletions(-)

diff --git a/drivers/pwm/pwm-bcm2835.c b/drivers/pwm/pwm-bcm2835.c
index 6ff5f04b3e07..f937db32fa54 100644
--- a/drivers/pwm/pwm-bcm2835.c
+++ b/drivers/pwm/pwm-bcm2835.c
@@ -64,8 +64,9 @@ static int bcm2835_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
 
 	struct bcm2835_pwm *pc = to_bcm2835_pwm(chip);
 	unsigned long rate = clk_get_rate(pc->clk);
-	unsigned long long period;
-	unsigned long scaler;
+	unsigned long long period_cycles;
+	u64 max_period, period, duty_cycle;
+
 	u32 val;
 
 	if (!rate) {
@@ -73,18 +74,34 @@ static int bcm2835_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
 		return -EINVAL;
 	}
 
-	scaler = DIV_ROUND_CLOSEST(NSEC_PER_SEC, rate);
+	/*
+	 * period_cycles must be a 32 bit value, so period * rate / NSEC_PER_SEC
+	 * must be <= U32_MAX. As U32_MAX * NSEC_PER_SEC < U64_MAX the
+	 * multiplication period * rate doesn't overflow.
+	 */
+	max_period = DIV_ROUND_DOWN_ULL((u64)U32_MAX * NSEC_PER_SEC, rate);
+
+	if (state->period > max_period)
+		period = max_period;
+	else
+		period = state->period;
+
+	if (state->duty_cycle > period)
+		duty_cycle = period;
+	else
+		duty_cycle = state->duty_cycle;
+
 	/* set period */
-	period = DIV_ROUND_CLOSEST_ULL(state->period, scaler);
+	period_cycles = DIV_ROUND_CLOSEST_ULL(period * rate, NSEC_PER_SEC);
 
-	/* dont accept a period that is too small or has been truncated */
-	if ((period < PERIOD_MIN) || (period > U32_MAX))
+	/* don't accept a period that is too small */
+	if (period_cycles < PERIOD_MIN)
 		return -EINVAL;
 
-	writel(period, pc->base + PERIOD(pwm->hwpwm));
+	writel(period_cycles, pc->base + PERIOD(pwm->hwpwm));
 
 	/* set duty cycle */
-	val = DIV_ROUND_CLOSEST_ULL(state->duty_cycle, scaler);
+	val = DIV_ROUND_CLOSEST_ULL(duty_cycle * rate, NSEC_PER_SEC);
 	writel(val, pc->base + DUTY(pwm->hwpwm));
 
 	/* set polarity */
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] pwm: bcm2835: Improve period and duty cycle calculation
  2020-12-21 16:55 [PATCH] pwm: bcm2835: Improve period and duty cycle calculation Uwe Kleine-König
@ 2020-12-21 23:04 ` Florian Fainelli
  2020-12-22  6:30   ` Uwe Kleine-König
  2020-12-22 16:34 ` Lino Sanfilippo
  1 sibling, 1 reply; 5+ messages in thread
From: Florian Fainelli @ 2020-12-21 23:04 UTC (permalink / raw)
  To: Uwe Kleine-König, Thierry Reding, Lee Jones,
	Nicolas Saenz Julienne, Florian Fainelli, Ray Jui, Scott Branden,
	Lino Sanfilippo, Sean Young
  Cc: bcm-kernel-feedback-list, linux-pwm, linux-rpi-kernel, kernel



On 12/21/2020 8:55 AM, Uwe Kleine-König wrote:
> With an input clk rate bigger than 2000000000, scaler would have been
> zero which then would have resulted in a division by zero.
> 
> Also the originally implemented algorithm divided by the result of a
> division. This nearly always looses precision. Consider a requested period
> of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware
> was configured with an actual period of 983869.007 ns (PERIOD = 32258)
> while the hardware can provide 1000003.508 ns (PERIOD = 32787).
> And note if the input clock frequency was 32786886 Hz instead, the hardware
> was configured to 1016656.477 ns (PERIOD = 33333) while the optimal
> setting results in 1000003.477 ns (PERIOD = 32787).
> 
> This patch implements proper range checking and only divides once for
> the calculation of period (and similar for duty_cycle).
> 
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> ---
> Hello,
> 
> during review of the bcm2835 driver I noticed this double division.
> 
> I think the practical relevance is low however because the clock rate is
> fixed to 10 MHz on this platform which doesn't result in these
> deviations. (Is this right, what is the actual rate?)

Currently this is correct but the PWM input clock can be configured from
the divider of a PLL that runs at 500MHz so this change is potentially
useful in that regard.
-- 
Florian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] pwm: bcm2835: Improve period and duty cycle calculation
  2020-12-21 23:04 ` Florian Fainelli
@ 2020-12-22  6:30   ` Uwe Kleine-König
  0 siblings, 0 replies; 5+ messages in thread
From: Uwe Kleine-König @ 2020-12-22  6:30 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Thierry Reding, Lee Jones, Nicolas Saenz Julienne, Ray Jui,
	Scott Branden, Lino Sanfilippo, Sean Young, linux-pwm,
	bcm-kernel-feedback-list, linux-rpi-kernel, kernel

[-- Attachment #1: Type: text/plain, Size: 1802 bytes --]

Hello Florian,

On Mon, Dec 21, 2020 at 03:04:25PM -0800, Florian Fainelli wrote:
> On 12/21/2020 8:55 AM, Uwe Kleine-König wrote:
> > With an input clk rate bigger than 2000000000, scaler would have been
> > zero which then would have resulted in a division by zero.
> > 
> > Also the originally implemented algorithm divided by the result of a
> > division. This nearly always looses precision. Consider a requested period
> > of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware
> > was configured with an actual period of 983869.007 ns (PERIOD = 32258)
> > while the hardware can provide 1000003.508 ns (PERIOD = 32787).
> > And note if the input clock frequency was 32786886 Hz instead, the hardware
> > was configured to 1016656.477 ns (PERIOD = 33333) while the optimal
> > setting results in 1000003.477 ns (PERIOD = 32787).
> > 
> > This patch implements proper range checking and only divides once for
> > the calculation of period (and similar for duty_cycle).
> > 
> > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> > ---
> > Hello,
> > 
> > during review of the bcm2835 driver I noticed this double division.
> > 
> > I think the practical relevance is low however because the clock rate is
> > fixed to 10 MHz on this platform which doesn't result in these
> > deviations. (Is this right, what is the actual rate?)
> 
> Currently this is correct but the PWM input clock can be configured from
> the divider of a PLL that runs at 500MHz so this change is potentially
> useful in that regard.

Thanks for your feedback; is this an Ack?

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] pwm: bcm2835: Improve period and duty cycle calculation
  2020-12-21 16:55 [PATCH] pwm: bcm2835: Improve period and duty cycle calculation Uwe Kleine-König
  2020-12-21 23:04 ` Florian Fainelli
@ 2020-12-22 16:34 ` Lino Sanfilippo
  2020-12-22 22:11   ` Uwe Kleine-König
  1 sibling, 1 reply; 5+ messages in thread
From: Lino Sanfilippo @ 2020-12-22 16:34 UTC (permalink / raw)
  To: Uwe Kleine-König, Thierry Reding, Lee Jones,
	Nicolas Saenz Julienne, Florian Fainelli, Ray Jui, Scott Branden,
	Sean Young
  Cc: bcm-kernel-feedback-list, linux-pwm, linux-rpi-kernel, kernel

Hi,

On 21.12.20 at 17:55, Uwe Kleine-König wrote:
> With an input clk rate bigger than 2000000000, scaler would have been
> zero which then would have resulted in a division by zero.
>
> Also the originally implemented algorithm divided by the result of a
> division. This nearly always looses precision. Consider a requested period
> of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware
> was configured with an actual period of 983869.007 ns (PERIOD = 32258)
> while the hardware can provide 1000003.508 ns (PERIOD = 32787).
> And note if the input clock frequency was 32786886 Hz instead, the hardware
> was configured to 1016656.477 ns (PERIOD = 33333) while the optimal
> setting results in 1000003.477 ns (PERIOD = 32787).
>
> This patch implements proper range checking and only divides once for
> the calculation of period (and similar for duty_cycle).
>
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> ---
> Hello,
>
> during review of the bcm2835 driver I noticed this double division.
>
> I think the practical relevance is low however because the clock rate is
> fixed to 10 MHz on this platform which doesn't result in these
> deviations. (Is this right, what is the actual rate?)

clk_get_rate() on my Raspberry Pi 4 returns 9999999.

>
> Having said that this patch is only compile tested.

I tested this on the RP and everything still works so far, so FWIW
Tested-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>


> +
> +	if (state->period > max_period)
> +		period = max_period;
> +	else
> +		period = state->period;
> +

We return EINVAL if the value for period is too small, so should we not do the
same for the case that the value is too high?

Regards,
Lino

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] pwm: bcm2835: Improve period and duty cycle calculation
  2020-12-22 16:34 ` Lino Sanfilippo
@ 2020-12-22 22:11   ` Uwe Kleine-König
  0 siblings, 0 replies; 5+ messages in thread
From: Uwe Kleine-König @ 2020-12-22 22:11 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: Thierry Reding, Lee Jones, Nicolas Saenz Julienne,
	Florian Fainelli, Ray Jui, Scott Branden, Sean Young, linux-pwm,
	bcm-kernel-feedback-list, linux-rpi-kernel, kernel

[-- Attachment #1: Type: text/plain, Size: 2528 bytes --]

Hello Lino,

On Tue, Dec 22, 2020 at 05:34:00PM +0100, Lino Sanfilippo wrote:
> On 21.12.20 at 17:55, Uwe Kleine-König wrote:
> > With an input clk rate bigger than 2000000000, scaler would have been
> > zero which then would have resulted in a division by zero.
> >
> > Also the originally implemented algorithm divided by the result of a
> > division. This nearly always looses precision. Consider a requested period
> > of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware
> > was configured with an actual period of 983869.007 ns (PERIOD = 32258)
> > while the hardware can provide 1000003.508 ns (PERIOD = 32787).
> > And note if the input clock frequency was 32786886 Hz instead, the hardware
> > was configured to 1016656.477 ns (PERIOD = 33333) while the optimal
> > setting results in 1000003.477 ns (PERIOD = 32787).
> >
> > This patch implements proper range checking and only divides once for
> > the calculation of period (and similar for duty_cycle).
> >
> > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> > ---
> > Hello,
> >
> > during review of the bcm2835 driver I noticed this double division.
> >
> > I think the practical relevance is low however because the clock rate is
> > fixed to 10 MHz on this platform which doesn't result in these
> > deviations. (Is this right, what is the actual rate?)
> 
> clk_get_rate() on my Raspberry Pi 4 returns 9999999.

oh, then my patch is a real improvement, but the problems are not as bad
as my examples above. So when a period of 350ns was requested, the old
algorithm provided 400.000040000004 ns and the improved yields
300.000030000003 ns.

> I tested this on the RP and everything still works so far, so FWIW
> Tested-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>

Thanks.

> > +	if (state->period > max_period)
> > +		period = max_period;
> > +	else
> > +		period = state->period;
> > +
> 
> We return EINVAL if the value for period is too small, so should we not do the
> same for the case that the value is too high?

Ah, I didn't notice I changed the behaviour here, but indeed this is
what I think is the right behaviour. (i.e. configure the biggest
possible period not bigger than the requested period.)

But this change doesn't belong into this patch, so I will fixup and
resend.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-12-22 22:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-21 16:55 [PATCH] pwm: bcm2835: Improve period and duty cycle calculation Uwe Kleine-König
2020-12-21 23:04 ` Florian Fainelli
2020-12-22  6:30   ` Uwe Kleine-König
2020-12-22 16:34 ` Lino Sanfilippo
2020-12-22 22:11   ` Uwe Kleine-König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.