linux-pwm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v16 0/2] Microchip Soft IP corePWM driver
@ 2023-04-11  8:56 Conor Dooley
  2023-04-11  8:56 ` [PATCH v16 1/2] pwm: add microchip soft ip " Conor Dooley
  2023-04-11  8:56 ` [PATCH v16 2/2] MAINTAINERS: add pwm to PolarFire SoC entry Conor Dooley
  0 siblings, 2 replies; 10+ messages in thread
From: Conor Dooley @ 2023-04-11  8:56 UTC (permalink / raw)
  To: Thierry Reding, Uwe Kleine-König
  Cc: Daire McNamara, linux-kernel, linux-pwm, linux-riscv, Conor Dooley

Hey all,

Uwe & I had a long back and forth about period calculations on v13,
my ultimate conclusion being that, after some testing of the "corrected"
calculation in hardware, the original calculation was correct.
I think we had gotten sucked into discussion the calculation of the
period itself, when we were in fact trying to calculate a bound on the
period instead. That discussion is here:
https://lore.kernel.org/linux-pwm/Y+ow8tfAHo1yv1XL@wendy/

There was, of course, another chat about the niche bits of this
calculation on v14 too:
https://lore.kernel.org/linux-pwm/20230322105536.kgt3ffowefqlg6eu@pengutronix.de/

Thanks,
Conor.

Changes since v15:
- calculate prescale modulus without using %

Changes since v14:
- change period_steps calculation logic to correctly handle the cases
  where tmp % (254 + 1) == 0, by swapping implicit truncation for
  explicit rounding upwards and subtracting zero
- special case periods < 1/clk_rate & add a note in limitations about
  this, although I think this issue wasn't present prior to v15's
  changes
- check for smaller suitable values of prescale, which picks the "more
  correct" value in about half of all cases, particularly those where
  tmp is large.
- explain what I mean by the "optimal" values for prescale/period steps
  re-fix use of defines
- add a comment about how sync_upd mode works
- make the use of period_steps and prescale consistently refer to the
  register values rather than, in comments, using these to mean the
  resulting values after 1 has been added
- drop the PREG_TO_VAL() macro, as most of its users are now gone & it
  only added to the register value versus "real" value problem
- report pwmchip_add() failures

Changes since v13:
- couple bits of cleanup to apply_locked(), suggested by Uwe
- move the overhead waiting for a change to be applied, for channels
  with shadow registers, to subsequent calls to apply(). This has the
  benefit of only waiting when two calls to apply() are close in time
  rather than eating the delay in every call.

Changes since v11:
- swap a "bare" multiply & divide for the corresponding helper to
  prevent overflow
- factor out duplicate clk rate acquisition & period calculation
- make the period calculation return void by checking the validity of
  the clock rate in the caller
- drop the binding & dt patch, they're on-track for v6.2 via my tree

Changes since v10:
- reword some comments
- try to assign the period if a disable is requested
- drop a cast around a u8 -> u16 conversion
- fix a check on period_steps that should be on the hw_ variant
- split up the period calculation in get_state() to fix the result on
  32 bit
- add a rate variable in get_state() to only call get_rate() once
- redo the locking as suggested to make it more straightforward.
- stop checking for enablement in get_state() that was working around
 intended behaviour of the sysfs interface

Changes since v9:
- fixed the missing unlock that Dan reported

Changes since v8:
- fixed a(nother) raw 64 bit division (& built it for riscv32!)
- added a check to make sure we don't try to sleep for 0 us

Changes since v7:
- rebased on 6.0-rc1
- reworded comments you highlighted in v7
- fixed the overkill sleeping
- removed the unused variables in calc_duty
- added some extra comments to explain behaviours you questioned in v7
- make the mutexes un-interruptible
- fixed added the 1s you suggested for the if(period_locked) logic
- added setup of the channel_enabled shadowing
- fixed the period reporting for the negedge == posedge case in
  get_state() I had to add the enabled check, as otherwise it broke
  setting the period for the first time out of reset.
- added a test for invalid PERIOD_STEPS values, in which case we abort
  if we cannot fix the period

Changes from v6:
- Dropped an unused variable that I'd missed
- Actually check the return values of the mutex lock()s
- Re-rebased on -next for the MAINTAINERS patch (again...)

Changes from v5:
- switched to a mutex b/c we must sleep with the lock taken
- simplified the locking in apply() and added locking to get_state()
- reworked apply() as requested
- removed the loop in the period calculation (thanks Uwe!)
- add a copy of the enable registers in the driver to save on reads.
- remove the second (useless) write to sync_update
- added some missing rounding in get_state()
- couple other minor cleanups as requested in:
https://lore.kernel.org/linux-riscv/20220709160206.cw5luo7kxdshoiua@pengutronix.de/

Changes from v4:
- dropped some accidentally added files

Changes before v4:
https://lore.kernel.org/linux-pwm/20220721172109.941900-1-mail@conchuod.ie

Conor Dooley (2):
  pwm: add microchip soft ip corePWM driver
  MAINTAINERS: add pwm to PolarFire SoC entry

 MAINTAINERS                      |   1 +
 drivers/pwm/Kconfig              |  10 +
 drivers/pwm/Makefile             |   1 +
 drivers/pwm/pwm-microchip-core.c | 509 +++++++++++++++++++++++++++++++
 4 files changed, 521 insertions(+)
 create mode 100644 drivers/pwm/pwm-microchip-core.c

-- 
2.39.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v16 1/2] pwm: add microchip soft ip corePWM driver
  2023-04-11  8:56 [PATCH v16 0/2] Microchip Soft IP corePWM driver Conor Dooley
@ 2023-04-11  8:56 ` Conor Dooley
  2023-04-11 10:55   ` Uwe Kleine-König
  2023-04-11  8:56 ` [PATCH v16 2/2] MAINTAINERS: add pwm to PolarFire SoC entry Conor Dooley
  1 sibling, 1 reply; 10+ messages in thread
From: Conor Dooley @ 2023-04-11  8:56 UTC (permalink / raw)
  To: Thierry Reding, Uwe Kleine-König
  Cc: Daire McNamara, linux-kernel, linux-pwm, linux-riscv, Conor Dooley

Add a driver that supports the Microchip FPGA "soft" PWM IP core.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
---
 drivers/pwm/Kconfig              |  10 +
 drivers/pwm/Makefile             |   1 +
 drivers/pwm/pwm-microchip-core.c | 509 +++++++++++++++++++++++++++++++
 3 files changed, 520 insertions(+)
 create mode 100644 drivers/pwm/pwm-microchip-core.c

diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig
index dae023d783a2..f42756a014ed 100644
--- a/drivers/pwm/Kconfig
+++ b/drivers/pwm/Kconfig
@@ -393,6 +393,16 @@ config PWM_MEDIATEK
 	  To compile this driver as a module, choose M here: the module
 	  will be called pwm-mediatek.
 
+config PWM_MICROCHIP_CORE
+	tristate "Microchip corePWM PWM support"
+	depends on SOC_MICROCHIP_POLARFIRE || COMPILE_TEST
+	depends on HAS_IOMEM && OF
+	help
+	  PWM driver for Microchip FPGA soft IP core.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called pwm-microchip-core.
+
 config PWM_MXS
 	tristate "Freescale MXS PWM support"
 	depends on ARCH_MXS || COMPILE_TEST
diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile
index 7bf1a29f02b8..a65625359ece 100644
--- a/drivers/pwm/Makefile
+++ b/drivers/pwm/Makefile
@@ -34,6 +34,7 @@ obj-$(CONFIG_PWM_LPSS_PCI)	+= pwm-lpss-pci.o
 obj-$(CONFIG_PWM_LPSS_PLATFORM)	+= pwm-lpss-platform.o
 obj-$(CONFIG_PWM_MESON)		+= pwm-meson.o
 obj-$(CONFIG_PWM_MEDIATEK)	+= pwm-mediatek.o
+obj-$(CONFIG_PWM_MICROCHIP_CORE)	+= pwm-microchip-core.o
 obj-$(CONFIG_PWM_MTK_DISP)	+= pwm-mtk-disp.o
 obj-$(CONFIG_PWM_MXS)		+= pwm-mxs.o
 obj-$(CONFIG_PWM_NTXEC)		+= pwm-ntxec.o
diff --git a/drivers/pwm/pwm-microchip-core.c b/drivers/pwm/pwm-microchip-core.c
new file mode 100644
index 000000000000..0a69ec376c51
--- /dev/null
+++ b/drivers/pwm/pwm-microchip-core.c
@@ -0,0 +1,509 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * corePWM driver for Microchip "soft" FPGA IP cores.
+ *
+ * Copyright (c) 2021-2023 Microchip Corporation. All rights reserved.
+ * Author: Conor Dooley <conor.dooley@microchip.com>
+ * Documentation:
+ * https://www.microsemi.com/document-portal/doc_download/1245275-corepwm-hb
+ *
+ * Limitations:
+ * - If the IP block is configured without "shadow registers", all register
+ *   writes will take effect immediately, causing glitches on the output.
+ *   If shadow registers *are* enabled, a write to the "SYNC_UPDATE" register
+ *   notifies the core that it needs to update the registers defining the
+ *   waveform from the contents of the "shadow registers".
+ * - The IP block has no concept of a duty cycle, only rising/falling edges of
+ *   the waveform. Unfortunately, if the rising & falling edges registers have
+ *   the same value written to them the IP block will do whichever of a rising
+ *   or a falling edge is possible. I.E. a 50% waveform at twice the requested
+ *   period. Therefore to get a 0% waveform, the output is set the max high/low
+ *   time depending on polarity.
+ *   If the duty cycle is 0%, and the requested period is less than the
+ *   available period resolution, this will manifest as a ~100% waveform (with
+ *   some output glitches) rather than 50%.
+ * - The PWM period is set for the whole IP block not per channel. The driver
+ *   will only change the period if no other PWM output is enabled.
+ */
+
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/err.h>
+#include <linux/io.h>
+#include <linux/ktime.h>
+#include <linux/math.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of_device.h>
+#include <linux/platform_device.h>
+#include <linux/pwm.h>
+
+#define MCHPCOREPWM_PRESCALE_MAX	0xff
+#define MCHPCOREPWM_PERIOD_STEPS_MAX	0xfe
+#define MCHPCOREPWM_PERIOD_MAX		0xff00
+
+#define MCHPCOREPWM_PRESCALE	0x00
+#define MCHPCOREPWM_PERIOD	0x04
+#define MCHPCOREPWM_EN(i)	(0x08 + 0x04 * (i)) /* 0x08, 0x0c */
+#define MCHPCOREPWM_POSEDGE(i)	(0x10 + 0x08 * (i)) /* 0x10, 0x18, ..., 0x88 */
+#define MCHPCOREPWM_NEGEDGE(i)	(0x14 + 0x08 * (i)) /* 0x14, 0x1c, ..., 0x8c */
+#define MCHPCOREPWM_SYNC_UPD	0xe4
+#define MCHPCOREPWM_TIMEOUT_MS	100u
+
+struct mchp_core_pwm_chip {
+	struct pwm_chip chip;
+	struct clk *clk;
+	void __iomem *base;
+	struct mutex lock; /* protects the shared period */
+	ktime_t update_timestamp;
+	u32 sync_update_mask;
+	u16 channel_enabled;
+};
+
+static inline struct mchp_core_pwm_chip *to_mchp_core_pwm(struct pwm_chip *chip)
+{
+	return container_of(chip, struct mchp_core_pwm_chip, chip);
+}
+
+static void mchp_core_pwm_enable(struct pwm_chip *chip, struct pwm_device *pwm,
+				 bool enable, u64 period)
+{
+	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
+	u8 channel_enable, reg_offset, shift;
+
+	/*
+	 * There are two adjacent 8 bit control regs, the lower reg controls
+	 * 0-7 and the upper reg 8-15. Check if the pwm is in the upper reg
+	 * and if so, offset by the bus width.
+	 */
+	reg_offset = MCHPCOREPWM_EN(pwm->hwpwm >> 3);
+	shift = pwm->hwpwm & 7;
+
+	channel_enable = readb_relaxed(mchp_core_pwm->base + reg_offset);
+	channel_enable &= ~(1 << shift);
+	channel_enable |= (enable << shift);
+
+	writel_relaxed(channel_enable, mchp_core_pwm->base + reg_offset);
+	mchp_core_pwm->channel_enabled &= ~BIT(pwm->hwpwm);
+	mchp_core_pwm->channel_enabled |= enable << pwm->hwpwm;
+
+	/*
+	 * Notify the block to update the waveform from the shadow registers.
+	 * The updated values will not appear on the bus until they have been
+	 * applied to the waveform at the beginning of the next period.
+	 * This is a NO-OP if the channel does not have shadow registers.
+	 */
+	if (mchp_core_pwm->sync_update_mask & (1 << pwm->hwpwm))
+		mchp_core_pwm->update_timestamp = ktime_add_ns(ktime_get(), period);
+}
+
+static void mchp_core_pwm_wait_for_sync_update(struct mchp_core_pwm_chip *mchp_core_pwm,
+					       unsigned int channel)
+{
+	/*
+	 * If a shadow register is used for this PWM channel, and iff there is
+	 * a pending update to the waveform, we must wait for it to be applied
+	 * before attempting to read its state. Reading the registers yields
+	 * the currently implemented settings & the new ones are only readable
+	 * once the current period has ended.
+	 */
+
+	if (mchp_core_pwm->sync_update_mask & (1 << channel)) {
+		ktime_t current_time = ktime_get();
+		s64 remaining_ns;
+		u32 delay_us;
+
+		remaining_ns = ktime_to_ns(ktime_sub(mchp_core_pwm->update_timestamp,
+						     current_time));
+
+		/*
+		 * If the update has gone through, don't bother waiting for
+		 * obvious reasons. Otherwise wait around for an appropriate
+		 * amount of time for the update to go through.
+		 */
+		if (remaining_ns <= 0)
+			return;
+
+		delay_us = DIV_ROUND_UP_ULL(remaining_ns, NSEC_PER_USEC);
+		fsleep(delay_us);
+	}
+}
+
+static u64 mchp_core_pwm_calc_duty(const struct pwm_state *state, u64 clk_rate,
+				   u8 prescale, u8 period_steps)
+{
+	u64 duty_steps, tmp;
+
+	/*
+	 * Calculate the duty cycle in multiples of the prescaled period:
+	 * duty_steps = duty_in_ns / step_in_ns
+	 * step_in_ns = (prescale * NSEC_PER_SEC) / clk_rate
+	 * The code below is rearranged slightly to only divide once.
+	 */
+	tmp = (prescale + 1) * NSEC_PER_SEC;
+	duty_steps = mul_u64_u64_div_u64(state->duty_cycle, clk_rate, tmp);
+
+	return duty_steps;
+}
+
+static void mchp_core_pwm_apply_duty(struct pwm_chip *chip, struct pwm_device *pwm,
+				     const struct pwm_state *state, u64 duty_steps,
+				     u16 period_steps)
+{
+	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
+	u8 posedge, negedge;
+	u8 first_edge = 0, second_edge = duty_steps;
+
+	/*
+	 * Setting posedge == negedge doesn't yield a constant output,
+	 * so that's an unsuitable setting to model duty_steps = 0.
+	 * In that case set the unwanted edge to a value that never
+	 * triggers.
+	 */
+	if (duty_steps == 0)
+		first_edge = period_steps + 1;
+
+	if (state->polarity == PWM_POLARITY_INVERSED) {
+		negedge = first_edge;
+		posedge = second_edge;
+	} else {
+		posedge = first_edge;
+		negedge = second_edge;
+	}
+
+	writel_relaxed(posedge, mchp_core_pwm->base + MCHPCOREPWM_POSEDGE(pwm->hwpwm));
+	writel_relaxed(negedge, mchp_core_pwm->base + MCHPCOREPWM_NEGEDGE(pwm->hwpwm));
+}
+
+static int mchp_core_pwm_calc_period(const struct pwm_state *state, unsigned long clk_rate,
+				     u16 *prescale, u16 *period_steps)
+{
+	u64 tmp;
+	u32 remainder;
+
+	/*
+	 * Calculate the period cycles and prescale values.
+	 * The registers are each 8 bits wide & multiplied to compute the period
+	 * using the formula:
+	 *           (prescale + 1) * (period_steps + 1)
+	 * period = -------------------------------------
+	 *                      clk_rate
+	 * so the maximum period that can be generated is 0x10000 times the
+	 * period of the input clock.
+	 * However, due to the design of the "hardware", it is not possible to
+	 * attain a 100% duty cycle if the full range of period_steps is used.
+	 * Therefore period_steps is restricted to 0xfe and the maximum multiple
+	 * of the clock period attainable is (0xff + 1) * (0xfe + 1) = 0xff00
+	 *
+	 * The prescale and period_steps registers operate similarly to
+	 * CLK_DIVIDER_ONE_BASED, where the value used by the hardware is that
+	 * in the register plus one.
+	 * It's therefore not possible to set a period lower than 1/clk_rate, so
+	 * if tmp is 0, abort. Without aborting, we will set a period that is
+	 * greater than that requested and, more importantly, will trigger the
+	 * neg-/pos-edge issue described in the limitations.
+	 */
+	tmp = mul_u64_u64_div_u64(state->period, clk_rate, NSEC_PER_SEC);
+	if (!tmp)
+		return -EINVAL;
+
+	if (tmp >= MCHPCOREPWM_PERIOD_MAX) {
+		*prescale = MCHPCOREPWM_PRESCALE_MAX;
+		*period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX;
+
+		return 0;
+	}
+
+	/*
+	 * There are multiple strategies that could be used to choose the
+	 * prescale & period_steps values.
+	 * Here the idea is to pick values so that the selection of duty cycles
+	 * is as finegrain as possible.
+	 * This "optimal" value for prescale can be calculated using the maximum
+	 * permitted value of period_steps, 0xfe.
+	 *
+	 *                period * clk_rate
+	 * prescale = ------------------------- - 1
+	 *            NSEC_PER_SEC * (0xfe + 1)
+	 *
+	 * However, we are purely interested in the integer upper bound of this
+	 * calculation, so this division should be rounded up before subtracting
+	 * 1
+	 *
+	 *  period * clk_rate
+	 * ------------------- was precomputed as `tmp`
+	 *    NSEC_PER_SEC
+	 */
+	*prescale = DIV64_U64_ROUND_UP(tmp, MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1;
+
+	/*
+	 * Because 0xff is not a permitted value some error will seep into the
+	 * calculation of prescale as prescale grows. Specifically, this error
+	 * occurs where the remainder of the prescale calculation is less than
+	 * prescale.
+	 * For small values of prescale, only a handful of values will need
+	 * correction, but overall this applies to almost half of the valid
+	 * values for tmp.
+	 *
+	 * To keep the algorithm's decision making consistent, this case is
+	 * checked for and the simple solution is to, in these cases,
+	 * decrement prescale and check that the resulting value of period_steps
+	 * is valid.
+	 *
+	 * period_steps can be computed from prescale:
+	 *                      period * clk_rate
+	 * period_steps = ----------------------------- - 1
+	 *                NSEC_PER_SEC * (prescale + 1)
+	 *
+	 */
+	div_u64_rem(tmp, (MCHPCOREPWM_PERIOD_STEPS_MAX + 1), &remainder);
+	if (remainder < *prescale) {
+		u16 smaller_prescale = *prescale - 1;
+
+		*period_steps = div_u64(tmp, smaller_prescale + 1) - 1;
+		if (*period_steps < 255) {
+			*prescale = smaller_prescale;
+
+			return 0;
+		}
+	}
+
+	*period_steps = div_u64(tmp, *prescale + 1);
+	if (*period_steps)
+		*period_steps -= 1;
+
+	return 0;
+}
+
+static int mchp_core_pwm_apply_locked(struct pwm_chip *chip, struct pwm_device *pwm,
+				      const struct pwm_state *state)
+{
+	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
+	bool period_locked;
+	unsigned long clk_rate;
+	u64 duty_steps;
+	u16 prescale, period_steps;
+	int ret;
+
+	if (!state->enabled) {
+		mchp_core_pwm_enable(chip, pwm, false, pwm->state.period);
+		return 0;
+	}
+
+	/*
+	 * If clk_rate is too big, the following multiplication might overflow.
+	 * However this is implausible, as the fabric of current FPGAs cannot
+	 * provide clocks at a rate high enough.
+	 */
+	clk_rate = clk_get_rate(mchp_core_pwm->clk);
+	if (clk_rate >= NSEC_PER_SEC)
+		return -EINVAL;
+
+	ret = mchp_core_pwm_calc_period(state, clk_rate, &prescale, &period_steps);
+	if (ret)
+		return ret;
+
+	/*
+	 * If the only thing that has changed is the duty cycle or the polarity,
+	 * we can shortcut the calculations and just compute/apply the new duty
+	 * cycle pos & neg edges
+	 * As all the channels share the same period, do not allow it to be
+	 * changed if any other channels are enabled.
+	 * If the period is locked, it may not be possible to use a period
+	 * less than that requested. In that case, we just abort.
+	 */
+	period_locked = mchp_core_pwm->channel_enabled & ~(1 << pwm->hwpwm);
+
+	if (period_locked) {
+		u16 hw_prescale;
+		u16 hw_period_steps;
+
+		hw_prescale = readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_PRESCALE);
+		hw_period_steps = readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_PERIOD);
+
+		if ((period_steps + 1) * (prescale + 1) <
+		    (hw_period_steps + 1) * (hw_prescale + 1))
+			return -EINVAL;
+
+		/*
+		 * It is possible that something could have set the period_steps
+		 * register to 0xff, which would prevent us from setting a 100%
+		 * or 0% relative duty cycle, as explained above in
+		 * mchp_core_pwm_calc_period().
+		 * The period is locked and we cannot change this, so we abort.
+		 */
+		if (hw_period_steps == MCHPCOREPWM_PERIOD_STEPS_MAX)
+			return -EINVAL;
+
+		prescale = hw_prescale;
+		period_steps = hw_period_steps;
+	} else {
+		writel_relaxed(prescale, mchp_core_pwm->base + MCHPCOREPWM_PRESCALE);
+		writel_relaxed(period_steps, mchp_core_pwm->base + MCHPCOREPWM_PERIOD);
+	}
+
+	duty_steps = mchp_core_pwm_calc_duty(state, clk_rate, prescale, period_steps);
+
+	/*
+	 * Because the period is not per channel, it is possible that the
+	 * requested duty cycle is longer than the period, in which case cap it
+	 * to the period, IOW a 100% duty cycle.
+	 */
+	if (duty_steps > period_steps)
+		duty_steps = period_steps + 1;
+
+	mchp_core_pwm_apply_duty(chip, pwm, state, duty_steps, period_steps);
+
+	mchp_core_pwm_enable(chip, pwm, true, pwm->state.period);
+
+	return 0;
+}
+
+static int mchp_core_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
+			       const struct pwm_state *state)
+{
+	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
+	int ret;
+
+	mutex_lock(&mchp_core_pwm->lock);
+
+	mchp_core_pwm_wait_for_sync_update(mchp_core_pwm, pwm->hwpwm);
+
+	ret = mchp_core_pwm_apply_locked(chip, pwm, state);
+
+	mutex_unlock(&mchp_core_pwm->lock);
+
+	return ret;
+}
+
+static int mchp_core_pwm_get_state(struct pwm_chip *chip, struct pwm_device *pwm,
+				   struct pwm_state *state)
+{
+	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
+	u64 rate;
+	u16 prescale, period_steps;
+	u8 duty_steps, posedge, negedge;
+
+	mutex_lock(&mchp_core_pwm->lock);
+
+	mchp_core_pwm_wait_for_sync_update(mchp_core_pwm, pwm->hwpwm);
+
+	if (mchp_core_pwm->channel_enabled & (1 << pwm->hwpwm))
+		state->enabled = true;
+	else
+		state->enabled = false;
+
+	rate = clk_get_rate(mchp_core_pwm->clk);
+
+	/*
+	 * Calculating the period:
+	 * The registers are each 8 bits wide & multiplied to compute the period
+	 * using the formula:
+	 *           (prescale + 1) * (period_steps + 1)
+	 * period = -------------------------------------
+	 *                      clk_rate
+	 *
+	 * Note:
+	 * The prescale and period_steps registers operate similarly to
+	 * CLK_DIVIDER_ONE_BASED, where the value used by the hardware is that
+	 * in the register plus one.
+	 */
+	prescale = readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_PRESCALE);
+	period_steps = readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_PERIOD);
+
+	state->period = (period_steps + 1) * (prescale + 1);
+	state->period *= NSEC_PER_SEC;
+	state->period = DIV64_U64_ROUND_UP(state->period, rate);
+
+	posedge = readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_POSEDGE(pwm->hwpwm));
+	negedge = readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_NEGEDGE(pwm->hwpwm));
+
+	mutex_unlock(&mchp_core_pwm->lock);
+
+	if (negedge == posedge) {
+		state->duty_cycle = state->period;
+		state->period *= 2;
+	} else {
+		duty_steps = abs((s16)posedge - (s16)negedge);
+		state->duty_cycle = duty_steps * (prescale + 1) * NSEC_PER_SEC;
+		state->duty_cycle = DIV64_U64_ROUND_UP(state->duty_cycle, rate);
+	}
+
+	state->polarity = negedge < posedge ? PWM_POLARITY_INVERSED : PWM_POLARITY_NORMAL;
+
+	return 0;
+}
+
+static const struct pwm_ops mchp_core_pwm_ops = {
+	.apply = mchp_core_pwm_apply,
+	.get_state = mchp_core_pwm_get_state,
+	.owner = THIS_MODULE,
+};
+
+static const struct of_device_id mchp_core_of_match[] = {
+	{
+		.compatible = "microchip,corepwm-rtl-v4",
+	},
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, mchp_core_of_match);
+
+static int mchp_core_pwm_probe(struct platform_device *pdev)
+{
+	struct mchp_core_pwm_chip *mchp_core_pwm;
+	struct resource *regs;
+	int ret;
+
+	mchp_core_pwm = devm_kzalloc(&pdev->dev, sizeof(*mchp_core_pwm), GFP_KERNEL);
+	if (!mchp_core_pwm)
+		return -ENOMEM;
+
+	mchp_core_pwm->base = devm_platform_get_and_ioremap_resource(pdev, 0, &regs);
+	if (IS_ERR(mchp_core_pwm->base))
+		return PTR_ERR(mchp_core_pwm->base);
+
+	mchp_core_pwm->clk = devm_clk_get_enabled(&pdev->dev, NULL);
+	if (IS_ERR(mchp_core_pwm->clk))
+		return dev_err_probe(&pdev->dev, PTR_ERR(mchp_core_pwm->clk),
+				     "failed to get PWM clock\n");
+
+	if (of_property_read_u32(pdev->dev.of_node, "microchip,sync-update-mask",
+				 &mchp_core_pwm->sync_update_mask))
+		mchp_core_pwm->sync_update_mask = 0;
+
+	mutex_init(&mchp_core_pwm->lock);
+
+	mchp_core_pwm->chip.dev = &pdev->dev;
+	mchp_core_pwm->chip.ops = &mchp_core_pwm_ops;
+	mchp_core_pwm->chip.npwm = 16;
+
+	mchp_core_pwm->channel_enabled = readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_EN(0));
+	mchp_core_pwm->channel_enabled |=
+		readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_EN(1)) << 8;
+
+	/*
+	 * Enable synchronous update mode for all channels for which shadow
+	 * registers have been synthesised.
+	 */
+	writel_relaxed(1U, mchp_core_pwm->base + MCHPCOREPWM_SYNC_UPD);
+	mchp_core_pwm->update_timestamp = ktime_get();
+
+	ret = devm_pwmchip_add(&pdev->dev, &mchp_core_pwm->chip);
+	if (ret)
+		return dev_err_probe(&pdev->dev, ret, "Failed to add pwmchip\n");
+
+	return 0;
+}
+
+static struct platform_driver mchp_core_pwm_driver = {
+	.driver = {
+		.name = "mchp-core-pwm",
+		.of_match_table = mchp_core_of_match,
+	},
+	.probe = mchp_core_pwm_probe,
+};
+module_platform_driver(mchp_core_pwm_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Conor Dooley <conor.dooley@microchip.com>");
+MODULE_DESCRIPTION("corePWM driver for Microchip FPGAs");
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v16 2/2] MAINTAINERS: add pwm to PolarFire SoC entry
  2023-04-11  8:56 [PATCH v16 0/2] Microchip Soft IP corePWM driver Conor Dooley
  2023-04-11  8:56 ` [PATCH v16 1/2] pwm: add microchip soft ip " Conor Dooley
@ 2023-04-11  8:56 ` Conor Dooley
  1 sibling, 0 replies; 10+ messages in thread
From: Conor Dooley @ 2023-04-11  8:56 UTC (permalink / raw)
  To: Thierry Reding, Uwe Kleine-König
  Cc: Daire McNamara, linux-kernel, linux-pwm, linux-riscv, Conor Dooley

Add the newly introduced pwm driver to the existing PolarFire SoC entry.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8d5bc223f305..128cc89a47d8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17993,6 +17993,7 @@ F:	drivers/clk/microchip/clk-mpfs.c
 F:	drivers/i2c/busses/i2c-microchip-corei2c.c
 F:	drivers/mailbox/mailbox-mpfs.c
 F:	drivers/pci/controller/pcie-microchip-host.c
+F:	drivers/pwm/pwm-microchip-core.c
 F:	drivers/reset/reset-mpfs.c
 F:	drivers/rtc/rtc-mpfs.c
 F:	drivers/soc/microchip/mpfs-sys-controller.c
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v16 1/2] pwm: add microchip soft ip corePWM driver
  2023-04-11  8:56 ` [PATCH v16 1/2] pwm: add microchip soft ip " Conor Dooley
@ 2023-04-11 10:55   ` Uwe Kleine-König
  2023-04-11 13:56     ` Conor Dooley
  0 siblings, 1 reply; 10+ messages in thread
From: Uwe Kleine-König @ 2023-04-11 10:55 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Thierry Reding, Daire McNamara, linux-kernel, linux-pwm, linux-riscv

[-- Attachment #1: Type: text/plain, Size: 13395 bytes --]

Hello Conor,

On Tue, Apr 11, 2023 at 09:56:34AM +0100, Conor Dooley wrote:
> Add a driver that supports the Microchip FPGA "soft" PWM IP core.
> 
> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
> ---
>  drivers/pwm/Kconfig              |  10 +
>  drivers/pwm/Makefile             |   1 +
>  drivers/pwm/pwm-microchip-core.c | 509 +++++++++++++++++++++++++++++++
>  3 files changed, 520 insertions(+)
>  create mode 100644 drivers/pwm/pwm-microchip-core.c
> 
> diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig
> index dae023d783a2..f42756a014ed 100644
> --- a/drivers/pwm/Kconfig
> +++ b/drivers/pwm/Kconfig
> @@ -393,6 +393,16 @@ config PWM_MEDIATEK
>  	  To compile this driver as a module, choose M here: the module
>  	  will be called pwm-mediatek.
>  
> +config PWM_MICROCHIP_CORE
> +	tristate "Microchip corePWM PWM support"
> +	depends on SOC_MICROCHIP_POLARFIRE || COMPILE_TEST
> +	depends on HAS_IOMEM && OF
> +	help
> +	  PWM driver for Microchip FPGA soft IP core.
> +
> +	  To compile this driver as a module, choose M here: the module
> +	  will be called pwm-microchip-core.
> +
>  config PWM_MXS
>  	tristate "Freescale MXS PWM support"
>  	depends on ARCH_MXS || COMPILE_TEST
> diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile
> index 7bf1a29f02b8..a65625359ece 100644
> --- a/drivers/pwm/Makefile
> +++ b/drivers/pwm/Makefile
> @@ -34,6 +34,7 @@ obj-$(CONFIG_PWM_LPSS_PCI)	+= pwm-lpss-pci.o
>  obj-$(CONFIG_PWM_LPSS_PLATFORM)	+= pwm-lpss-platform.o
>  obj-$(CONFIG_PWM_MESON)		+= pwm-meson.o
>  obj-$(CONFIG_PWM_MEDIATEK)	+= pwm-mediatek.o
> +obj-$(CONFIG_PWM_MICROCHIP_CORE)	+= pwm-microchip-core.o
>  obj-$(CONFIG_PWM_MTK_DISP)	+= pwm-mtk-disp.o
>  obj-$(CONFIG_PWM_MXS)		+= pwm-mxs.o
>  obj-$(CONFIG_PWM_NTXEC)		+= pwm-ntxec.o
> diff --git a/drivers/pwm/pwm-microchip-core.c b/drivers/pwm/pwm-microchip-core.c
> new file mode 100644
> index 000000000000..0a69ec376c51
> --- /dev/null
> +++ b/drivers/pwm/pwm-microchip-core.c
> @@ -0,0 +1,509 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * corePWM driver for Microchip "soft" FPGA IP cores.
> + *
> + * Copyright (c) 2021-2023 Microchip Corporation. All rights reserved.
> + * Author: Conor Dooley <conor.dooley@microchip.com>
> + * Documentation:
> + * https://www.microsemi.com/document-portal/doc_download/1245275-corepwm-hb
> + *
> + * Limitations:
> + * - If the IP block is configured without "shadow registers", all register
> + *   writes will take effect immediately, causing glitches on the output.
> + *   If shadow registers *are* enabled, a write to the "SYNC_UPDATE" register
> + *   notifies the core that it needs to update the registers defining the
> + *   waveform from the contents of the "shadow registers".

You only write once to the sync update register (i.e. during probe). So
that register specifies that a period should be completed before a new
setting becomes active, right? Even with sync update this is still racy,
right?

> + * - The IP block has no concept of a duty cycle, only rising/falling edges of
> + *   the waveform. Unfortunately, if the rising & falling edges registers have
> + *   the same value written to them the IP block will do whichever of a rising
> + *   or a falling edge is possible. I.E. a 50% waveform at twice the requested
> + *   period. Therefore to get a 0% waveform, the output is set the max high/low
> + *   time depending on polarity.
> + *   If the duty cycle is 0%, and the requested period is less than the
> + *   available period resolution, this will manifest as a ~100% waveform (with
> + *   some output glitches) rather than 50%.

The last paragraph refers to negedge = 0, posedge = 0 and period_steps =
0?

> + * - The PWM period is set for the whole IP block not per channel. The driver
> + *   will only change the period if no other PWM output is enabled.
> + */

> +static void mchp_core_pwm_enable(struct pwm_chip *chip, struct pwm_device *pwm,
> +				 bool enable, u64 period)
> +{
> +	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
> +	u8 channel_enable, reg_offset, shift;
> +
> +	/*
> +	 * There are two adjacent 8 bit control regs, the lower reg controls
> +	 * 0-7 and the upper reg 8-15. Check if the pwm is in the upper reg
> +	 * and if so, offset by the bus width.
> +	 */
> +	reg_offset = MCHPCOREPWM_EN(pwm->hwpwm >> 3);
> +	shift = pwm->hwpwm & 7;
> +
> +	channel_enable = readb_relaxed(mchp_core_pwm->base + reg_offset);
> +	channel_enable &= ~(1 << shift);
> +	channel_enable |= (enable << shift);
> +
> +	writel_relaxed(channel_enable, mchp_core_pwm->base + reg_offset);
> +	mchp_core_pwm->channel_enabled &= ~BIT(pwm->hwpwm);
> +	mchp_core_pwm->channel_enabled |= enable << pwm->hwpwm;
> +
> +	/*
> +	 * Notify the block to update the waveform from the shadow registers.
> +	 * The updated values will not appear on the bus until they have been
> +	 * applied to the waveform at the beginning of the next period.
> +	 * This is a NO-OP if the channel does not have shadow registers.
> +	 */

The code doesn't match the comment. I think that is a relict from the
times when we thought that a trigger was necessary to update the
operating settings from the shadow registers?

> +	if (mchp_core_pwm->sync_update_mask & (1 << pwm->hwpwm))
> +		mchp_core_pwm->update_timestamp = ktime_add_ns(ktime_get(), period);
> +}
> +
> +static void mchp_core_pwm_wait_for_sync_update(struct mchp_core_pwm_chip *mchp_core_pwm,
> +					       unsigned int channel)
> +{
> +	/*
> +	 * If a shadow register is used for this PWM channel, and iff there is
> +	 * a pending update to the waveform, we must wait for it to be applied
> +	 * before attempting to read its state. Reading the registers yields
> +	 * the currently implemented settings & the new ones are only readable
> +	 * once the current period has ended.
> +	 */
> +
> +	if (mchp_core_pwm->sync_update_mask & (1 << channel)) {
> +		ktime_t current_time = ktime_get();
> +		s64 remaining_ns;
> +		u32 delay_us;
> +
> +		remaining_ns = ktime_to_ns(ktime_sub(mchp_core_pwm->update_timestamp,
> +						     current_time));
> +
> +		/*
> +		 * If the update has gone through, don't bother waiting for
> +		 * obvious reasons. Otherwise wait around for an appropriate
> +		 * amount of time for the update to go through.
> +		 */
> +		if (remaining_ns <= 0)
> +			return;
> +
> +		delay_us = DIV_ROUND_UP_ULL(remaining_ns, NSEC_PER_USEC);
> +		fsleep(delay_us);
> +	}

There is no way to query the hardware if there is still an update
pending, right? Maybe that's possible implicitly by memoizing the
expected read value? For me the current approach is fine enough though.
This can be addressed in the future if needed.

> +static u64 mchp_core_pwm_calc_duty(const struct pwm_state *state, u64 clk_rate,
> +				   u8 prescale, u8 period_steps)
> +{
> +	u64 duty_steps, tmp;
> +
> +	/*
> +	 * Calculate the duty cycle in multiples of the prescaled period:
> +	 * duty_steps = duty_in_ns / step_in_ns
> +	 * step_in_ns = (prescale * NSEC_PER_SEC) / clk_rate
> +	 * The code below is rearranged slightly to only divide once.
> +	 */
> +	tmp = (prescale + 1) * NSEC_PER_SEC;
> +	duty_steps = mul_u64_u64_div_u64(state->duty_cycle, clk_rate, tmp);
> +
> +	return duty_steps;
> +}
> +
> +static void mchp_core_pwm_apply_duty(struct pwm_chip *chip, struct pwm_device *pwm,
> +				     const struct pwm_state *state, u64 duty_steps,
> +				     u16 period_steps)
> +{
> +	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
> +	u8 posedge, negedge;
> +	u8 first_edge = 0, second_edge = duty_steps;
> +
> +	/*
> +	 * Setting posedge == negedge doesn't yield a constant output,
> +	 * so that's an unsuitable setting to model duty_steps = 0.
> +	 * In that case set the unwanted edge to a value that never
> +	 * triggers.
> +	 */
> +	if (duty_steps == 0)
> +		first_edge = period_steps + 1;
> +
> +	if (state->polarity == PWM_POLARITY_INVERSED) {
> +		negedge = first_edge;
> +		posedge = second_edge;
> +	} else {
> +		posedge = first_edge;
> +		negedge = second_edge;
> +	}
> +
> +	writel_relaxed(posedge, mchp_core_pwm->base + MCHPCOREPWM_POSEDGE(pwm->hwpwm));
> +	writel_relaxed(negedge, mchp_core_pwm->base + MCHPCOREPWM_NEGEDGE(pwm->hwpwm));

Is this racy with sync update implemented in the firmware? A comment
about how the sync update is implemented would be good.

> +}
> +
> +static int mchp_core_pwm_calc_period(const struct pwm_state *state, unsigned long clk_rate,
> +				     u16 *prescale, u16 *period_steps)
> +{
> +	u64 tmp;
> +	u32 remainder;
> +
> +	/*
> +	 * Calculate the period cycles and prescale values.
> +	 * The registers are each 8 bits wide & multiplied to compute the period
> +	 * using the formula:
> +	 *           (prescale + 1) * (period_steps + 1)
> +	 * period = -------------------------------------
> +	 *                      clk_rate
> +	 * so the maximum period that can be generated is 0x10000 times the
> +	 * period of the input clock.
> +	 * However, due to the design of the "hardware", it is not possible to
> +	 * attain a 100% duty cycle if the full range of period_steps is used.
> +	 * Therefore period_steps is restricted to 0xfe and the maximum multiple
> +	 * of the clock period attainable is (0xff + 1) * (0xfe + 1) = 0xff00
> +	 *
> +	 * The prescale and period_steps registers operate similarly to
> +	 * CLK_DIVIDER_ONE_BASED, where the value used by the hardware is that
> +	 * in the register plus one.
> +	 * It's therefore not possible to set a period lower than 1/clk_rate, so
> +	 * if tmp is 0, abort. Without aborting, we will set a period that is
> +	 * greater than that requested and, more importantly, will trigger the
> +	 * neg-/pos-edge issue described in the limitations.
> +	 */
> +	tmp = mul_u64_u64_div_u64(state->period, clk_rate, NSEC_PER_SEC);
> +	if (!tmp)
> +		return -EINVAL;
> +
> +	if (tmp >= MCHPCOREPWM_PERIOD_MAX) {
> +		*prescale = MCHPCOREPWM_PRESCALE_MAX;
> +		*period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX;
> +
> +		return 0;
> +	}
> +
> +	/*
> +	 * There are multiple strategies that could be used to choose the
> +	 * prescale & period_steps values.
> +	 * Here the idea is to pick values so that the selection of duty cycles
> +	 * is as finegrain as possible.
> +	 * This "optimal" value for prescale can be calculated using the maximum
> +	 * permitted value of period_steps, 0xfe.
> +	 *
> +	 *                period * clk_rate
> +	 * prescale = ------------------------- - 1
> +	 *            NSEC_PER_SEC * (0xfe + 1)
> +	 *
> +	 * However, we are purely interested in the integer upper bound of this
> +	 * calculation, so this division should be rounded up before subtracting
> +	 * 1
> +	 *
> +	 *  period * clk_rate
> +	 * ------------------- was precomputed as `tmp`
> +	 *    NSEC_PER_SEC
> +	 */
> +	*prescale = DIV64_U64_ROUND_UP(tmp, MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1;

If state->period * clk_rate is 765000000001 you get tmp = 765 and then
*prescale = 2. However roundup(765000000001 / (1000000000 * 255)) - 1 is
3. The problem here is that you're rounding down in the calculation of
tmp. Of course this is constructed because 765000000001 is prime, but
I'm sure you get the point :-)

Also we know that tmp is < 0xff00, so we don't need a 64 bit division
here.

> +	/*
> +	 * Because 0xff is not a permitted value some error will seep into the
> +	 * calculation of prescale as prescale grows. Specifically, this error
> +	 * occurs where the remainder of the prescale calculation is less than
> +	 * prescale.
> +	 * For small values of prescale, only a handful of values will need
> +	 * correction, but overall this applies to almost half of the valid
> +	 * values for tmp.
> +	 *
> +	 * To keep the algorithm's decision making consistent, this case is
> +	 * checked for and the simple solution is to, in these cases,
> +	 * decrement prescale and check that the resulting value of period_steps
> +	 * is valid.
> +	 *
> +	 * period_steps can be computed from prescale:
> +	 *                      period * clk_rate
> +	 * period_steps = ----------------------------- - 1
> +	 *                NSEC_PER_SEC * (prescale + 1)
> +	 *
> +	 */
> +	div_u64_rem(tmp, (MCHPCOREPWM_PERIOD_STEPS_MAX + 1), &remainder);
> +	if (remainder < *prescale) {
> +		u16 smaller_prescale = *prescale - 1;
> +
> +		*period_steps = div_u64(tmp, smaller_prescale + 1) - 1;
> +		if (*period_steps < 255) {
> +			*prescale = smaller_prescale;
> +
> +			return 0;
> +		}
> +	}

I don't understand that part. It triggers for tmp = 511. So you prefer

	prescale = 1
	period_steps = 254

yielding period = 510 / clkrate over

	prescale = 2
	period_steps = 170

yielding 513 / clkrate. I wonder why. Alsot tmp = 511 is the only value
where this triggers. There is a mistake somewhere (maybe on my side).

> +	*period_steps = div_u64(tmp, *prescale + 1);
> +	if (*period_steps)
> +		*period_steps -= 1;
> +
> +	return 0;
> +}

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v16 1/2] pwm: add microchip soft ip corePWM driver
  2023-04-11 10:55   ` Uwe Kleine-König
@ 2023-04-11 13:56     ` Conor Dooley
  2023-04-11 16:25       ` Uwe Kleine-König
  0 siblings, 1 reply; 10+ messages in thread
From: Conor Dooley @ 2023-04-11 13:56 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: Thierry Reding, Daire McNamara, linux-kernel, linux-pwm, linux-riscv

[-- Attachment #1: Type: text/plain, Size: 16105 bytes --]

Hey Uwe,

On Tue, Apr 11, 2023 at 12:55:47PM +0200, Uwe Kleine-König wrote:
> On Tue, Apr 11, 2023 at 09:56:34AM +0100, Conor Dooley wrote:
> > Add a driver that supports the Microchip FPGA "soft" PWM IP core.
> > 
> > Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
> > ---
> >  drivers/pwm/Kconfig              |  10 +
> >  drivers/pwm/Makefile             |   1 +
> >  drivers/pwm/pwm-microchip-core.c | 509 +++++++++++++++++++++++++++++++
> >  3 files changed, 520 insertions(+)
> >  create mode 100644 drivers/pwm/pwm-microchip-core.c
> > 
> > diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig
> > index dae023d783a2..f42756a014ed 100644
> > --- a/drivers/pwm/Kconfig
> > +++ b/drivers/pwm/Kconfig
> > @@ -393,6 +393,16 @@ config PWM_MEDIATEK
> >  	  To compile this driver as a module, choose M here: the module
> >  	  will be called pwm-mediatek.
> >  
> > +config PWM_MICROCHIP_CORE
> > +	tristate "Microchip corePWM PWM support"
> > +	depends on SOC_MICROCHIP_POLARFIRE || COMPILE_TEST
> > +	depends on HAS_IOMEM && OF
> > +	help
> > +	  PWM driver for Microchip FPGA soft IP core.
> > +
> > +	  To compile this driver as a module, choose M here: the module
> > +	  will be called pwm-microchip-core.
> > +
> >  config PWM_MXS
> >  	tristate "Freescale MXS PWM support"
> >  	depends on ARCH_MXS || COMPILE_TEST
> > diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile
> > index 7bf1a29f02b8..a65625359ece 100644
> > --- a/drivers/pwm/Makefile
> > +++ b/drivers/pwm/Makefile
> > @@ -34,6 +34,7 @@ obj-$(CONFIG_PWM_LPSS_PCI)	+= pwm-lpss-pci.o
> >  obj-$(CONFIG_PWM_LPSS_PLATFORM)	+= pwm-lpss-platform.o
> >  obj-$(CONFIG_PWM_MESON)		+= pwm-meson.o
> >  obj-$(CONFIG_PWM_MEDIATEK)	+= pwm-mediatek.o
> > +obj-$(CONFIG_PWM_MICROCHIP_CORE)	+= pwm-microchip-core.o
> >  obj-$(CONFIG_PWM_MTK_DISP)	+= pwm-mtk-disp.o
> >  obj-$(CONFIG_PWM_MXS)		+= pwm-mxs.o
> >  obj-$(CONFIG_PWM_NTXEC)		+= pwm-ntxec.o
> > diff --git a/drivers/pwm/pwm-microchip-core.c b/drivers/pwm/pwm-microchip-core.c
> > new file mode 100644
> > index 000000000000..0a69ec376c51
> > --- /dev/null
> > +++ b/drivers/pwm/pwm-microchip-core.c
> > @@ -0,0 +1,509 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * corePWM driver for Microchip "soft" FPGA IP cores.
> > + *
> > + * Copyright (c) 2021-2023 Microchip Corporation. All rights reserved.
> > + * Author: Conor Dooley <conor.dooley@microchip.com>
> > + * Documentation:
> > + * https://www.microsemi.com/document-portal/doc_download/1245275-corepwm-hb
> > + *
> > + * Limitations:
> > + * - If the IP block is configured without "shadow registers", all register
> > + *   writes will take effect immediately, causing glitches on the output.
> > + *   If shadow registers *are* enabled, a write to the "SYNC_UPDATE" register
> > + *   notifies the core that it needs to update the registers defining the
> > + *   waveform from the contents of the "shadow registers".
> 
> You only write once to the sync update register (i.e. during probe). So
> that register specifies that a period should be completed before a new
> setting becomes active, right?

Correct.

> Even with sync update this is still racy, right?

I assume the period ticking over as we are updating the values is your
concern here. I'm not sure that there's all that much we can do about
that, so I guess I shall update the comment.
Perhaps writing out period_steps and prescale should be done after
computing the new duty cycle to reduce the possible window since that
may require an expensive division on a 32-bit arch?

> > + * - The IP block has no concept of a duty cycle, only rising/falling edges of
> > + *   the waveform. Unfortunately, if the rising & falling edges registers have
> > + *   the same value written to them the IP block will do whichever of a rising
> > + *   or a falling edge is possible. I.E. a 50% waveform at twice the requested
> > + *   period. Therefore to get a 0% waveform, the output is set the max high/low
> > + *   time depending on polarity.
> > + *   If the duty cycle is 0%, and the requested period is less than the
> > + *   available period resolution, this will manifest as a ~100% waveform (with
> > + *   some output glitches) rather than 50%.
> 
> The last paragraph refers to negedge = 0, posedge = 0 and period_steps =
> 0?

Yes. Although, I did some poking around with it just now & that actually
only happens if prescale is also 0.
If it is non-zero, get to see some other "interesting behaviour" where
the period becomes gigantic - for example @ prescale = 0x3, the period
becomes about a quarter of a second w/ a 50% duty cycle. clk_rate is
62.5 MHz. I'd need to dig out the RTL to justify that one!

I've just gone and made apply() return -EINVAL for this, which the
subsystem does for requests of zero periods.

> > + * - The PWM period is set for the whole IP block not per channel. The driver
> > + *   will only change the period if no other PWM output is enabled.
> > + */
> 
> > +static void mchp_core_pwm_enable(struct pwm_chip *chip, struct pwm_device *pwm,
> > +				 bool enable, u64 period)
> > +{
> > +	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
> > +	u8 channel_enable, reg_offset, shift;
> > +
> > +	/*
> > +	 * There are two adjacent 8 bit control regs, the lower reg controls
> > +	 * 0-7 and the upper reg 8-15. Check if the pwm is in the upper reg
> > +	 * and if so, offset by the bus width.
> > +	 */
> > +	reg_offset = MCHPCOREPWM_EN(pwm->hwpwm >> 3);
> > +	shift = pwm->hwpwm & 7;
> > +
> > +	channel_enable = readb_relaxed(mchp_core_pwm->base + reg_offset);
> > +	channel_enable &= ~(1 << shift);
> > +	channel_enable |= (enable << shift);
> > +
> > +	writel_relaxed(channel_enable, mchp_core_pwm->base + reg_offset);
> > +	mchp_core_pwm->channel_enabled &= ~BIT(pwm->hwpwm);
> > +	mchp_core_pwm->channel_enabled |= enable << pwm->hwpwm;
> > +
> > +	/*
> > +	 * Notify the block to update the waveform from the shadow registers.
> > +	 * The updated values will not appear on the bus until they have been
> > +	 * applied to the waveform at the beginning of the next period.
> > +	 * This is a NO-OP if the channel does not have shadow registers.
> > +	 */
> 
> The code doesn't match the comment. I think that is a relict from the
> times when we thought that a trigger was necessary to update the
> operating settings from the shadow registers?

Yeah, I read this back to myself before sending v15 & thought that it
didn't need to be changed. I think removing the first line should go.

> 
> > +	if (mchp_core_pwm->sync_update_mask & (1 << pwm->hwpwm))
> > +		mchp_core_pwm->update_timestamp = ktime_add_ns(ktime_get(), period);
> > +}
> > +
> > +static void mchp_core_pwm_wait_for_sync_update(struct mchp_core_pwm_chip *mchp_core_pwm,
> > +					       unsigned int channel)
> > +{
> > +	/*
> > +	 * If a shadow register is used for this PWM channel, and iff there is
> > +	 * a pending update to the waveform, we must wait for it to be applied
> > +	 * before attempting to read its state. Reading the registers yields
> > +	 * the currently implemented settings & the new ones are only readable
> > +	 * once the current period has ended.
> > +	 */
> > +
> > +	if (mchp_core_pwm->sync_update_mask & (1 << channel)) {
> > +		ktime_t current_time = ktime_get();
> > +		s64 remaining_ns;
> > +		u32 delay_us;
> > +
> > +		remaining_ns = ktime_to_ns(ktime_sub(mchp_core_pwm->update_timestamp,
> > +						     current_time));
> > +
> > +		/*
> > +		 * If the update has gone through, don't bother waiting for
> > +		 * obvious reasons. Otherwise wait around for an appropriate
> > +		 * amount of time for the update to go through.
> > +		 */
> > +		if (remaining_ns <= 0)
> > +			return;
> > +
> > +		delay_us = DIV_ROUND_UP_ULL(remaining_ns, NSEC_PER_USEC);
> > +		fsleep(delay_us);
> > +	}
> 
> There is no way to query the hardware if there is still an update
> pending, right?

Hah, no. This IP is about as old as I am & appears to have been written
with keeping the FPGA utilisation % to a minimum. No such luxuries!

> Maybe that's possible implicitly by memoizing the
> expected read value? For me the current approach is fine enough though.
> This can be addressed in the future if needed.
> 
> > +static u64 mchp_core_pwm_calc_duty(const struct pwm_state *state, u64 clk_rate,
> > +				   u8 prescale, u8 period_steps)
> > +{
> > +	u64 duty_steps, tmp;
> > +
> > +	/*
> > +	 * Calculate the duty cycle in multiples of the prescaled period:
> > +	 * duty_steps = duty_in_ns / step_in_ns
> > +	 * step_in_ns = (prescale * NSEC_PER_SEC) / clk_rate
> > +	 * The code below is rearranged slightly to only divide once.
> > +	 */
> > +	tmp = (prescale + 1) * NSEC_PER_SEC;
> > +	duty_steps = mul_u64_u64_div_u64(state->duty_cycle, clk_rate, tmp);
> > +
> > +	return duty_steps;
> > +}
> > +
> > +static void mchp_core_pwm_apply_duty(struct pwm_chip *chip, struct pwm_device *pwm,
> > +				     const struct pwm_state *state, u64 duty_steps,
> > +				     u16 period_steps)
> > +{
> > +	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
> > +	u8 posedge, negedge;
> > +	u8 first_edge = 0, second_edge = duty_steps;
> > +
> > +	/*
> > +	 * Setting posedge == negedge doesn't yield a constant output,
> > +	 * so that's an unsuitable setting to model duty_steps = 0.
> > +	 * In that case set the unwanted edge to a value that never
> > +	 * triggers.
> > +	 */
> > +	if (duty_steps == 0)
> > +		first_edge = period_steps + 1;
> > +
> > +	if (state->polarity == PWM_POLARITY_INVERSED) {
> > +		negedge = first_edge;
> > +		posedge = second_edge;
> > +	} else {
> > +		posedge = first_edge;
> > +		negedge = second_edge;
> > +	}
> > +
> > +	writel_relaxed(posedge, mchp_core_pwm->base + MCHPCOREPWM_POSEDGE(pwm->hwpwm));
> > +	writel_relaxed(negedge, mchp_core_pwm->base + MCHPCOREPWM_NEGEDGE(pwm->hwpwm));
> 
> Is this racy with sync update implemented in the firmware? A comment
> about how the sync update is implemented would be good.

Unless this is a different fear of racing, see above.

> > +}
> > +
> > +static int mchp_core_pwm_calc_period(const struct pwm_state *state, unsigned long clk_rate,
> > +				     u16 *prescale, u16 *period_steps)
> > +{
> > +	u64 tmp;
> > +	u32 remainder;
> > +
> > +	/*
> > +	 * Calculate the period cycles and prescale values.
> > +	 * The registers are each 8 bits wide & multiplied to compute the period
> > +	 * using the formula:
> > +	 *           (prescale + 1) * (period_steps + 1)
> > +	 * period = -------------------------------------
> > +	 *                      clk_rate
> > +	 * so the maximum period that can be generated is 0x10000 times the
> > +	 * period of the input clock.
> > +	 * However, due to the design of the "hardware", it is not possible to
> > +	 * attain a 100% duty cycle if the full range of period_steps is used.
> > +	 * Therefore period_steps is restricted to 0xfe and the maximum multiple
> > +	 * of the clock period attainable is (0xff + 1) * (0xfe + 1) = 0xff00
> > +	 *
> > +	 * The prescale and period_steps registers operate similarly to
> > +	 * CLK_DIVIDER_ONE_BASED, where the value used by the hardware is that
> > +	 * in the register plus one.
> > +	 * It's therefore not possible to set a period lower than 1/clk_rate, so
> > +	 * if tmp is 0, abort. Without aborting, we will set a period that is
> > +	 * greater than that requested and, more importantly, will trigger the
> > +	 * neg-/pos-edge issue described in the limitations.
> > +	 */
> > +	tmp = mul_u64_u64_div_u64(state->period, clk_rate, NSEC_PER_SEC);
> > +	if (!tmp)
> > +		return -EINVAL;
> > +
> > +	if (tmp >= MCHPCOREPWM_PERIOD_MAX) {
> > +		*prescale = MCHPCOREPWM_PRESCALE_MAX;
> > +		*period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX;
> > +
> > +		return 0;
> > +	}
> > +
> > +	/*
> > +	 * There are multiple strategies that could be used to choose the
> > +	 * prescale & period_steps values.
> > +	 * Here the idea is to pick values so that the selection of duty cycles
> > +	 * is as finegrain as possible.
> > +	 * This "optimal" value for prescale can be calculated using the maximum
> > +	 * permitted value of period_steps, 0xfe.
> > +	 *
> > +	 *                period * clk_rate
> > +	 * prescale = ------------------------- - 1
> > +	 *            NSEC_PER_SEC * (0xfe + 1)
> > +	 *
> > +	 * However, we are purely interested in the integer upper bound of this
> > +	 * calculation, so this division should be rounded up before subtracting
> > +	 * 1
> > +	 *
> > +	 *  period * clk_rate
> > +	 * ------------------- was precomputed as `tmp`
> > +	 *    NSEC_PER_SEC
> > +	 */
> > +	*prescale = DIV64_U64_ROUND_UP(tmp, MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1;
> 
> If state->period * clk_rate is 765000000001 you get tmp = 765 and then
> *prescale = 2. However roundup(765000000001 / (1000000000 * 255)) - 1 is
> 3. The problem here is that you're rounding down in the calculation of
> tmp. Of course this is constructed because 765000000001 is prime, but
> I'm sure you get the point :-)

Hold that thought for a moment..

> Also we know that tmp is < 0xff00, so we don't need a 64 bit division
> here.

Neither here nor below, true.

> > +	/*
> > +	 * Because 0xff is not a permitted value some error will seep into the
> > +	 * calculation of prescale as prescale grows. Specifically, this error
> > +	 * occurs where the remainder of the prescale calculation is less than
> > +	 * prescale.
> > +	 * For small values of prescale, only a handful of values will need
> > +	 * correction, but overall this applies to almost half of the valid
> > +	 * values for tmp.
> > +	 *
> > +	 * To keep the algorithm's decision making consistent, this case is
> > +	 * checked for and the simple solution is to, in these cases,
> > +	 * decrement prescale and check that the resulting value of period_steps
> > +	 * is valid.
> > +	 *
> > +	 * period_steps can be computed from prescale:
> > +	 *                      period * clk_rate
> > +	 * period_steps = ----------------------------- - 1
> > +	 *                NSEC_PER_SEC * (prescale + 1)
> > +	 *
> > +	 */
> > +	div_u64_rem(tmp, (MCHPCOREPWM_PERIOD_STEPS_MAX + 1), &remainder);
> > +	if (remainder < *prescale) {
> > +		u16 smaller_prescale = *prescale - 1;
> > +
> > +		*period_steps = div_u64(tmp, smaller_prescale + 1) - 1;
> > +		if (*period_steps < 255) {
> > +			*prescale = smaller_prescale;
> > +
> > +			return 0;
> > +		}
> > +	}

...so in your prime case above, we would initially compute a prescale
value that is too large, and then wind up hitting the test of the
remainder here, thereby realising that the smaller prescale value is a
better fit?
Perhaps that's not an acceptable way to handle the issue though.

> I don't understand that part. It triggers for tmp = 511. So you prefer
> 
> 	prescale = 1
> 	period_steps = 254
> 
> yielding period = 510 / clkrate over
> 
> 	prescale = 2
> 	period_steps = 170
> 
> yielding 513 / clkrate. I wonder why.

Because 513 > 511 & 254 > 170!
Is the aim not to produce a period that is less than or equal to that
requested? The aim of this driver is to pick a prescale/period_steps
combo that satisfies that constraint, while also trying to maximise the
"finegrainness" of the duty cycle.
The latter should be stated in a comment above.


> Alsot tmp = 511 is the only value
> where this triggers. There is a mistake somewhere (maybe on my side).

It should trigger for any value 255 * n < x < 256 * n, no?
Say for tmp of 767:
*prescale = DIV64_U64_ROUND_UP(767, 254 + 1) - 1 = DIV64_U64_ROUND_UP(3.00784...) - 1 = 3
remainder = 0.00784.. * (254 + 1) = 2

Am I going nuts? Wouldn't be the first time that I've made a hames of
things here, there are 16 versions for a reason after all.

Cheers,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v16 1/2] pwm: add microchip soft ip corePWM driver
  2023-04-11 13:56     ` Conor Dooley
@ 2023-04-11 16:25       ` Uwe Kleine-König
  2023-04-18 11:27         ` Conor Dooley
  0 siblings, 1 reply; 10+ messages in thread
From: Uwe Kleine-König @ 2023-04-11 16:25 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Thierry Reding, Daire McNamara, linux-kernel, linux-pwm, linux-riscv

[-- Attachment #1: Type: text/plain, Size: 17350 bytes --]

Hello Conor,

On Tue, Apr 11, 2023 at 02:56:15PM +0100, Conor Dooley wrote:
> On Tue, Apr 11, 2023 at 12:55:47PM +0200, Uwe Kleine-König wrote:
> > On Tue, Apr 11, 2023 at 09:56:34AM +0100, Conor Dooley wrote:
> > > Add a driver that supports the Microchip FPGA "soft" PWM IP core.
> > > 
> > > Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
> > > ---
> > >  drivers/pwm/Kconfig              |  10 +
> > >  drivers/pwm/Makefile             |   1 +
> > >  drivers/pwm/pwm-microchip-core.c | 509 +++++++++++++++++++++++++++++++
> > >  3 files changed, 520 insertions(+)
> > >  create mode 100644 drivers/pwm/pwm-microchip-core.c
> > > 
> > > diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig
> > > index dae023d783a2..f42756a014ed 100644
> > > --- a/drivers/pwm/Kconfig
> > > +++ b/drivers/pwm/Kconfig
> > > @@ -393,6 +393,16 @@ config PWM_MEDIATEK
> > >  	  To compile this driver as a module, choose M here: the module
> > >  	  will be called pwm-mediatek.
> > >  
> > > +config PWM_MICROCHIP_CORE
> > > +	tristate "Microchip corePWM PWM support"
> > > +	depends on SOC_MICROCHIP_POLARFIRE || COMPILE_TEST
> > > +	depends on HAS_IOMEM && OF
> > > +	help
> > > +	  PWM driver for Microchip FPGA soft IP core.
> > > +
> > > +	  To compile this driver as a module, choose M here: the module
> > > +	  will be called pwm-microchip-core.
> > > +
> > >  config PWM_MXS
> > >  	tristate "Freescale MXS PWM support"
> > >  	depends on ARCH_MXS || COMPILE_TEST
> > > diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile
> > > index 7bf1a29f02b8..a65625359ece 100644
> > > --- a/drivers/pwm/Makefile
> > > +++ b/drivers/pwm/Makefile
> > > @@ -34,6 +34,7 @@ obj-$(CONFIG_PWM_LPSS_PCI)	+= pwm-lpss-pci.o
> > >  obj-$(CONFIG_PWM_LPSS_PLATFORM)	+= pwm-lpss-platform.o
> > >  obj-$(CONFIG_PWM_MESON)		+= pwm-meson.o
> > >  obj-$(CONFIG_PWM_MEDIATEK)	+= pwm-mediatek.o
> > > +obj-$(CONFIG_PWM_MICROCHIP_CORE)	+= pwm-microchip-core.o
> > >  obj-$(CONFIG_PWM_MTK_DISP)	+= pwm-mtk-disp.o
> > >  obj-$(CONFIG_PWM_MXS)		+= pwm-mxs.o
> > >  obj-$(CONFIG_PWM_NTXEC)		+= pwm-ntxec.o
> > > diff --git a/drivers/pwm/pwm-microchip-core.c b/drivers/pwm/pwm-microchip-core.c
> > > new file mode 100644
> > > index 000000000000..0a69ec376c51
> > > --- /dev/null
> > > +++ b/drivers/pwm/pwm-microchip-core.c
> > > @@ -0,0 +1,509 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/*
> > > + * corePWM driver for Microchip "soft" FPGA IP cores.
> > > + *
> > > + * Copyright (c) 2021-2023 Microchip Corporation. All rights reserved.
> > > + * Author: Conor Dooley <conor.dooley@microchip.com>
> > > + * Documentation:
> > > + * https://www.microsemi.com/document-portal/doc_download/1245275-corepwm-hb
> > > + *
> > > + * Limitations:
> > > + * - If the IP block is configured without "shadow registers", all register
> > > + *   writes will take effect immediately, causing glitches on the output.
> > > + *   If shadow registers *are* enabled, a write to the "SYNC_UPDATE" register
> > > + *   notifies the core that it needs to update the registers defining the
> > > + *   waveform from the contents of the "shadow registers".
> > 
> > You only write once to the sync update register (i.e. during probe). So
> > that register specifies that a period should be completed before a new
> > setting becomes active, right?
> 
> Correct.
> 
> > Even with sync update this is still racy, right?
> 
> I assume the period ticking over as we are updating the values is your
> concern here. I'm not sure that there's all that much we can do about
> that, so I guess I shall update the comment.

Yes, that's what I had in mind.

> Perhaps writing out period_steps and prescale should be done after
> computing the new duty cycle to reduce the possible window since that
> may require an expensive division on a 32-bit arch?

*nod*

> > > + * - The IP block has no concept of a duty cycle, only rising/falling edges of
> > > + *   the waveform. Unfortunately, if the rising & falling edges registers have
> > > + *   the same value written to them the IP block will do whichever of a rising
> > > + *   or a falling edge is possible. I.E. a 50% waveform at twice the requested
> > > + *   period. Therefore to get a 0% waveform, the output is set the max high/low
> > > + *   time depending on polarity.
> > > + *   If the duty cycle is 0%, and the requested period is less than the
> > > + *   available period resolution, this will manifest as a ~100% waveform (with
> > > + *   some output glitches) rather than 50%.
> > 
> > The last paragraph refers to negedge = 0, posedge = 0 and period_steps =
> > 0?
> 
> Yes. Although, I did some poking around with it just now & that actually
> only happens if prescale is also 0.
> If it is non-zero, get to see some other "interesting behaviour" where
> the period becomes gigantic - for example @ prescale = 0x3, the period
> becomes about a quarter of a second w/ a 50% duty cycle. clk_rate is
> 62.5 MHz. I'd need to dig out the RTL to justify that one!
> 
> I've just gone and made apply() return -EINVAL for this, which the
> subsystem does for requests of zero periods.

Sounds right.

> > > + * - The PWM period is set for the whole IP block not per channel. The driver
> > > + *   will only change the period if no other PWM output is enabled.
> > > + */
> > 
> > > +static void mchp_core_pwm_enable(struct pwm_chip *chip, struct pwm_device *pwm,
> > > +				 bool enable, u64 period)
> > > +{
> > > +	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
> > > +	u8 channel_enable, reg_offset, shift;
> > > +
> > > +	/*
> > > +	 * There are two adjacent 8 bit control regs, the lower reg controls
> > > +	 * 0-7 and the upper reg 8-15. Check if the pwm is in the upper reg
> > > +	 * and if so, offset by the bus width.
> > > +	 */
> > > +	reg_offset = MCHPCOREPWM_EN(pwm->hwpwm >> 3);
> > > +	shift = pwm->hwpwm & 7;
> > > +
> > > +	channel_enable = readb_relaxed(mchp_core_pwm->base + reg_offset);
> > > +	channel_enable &= ~(1 << shift);
> > > +	channel_enable |= (enable << shift);
> > > +
> > > +	writel_relaxed(channel_enable, mchp_core_pwm->base + reg_offset);
> > > +	mchp_core_pwm->channel_enabled &= ~BIT(pwm->hwpwm);
> > > +	mchp_core_pwm->channel_enabled |= enable << pwm->hwpwm;
> > > +
> > > +	/*
> > > +	 * Notify the block to update the waveform from the shadow registers.
> > > +	 * The updated values will not appear on the bus until they have been
> > > +	 * applied to the waveform at the beginning of the next period.
> > > +	 * This is a NO-OP if the channel does not have shadow registers.
> > > +	 */
> > 
> > The code doesn't match the comment. I think that is a relict from the
> > times when we thought that a trigger was necessary to update the
> > operating settings from the shadow registers?
> 
> Yeah, I read this back to myself before sending v15 & thought that it
> didn't need to be changed. I think removing the first line should go.

good.

> > > +static u64 mchp_core_pwm_calc_duty(const struct pwm_state *state, u64 clk_rate,
> > > +				   u8 prescale, u8 period_steps)
> > > +{
> > > +	u64 duty_steps, tmp;
> > > +
> > > +	/*
> > > +	 * Calculate the duty cycle in multiples of the prescaled period:
> > > +	 * duty_steps = duty_in_ns / step_in_ns
> > > +	 * step_in_ns = (prescale * NSEC_PER_SEC) / clk_rate
> > > +	 * The code below is rearranged slightly to only divide once.
> > > +	 */
> > > +	tmp = (prescale + 1) * NSEC_PER_SEC;
> > > +	duty_steps = mul_u64_u64_div_u64(state->duty_cycle, clk_rate, tmp);
> > > +
> > > +	return duty_steps;
> > > +}
> > > +
> > > +static void mchp_core_pwm_apply_duty(struct pwm_chip *chip, struct pwm_device *pwm,
> > > +				     const struct pwm_state *state, u64 duty_steps,
> > > +				     u16 period_steps)
> > > +{
> > > +	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
> > > +	u8 posedge, negedge;
> > > +	u8 first_edge = 0, second_edge = duty_steps;
> > > +
> > > +	/*
> > > +	 * Setting posedge == negedge doesn't yield a constant output,
> > > +	 * so that's an unsuitable setting to model duty_steps = 0.
> > > +	 * In that case set the unwanted edge to a value that never
> > > +	 * triggers.
> > > +	 */
> > > +	if (duty_steps == 0)
> > > +		first_edge = period_steps + 1;
> > > +
> > > +	if (state->polarity == PWM_POLARITY_INVERSED) {
> > > +		negedge = first_edge;
> > > +		posedge = second_edge;
> > > +	} else {
> > > +		posedge = first_edge;
> > > +		negedge = second_edge;
> > > +	}
> > > +
> > > +	writel_relaxed(posedge, mchp_core_pwm->base + MCHPCOREPWM_POSEDGE(pwm->hwpwm));
> > > +	writel_relaxed(negedge, mchp_core_pwm->base + MCHPCOREPWM_NEGEDGE(pwm->hwpwm));
> > 
> > Is this racy with sync update implemented in the firmware? A comment
> > about how the sync update is implemented would be good.
> 
> Unless this is a different fear of racing, see above.

OK, so a suitable comment would be along the lines of:

	/*
	 * Set the sync bit which ensures that periods that already
	 * started are completed unaltered. At each counter reset event
	 * the values are updated from the shadow registers.
	 */
 
> > > +}
> > > +
> > > +static int mchp_core_pwm_calc_period(const struct pwm_state *state, unsigned long clk_rate,
> > > +				     u16 *prescale, u16 *period_steps)
> > > +{
> > > +	u64 tmp;
> > > +	u32 remainder;
> > > +
> > > +	/*
> > > +	 * Calculate the period cycles and prescale values.
> > > +	 * The registers are each 8 bits wide & multiplied to compute the period
> > > +	 * using the formula:
> > > +	 *           (prescale + 1) * (period_steps + 1)
> > > +	 * period = -------------------------------------
> > > +	 *                      clk_rate
> > > +	 * so the maximum period that can be generated is 0x10000 times the
> > > +	 * period of the input clock.
> > > +	 * However, due to the design of the "hardware", it is not possible to
> > > +	 * attain a 100% duty cycle if the full range of period_steps is used.
> > > +	 * Therefore period_steps is restricted to 0xfe and the maximum multiple
> > > +	 * of the clock period attainable is (0xff + 1) * (0xfe + 1) = 0xff00
> > > +	 *
> > > +	 * The prescale and period_steps registers operate similarly to
> > > +	 * CLK_DIVIDER_ONE_BASED, where the value used by the hardware is that
> > > +	 * in the register plus one.
> > > +	 * It's therefore not possible to set a period lower than 1/clk_rate, so
> > > +	 * if tmp is 0, abort. Without aborting, we will set a period that is
> > > +	 * greater than that requested and, more importantly, will trigger the
> > > +	 * neg-/pos-edge issue described in the limitations.
> > > +	 */
> > > +	tmp = mul_u64_u64_div_u64(state->period, clk_rate, NSEC_PER_SEC);
> > > +	if (!tmp)
> > > +		return -EINVAL;
> > > +
> > > +	if (tmp >= MCHPCOREPWM_PERIOD_MAX) {
> > > +		*prescale = MCHPCOREPWM_PRESCALE_MAX;
> > > +		*period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX;
> > > +
> > > +		return 0;
> > > +	}
> > > +
> > > +	/*
> > > +	 * There are multiple strategies that could be used to choose the
> > > +	 * prescale & period_steps values.
> > > +	 * Here the idea is to pick values so that the selection of duty cycles
> > > +	 * is as finegrain as possible.
> > > +	 * This "optimal" value for prescale can be calculated using the maximum
> > > +	 * permitted value of period_steps, 0xfe.
> > > +	 *
> > > +	 *                period * clk_rate
> > > +	 * prescale = ------------------------- - 1
> > > +	 *            NSEC_PER_SEC * (0xfe + 1)
> > > +	 *
> > > +	 * However, we are purely interested in the integer upper bound of this
> > > +	 * calculation, so this division should be rounded up before subtracting
> > > +	 * 1
> > > +	 *
> > > +	 *  period * clk_rate
> > > +	 * ------------------- was precomputed as `tmp`
> > > +	 *    NSEC_PER_SEC
> > > +	 */
> > > +	*prescale = DIV64_U64_ROUND_UP(tmp, MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1;
> > 
> > If state->period * clk_rate is 765000000001 you get tmp = 765 and then
> > *prescale = 2. However roundup(765000000001 / (1000000000 * 255)) - 1 is
> > 3. The problem here is that you're rounding down in the calculation of
> > tmp. Of course this is constructed because 765000000001 is prime, but
> > I'm sure you get the point :-)
> 
> Hold that thought for a moment..

OK, so the correction below is to make up for the wrong rounding here.
I'd like to have that in a comment. Otherwise it wasn't clear to me.
 
> > Also we know that tmp is < 0xff00, so we don't need a 64 bit division
> > here.
> 
> Neither here nor below, true.
> 
> > > +	/*
> > > +	 * Because 0xff is not a permitted value some error will seep into the
> > > +	 * calculation of prescale as prescale grows. Specifically, this error
> > > +	 * occurs where the remainder of the prescale calculation is less than
> > > +	 * prescale.
> > > +	 * For small values of prescale, only a handful of values will need
> > > +	 * correction, but overall this applies to almost half of the valid
> > > +	 * values for tmp.
> > > +	 *
> > > +	 * To keep the algorithm's decision making consistent, this case is
> > > +	 * checked for and the simple solution is to, in these cases,
> > > +	 * decrement prescale and check that the resulting value of period_steps
> > > +	 * is valid.
> > > +	 *
> > > +	 * period_steps can be computed from prescale:
> > > +	 *                      period * clk_rate
> > > +	 * period_steps = ----------------------------- - 1
> > > +	 *                NSEC_PER_SEC * (prescale + 1)
> > > +	 *
> > > +	 */
> > > +	div_u64_rem(tmp, (MCHPCOREPWM_PERIOD_STEPS_MAX + 1), &remainder);
> > > +	if (remainder < *prescale) {
> > > +		u16 smaller_prescale = *prescale - 1;
> > > +
> > > +		*period_steps = div_u64(tmp, smaller_prescale + 1) - 1;
> > > +		if (*period_steps < 255) {
> > > +			*prescale = smaller_prescale;
> > > +
> > > +			return 0;
> > > +		}
> > > +	}
> 
> ...so in your prime case above, we would initially compute a prescale
> value that is too large, and then wind up hitting the test of the
> remainder here, thereby realising that the smaller prescale value is a
> better fit?
> Perhaps that's not an acceptable way to handle the issue though.

IMHO it is, but the comment explaining needs some improvement. It should
also make clear why rounding cannot lead to prescale - 2 being a
good/better candidate. (I haven't thought it through, maybe needs some
improvement?) I wonder if the computation can find all improvements
given that it only used tmp, but not ->period and clk_rate?!
 
> > I don't understand that part. It triggers for tmp = 511. So you prefer
> > 
> > 	prescale = 1
> > 	period_steps = 254
> > 
> > yielding period = 510 / clkrate over
> > 
> > 	prescale = 2
> > 	period_steps = 170
> > 
> > yielding 513 / clkrate. I wonder why.

I missed the -= 1 in my example. So it's:

	It triggers for tmp = 511. So you prefer

		prescale = 1
		period_steps = 254

	yielding period = 510 / clkrate over

		prescale = 2
		period_steps = 169

	yielding 510 / clkrate.

Here it's obvious that the former is the better one. But I wonder why
the former isn't found instantly. Wouldn't

	*prescale = tmp / (MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1

give a better approximation in general? (Of course with an additional
check that *prescale >= 0 then.) 

... thinking a bit ... yes, I think that's true:

If you pick *prescale = tmp / (MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1,
then for each period_steps value ≤ 254 we have:

	  (*prescale + 1) * (period_steps + 1)
	≤ (*prescale + 1) * 255
	≤ (tmp // (MCHPCOREPWM_PERIOD_STEPS_MAX + 1)) * 255
	≤ (tmp // 255) * 255
	≤ (tmp / 255) * 255
	= tmp

and you can use the maximal period_steps = 0xfe.
(Only for tmp < 255 that isn't possible, up to you if you refuse these
or pick a smaller value for period_steps.)

> Because 513 > 511 & 254 > 170!
> Is the aim not to produce a period that is less than or equal to that
> requested? The aim of this driver is to pick a prescale/period_steps
> combo that satisfies that constraint, while also trying to maximise the
> "finegrainness" of the duty cycle.
> The latter should be stated in a comment above.

ack. I also wonder if such a change breaks the other assumptions a
consumer might have. I'll spend some more cycles about that in the next
round :-)

> > Also tmp = 511 is the only value
> > where this triggers. There is a mistake somewhere (maybe on my side).
> 
> It should trigger for any value 255 * n < x < 256 * n, no?
> Say for tmp of 767:
> *prescale = DIV64_U64_ROUND_UP(767, 254 + 1) - 1 = DIV64_U64_ROUND_UP(3.00784...) - 1 = 3
> remainder = 0.00784.. * (254 + 1) = 2
> 
> Am I going nuts? Wouldn't be the first time that I've made a hames of
> things here, there are 16 versions for a reason after all.

Ah, I had a bogus break statement in my python test code. So it's me
who got it wrong.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v16 1/2] pwm: add microchip soft ip corePWM driver
  2023-04-11 16:25       ` Uwe Kleine-König
@ 2023-04-18 11:27         ` Conor Dooley
  2023-04-18 13:08           ` Uwe Kleine-König
  0 siblings, 1 reply; 10+ messages in thread
From: Conor Dooley @ 2023-04-18 11:27 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: Thierry Reding, Daire McNamara, linux-kernel, linux-pwm, linux-riscv

[-- Attachment #1: Type: text/plain, Size: 9537 bytes --]

Hey Uwe,

On Tue, Apr 11, 2023 at 06:25:54PM +0200, Uwe Kleine-König wrote:
> On Tue, Apr 11, 2023 at 02:56:15PM +0100, Conor Dooley wrote:
> > On Tue, Apr 11, 2023 at 12:55:47PM +0200, Uwe Kleine-König wrote:
> > > On Tue, Apr 11, 2023 at 09:56:34AM +0100, Conor Dooley wrote:
> > > > Add a driver that supports the Microchip FPGA "soft" PWM IP core.

> > > > +static int mchp_core_pwm_calc_period(const struct pwm_state *state, unsigned long clk_rate,
> > > > +				     u16 *prescale, u16 *period_steps)
> > > > +{
> > > > +	u64 tmp;
> > > > +	u32 remainder;
> > > > +
> > > > +	/*
> > > > +	 * Calculate the period cycles and prescale values.
> > > > +	 * The registers are each 8 bits wide & multiplied to compute the period
> > > > +	 * using the formula:
> > > > +	 *           (prescale + 1) * (period_steps + 1)
> > > > +	 * period = -------------------------------------
> > > > +	 *                      clk_rate
> > > > +	 * so the maximum period that can be generated is 0x10000 times the
> > > > +	 * period of the input clock.
> > > > +	 * However, due to the design of the "hardware", it is not possible to
> > > > +	 * attain a 100% duty cycle if the full range of period_steps is used.
> > > > +	 * Therefore period_steps is restricted to 0xfe and the maximum multiple
> > > > +	 * of the clock period attainable is (0xff + 1) * (0xfe + 1) = 0xff00
> > > > +	 *
> > > > +	 * The prescale and period_steps registers operate similarly to
> > > > +	 * CLK_DIVIDER_ONE_BASED, where the value used by the hardware is that
> > > > +	 * in the register plus one.
> > > > +	 * It's therefore not possible to set a period lower than 1/clk_rate, so
> > > > +	 * if tmp is 0, abort. Without aborting, we will set a period that is
> > > > +	 * greater than that requested and, more importantly, will trigger the
> > > > +	 * neg-/pos-edge issue described in the limitations.
> > > > +	 */
> > > > +	tmp = mul_u64_u64_div_u64(state->period, clk_rate, NSEC_PER_SEC);
> > > > +	if (!tmp)
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (tmp >= MCHPCOREPWM_PERIOD_MAX) {
> > > > +		*prescale = MCHPCOREPWM_PRESCALE_MAX;
> > > > +		*period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX;
> > > > +
> > > > +		return 0;
> > > > +	}
> > > > +
> > > > +	/*
> > > > +	 * There are multiple strategies that could be used to choose the
> > > > +	 * prescale & period_steps values.
> > > > +	 * Here the idea is to pick values so that the selection of duty cycles
> > > > +	 * is as finegrain as possible.
> > > > +	 * This "optimal" value for prescale can be calculated using the maximum
> > > > +	 * permitted value of period_steps, 0xfe.
> > > > +	 *
> > > > +	 *                period * clk_rate
> > > > +	 * prescale = ------------------------- - 1
> > > > +	 *            NSEC_PER_SEC * (0xfe + 1)
> > > > +	 *
> > > > +	 * However, we are purely interested in the integer upper bound of this
> > > > +	 * calculation, so this division should be rounded up before subtracting
> > > > +	 * 1
> > > > +	 *
> > > > +	 *  period * clk_rate
> > > > +	 * ------------------- was precomputed as `tmp`
> > > > +	 *    NSEC_PER_SEC
> > > > +	 */
> > > > +	*prescale = DIV64_U64_ROUND_UP(tmp, MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1;
> > > 
> > > If state->period * clk_rate is 765000000001 you get tmp = 765 and then
> > > *prescale = 2. However roundup(765000000001 / (1000000000 * 255)) - 1 is
> > > 3. The problem here is that you're rounding down in the calculation of
> > > tmp. Of course this is constructed because 765000000001 is prime, but
> > > I'm sure you get the point :-)
> > 
> > Hold that thought for a moment..
> 
> OK, so the correction below is to make up for the wrong rounding here.
> I'd like to have that in a comment. Otherwise it wasn't clear to me.

Sure, no problem.

> > > Also we know that tmp is < 0xff00, so we don't need a 64 bit division
> > > here.
> > 
> > Neither here nor below, true.
> > 
> > > > +	/*
> > > > +	 * Because 0xff is not a permitted value some error will seep into the
> > > > +	 * calculation of prescale as prescale grows. Specifically, this error
> > > > +	 * occurs where the remainder of the prescale calculation is less than
> > > > +	 * prescale.
> > > > +	 * For small values of prescale, only a handful of values will need
> > > > +	 * correction, but overall this applies to almost half of the valid
> > > > +	 * values for tmp.
> > > > +	 *
> > > > +	 * To keep the algorithm's decision making consistent, this case is
> > > > +	 * checked for and the simple solution is to, in these cases,
> > > > +	 * decrement prescale and check that the resulting value of period_steps
> > > > +	 * is valid.
> > > > +	 *
> > > > +	 * period_steps can be computed from prescale:
> > > > +	 *                      period * clk_rate
> > > > +	 * period_steps = ----------------------------- - 1
> > > > +	 *                NSEC_PER_SEC * (prescale + 1)
> > > > +	 *
> > > > +	 */
> > > > +	div_u64_rem(tmp, (MCHPCOREPWM_PERIOD_STEPS_MAX + 1), &remainder);
> > > > +	if (remainder < *prescale) {
> > > > +		u16 smaller_prescale = *prescale - 1;
> > > > +
> > > > +		*period_steps = div_u64(tmp, smaller_prescale + 1) - 1;
> > > > +		if (*period_steps < 255) {
> > > > +			*prescale = smaller_prescale;
> > > > +
> > > > +			return 0;
> > > > +		}
> > > > +	}
> > 
> > ...so in your prime case above, we would initially compute a prescale
> > value that is too large, and then wind up hitting the test of the
> > remainder here, thereby realising that the smaller prescale value is a
> > better fit?
> > Perhaps that's not an acceptable way to handle the issue though.
> 
> IMHO it is, but the comment explaining needs some improvement. It should
> also make clear why rounding cannot lead to prescale - 2 being a
> good/better candidate. (I haven't thought it through, maybe needs some
> improvement?) I wonder if the computation can find all improvements
> given that it only used tmp, but not ->period and clk_rate?!

For tmp > 0xff00 it may be off by more than 255, but that isn't possible
at this stage as we've already eliminated the possibility above.

> > > I don't understand that part. It triggers for tmp = 511. So you prefer
> > > 
> > > 	prescale = 1
> > > 	period_steps = 254
> > > 
> > > yielding period = 510 / clkrate over
> > > 
> > > 	prescale = 2
> > > 	period_steps = 170
> > > 
> > > yielding 513 / clkrate. I wonder why.
> 
> I missed the -= 1 in my example. So it's:
> 
> 	It triggers for tmp = 511. So you prefer
> 
> 		prescale = 1
> 		period_steps = 254
> 
> 	yielding period = 510 / clkrate over
> 
> 		prescale = 2
> 		period_steps = 169
> 
> 	yielding 510 / clkrate.
> 
> Here it's obvious that the former is the better one. But I wonder why
> the former isn't found instantly. Wouldn't
> 
> 	*prescale = tmp / (MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1
> 
> give a better approximation in general? (Of course with an additional
> check that *prescale >= 0 then.) 
> 
> ... thinking a bit ... yes, I think that's true:
> 
> If you pick *prescale = tmp / (MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1,
> then for each period_steps value ≤ 254 we have:
> 
> 	  (*prescale + 1) * (period_steps + 1)
> 	≤ (*prescale + 1) * 255
> 	≤ (tmp // (MCHPCOREPWM_PERIOD_STEPS_MAX + 1)) * 255
> 	≤ (tmp // 255) * 255
> 	≤ (tmp / 255) * 255
> 	= tmp

tmp = 256

*prescale = 256 // (254 + 1) - 1
          ≈ 0

*prescale = 256 // (prescale + 1) - 1
          = 256 / (0 + 1) - 1
	  = 255

That's then gonna give us one of the broken configurations from the
limitations.

tmp = 257

*prescale = 257 // (254 + 1) - 1
          ≈ 0

*prescale = 257 // (prescale + 1) - 1
          = 257 / (0 + 1) - 1
	  = 256
	  = 0 (registers are 8-bit)

And so on...

I'm quite obviously missing something that you may think is obvious
here, but is not immediately clear to me.

> and you can use the maximal period_steps = 0xfe.
> (Only for tmp < 255 that isn't possible, up to you if you refuse these
> or pick a smaller value for period_steps.)
> 
> > Because 513 > 511 & 254 > 170!
> > Is the aim not to produce a period that is less than or equal to that
> > requested? The aim of this driver is to pick a prescale/period_steps
> > combo that satisfies that constraint, while also trying to maximise the
> > "finegrainness" of the duty cycle.
> > The latter should be stated in a comment above.
> 
> ack. I also wonder if such a change breaks the other assumptions a
> consumer might have. I'll spend some more cycles about that in the next
> round :-)
> 
> > > Also tmp = 511 is the only value
> > > where this triggers. There is a mistake somewhere (maybe on my side).
> > 
> > It should trigger for any value 255 * n < x < 256 * n, no?
> > Say for tmp of 767:
> > *prescale = DIV64_U64_ROUND_UP(767, 254 + 1) - 1 = DIV64_U64_ROUND_UP(3.00784...) - 1 = 3
> > remainder = 0.00784.. * (254 + 1) = 2
> > 
> > Am I going nuts? Wouldn't be the first time that I've made a hames of
> > things here, there are 16 versions for a reason after all.
> 
> Ah, I had a bogus break statement in my python test code. So it's me
> who got it wrong.

No worries :)

Pending an explanation of your calculation above, I've gone and done the
rest of these things. I shan't resend until -rc1 or I get an explanation
of your calculation - whichever happens first!

Cheers,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v16 1/2] pwm: add microchip soft ip corePWM driver
  2023-04-18 11:27         ` Conor Dooley
@ 2023-04-18 13:08           ` Uwe Kleine-König
  2023-04-18 13:27             ` Conor Dooley
  0 siblings, 1 reply; 10+ messages in thread
From: Uwe Kleine-König @ 2023-04-18 13:08 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Thierry Reding, Daire McNamara, linux-kernel, linux-pwm, linux-riscv

[-- Attachment #1: Type: text/plain, Size: 9179 bytes --]

On Tue, Apr 18, 2023 at 12:27:33PM +0100, Conor Dooley wrote:
> Hey Uwe,
> 
> On Tue, Apr 11, 2023 at 06:25:54PM +0200, Uwe Kleine-König wrote:
> > On Tue, Apr 11, 2023 at 02:56:15PM +0100, Conor Dooley wrote:
> > > On Tue, Apr 11, 2023 at 12:55:47PM +0200, Uwe Kleine-König wrote:
> > > > On Tue, Apr 11, 2023 at 09:56:34AM +0100, Conor Dooley wrote:
> > > > > Add a driver that supports the Microchip FPGA "soft" PWM IP core.
> 
> > > > > +static int mchp_core_pwm_calc_period(const struct pwm_state *state, unsigned long clk_rate,
> > > > > +				     u16 *prescale, u16 *period_steps)
> > > > > +{
> > > > > +	u64 tmp;
> > > > > +	u32 remainder;
> > > > > +
> > > > > +	/*
> > > > > +	 * Calculate the period cycles and prescale values.
> > > > > +	 * The registers are each 8 bits wide & multiplied to compute the period
> > > > > +	 * using the formula:
> > > > > +	 *           (prescale + 1) * (period_steps + 1)
> > > > > +	 * period = -------------------------------------
> > > > > +	 *                      clk_rate
> > > > > +	 * so the maximum period that can be generated is 0x10000 times the
> > > > > +	 * period of the input clock.
> > > > > +	 * However, due to the design of the "hardware", it is not possible to
> > > > > +	 * attain a 100% duty cycle if the full range of period_steps is used.
> > > > > +	 * Therefore period_steps is restricted to 0xfe and the maximum multiple
> > > > > +	 * of the clock period attainable is (0xff + 1) * (0xfe + 1) = 0xff00
> > > > > +	 *
> > > > > +	 * The prescale and period_steps registers operate similarly to
> > > > > +	 * CLK_DIVIDER_ONE_BASED, where the value used by the hardware is that
> > > > > +	 * in the register plus one.
> > > > > +	 * It's therefore not possible to set a period lower than 1/clk_rate, so
> > > > > +	 * if tmp is 0, abort. Without aborting, we will set a period that is
> > > > > +	 * greater than that requested and, more importantly, will trigger the
> > > > > +	 * neg-/pos-edge issue described in the limitations.
> > > > > +	 */
> > > > > +	tmp = mul_u64_u64_div_u64(state->period, clk_rate, NSEC_PER_SEC);
> > > > > +	if (!tmp)
> > > > > +		return -EINVAL;
> > > > > +
> > > > > +	if (tmp >= MCHPCOREPWM_PERIOD_MAX) {
> > > > > +		*prescale = MCHPCOREPWM_PRESCALE_MAX;
> > > > > +		*period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX;
> > > > > +
> > > > > +		return 0;
> > > > > +	}
> > > > > +
> > > > > +	/*
> > > > > +	 * There are multiple strategies that could be used to choose the
> > > > > +	 * prescale & period_steps values.
> > > > > +	 * Here the idea is to pick values so that the selection of duty cycles
> > > > > +	 * is as finegrain as possible.
> > > > > +	 * This "optimal" value for prescale can be calculated using the maximum
> > > > > +	 * permitted value of period_steps, 0xfe.
> > > > > +	 *
> > > > > +	 *                period * clk_rate
> > > > > +	 * prescale = ------------------------- - 1
> > > > > +	 *            NSEC_PER_SEC * (0xfe + 1)
> > > > > +	 *
> > > > > +	 * However, we are purely interested in the integer upper bound of this
> > > > > +	 * calculation, so this division should be rounded up before subtracting
> > > > > +	 * 1
> > > > > +	 *
> > > > > +	 *  period * clk_rate
> > > > > +	 * ------------------- was precomputed as `tmp`
> > > > > +	 *    NSEC_PER_SEC
> > > > > +	 */
> > > > > +	*prescale = DIV64_U64_ROUND_UP(tmp, MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1;
> > > > 
> > > > If state->period * clk_rate is 765000000001 you get tmp = 765 and then
> > > > *prescale = 2. However roundup(765000000001 / (1000000000 * 255)) - 1 is
> > > > 3. The problem here is that you're rounding down in the calculation of
> > > > tmp. Of course this is constructed because 765000000001 is prime, but
> > > > I'm sure you get the point :-)
> > > 
> > > Hold that thought for a moment..
> > 
> > OK, so the correction below is to make up for the wrong rounding here.
> > I'd like to have that in a comment. Otherwise it wasn't clear to me.
> 
> Sure, no problem.
> 
> > > > Also we know that tmp is < 0xff00, so we don't need a 64 bit division
> > > > here.
> > > 
> > > Neither here nor below, true.
> > > 
> > > > > +	/*
> > > > > +	 * Because 0xff is not a permitted value some error will seep into the
> > > > > +	 * calculation of prescale as prescale grows. Specifically, this error
> > > > > +	 * occurs where the remainder of the prescale calculation is less than
> > > > > +	 * prescale.
> > > > > +	 * For small values of prescale, only a handful of values will need
> > > > > +	 * correction, but overall this applies to almost half of the valid
> > > > > +	 * values for tmp.
> > > > > +	 *
> > > > > +	 * To keep the algorithm's decision making consistent, this case is
> > > > > +	 * checked for and the simple solution is to, in these cases,
> > > > > +	 * decrement prescale and check that the resulting value of period_steps
> > > > > +	 * is valid.
> > > > > +	 *
> > > > > +	 * period_steps can be computed from prescale:
> > > > > +	 *                      period * clk_rate
> > > > > +	 * period_steps = ----------------------------- - 1
> > > > > +	 *                NSEC_PER_SEC * (prescale + 1)
> > > > > +	 *
> > > > > +	 */
> > > > > +	div_u64_rem(tmp, (MCHPCOREPWM_PERIOD_STEPS_MAX + 1), &remainder);
> > > > > +	if (remainder < *prescale) {
> > > > > +		u16 smaller_prescale = *prescale - 1;
> > > > > +
> > > > > +		*period_steps = div_u64(tmp, smaller_prescale + 1) - 1;
> > > > > +		if (*period_steps < 255) {
> > > > > +			*prescale = smaller_prescale;
> > > > > +
> > > > > +			return 0;
> > > > > +		}
> > > > > +	}
> > > 
> > > ...so in your prime case above, we would initially compute a prescale
> > > value that is too large, and then wind up hitting the test of the
> > > remainder here, thereby realising that the smaller prescale value is a
> > > better fit?
> > > Perhaps that's not an acceptable way to handle the issue though.
> > 
> > IMHO it is, but the comment explaining needs some improvement. It should
> > also make clear why rounding cannot lead to prescale - 2 being a
> > good/better candidate. (I haven't thought it through, maybe needs some
> > improvement?) I wonder if the computation can find all improvements
> > given that it only used tmp, but not ->period and clk_rate?!
> 
> For tmp > 0xff00 it may be off by more than 255, but that isn't possible
> at this stage as we've already eliminated the possibility above.
> 
> > > > I don't understand that part. It triggers for tmp = 511. So you prefer
> > > > 
> > > > 	prescale = 1
> > > > 	period_steps = 254
> > > > 
> > > > yielding period = 510 / clkrate over
> > > > 
> > > > 	prescale = 2
> > > > 	period_steps = 170
> > > > 
> > > > yielding 513 / clkrate. I wonder why.
> > 
> > I missed the -= 1 in my example. So it's:
> > 
> > 	It triggers for tmp = 511. So you prefer
> > 
> > 		prescale = 1
> > 		period_steps = 254
> > 
> > 	yielding period = 510 / clkrate over
> > 
> > 		prescale = 2
> > 		period_steps = 169
> > 
> > 	yielding 510 / clkrate.
> > 
> > Here it's obvious that the former is the better one. But I wonder why
> > the former isn't found instantly. Wouldn't
> > 
> > 	*prescale = tmp / (MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1
> > 
> > give a better approximation in general? (Of course with an additional
> > check that *prescale >= 0 then.) 
> > 
> > ... thinking a bit ... yes, I think that's true:
> > 
> > If you pick *prescale = tmp / (MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1,
> > then for each period_steps value ≤ 254 we have:
> > 
> > 	  (*prescale + 1) * (period_steps + 1)
> > 	≤ (*prescale + 1) * 255
> > 	≤ (tmp // (MCHPCOREPWM_PERIOD_STEPS_MAX + 1)) * 255
> > 	≤ (tmp // 255) * 255
> > 	≤ (tmp / 255) * 255
> > 	= tmp
> 
> tmp = 256
> 
> *prescale = 256 // (254 + 1) - 1
>           ≈ 0
> 
> *prescale = 256 // (prescale + 1) - 1
>           = 256 / (0 + 1) - 1
> 	  = 255

I don't understand what you wanna say here. If tmp = 256 my suggestion
is to pick prescale = 0 and period_steps = 254. Then

	 (prescale + 1) * (period_steps + 1) ≤ tmp

and period_steps is big enough to ensure a a finegrained choice for the
duty_cycle.

> That's then gonna give us one of the broken configurations from the
> limitations.
> 
> tmp = 257
> 
> *prescale = 257 // (254 + 1) - 1
>           ≈ 0
> 
> *prescale = 257 // (prescale + 1) - 1
>           = 257 / (0 + 1) - 1
> 	  = 256
> 	  = 0 (registers are 8-bit)

I think you mean s/prescale/period_steps/ in the second part, but that's
not what I meant. I meant to suggest:

	*prescale = tmp / (MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1
	period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX = 254

> I'm quite obviously missing something that you may think is obvious
> here, but is not immediately clear to me.

That would be an explanation, yes. :-)

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v16 1/2] pwm: add microchip soft ip corePWM driver
  2023-04-18 13:08           ` Uwe Kleine-König
@ 2023-04-18 13:27             ` Conor Dooley
  2023-04-18 15:18               ` Uwe Kleine-König
  0 siblings, 1 reply; 10+ messages in thread
From: Conor Dooley @ 2023-04-18 13:27 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: Thierry Reding, Daire McNamara, linux-kernel, linux-pwm, linux-riscv

[-- Attachment #1: Type: text/plain, Size: 1874 bytes --]

On Tue, Apr 18, 2023 at 03:08:37PM +0200, Uwe Kleine-König wrote:
> On Tue, Apr 18, 2023 at 12:27:33PM +0100, Conor Dooley wrote:
> > On Tue, Apr 11, 2023 at 06:25:54PM +0200, Uwe Kleine-König wrote:


> I don't understand what you wanna say here. If tmp = 256 my suggestion
> is to pick prescale = 0 and period_steps = 254. Then
> 
> 	 (prescale + 1) * (period_steps + 1) ≤ tmp
> 
> and period_steps is big enough to ensure a a finegrained choice for the
> duty_cycle.
> 
> > That's then gonna give us one of the broken configurations from the
> > limitations.
> > 
> > tmp = 257
> > 
> > *prescale = 257 // (254 + 1) - 1
> >           ≈ 0
> > 
> > *prescale = 257 // (prescale + 1) - 1
> >           = 257 / (0 + 1) - 1
> > 	  = 256
> > 	  = 0 (registers are 8-bit)
> 
> I think you mean s/prescale/period_steps/ in the second part, but that's
> not what I meant. I meant to suggest:

I did, yeah!

> 	*prescale = tmp / (MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1
> 	period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX = 254
> 
> > I'm quite obviously missing something that you may think is obvious
> > here, but is not immediately clear to me.
> 
> That would be an explanation, yes. :-)

Right, it makes a lot more sense now. Definitely was not clear to me
that that was what you were suggesting.
I'm not sure that disallowing tmp < 255 is something I want to do
though, as this is mainly used as a "soft" IP core in the FPGA fabric,
the clock provided to it may not be particularly high.
Probably not the end of the world though, once added to the limitations.

The implemented period is also going to be quite a ways off with this
method (compared to the method I have been using until now) - although
it is of course far simpler.
You're the PWM expert and are suggesting it, so maybe I should just shut
up and go do it...

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v16 1/2] pwm: add microchip soft ip corePWM driver
  2023-04-18 13:27             ` Conor Dooley
@ 2023-04-18 15:18               ` Uwe Kleine-König
  0 siblings, 0 replies; 10+ messages in thread
From: Uwe Kleine-König @ 2023-04-18 15:18 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Thierry Reding, Daire McNamara, linux-kernel, linux-pwm, linux-riscv

[-- Attachment #1: Type: text/plain, Size: 1414 bytes --]

Hello Conor,

On Tue, Apr 18, 2023 at 02:27:09PM +0100, Conor Dooley wrote:
> On Tue, Apr 18, 2023 at 03:08:37PM +0200, Uwe Kleine-König wrote:
> > On Tue, Apr 18, 2023 at 12:27:33PM +0100, Conor Dooley wrote:
> > > I'm quite obviously missing something that you may think is obvious
> > > here, but is not immediately clear to me.
> > 
> > That would be an explanation, yes. :-)
> 
> Right, it makes a lot more sense now. Definitely was not clear to me
> that that was what you were suggesting.

I reread what I wrote and maybe I wouldn't have understood it either :-)

> I'm not sure that disallowing tmp < 255 is something I want to do
> though, as this is mainly used as a "soft" IP core in the FPGA fabric,
> the clock provided to it may not be particularly high.
> Probably not the end of the world though, once added to the limitations.
> 
> The implemented period is also going to be quite a ways off with this
> method (compared to the method I have been using until now) - although
> it is of course far simpler.

"simpler" is in my eyes a good approach until someone comes who has
stronger needs than can be done with a simple approach. Until this
happens a simple approach is good for everyone.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-04-18 15:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-11  8:56 [PATCH v16 0/2] Microchip Soft IP corePWM driver Conor Dooley
2023-04-11  8:56 ` [PATCH v16 1/2] pwm: add microchip soft ip " Conor Dooley
2023-04-11 10:55   ` Uwe Kleine-König
2023-04-11 13:56     ` Conor Dooley
2023-04-11 16:25       ` Uwe Kleine-König
2023-04-18 11:27         ` Conor Dooley
2023-04-18 13:08           ` Uwe Kleine-König
2023-04-18 13:27             ` Conor Dooley
2023-04-18 15:18               ` Uwe Kleine-König
2023-04-11  8:56 ` [PATCH v16 2/2] MAINTAINERS: add pwm to PolarFire SoC entry Conor Dooley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).