linux-riscv.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v14 0/2] Microchip Soft IP corePWM driver
@ 2023-03-06  9:48 Conor Dooley
  2023-03-06  9:48 ` [PATCH v14 1/2] pwm: add microchip soft ip " Conor Dooley
  2023-03-06  9:48 ` [PATCH v14 2/2] MAINTAINERS: add pwm to PolarFire SoC entry Conor Dooley
  0 siblings, 2 replies; 8+ messages in thread
From: Conor Dooley @ 2023-03-06  9:48 UTC (permalink / raw)
  To: Thierry Reding, Uwe Kleine-König
  Cc: Daire McNamara, linux-kernel, linux-pwm, linux-riscv, Conor Dooley

Hey Uwe, all,

v14 is rebased on top of v6.3-rc1.

Uwe & I had a long back and forth about period calculations on v13,
my ultimate conclusion being that, after some testing of the "corrected"
calculation in hardware, the original calculation was correct.
I think we had gotten sucked into discussion the calculation of the
period itself, when we were in fact trying to calculate a bound on the
period instead. That discussion is here:
https://lore.kernel.org/linux-pwm/Y+ow8tfAHo1yv1XL@wendy/

Thanks,
Conor.

Changes since v13:
- couple bits of cleanup to apply_locked(), suggested by Uwe
- move the overhead waiting for a change to be applied, for channels
  with shadow registers, to subsequent calls to apply(). This has the
  benefit of only waiting when two calls to apply() are close in time
  rather than eating the delay in every call.

Changes since v11:
- swap a "bare" multiply & divide for the corresponding helper to
  prevent overflow
- factor out duplicate clk rate acquisition & period calculation
- make the period calculation return void by checking the validity of
  the clock rate in the caller
- drop the binding & dt patch, they're on-track for v6.2 via my tree

Changes since v10:
- reword some comments
- try to assign the period if a disable is requested
- drop a cast around a u8 -> u16 conversion
- fix a check on period_steps that should be on the hw_ variant
- split up the period calculation in get_state() to fix the result on
  32 bit
- add a rate variable in get_state() to only call get_rate() once
- redo the locking as suggested to make it more straightforward.
- stop checking for enablement in get_state() that was working around
 intended behaviour of the sysfs interface

Changes since v9:
- fixed the missing unlock that Dan reported

Changes since v8:
- fixed a(nother) raw 64 bit division (& built it for riscv32!)
- added a check to make sure we don't try to sleep for 0 us

Changes since v7:
- rebased on 6.0-rc1
- reworded comments you highlighted in v7
- fixed the overkill sleeping
- removed the unused variables in calc_duty
- added some extra comments to explain behaviours you questioned in v7
- make the mutexes un-interruptible
- fixed added the 1s you suggested for the if(period_locked) logic
- added setup of the channel_enabled shadowing
- fixed the period reporting for the negedge == posedge case in
  get_state() I had to add the enabled check, as otherwise it broke
  setting the period for the first time out of reset.
- added a test for invalid PERIOD_STEPS values, in which case we abort
  if we cannot fix the period

Changes from v6:
- Dropped an unused variable that I'd missed
- Actually check the return values of the mutex lock()s
- Re-rebased on -next for the MAINTAINERS patch (again...)

Changes from v5:
- switched to a mutex b/c we must sleep with the lock taken
- simplified the locking in apply() and added locking to get_state()
- reworked apply() as requested
- removed the loop in the period calculation (thanks Uwe!)
- add a copy of the enable registers in the driver to save on reads.
- remove the second (useless) write to sync_update
- added some missing rounding in get_state()
- couple other minor cleanups as requested in:
https://lore.kernel.org/linux-riscv/20220709160206.cw5luo7kxdshoiua@pengutronix.de/

Changes from v4:
- dropped some accidentally added files

Changes before v4:
https://lore.kernel.org/linux-pwm/20220721172109.941900-1-mail@conchuod.ie

Conor Dooley (2):
  pwm: add microchip soft ip corePWM driver
  MAINTAINERS: add pwm to PolarFire SoC entry

 MAINTAINERS                      |   1 +
 drivers/pwm/Kconfig              |  10 +
 drivers/pwm/Makefile             |   1 +
 drivers/pwm/pwm-microchip-core.c | 441 +++++++++++++++++++++++++++++++
 4 files changed, 453 insertions(+)
 create mode 100644 drivers/pwm/pwm-microchip-core.c

-- 
2.39.2


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v14 1/2] pwm: add microchip soft ip corePWM driver
  2023-03-06  9:48 [PATCH v14 0/2] Microchip Soft IP corePWM driver Conor Dooley
@ 2023-03-06  9:48 ` Conor Dooley
  2023-03-22 10:55   ` Uwe Kleine-König
  2023-03-06  9:48 ` [PATCH v14 2/2] MAINTAINERS: add pwm to PolarFire SoC entry Conor Dooley
  1 sibling, 1 reply; 8+ messages in thread
From: Conor Dooley @ 2023-03-06  9:48 UTC (permalink / raw)
  To: Thierry Reding, Uwe Kleine-König
  Cc: Daire McNamara, linux-kernel, linux-pwm, linux-riscv, Conor Dooley

Add a driver that supports the Microchip FPGA "soft" PWM IP core.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
---
 drivers/pwm/Kconfig              |  10 +
 drivers/pwm/Makefile             |   1 +
 drivers/pwm/pwm-microchip-core.c | 441 +++++++++++++++++++++++++++++++
 3 files changed, 452 insertions(+)
 create mode 100644 drivers/pwm/pwm-microchip-core.c

diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig
index dae023d783a22..f42756a014ed4 100644
--- a/drivers/pwm/Kconfig
+++ b/drivers/pwm/Kconfig
@@ -393,6 +393,16 @@ config PWM_MEDIATEK
 	  To compile this driver as a module, choose M here: the module
 	  will be called pwm-mediatek.
 
+config PWM_MICROCHIP_CORE
+	tristate "Microchip corePWM PWM support"
+	depends on SOC_MICROCHIP_POLARFIRE || COMPILE_TEST
+	depends on HAS_IOMEM && OF
+	help
+	  PWM driver for Microchip FPGA soft IP core.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called pwm-microchip-core.
+
 config PWM_MXS
 	tristate "Freescale MXS PWM support"
 	depends on ARCH_MXS || COMPILE_TEST
diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile
index 7bf1a29f02b84..a65625359ece4 100644
--- a/drivers/pwm/Makefile
+++ b/drivers/pwm/Makefile
@@ -34,6 +34,7 @@ obj-$(CONFIG_PWM_LPSS_PCI)	+= pwm-lpss-pci.o
 obj-$(CONFIG_PWM_LPSS_PLATFORM)	+= pwm-lpss-platform.o
 obj-$(CONFIG_PWM_MESON)		+= pwm-meson.o
 obj-$(CONFIG_PWM_MEDIATEK)	+= pwm-mediatek.o
+obj-$(CONFIG_PWM_MICROCHIP_CORE)	+= pwm-microchip-core.o
 obj-$(CONFIG_PWM_MTK_DISP)	+= pwm-mtk-disp.o
 obj-$(CONFIG_PWM_MXS)		+= pwm-mxs.o
 obj-$(CONFIG_PWM_NTXEC)		+= pwm-ntxec.o
diff --git a/drivers/pwm/pwm-microchip-core.c b/drivers/pwm/pwm-microchip-core.c
new file mode 100644
index 0000000000000..a2a2e28f39031
--- /dev/null
+++ b/drivers/pwm/pwm-microchip-core.c
@@ -0,0 +1,441 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * corePWM driver for Microchip "soft" FPGA IP cores.
+ *
+ * Copyright (c) 2021-2023 Microchip Corporation. All rights reserved.
+ * Author: Conor Dooley <conor.dooley@microchip.com>
+ * Documentation:
+ * https://www.microsemi.com/document-portal/doc_download/1245275-corepwm-hb
+ *
+ * Limitations:
+ * - If the IP block is configured without "shadow registers", all register
+ *   writes will take effect immediately, causing glitches on the output.
+ *   If shadow registers *are* enabled, a write to the "SYNC_UPDATE" register
+ *   notifies the core that it needs to update the registers defining the
+ *   waveform from the contents of the "shadow registers".
+ * - The IP block has no concept of a duty cycle, only rising/falling edges of
+ *   the waveform. Unfortunately, if the rising & falling edges registers have
+ *   the same value written to them the IP block will do whichever of a rising
+ *   or a falling edge is possible. I.E. a 50% waveform at twice the requested
+ *   period. Therefore to get a 0% waveform, the output is set the max high/low
+ *   time depending on polarity.
+ * - The PWM period is set for the whole IP block not per channel. The driver
+ *   will only change the period if no other PWM output is enabled.
+ */
+
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/err.h>
+#include <linux/io.h>
+#include <linux/ktime.h>
+#include <linux/math.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of_device.h>
+#include <linux/platform_device.h>
+#include <linux/pwm.h>
+
+#define PREG_TO_VAL(PREG) ((PREG) + 1)
+
+#define MCHPCOREPWM_PRESCALE_MAX	0x100
+#define MCHPCOREPWM_PERIOD_STEPS_MAX	0xff
+#define MCHPCOREPWM_PERIOD_MAX		0xff00
+
+#define MCHPCOREPWM_PRESCALE	0x00
+#define MCHPCOREPWM_PERIOD	0x04
+#define MCHPCOREPWM_EN(i)	(0x08 + 0x04 * (i)) /* 0x08, 0x0c */
+#define MCHPCOREPWM_POSEDGE(i)	(0x10 + 0x08 * (i)) /* 0x10, 0x18, ..., 0x88 */
+#define MCHPCOREPWM_NEGEDGE(i)	(0x14 + 0x08 * (i)) /* 0x14, 0x1c, ..., 0x8c */
+#define MCHPCOREPWM_SYNC_UPD	0xe4
+#define MCHPCOREPWM_TIMEOUT_MS	100u
+
+struct mchp_core_pwm_chip {
+	struct pwm_chip chip;
+	struct clk *clk;
+	void __iomem *base;
+	struct mutex lock; /* protects the shared period */
+	ktime_t update_timestamp;
+	u32 sync_update_mask;
+	u16 channel_enabled;
+};
+
+static inline struct mchp_core_pwm_chip *to_mchp_core_pwm(struct pwm_chip *chip)
+{
+	return container_of(chip, struct mchp_core_pwm_chip, chip);
+}
+
+static void mchp_core_pwm_enable(struct pwm_chip *chip, struct pwm_device *pwm,
+				 bool enable, u64 period)
+{
+	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
+	u8 channel_enable, reg_offset, shift;
+
+	/*
+	 * There are two adjacent 8 bit control regs, the lower reg controls
+	 * 0-7 and the upper reg 8-15. Check if the pwm is in the upper reg
+	 * and if so, offset by the bus width.
+	 */
+	reg_offset = MCHPCOREPWM_EN(pwm->hwpwm >> 3);
+	shift = pwm->hwpwm & 7;
+
+	channel_enable = readb_relaxed(mchp_core_pwm->base + reg_offset);
+	channel_enable &= ~(1 << shift);
+	channel_enable |= (enable << shift);
+
+	writel_relaxed(channel_enable, mchp_core_pwm->base + reg_offset);
+	mchp_core_pwm->channel_enabled &= ~BIT(pwm->hwpwm);
+	mchp_core_pwm->channel_enabled |= enable << pwm->hwpwm;
+
+	/*
+	 * Notify the block to update the waveform from the shadow registers.
+	 * The updated values will not appear on the bus until they have been
+	 * applied to the waveform at the beginning of the next period.
+	 * This is a NO-OP if the channel does not have shadow registers.
+	 */
+	if (mchp_core_pwm->sync_update_mask & (1 << pwm->hwpwm))
+		mchp_core_pwm->update_timestamp = ktime_add_ns(ktime_get(), period);
+}
+
+static void mchp_core_pwm_wait_for_sync_update(struct mchp_core_pwm_chip *mchp_core_pwm,
+					       unsigned int channel)
+{
+	/*
+	 * If a shadow register is used for this PWM channel, and iff there is
+	 * a pending update to the waveform, we must wait for it to be applied
+	 * before attempting to read its state. Reading the registers yields
+	 * the currently implemented settings & the new ones are only readable
+	 * once the current period has ended.
+	 */
+
+	if (mchp_core_pwm->sync_update_mask & (1 << channel)) {
+		ktime_t current_time = ktime_get();
+		s64 remaining_ns;
+		u32 delay_us;
+
+		remaining_ns = ktime_to_ns(ktime_sub(mchp_core_pwm->update_timestamp,
+						     current_time));
+
+		/*
+		 * If the update has gone through, don't bother waiting for
+		 * obvious reasons. Otherwise wait around for an appropriate
+		 * amount of time for the update to go through.
+		 */
+		if (remaining_ns <= 0)
+			return;
+
+		delay_us = DIV_ROUND_UP_ULL(remaining_ns, NSEC_PER_USEC);
+		fsleep(delay_us);
+	}
+}
+
+static u64 mchp_core_pwm_calc_duty(const struct pwm_state *state, u64 clk_rate,
+				   u8 prescale, u8 period_steps)
+{
+	u64 duty_steps, tmp;
+	u16 prescale_val = PREG_TO_VAL(prescale);
+
+	/*
+	 * Calculate the duty cycle in multiples of the prescaled period:
+	 * duty_steps = duty_in_ns / step_in_ns
+	 * step_in_ns = (prescale * NSEC_PER_SEC) / clk_rate
+	 * The code below is rearranged slightly to only divide once.
+	 */
+	tmp = prescale_val * NSEC_PER_SEC;
+	duty_steps = mul_u64_u64_div_u64(state->duty_cycle, clk_rate, tmp);
+
+	return duty_steps;
+}
+
+static void mchp_core_pwm_apply_duty(struct pwm_chip *chip, struct pwm_device *pwm,
+				     const struct pwm_state *state, u64 duty_steps,
+				     u16 period_steps)
+{
+	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
+	u8 posedge, negedge;
+	u8 first_edge = 0, second_edge = duty_steps;
+
+	/*
+	 * Setting posedge == negedge doesn't yield a constant output,
+	 * so that's an unsuitable setting to model duty_steps = 0.
+	 * In that case set the unwanted edge to a value that never
+	 * triggers.
+	 */
+	if (duty_steps == 0)
+		first_edge = PREG_TO_VAL(period_steps);
+
+	if (state->polarity == PWM_POLARITY_INVERSED) {
+		negedge = first_edge;
+		posedge = second_edge;
+	} else {
+		posedge = first_edge;
+		negedge = second_edge;
+	}
+
+	writel_relaxed(posedge, mchp_core_pwm->base + MCHPCOREPWM_POSEDGE(pwm->hwpwm));
+	writel_relaxed(negedge, mchp_core_pwm->base + MCHPCOREPWM_NEGEDGE(pwm->hwpwm));
+}
+
+static void mchp_core_pwm_calc_period(const struct pwm_state *state, unsigned long clk_rate,
+				      u16 *prescale, u16 *period_steps)
+{
+	u64 tmp;
+
+	/*
+	 * Calculate the period cycles and prescale values.
+	 * The registers are each 8 bits wide & multiplied to compute the period
+	 * using the formula:
+	 *           (prescale + 1) * (period_steps + 1)
+	 * period = -------------------------------------
+	 *                      clk_rate
+	 * so the maximum period that can be generated is 0x10000 times the
+	 * period of the input clock.
+	 * However, due to the design of the "hardware", it is not possible to
+	 * attain a 100% duty cycle if the full range of period_steps is used.
+	 * Therefore period_steps is restricted to 0xFE and the maximum multiple
+	 * of the clock period attainable is 0xFF00.
+	 */
+	tmp = mul_u64_u64_div_u64(state->period, clk_rate, NSEC_PER_SEC);
+
+	/*
+	 * The hardware adds one to the register value, so decrement by one to
+	 * account for the offset
+	 */
+	if (tmp >= MCHPCOREPWM_PERIOD_MAX) {
+		*prescale = MCHPCOREPWM_PRESCALE_MAX - 1;
+		*period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX - 1;
+
+		return;
+	}
+
+	/*
+	 * The optimal value for prescale can be calculated using the maximum
+	 * permitted value of period_steps, 0xff.
+	 *
+	 *             period * clk_rate
+	 * prescale = ------------------- - 1
+	 *            NSEC_PER_SEC * 0xff
+	 *
+	 * However, we are purely interested in the integer upper bound of this
+	 * calculation, so ignore the subtraction & rely on the truncation done
+	 * by the division.
+	 *
+	 *  period * clk_rate
+	 * ------------------- was precomputed as `tmp`
+	 *    NSEC_PER_SEC
+	 *
+	 * period_steps is then computed using the result:
+	 *                      period * clk_rate
+	 * period_steps = ----------------------------- - 1
+	 *                NSEC_PER_SEC * (prescale + 1)
+	 */
+	*prescale = div_u64(tmp, MCHPCOREPWM_PERIOD_STEPS_MAX);
+	*period_steps = div_u64(tmp, PREG_TO_VAL(*prescale)) - 1;
+}
+
+static int mchp_core_pwm_apply_locked(struct pwm_chip *chip, struct pwm_device *pwm,
+				      const struct pwm_state *state)
+{
+	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
+	bool period_locked;
+	unsigned long clk_rate;
+	u64 duty_steps;
+	u16 prescale, period_steps;
+
+	if (!state->enabled) {
+		mchp_core_pwm_enable(chip, pwm, false, pwm->state.period);
+		return 0;
+	}
+
+	/*
+	 * If clk_rate is too big, the following multiplication might overflow.
+	 * However this is implausible, as the fabric of current FPGAs cannot
+	 * provide clocks at a rate high enough.
+	 */
+	clk_rate = clk_get_rate(mchp_core_pwm->clk);
+	if (clk_rate >= NSEC_PER_SEC)
+		return -EINVAL;
+
+	mchp_core_pwm_calc_period(state, clk_rate, &prescale, &period_steps);
+
+	/*
+	 * If the only thing that has changed is the duty cycle or the polarity,
+	 * we can shortcut the calculations and just compute/apply the new duty
+	 * cycle pos & neg edges
+	 * As all the channels share the same period, do not allow it to be
+	 * changed if any other channels are enabled.
+	 * If the period is locked, it may not be possible to use a period
+	 * less than that requested. In that case, we just abort.
+	 */
+	period_locked = mchp_core_pwm->channel_enabled & ~(1 << pwm->hwpwm);
+
+	if (period_locked) {
+		u16 hw_prescale;
+		u16 hw_period_steps;
+
+		hw_prescale = readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_PRESCALE);
+		hw_period_steps = readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_PERIOD);
+
+		if ((period_steps + 1) * (prescale + 1) <
+		    (hw_period_steps + 1) * (hw_prescale + 1))
+			return -EINVAL;
+
+		/*
+		 * It is possible that something could have set the period_steps
+		 * register to 0xff, which would prevent us from setting a 100%
+		 * or 0% relative duty cycle, as explained above in
+		 * mchp_core_pwm_calc_period().
+		 * The period is locked and we cannot change this, so we abort.
+		 */
+		if (hw_period_steps == MCHPCOREPWM_PERIOD_STEPS_MAX)
+			return -EINVAL;
+
+		prescale = hw_prescale;
+		period_steps = hw_period_steps;
+	} else {
+		writel_relaxed(prescale, mchp_core_pwm->base + MCHPCOREPWM_PRESCALE);
+		writel_relaxed(period_steps, mchp_core_pwm->base + MCHPCOREPWM_PERIOD);
+	}
+
+	duty_steps = mchp_core_pwm_calc_duty(state, clk_rate, prescale, period_steps);
+
+	/*
+	 * Because the period is per channel, it is possible that the requested
+	 * duty cycle is longer than the period, in which case cap it to the
+	 * period, IOW a 100% duty cycle.
+	 */
+	if (duty_steps > period_steps)
+		duty_steps = period_steps + 1;
+
+	mchp_core_pwm_apply_duty(chip, pwm, state, duty_steps, period_steps);
+
+	mchp_core_pwm_enable(chip, pwm, true, pwm->state.period);
+
+	return 0;
+}
+
+static int mchp_core_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
+			       const struct pwm_state *state)
+{
+	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
+	int ret;
+
+	mutex_lock(&mchp_core_pwm->lock);
+
+	mchp_core_pwm_wait_for_sync_update(mchp_core_pwm, pwm->hwpwm);
+
+	ret = mchp_core_pwm_apply_locked(chip, pwm, state);
+
+	mutex_unlock(&mchp_core_pwm->lock);
+
+	return ret;
+}
+
+static int mchp_core_pwm_get_state(struct pwm_chip *chip, struct pwm_device *pwm,
+				   struct pwm_state *state)
+{
+	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
+	u64 rate;
+	u16 prescale, period_steps;
+	u8 duty_steps, posedge, negedge;
+
+	mutex_lock(&mchp_core_pwm->lock);
+
+	mchp_core_pwm_wait_for_sync_update(mchp_core_pwm, pwm->hwpwm);
+
+	if (mchp_core_pwm->channel_enabled & (1 << pwm->hwpwm))
+		state->enabled = true;
+	else
+		state->enabled = false;
+
+	rate = clk_get_rate(mchp_core_pwm->clk);
+
+	prescale = PREG_TO_VAL(readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_PRESCALE));
+
+	period_steps = PREG_TO_VAL(readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_PERIOD));
+	state->period = period_steps * prescale;
+	state->period *= NSEC_PER_SEC;
+	state->period = DIV64_U64_ROUND_UP(state->period, rate);
+
+	posedge = readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_POSEDGE(pwm->hwpwm));
+	negedge = readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_NEGEDGE(pwm->hwpwm));
+
+	mutex_unlock(&mchp_core_pwm->lock);
+
+	if (negedge == posedge) {
+		state->duty_cycle = state->period;
+		state->period *= 2;
+	} else {
+		duty_steps = abs((s16)posedge - (s16)negedge);
+		state->duty_cycle = duty_steps * prescale * NSEC_PER_SEC;
+		state->duty_cycle = DIV64_U64_ROUND_UP(state->duty_cycle, rate);
+	}
+
+	state->polarity = negedge < posedge ? PWM_POLARITY_INVERSED : PWM_POLARITY_NORMAL;
+
+	return 0;
+}
+
+static const struct pwm_ops mchp_core_pwm_ops = {
+	.apply = mchp_core_pwm_apply,
+	.get_state = mchp_core_pwm_get_state,
+	.owner = THIS_MODULE,
+};
+
+static const struct of_device_id mchp_core_of_match[] = {
+	{
+		.compatible = "microchip,corepwm-rtl-v4",
+	},
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, mchp_core_of_match);
+
+static int mchp_core_pwm_probe(struct platform_device *pdev)
+{
+	struct mchp_core_pwm_chip *mchp_core_pwm;
+	struct resource *regs;
+
+	mchp_core_pwm = devm_kzalloc(&pdev->dev, sizeof(*mchp_core_pwm), GFP_KERNEL);
+	if (!mchp_core_pwm)
+		return -ENOMEM;
+
+	mchp_core_pwm->base = devm_platform_get_and_ioremap_resource(pdev, 0, &regs);
+	if (IS_ERR(mchp_core_pwm->base))
+		return PTR_ERR(mchp_core_pwm->base);
+
+	mchp_core_pwm->clk = devm_clk_get_enabled(&pdev->dev, NULL);
+	if (IS_ERR(mchp_core_pwm->clk))
+		return dev_err_probe(&pdev->dev, PTR_ERR(mchp_core_pwm->clk),
+				     "failed to get PWM clock\n");
+
+	if (of_property_read_u32(pdev->dev.of_node, "microchip,sync-update-mask",
+				 &mchp_core_pwm->sync_update_mask))
+		mchp_core_pwm->sync_update_mask = 0;
+
+	mutex_init(&mchp_core_pwm->lock);
+
+	mchp_core_pwm->chip.dev = &pdev->dev;
+	mchp_core_pwm->chip.ops = &mchp_core_pwm_ops;
+	mchp_core_pwm->chip.npwm = 16;
+
+	mchp_core_pwm->channel_enabled = readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_EN(0));
+	mchp_core_pwm->channel_enabled |=
+		readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_EN(1)) << 8;
+
+	writel_relaxed(1U, mchp_core_pwm->base + MCHPCOREPWM_SYNC_UPD);
+	mchp_core_pwm->update_timestamp = ktime_get();
+
+	return devm_pwmchip_add(&pdev->dev, &mchp_core_pwm->chip);
+}
+
+static struct platform_driver mchp_core_pwm_driver = {
+	.driver = {
+		.name = "mchp-core-pwm",
+		.of_match_table = mchp_core_of_match,
+	},
+	.probe = mchp_core_pwm_probe,
+};
+module_platform_driver(mchp_core_pwm_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Conor Dooley <conor.dooley@microchip.com>");
+MODULE_DESCRIPTION("corePWM driver for Microchip FPGAs");
-- 
2.39.2


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v14 2/2] MAINTAINERS: add pwm to PolarFire SoC entry
  2023-03-06  9:48 [PATCH v14 0/2] Microchip Soft IP corePWM driver Conor Dooley
  2023-03-06  9:48 ` [PATCH v14 1/2] pwm: add microchip soft ip " Conor Dooley
@ 2023-03-06  9:48 ` Conor Dooley
  1 sibling, 0 replies; 8+ messages in thread
From: Conor Dooley @ 2023-03-06  9:48 UTC (permalink / raw)
  To: Thierry Reding, Uwe Kleine-König
  Cc: Daire McNamara, linux-kernel, linux-pwm, linux-riscv, Conor Dooley

Add the newly introduced pwm driver to the existing PolarFire SoC entry.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8d5bc223f3053..128cc89a47d8e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17993,6 +17993,7 @@ F:	drivers/clk/microchip/clk-mpfs.c
 F:	drivers/i2c/busses/i2c-microchip-corei2c.c
 F:	drivers/mailbox/mailbox-mpfs.c
 F:	drivers/pci/controller/pcie-microchip-host.c
+F:	drivers/pwm/pwm-microchip-core.c
 F:	drivers/reset/reset-mpfs.c
 F:	drivers/rtc/rtc-mpfs.c
 F:	drivers/soc/microchip/mpfs-sys-controller.c
-- 
2.39.2


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v14 1/2] pwm: add microchip soft ip corePWM driver
  2023-03-06  9:48 ` [PATCH v14 1/2] pwm: add microchip soft ip " Conor Dooley
@ 2023-03-22 10:55   ` Uwe Kleine-König
  2023-03-22 22:49     ` Conor Dooley
  0 siblings, 1 reply; 8+ messages in thread
From: Uwe Kleine-König @ 2023-03-22 10:55 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Thierry Reding, Daire McNamara, linux-kernel, linux-pwm, linux-riscv


[-- Attachment #1.1: Type: text/plain, Size: 15011 bytes --]

Hello,

On Mon, Mar 06, 2023 at 09:48:58AM +0000, Conor Dooley wrote:
> Add a driver that supports the Microchip FPGA "soft" PWM IP core.
> 
> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
> ---
>  drivers/pwm/Kconfig              |  10 +
>  drivers/pwm/Makefile             |   1 +
>  drivers/pwm/pwm-microchip-core.c | 441 +++++++++++++++++++++++++++++++
>  3 files changed, 452 insertions(+)
>  create mode 100644 drivers/pwm/pwm-microchip-core.c
> 
> diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig
> index dae023d783a22..f42756a014ed4 100644
> --- a/drivers/pwm/Kconfig
> +++ b/drivers/pwm/Kconfig
> @@ -393,6 +393,16 @@ config PWM_MEDIATEK
>  	  To compile this driver as a module, choose M here: the module
>  	  will be called pwm-mediatek.
>  
> +config PWM_MICROCHIP_CORE
> +	tristate "Microchip corePWM PWM support"
> +	depends on SOC_MICROCHIP_POLARFIRE || COMPILE_TEST
> +	depends on HAS_IOMEM && OF
> +	help
> +	  PWM driver for Microchip FPGA soft IP core.
> +
> +	  To compile this driver as a module, choose M here: the module
> +	  will be called pwm-microchip-core.
> +
>  config PWM_MXS
>  	tristate "Freescale MXS PWM support"
>  	depends on ARCH_MXS || COMPILE_TEST
> diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile
> index 7bf1a29f02b84..a65625359ece4 100644
> --- a/drivers/pwm/Makefile
> +++ b/drivers/pwm/Makefile
> @@ -34,6 +34,7 @@ obj-$(CONFIG_PWM_LPSS_PCI)	+= pwm-lpss-pci.o
>  obj-$(CONFIG_PWM_LPSS_PLATFORM)	+= pwm-lpss-platform.o
>  obj-$(CONFIG_PWM_MESON)		+= pwm-meson.o
>  obj-$(CONFIG_PWM_MEDIATEK)	+= pwm-mediatek.o
> +obj-$(CONFIG_PWM_MICROCHIP_CORE)	+= pwm-microchip-core.o
>  obj-$(CONFIG_PWM_MTK_DISP)	+= pwm-mtk-disp.o
>  obj-$(CONFIG_PWM_MXS)		+= pwm-mxs.o
>  obj-$(CONFIG_PWM_NTXEC)		+= pwm-ntxec.o
> diff --git a/drivers/pwm/pwm-microchip-core.c b/drivers/pwm/pwm-microchip-core.c
> new file mode 100644
> index 0000000000000..a2a2e28f39031
> --- /dev/null
> +++ b/drivers/pwm/pwm-microchip-core.c
> @@ -0,0 +1,441 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * corePWM driver for Microchip "soft" FPGA IP cores.
> + *
> + * Copyright (c) 2021-2023 Microchip Corporation. All rights reserved.
> + * Author: Conor Dooley <conor.dooley@microchip.com>
> + * Documentation:
> + * https://www.microsemi.com/document-portal/doc_download/1245275-corepwm-hb
> + *
> + * Limitations:
> + * - If the IP block is configured without "shadow registers", all register
> + *   writes will take effect immediately, causing glitches on the output.
> + *   If shadow registers *are* enabled, a write to the "SYNC_UPDATE" register
> + *   notifies the core that it needs to update the registers defining the
> + *   waveform from the contents of the "shadow registers".
> + * - The IP block has no concept of a duty cycle, only rising/falling edges of
> + *   the waveform. Unfortunately, if the rising & falling edges registers have
> + *   the same value written to them the IP block will do whichever of a rising
> + *   or a falling edge is possible. I.E. a 50% waveform at twice the requested
> + *   period. Therefore to get a 0% waveform, the output is set the max high/low
> + *   time depending on polarity.
> + * - The PWM period is set for the whole IP block not per channel. The driver
> + *   will only change the period if no other PWM output is enabled.
> + */
> +
> +#include <linux/clk.h>
> +#include <linux/delay.h>
> +#include <linux/err.h>
> +#include <linux/io.h>
> +#include <linux/ktime.h>
> +#include <linux/math.h>
> +#include <linux/module.h>
> +#include <linux/mutex.h>
> +#include <linux/of_device.h>
> +#include <linux/platform_device.h>
> +#include <linux/pwm.h>
> +
> +#define PREG_TO_VAL(PREG) ((PREG) + 1)
> +
> +#define MCHPCOREPWM_PRESCALE_MAX	0x100
> +#define MCHPCOREPWM_PERIOD_STEPS_MAX	0xff
> +#define MCHPCOREPWM_PERIOD_MAX		0xff00
> +
> +#define MCHPCOREPWM_PRESCALE	0x00
> +#define MCHPCOREPWM_PERIOD	0x04
> +#define MCHPCOREPWM_EN(i)	(0x08 + 0x04 * (i)) /* 0x08, 0x0c */
> +#define MCHPCOREPWM_POSEDGE(i)	(0x10 + 0x08 * (i)) /* 0x10, 0x18, ..., 0x88 */
> +#define MCHPCOREPWM_NEGEDGE(i)	(0x14 + 0x08 * (i)) /* 0x14, 0x1c, ..., 0x8c */
> +#define MCHPCOREPWM_SYNC_UPD	0xe4
> +#define MCHPCOREPWM_TIMEOUT_MS	100u
> +
> +struct mchp_core_pwm_chip {
> +	struct pwm_chip chip;
> +	struct clk *clk;
> +	void __iomem *base;
> +	struct mutex lock; /* protects the shared period */
> +	ktime_t update_timestamp;
> +	u32 sync_update_mask;
> +	u16 channel_enabled;
> +};
> +
> +static inline struct mchp_core_pwm_chip *to_mchp_core_pwm(struct pwm_chip *chip)
> +{
> +	return container_of(chip, struct mchp_core_pwm_chip, chip);
> +}
> +
> +static void mchp_core_pwm_enable(struct pwm_chip *chip, struct pwm_device *pwm,
> +				 bool enable, u64 period)
> +{
> +	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
> +	u8 channel_enable, reg_offset, shift;
> +
> +	/*
> +	 * There are two adjacent 8 bit control regs, the lower reg controls
> +	 * 0-7 and the upper reg 8-15. Check if the pwm is in the upper reg
> +	 * and if so, offset by the bus width.
> +	 */
> +	reg_offset = MCHPCOREPWM_EN(pwm->hwpwm >> 3);
> +	shift = pwm->hwpwm & 7;
> +
> +	channel_enable = readb_relaxed(mchp_core_pwm->base + reg_offset);
> +	channel_enable &= ~(1 << shift);
> +	channel_enable |= (enable << shift);
> +
> +	writel_relaxed(channel_enable, mchp_core_pwm->base + reg_offset);
> +	mchp_core_pwm->channel_enabled &= ~BIT(pwm->hwpwm);
> +	mchp_core_pwm->channel_enabled |= enable << pwm->hwpwm;
> +
> +	/*
> +	 * Notify the block to update the waveform from the shadow registers.
> +	 * The updated values will not appear on the bus until they have been
> +	 * applied to the waveform at the beginning of the next period.
> +	 * This is a NO-OP if the channel does not have shadow registers.
> +	 */
> +	if (mchp_core_pwm->sync_update_mask & (1 << pwm->hwpwm))
> +		mchp_core_pwm->update_timestamp = ktime_add_ns(ktime_get(), period);
> +}
> +
> +static void mchp_core_pwm_wait_for_sync_update(struct mchp_core_pwm_chip *mchp_core_pwm,
> +					       unsigned int channel)
> +{
> +	/*
> +	 * If a shadow register is used for this PWM channel, and iff there is
> +	 * a pending update to the waveform, we must wait for it to be applied
> +	 * before attempting to read its state. Reading the registers yields
> +	 * the currently implemented settings & the new ones are only readable
> +	 * once the current period has ended.
> +	 */
> +
> +	if (mchp_core_pwm->sync_update_mask & (1 << channel)) {
> +		ktime_t current_time = ktime_get();
> +		s64 remaining_ns;
> +		u32 delay_us;
> +
> +		remaining_ns = ktime_to_ns(ktime_sub(mchp_core_pwm->update_timestamp,
> +						     current_time));
> +
> +		/*
> +		 * If the update has gone through, don't bother waiting for
> +		 * obvious reasons. Otherwise wait around for an appropriate
> +		 * amount of time for the update to go through.
> +		 */
> +		if (remaining_ns <= 0)
> +			return;
> +
> +		delay_us = DIV_ROUND_UP_ULL(remaining_ns, NSEC_PER_USEC);
> +		fsleep(delay_us);
> +	}
> +}
> +
> +static u64 mchp_core_pwm_calc_duty(const struct pwm_state *state, u64 clk_rate,
> +				   u8 prescale, u8 period_steps)
> +{
> +	u64 duty_steps, tmp;
> +	u16 prescale_val = PREG_TO_VAL(prescale);
> +
> +	/*
> +	 * Calculate the duty cycle in multiples of the prescaled period:
> +	 * duty_steps = duty_in_ns / step_in_ns
> +	 * step_in_ns = (prescale * NSEC_PER_SEC) / clk_rate
> +	 * The code below is rearranged slightly to only divide once.
> +	 */
> +	tmp = prescale_val * NSEC_PER_SEC;
> +	duty_steps = mul_u64_u64_div_u64(state->duty_cycle, clk_rate, tmp);
> +
> +	return duty_steps;
> +}
> +
> +static void mchp_core_pwm_apply_duty(struct pwm_chip *chip, struct pwm_device *pwm,
> +				     const struct pwm_state *state, u64 duty_steps,
> +				     u16 period_steps)
> +{
> +	struct mchp_core_pwm_chip *mchp_core_pwm = to_mchp_core_pwm(chip);
> +	u8 posedge, negedge;
> +	u8 first_edge = 0, second_edge = duty_steps;
> +
> +	/*
> +	 * Setting posedge == negedge doesn't yield a constant output,
> +	 * so that's an unsuitable setting to model duty_steps = 0.
> +	 * In that case set the unwanted edge to a value that never
> +	 * triggers.
> +	 */
> +	if (duty_steps == 0)
> +		first_edge = PREG_TO_VAL(period_steps);
> +
> +	if (state->polarity == PWM_POLARITY_INVERSED) {
> +		negedge = first_edge;
> +		posedge = second_edge;
> +	} else {
> +		posedge = first_edge;
> +		negedge = second_edge;
> +	}
> +
> +	writel_relaxed(posedge, mchp_core_pwm->base + MCHPCOREPWM_POSEDGE(pwm->hwpwm));
> +	writel_relaxed(negedge, mchp_core_pwm->base + MCHPCOREPWM_NEGEDGE(pwm->hwpwm));
> +}
> +
> +static void mchp_core_pwm_calc_period(const struct pwm_state *state, unsigned long clk_rate,
> +				      u16 *prescale, u16 *period_steps)
> +{
> +	u64 tmp;
> +
> +	/*
> +	 * Calculate the period cycles and prescale values.
> +	 * The registers are each 8 bits wide & multiplied to compute the period
> +	 * using the formula:
> +	 *           (prescale + 1) * (period_steps + 1)
> +	 * period = -------------------------------------
> +	 *                      clk_rate
> +	 * so the maximum period that can be generated is 0x10000 times the
> +	 * period of the input clock.
> +	 * However, due to the design of the "hardware", it is not possible to
> +	 * attain a 100% duty cycle if the full range of period_steps is used.
> +	 * Therefore period_steps is restricted to 0xFE and the maximum multiple
> +	 * of the clock period attainable is 0xFF00.
> +	 */
> +	tmp = mul_u64_u64_div_u64(state->period, clk_rate, NSEC_PER_SEC);
> +
> +	/*
> +	 * The hardware adds one to the register value, so decrement by one to
> +	 * account for the offset
> +	 */
> +	if (tmp >= MCHPCOREPWM_PERIOD_MAX) {
> +		*prescale = MCHPCOREPWM_PRESCALE_MAX - 1;
> +		*period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX - 1;
> +
> +		return;
> +	}
> +
> +	/*
> +	 * The optimal value for prescale can be calculated using the maximum
> +	 * permitted value of period_steps, 0xff.

I had to think about that one for a while. The maximal value for
(period_steps + 1) is 0xff with the reasoning above?! That's also what
the code uses.

Also as the comment is written here, it's wrong (or misleading)
depending on the semantic of "optimal". If you want to achive

	(prescale + 1) * (period_steps + 1) <= 64009

you should pick prescale == period_steps == 252 to get optimally near
64009.
However the idea is to pick a set of values with period_steps being big
to allow a finegrained selection for the duty cycle, right?

Consider

	clk_rate = 1000000
	period = 64009000

then your code gives:

              period * clk_rate
	tmp = ----------------- = 64009
                NSEC_PER_SEC

and so *prescale = 251 and *period_steps = 253. 

Wasn't the intention to pick *prescale = 250 and then
*period_steps = 255?

Depending on your semantics of "optimal", either (252, 252) or (250,
255) is better than (251, 253). I think that means you shouldn't ignore
the -1?

One thing I think is strange is that with clk_rate = 1000001 and your
algorithm we get:

requested period = 1274998 ns -> real period = 1269998.73000127  (prescale = 4, period_steps = 253)
requested period = 1274999 ns -> real period = 1271998.728001272 (prescale = 5, period_steps = 211)

while 1271998.728001272 would be a better match for a request with
period = 1274998 than 1269998.73000127.

I spend too much time to think about that now. I'm unsure if this is
because the -1 is missing, or if there is a bug in the idea to pick a
small prescale to allow a big period_steps value (in combination with
the request to pick the biggest possible period).

Hmm, maybe you understand that better than me? I'll have to think about
it.

> +	 *
> +	 *             period * clk_rate
> +	 * prescale = ------------------- - 1
> +	 *            NSEC_PER_SEC * 0xff
> +	 *
> +	 * However, we are purely interested in the integer upper bound of this
> +	 * calculation, so ignore the subtraction & rely on the truncation done
> +	 * by the division.
> +	 *
> +	 *  period * clk_rate
> +	 * ------------------- was precomputed as `tmp`
> +	 *    NSEC_PER_SEC
> +	 *
> +	 * period_steps is then computed using the result:
> +	 *                      period * clk_rate
> +	 * period_steps = ----------------------------- - 1
> +	 *                NSEC_PER_SEC * (prescale + 1)
> +	 */
> +	*prescale = div_u64(tmp, MCHPCOREPWM_PERIOD_STEPS_MAX);
> +	*period_steps = div_u64(tmp, PREG_TO_VAL(*prescale)) - 1;
> +}
> +
> [..]
> +static int mchp_core_pwm_probe(struct platform_device *pdev)
> +{
> +	struct mchp_core_pwm_chip *mchp_core_pwm;
> +	struct resource *regs;
> +
> +	mchp_core_pwm = devm_kzalloc(&pdev->dev, sizeof(*mchp_core_pwm), GFP_KERNEL);
> +	if (!mchp_core_pwm)
> +		return -ENOMEM;
> +
> +	mchp_core_pwm->base = devm_platform_get_and_ioremap_resource(pdev, 0, &regs);
> +	if (IS_ERR(mchp_core_pwm->base))
> +		return PTR_ERR(mchp_core_pwm->base);
> +
> +	mchp_core_pwm->clk = devm_clk_get_enabled(&pdev->dev, NULL);
> +	if (IS_ERR(mchp_core_pwm->clk))
> +		return dev_err_probe(&pdev->dev, PTR_ERR(mchp_core_pwm->clk),
> +				     "failed to get PWM clock\n");
> +
> +	if (of_property_read_u32(pdev->dev.of_node, "microchip,sync-update-mask",
> +				 &mchp_core_pwm->sync_update_mask))
> +		mchp_core_pwm->sync_update_mask = 0;
> +
> +	mutex_init(&mchp_core_pwm->lock);
> +
> +	mchp_core_pwm->chip.dev = &pdev->dev;
> +	mchp_core_pwm->chip.ops = &mchp_core_pwm_ops;
> +	mchp_core_pwm->chip.npwm = 16;
> +
> +	mchp_core_pwm->channel_enabled = readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_EN(0));
> +	mchp_core_pwm->channel_enabled |=
> +		readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_EN(1)) << 8;
> +
> +	writel_relaxed(1U, mchp_core_pwm->base + MCHPCOREPWM_SYNC_UPD);

This one is just for the case where there is an unapplied configuration
in the registers, right?

> +	mchp_core_pwm->update_timestamp = ktime_get();
> +
> +	return devm_pwmchip_add(&pdev->dev, &mchp_core_pwm->chip);

An error message if devm_pwmchip_add() fails would be nice.

> +}
> +
> +static struct platform_driver mchp_core_pwm_driver = {
> +	.driver = {
> +		.name = "mchp-core-pwm",
> +		.of_match_table = mchp_core_of_match,
> +	},
> +	.probe = mchp_core_pwm_probe,
> +};
> +module_platform_driver(mchp_core_pwm_driver);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Conor Dooley <conor.dooley@microchip.com>");
> +MODULE_DESCRIPTION("corePWM driver for Microchip FPGAs");
> -- 
> 2.39.2
> 
> 

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v14 1/2] pwm: add microchip soft ip corePWM driver
  2023-03-22 10:55   ` Uwe Kleine-König
@ 2023-03-22 22:49     ` Conor Dooley
  2023-03-23  9:14       ` Uwe Kleine-König
  0 siblings, 1 reply; 8+ messages in thread
From: Conor Dooley @ 2023-03-22 22:49 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: Conor Dooley, Thierry Reding, Daire McNamara, linux-kernel,
	linux-pwm, linux-riscv


[-- Attachment #1.1: Type: text/plain, Size: 7484 bytes --]

On Wed, Mar 22, 2023 at 11:55:36AM +0100, Uwe Kleine-König wrote:
> On Mon, Mar 06, 2023 at 09:48:58AM +0000, Conor Dooley wrote:
> > Add a driver that supports the Microchip FPGA "soft" PWM IP core.

> > +static void mchp_core_pwm_calc_period(const struct pwm_state *state, unsigned long clk_rate,
> > +				      u16 *prescale, u16 *period_steps)
> > +{
> > +	u64 tmp;
> > +
> > +	/*
> > +	 * Calculate the period cycles and prescale values.
> > +	 * The registers are each 8 bits wide & multiplied to compute the period
> > +	 * using the formula:
> > +	 *           (prescale + 1) * (period_steps + 1)
> > +	 * period = -------------------------------------
> > +	 *                      clk_rate
> > +	 * so the maximum period that can be generated is 0x10000 times the
> > +	 * period of the input clock.
> > +	 * However, due to the design of the "hardware", it is not possible to
> > +	 * attain a 100% duty cycle if the full range of period_steps is used.
> > +	 * Therefore period_steps is restricted to 0xFE and the maximum multiple
> > +	 * of the clock period attainable is 0xFF00.
> > +	 */
> > +	tmp = mul_u64_u64_div_u64(state->period, clk_rate, NSEC_PER_SEC);
> > +
> > +	/*
> > +	 * The hardware adds one to the register value, so decrement by one to
> > +	 * account for the offset
> > +	 */
> > +	if (tmp >= MCHPCOREPWM_PERIOD_MAX) {
> > +		*prescale = MCHPCOREPWM_PRESCALE_MAX - 1;
> > +		*period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX - 1;
> > +
> > +		return;
> > +	}
> > +
> > +	/*
> > +	 * The optimal value for prescale can be calculated using the maximum
> > +	 * permitted value of period_steps, 0xff.
> 
> I had to think about that one for a while. The maximal value for
> (period_steps + 1) is 0xff with the reasoning above?! That's also what
> the code uses.

I've become really disenfranchised with these register/variable names.
I feel like just changing them to disconnect the variables used for
calculation from the register names a little, so that the "is there a +1
needed here or not" stuff becomes a little clearer.
It always makes sense to be when I am in an "I respun the patch today"
mode, but by the time we're in the review stage I get muddled.
God forbid I have to look at this in 10 years time.

That said, there is a bit of a mistake here. The comment two above says
"Therefore period_steps is restricted to 0xFE" when I'm capping things
off. Some inaccuracies have probably snuck in during the various
respins, and I think the comment above is "correct" but misleading, as
it muddies the waters about variable versus register names.

> Also as the comment is written here, it's wrong (or misleading)
> depending on the semantic of "optimal". If you want to achive
> 
> 	(prescale + 1) * (period_steps + 1) <= 64009
> 
> you should pick prescale == period_steps == 252 to get optimally near
> 64009.
> However the idea is to pick a set of values with period_steps being big
> to allow a finegrained selection for the duty cycle, right?

Correct. I'll update the comments with an explanation as to what the
objective is, rather than just referring to it as "optimal".

> Consider
> 
> 	clk_rate = 1000000
> 	period = 64009000
> 
> then your code gives:
> 
>               period * clk_rate
> 	tmp = ----------------- = 64009
>                 NSEC_PER_SEC
> 
> and so *prescale = 251 and *period_steps = 253. 
> 
> Wasn't the intention to pick *prescale = 250 and then
> *period_steps = 255?
> 
> Depending on your semantics of "optimal", either (252, 252) or (250,
> 255) is better than (251, 253). I think that means you shouldn't ignore
> the -1?
> 
> One thing I think is strange is that with clk_rate = 1000001 and your
> algorithm we get:
> 
> requested period = 1274998 ns -> real period = 1269998.73000127  (prescale = 4, period_steps = 253)
> requested period = 1274999 ns -> real period = 1271998.728001272 (prescale = 5, period_steps = 211)
> 
> while 1271998.728001272 would be a better match for a request with
> period = 1274998 than 1269998.73000127.
> 
> I spend too much time to think about that now. I'm unsure if this is
> because the -1 is missing, or if there is a bug in the idea to pick a
> small prescale to allow a big period_steps value (in combination with
> the request to pick the biggest possible period).
> 
> Hmm, maybe you understand that better than me? I'll have to think about
> it.

I'll have to think about it too, I'll clear a space among the todo-lists
on my whiteboard tomorrow or Friday and get back to you...

> 
> > +	 *
> > +	 *             period * clk_rate
> > +	 * prescale = ------------------- - 1
> > +	 *            NSEC_PER_SEC * 0xff
> > +	 *
> > +	 * However, we are purely interested in the integer upper bound of this
> > +	 * calculation, so ignore the subtraction & rely on the truncation done
> > +	 * by the division.
> > +	 *
> > +	 *  period * clk_rate
> > +	 * ------------------- was precomputed as `tmp`
> > +	 *    NSEC_PER_SEC
> > +	 *
> > +	 * period_steps is then computed using the result:
> > +	 *                      period * clk_rate
> > +	 * period_steps = ----------------------------- - 1
> > +	 *                NSEC_PER_SEC * (prescale + 1)
> > +	 */
> > +	*prescale = div_u64(tmp, MCHPCOREPWM_PERIOD_STEPS_MAX);
> > +	*period_steps = div_u64(tmp, PREG_TO_VAL(*prescale)) - 1;
> > +}
> > +
> > [..]
> > +static int mchp_core_pwm_probe(struct platform_device *pdev)
> > +{
> > +	struct mchp_core_pwm_chip *mchp_core_pwm;
> > +	struct resource *regs;
> > +
> > +	mchp_core_pwm = devm_kzalloc(&pdev->dev, sizeof(*mchp_core_pwm), GFP_KERNEL);
> > +	if (!mchp_core_pwm)
> > +		return -ENOMEM;
> > +
> > +	mchp_core_pwm->base = devm_platform_get_and_ioremap_resource(pdev, 0, &regs);
> > +	if (IS_ERR(mchp_core_pwm->base))
> > +		return PTR_ERR(mchp_core_pwm->base);
> > +
> > +	mchp_core_pwm->clk = devm_clk_get_enabled(&pdev->dev, NULL);
> > +	if (IS_ERR(mchp_core_pwm->clk))
> > +		return dev_err_probe(&pdev->dev, PTR_ERR(mchp_core_pwm->clk),
> > +				     "failed to get PWM clock\n");
> > +
> > +	if (of_property_read_u32(pdev->dev.of_node, "microchip,sync-update-mask",
> > +				 &mchp_core_pwm->sync_update_mask))
> > +		mchp_core_pwm->sync_update_mask = 0;
> > +
> > +	mutex_init(&mchp_core_pwm->lock);
> > +
> > +	mchp_core_pwm->chip.dev = &pdev->dev;
> > +	mchp_core_pwm->chip.ops = &mchp_core_pwm_ops;
> > +	mchp_core_pwm->chip.npwm = 16;
> > +
> > +	mchp_core_pwm->channel_enabled = readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_EN(0));
> > +	mchp_core_pwm->channel_enabled |=
> > +		readb_relaxed(mchp_core_pwm->base + MCHPCOREPWM_EN(1)) << 8;
> > +
> > +	writel_relaxed(1U, mchp_core_pwm->base + MCHPCOREPWM_SYNC_UPD);
> 
> This one is just for the case where there is an unapplied configuration
> in the registers, right?

No, this is me realising that I had a misconception about how that
register works. You write the bit once, and it enables the mode for
channels that have been configured that way at synthesis-time, rather
than how I somehow thought it worked which was as a "flush" from the
shadow registers into the "real" ones.

> 
> > +	mchp_core_pwm->update_timestamp = ktime_get();
> > +
> > +	return devm_pwmchip_add(&pdev->dev, &mchp_core_pwm->chip);
> 
> An error message if devm_pwmchip_add() fails would be nice.

Sure, can do!

Thanks,
Conor.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v14 1/2] pwm: add microchip soft ip corePWM driver
  2023-03-22 22:49     ` Conor Dooley
@ 2023-03-23  9:14       ` Uwe Kleine-König
  2023-03-24 18:42         ` Conor Dooley
  0 siblings, 1 reply; 8+ messages in thread
From: Uwe Kleine-König @ 2023-03-23  9:14 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Conor Dooley, Thierry Reding, Daire McNamara, linux-kernel,
	linux-pwm, linux-riscv


[-- Attachment #1.1: Type: text/plain, Size: 6093 bytes --]

Hello Conor,

On Wed, Mar 22, 2023 at 10:49:40PM +0000, Conor Dooley wrote:
> On Wed, Mar 22, 2023 at 11:55:36AM +0100, Uwe Kleine-König wrote:
> > On Mon, Mar 06, 2023 at 09:48:58AM +0000, Conor Dooley wrote:
> > > Add a driver that supports the Microchip FPGA "soft" PWM IP core.
> 
> > > +static void mchp_core_pwm_calc_period(const struct pwm_state *state, unsigned long clk_rate,
> > > +				      u16 *prescale, u16 *period_steps)
> > > +{
> > > +	u64 tmp;
> > > +
> > > +	/*
> > > +	 * Calculate the period cycles and prescale values.
> > > +	 * The registers are each 8 bits wide & multiplied to compute the period
> > > +	 * using the formula:
> > > +	 *           (prescale + 1) * (period_steps + 1)
> > > +	 * period = -------------------------------------
> > > +	 *                      clk_rate
> > > +	 * so the maximum period that can be generated is 0x10000 times the
> > > +	 * period of the input clock.
> > > +	 * However, due to the design of the "hardware", it is not possible to
> > > +	 * attain a 100% duty cycle if the full range of period_steps is used.
> > > +	 * Therefore period_steps is restricted to 0xFE and the maximum multiple
> > > +	 * of the clock period attainable is 0xFF00.
> > > +	 */
> > > +	tmp = mul_u64_u64_div_u64(state->period, clk_rate, NSEC_PER_SEC);
> > > +
> > > +	/*
> > > +	 * The hardware adds one to the register value, so decrement by one to
> > > +	 * account for the offset
> > > +	 */
> > > +	if (tmp >= MCHPCOREPWM_PERIOD_MAX) {
> > > +		*prescale = MCHPCOREPWM_PRESCALE_MAX - 1;
> > > +		*period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX - 1;
> > > +
> > > +		return;
> > > +	}
> > > +
> > > +	/*
> > > +	 * The optimal value for prescale can be calculated using the maximum
> > > +	 * permitted value of period_steps, 0xff.
> > 
> > I had to think about that one for a while. The maximal value for
> > (period_steps + 1) is 0xff with the reasoning above?! That's also what
> > the code uses.
> 
> I've become really disenfranchised with these register/variable names.
> I feel like just changing them to disconnect the variables used for
> calculation from the register names a little, so that the "is there a +1
> needed here or not" stuff becomes a little clearer.

Full ack, I considered asking for that, but after some time I was in the
"I reviewed the patch today"-mode (which is quite similar to the mode
you described :-) and forgot. (Even in that mode the PREG_TO_VAL macro
annoyed me a bit.)

> It always makes sense to be when I am in an "I respun the patch today"
> mode, but by the time we're in the review stage I get muddled.
> God forbid I have to look at this in 10 years time.
> 
> That said, there is a bit of a mistake here. The comment two above says
> "Therefore period_steps is restricted to 0xFE" when I'm capping things
> off. Some inaccuracies have probably snuck in during the various
> respins, and I think the comment above is "correct" but misleading, as
> it muddies the waters about variable versus register names.

I think it's sensible to only talk about either the register values or
the factors. I tend to think that talking about the register values is
easier at the end and recommend not to hide the +1 (or -1) in a macro.

Having said that here are my results of thinking a bit about how to
choose register values:

Let r(p) denote the period that is actually configured if p is
requested.

The nice properties we want (i.e. those a consumer might expect?) are:

 a) ∀p: r(p) ≤ p
    i.e. never configure a bigger period than requested
    (This is a bit arbitrary, but nice to get a consistent behaviour for
    all drivers and easier to handle than requesting the nearest
    possible period.)

 b) ∀p, q: p ≤ q ⟹ r(p) ≤ r(q)
    i.e. monotonicity

 c) ∀p: r(roundup(r(p))) = r(p)
    i.e. idempotency

 d) ∀p, q: r(p) ≤ q ⟹ r(p) ≤ r(q)
    i.e. pick the biggest possible period

(Sidenote: d) and a) imply b) and c))

If you always set period_steps to 0xfe
(in Python syntax:

	tmp = p * clkrate // NSPS
	period_steps = 0xfe
	prescale = tmp // (period_steps + 1) - 1

) you get this consistent behaviour.

This has the upside of being easy to implement and cheap to run.
Downside is that it's coarse and fails to implement periods that would
require e.g prescale = 0 and period_steps < 0xfe. As period_steps is
big, the domain to chose the duty cycle from is good.

If you pick period_steps and prescale such that
	(period_steps + 1) * (prescale + 1)
is the maximal value that makes r(p) ≤ p you have an increased runtime
because you have to test multiple values for period_steps. I don't think
there is an algorithm without a loop to determine an optimal pair?!
Usually this gives a better match that the easy algorithm above for the
period, but the domain for duty_cycle is (usually) smaller.
I think we have a) and d) here, so that's good.

I think for most consumers a big domain for duty_cycle is more important
that a good match for the requested period. So I tend to recommend the
easy algorithm, but I'm not religious about that and open for other
opinion and reasoning.

> > > +	writel_relaxed(1U, mchp_core_pwm->base + MCHPCOREPWM_SYNC_UPD);
> > 
> > This one is just for the case where there is an unapplied configuration
> > in the registers, right?
> 
> No, this is me realising that I had a misconception about how that
> register works. You write the bit once, and it enables the mode for
> channels that have been configured that way at synthesis-time, rather
> than how I somehow thought it worked which was as a "flush" from the
> shadow registers into the "real" ones.

Then I think I got that wrong in the same way.  Should this be better
described in a comment then? (I didn't recheck, maybe there is a good
description already?)

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v14 1/2] pwm: add microchip soft ip corePWM driver
  2023-03-23  9:14       ` Uwe Kleine-König
@ 2023-03-24 18:42         ` Conor Dooley
  2023-03-25 11:52           ` Conor Dooley
  0 siblings, 1 reply; 8+ messages in thread
From: Conor Dooley @ 2023-03-24 18:42 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: Conor Dooley, Thierry Reding, Daire McNamara, linux-kernel,
	linux-pwm, linux-riscv


[-- Attachment #1.1: Type: text/plain, Size: 10623 bytes --]

On Thu, Mar 23, 2023 at 10:14:09AM +0100, Uwe Kleine-König wrote:
> On Wed, Mar 22, 2023 at 10:49:40PM +0000, Conor Dooley wrote:
> > On Wed, Mar 22, 2023 at 11:55:36AM +0100, Uwe Kleine-König wrote:
> > > On Mon, Mar 06, 2023 at 09:48:58AM +0000, Conor Dooley wrote:
> > > > Add a driver that supports the Microchip FPGA "soft" PWM IP core.

> > > > +static void mchp_core_pwm_calc_period(const struct pwm_state *state, unsigned long clk_rate,
> > > > +				      u16 *prescale, u16 *period_steps)
> > > > +{
> > > > +	u64 tmp;
> > > > +
> > > > +	/*
> > > > +	 * Calculate the period cycles and prescale values.
> > > > +	 * The registers are each 8 bits wide & multiplied to compute the period
> > > > +	 * using the formula:
> > > > +	 *           (prescale + 1) * (period_steps + 1)
> > > > +	 * period = -------------------------------------
> > > > +	 *                      clk_rate
> > > > +	 * so the maximum period that can be generated is 0x10000 times the
> > > > +	 * period of the input clock.
> > > > +	 * However, due to the design of the "hardware", it is not possible to
> > > > +	 * attain a 100% duty cycle if the full range of period_steps is used.
> > > > +	 * Therefore period_steps is restricted to 0xFE and the maximum multiple
> > > > +	 * of the clock period attainable is 0xFF00.
> > > > +	 */
> > > > +	tmp = mul_u64_u64_div_u64(state->period, clk_rate, NSEC_PER_SEC);
> > > > +
> > > > +	/*
> > > > +	 * The hardware adds one to the register value, so decrement by one to
> > > > +	 * account for the offset
> > > > +	 */
> > > > +	if (tmp >= MCHPCOREPWM_PERIOD_MAX) {
> > > > +		*prescale = MCHPCOREPWM_PRESCALE_MAX - 1;
> > > > +		*period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX - 1;
> > > > +
> > > > +		return;
> > > > +	}
> > > > +
> > > > +	/*
> > > > +	 * The optimal value for prescale can be calculated using the maximum
> > > > +	 * permitted value of period_steps, 0xff.
> > > 
> > > I had to think about that one for a while. The maximal value for
> > > (period_steps + 1) is 0xff with the reasoning above?! That's also what
> > > the code uses.
> > 
> > I've become really disenfranchised with these register/variable names.
> > I feel like just changing them to disconnect the variables used for
> > calculation from the register names a little, so that the "is there a +1
> > needed here or not" stuff becomes a little clearer.
> 
> Full ack, I considered asking for that, but after some time I was in the
> "I reviewed the patch today"-mode (which is quite similar to the mode
> you described :-) and forgot. (Even in that mode the PREG_TO_VAL macro
> annoyed me a bit.)
> 
> > It always makes sense to be when I am in an "I respun the patch today"
> > mode, but by the time we're in the review stage I get muddled.
> > God forbid I have to look at this in 10 years time.
> > 
> > That said, there is a bit of a mistake here. The comment two above says
> > "Therefore period_steps is restricted to 0xFE" when I'm capping things
> > off. Some inaccuracies have probably snuck in during the various
> > respins, and I think the comment above is "correct" but misleading, as
> > it muddies the waters about variable versus register names.
> 
> I think it's sensible to only talk about either the register values or
> the factors. I tend to think that talking about the register values is
> easier at the end and recommend not to hide the +1 (or -1) in a macro.

Yeah, I think the macro had value about 14 versions ago, but the number
of users has dropped over the revisions.
I think what I am going to to do for the next version is drop that
macro, and only ever hold the registers values in variables.
That had the advantage of making the maths in get_state() better match
the comments that are now quite prevalent in the driver, as the +1s done
there are more obvious.
I'm loathe to link a git tree but, without changes to the period
calculation logic, this is what it looks like w/ a consistent approach
to what period_steps and prescale mean:
https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git/tree/drivers/pwm/pwm-microchip-core.c?h=pwm-dev-v15
[I blindly pushed that before leaving work & without even building it, so
there's probably some silly mistake in it, but that's besides the point
of displaying variable/comment changes]

From the chopped out bits of the previous email:
> Consider
> 
> 	clk_rate = 1000000
> 	period = 64009000
> 
> then your code gives:
> 
>               period * clk_rate
> 	tmp = ----------------- = 64009
>                 NSEC_PER_SEC
> 
> and so *prescale = 251 and *period_steps = 253. 
> 
> Wasn't the intention to pick *prescale = 250 and then
> *period_steps = 255?
> 
> Depending on your semantics of "optimal", either (252, 252) or (250,
> 255) is better than (251, 253). I think that means you shouldn't ignore
> the -1?

So, putting this one aside because it merits more thinking about.
I went through and checked some arbitrary values of tmp, rather than
dealing with "real" ones computed from frequencies I know are easily
available for me to use in the FPGA bitstream I use to test this stuff.

I think my "integer maths" truncation approach is not actually valid.
Consider tmp = 255... *prescale gets computed as 255/(254 + 1) = 1, per my
algorithm. Then, *period_steps is 255/(1 + 1) - 1 = 127.
The period is (1 + 1)(127 + 1), which is not 255.

Or take tmp = 510. prescale is 510/(254 + 1) = 2. period_steps is then
510/(2 + 1) - 1 = 169. period is (2 + 1)(169 + 1), which is 510. The
calculated period is right, but that is not the "optimal" value!

I think you're right in that I do actually need to consider the -1, and
do a ceiling operation, when calculating prescale, IOW:
	*prescale = ceil(div_u64(tmp, MCHPCOREPWM_PERIOD_STEPS_MAX + 1)) - 1;
	*period_steps = div_u64(tmp, *prescale + 1) - 1;
	[I know I can't actually call ceil()]

That'd do the same thing as the truncation where
div_u64(tmp, MCHPCOREPWM_PERIOD_STEPS_MAX + 1) is not a round number,
but it improves the round number case, eg tmp = 510:
prescale = 510/(254 + 1) - 1 = 1, period_steps = 510/(1 + 1) - 1 = 254
period = (1 + 1)(254 + 1) = 510

It does mean a zero period would need to be special cased, but I don't
mind that.

> One thing I think is strange is that with clk_rate = 1000001 and your
> algorithm we get:
> 
> requested period = 1274998 ns -> real period = 1269998.73000127  (prescale = 4, period_steps = 253)
> requested period = 1274999 ns -> real period = 1271998.728001272 (prescale = 5, period_steps = 211)

This second one here, where tmp = 1275, is a good example of the above.
With the ceil() change, this would be prescale = 4, period_steps = 254
which I think makes more sense.
> 
> while 1271998.728001272 would be a better match for a request with
> period = 1274998 than 1269998.73000127.
> 
> I spend too much time to think about that now. I'm unsure if this is
> because the -1 is missing, or if there is a bug in the idea to pick a
> small prescale to allow a big period_steps value (in combination with
> the request to pick the biggest possible period).

With the inconsistency fixed, I think getting the slightly less accurate
period is a byproduct of prioritising the finegrainedness of the duty
cycle.

> Hmm, maybe you understand that better than me? I'll have to think about
> it.

[end of snip from the last mail]

> 
> Having said that here are my results of thinking a bit about how to
> choose register values:
> 
> Let r(p) denote the period that is actually configured if p is
> requested.
> 
> The nice properties we want (i.e. those a consumer might expect?) are:
> 
>  a) ∀p: r(p) ≤ p
>     i.e. never configure a bigger period than requested
>     (This is a bit arbitrary, but nice to get a consistent behaviour for
>     all drivers and easier to handle than requesting the nearest
>     possible period.)
> 
>  b) ∀p, q: p ≤ q ⟹ r(p) ≤ r(q)
>     i.e. monotonicity
> 
>  c) ∀p: r(roundup(r(p))) = r(p)
>     i.e. idempotency
> 
>  d) ∀p, q: r(p) ≤ q ⟹ r(p) ≤ r(q)
>     i.e. pick the biggest possible period
> 
> (Sidenote: d) and a) imply b) and c))
> 
> If you always set period_steps to 0xfe
> (in Python syntax:
> 
> 	tmp = p * clkrate // NSPS
> 	period_steps = 0xfe
> 	prescale = tmp // (period_steps + 1) - 1
> 
> ) you get this consistent behaviour.
> 
> This has the upside of being easy to implement and cheap to run.
> Downside is that it's coarse and fails to implement periods that would
> require e.g prescale = 0 and period_steps < 0xfe. As period_steps is
> big, the domain to chose the duty cycle from is good.

I want to maintain support for prescale = 0, so I'm not really
interested in a computation that forsakes that.

> If you pick period_steps and prescale such that
> 	(period_steps + 1) * (prescale + 1)
> is the maximal value that makes r(p) ≤ p you have an increased runtime
> because you have to test multiple values for period_steps. I don't think
> there is an algorithm without a loop to determine an optimal pair?!
> Usually this gives a better match that the easy algorithm above for the
> period, but the domain for duty_cycle is (usually) smaller.
> I think we have a) and d) here, so that's good.
> 
> I think for most consumers a big domain for duty_cycle is more important
> that a good match for the requested period. So I tend to recommend the
> easy algorithm, but I'm not religious about that and open for other
> opinion and reasoning.

I'll be honest and say that I am getting a bit fatigued with the way
that issues w/ the calculations keep cropping up. I'll put a bit of time
into trying to figure out how to fix the tmp = 6400900 case that you
mentioned above, but if it comes out in the wash that that is a facet of
the way this stuff is computed, then I might be inclined to forge ahead
without resolving it... I'd certainly want to explain in a comment
somewhere (limitations section?) the point at which it starts getting
less accurate though. I'll write a script to iterate through the values
of tmp & see if the reason is obvious.

I would like to keep (something resembling) the current simple
implementation of the period calculation, as simplicity is a significant
advantage! I do also like the strategy of trying to maximise the number
of available duty cycle steps, I think it makes perfect sense to make
that the goal.

Thanks again for spending time on this Uwe,
Conor.


[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v14 1/2] pwm: add microchip soft ip corePWM driver
  2023-03-24 18:42         ` Conor Dooley
@ 2023-03-25 11:52           ` Conor Dooley
  0 siblings, 0 replies; 8+ messages in thread
From: Conor Dooley @ 2023-03-25 11:52 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: Conor Dooley, Thierry Reding, Daire McNamara, linux-kernel,
	linux-pwm, linux-riscv


[-- Attachment #1.1: Type: text/plain, Size: 12087 bytes --]

On Fri, Mar 24, 2023 at 06:43:03PM +0000, Conor Dooley wrote:
> On Thu, Mar 23, 2023 at 10:14:09AM +0100, Uwe Kleine-König wrote:
> > On Wed, Mar 22, 2023 at 10:49:40PM +0000, Conor Dooley wrote:
> > > On Wed, Mar 22, 2023 at 11:55:36AM +0100, Uwe Kleine-König wrote:
> > > > On Mon, Mar 06, 2023 at 09:48:58AM +0000, Conor Dooley wrote:
> > > > > Add a driver that supports the Microchip FPGA "soft" PWM IP core.
> 
> > > > > +static void mchp_core_pwm_calc_period(const struct pwm_state *state, unsigned long clk_rate,
> > > > > +				      u16 *prescale, u16 *period_steps)
> > > > > +{
> > > > > +	u64 tmp;
> > > > > +
> > > > > +	/*
> > > > > +	 * Calculate the period cycles and prescale values.
> > > > > +	 * The registers are each 8 bits wide & multiplied to compute the period
> > > > > +	 * using the formula:
> > > > > +	 *           (prescale + 1) * (period_steps + 1)
> > > > > +	 * period = -------------------------------------
> > > > > +	 *                      clk_rate
> > > > > +	 * so the maximum period that can be generated is 0x10000 times the
> > > > > +	 * period of the input clock.
> > > > > +	 * However, due to the design of the "hardware", it is not possible to
> > > > > +	 * attain a 100% duty cycle if the full range of period_steps is used.
> > > > > +	 * Therefore period_steps is restricted to 0xFE and the maximum multiple
> > > > > +	 * of the clock period attainable is 0xFF00.
> > > > > +	 */
> > > > > +	tmp = mul_u64_u64_div_u64(state->period, clk_rate, NSEC_PER_SEC);
> > > > > +
> > > > > +	/*
> > > > > +	 * The hardware adds one to the register value, so decrement by one to
> > > > > +	 * account for the offset
> > > > > +	 */
> > > > > +	if (tmp >= MCHPCOREPWM_PERIOD_MAX) {
> > > > > +		*prescale = MCHPCOREPWM_PRESCALE_MAX - 1;
> > > > > +		*period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX - 1;
> > > > > +
> > > > > +		return;
> > > > > +	}
> > > > > +
> > > > > +	/*
> > > > > +	 * The optimal value for prescale can be calculated using the maximum
> > > > > +	 * permitted value of period_steps, 0xff.
> > > > 
> > > > I had to think about that one for a while. The maximal value for
> > > > (period_steps + 1) is 0xff with the reasoning above?! That's also what
> > > > the code uses.
> > > 
> > > I've become really disenfranchised with these register/variable names.
> > > I feel like just changing them to disconnect the variables used for
> > > calculation from the register names a little, so that the "is there a +1
> > > needed here or not" stuff becomes a little clearer.
> > 
> > Full ack, I considered asking for that, but after some time I was in the
> > "I reviewed the patch today"-mode (which is quite similar to the mode
> > you described :-) and forgot. (Even in that mode the PREG_TO_VAL macro
> > annoyed me a bit.)
> > 
> > > It always makes sense to be when I am in an "I respun the patch today"
> > > mode, but by the time we're in the review stage I get muddled.
> > > God forbid I have to look at this in 10 years time.
> > > 
> > > That said, there is a bit of a mistake here. The comment two above says
> > > "Therefore period_steps is restricted to 0xFE" when I'm capping things
> > > off. Some inaccuracies have probably snuck in during the various
> > > respins, and I think the comment above is "correct" but misleading, as
> > > it muddies the waters about variable versus register names.
> > 
> > I think it's sensible to only talk about either the register values or
> > the factors. I tend to think that talking about the register values is
> > easier at the end and recommend not to hide the +1 (or -1) in a macro.
> 
> Yeah, I think the macro had value about 14 versions ago, but the number
> of users has dropped over the revisions.
> I think what I am going to to do for the next version is drop that
> macro, and only ever hold the registers values in variables.
> That had the advantage of making the maths in get_state() better match
> the comments that are now quite prevalent in the driver, as the +1s done
> there are more obvious.
> I'm loathe to link a git tree but, without changes to the period
> calculation logic, this is what it looks like w/ a consistent approach
> to what period_steps and prescale mean:
> https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git/tree/drivers/pwm/pwm-microchip-core.c?h=pwm-dev-v15
> [I blindly pushed that before leaving work & without even building it, so
> there's probably some silly mistake in it, but that's besides the point
> of displaying variable/comment changes]
> 
> From the chopped out bits of the previous email:
> > Consider
> > 
> > 	clk_rate = 1000000
> > 	period = 64009000
> > 
> > then your code gives:
> > 
> >               period * clk_rate
> > 	tmp = ----------------- = 64009
> >                 NSEC_PER_SEC
> > 
> > and so *prescale = 251 and *period_steps = 253. 
> > 
> > Wasn't the intention to pick *prescale = 250 and then
> > *period_steps = 255?
> > 
> > Depending on your semantics of "optimal", either (252, 252) or (250,
> > 255) is better than (251, 253). I think that means you shouldn't ignore
> > the -1?
> 
> So, putting this one aside because it merits more thinking about.
> 
> I went through and checked some arbitrary values of tmp, rather than
> dealing with "real" ones computed from frequencies I know are easily
> available for me to use in the FPGA bitstream I use to test this stuff.
> 
> I think my "integer maths" truncation approach is not actually valid.
> Consider tmp = 255... *prescale gets computed as 255/(254 + 1) = 1, per my
> algorithm. Then, *period_steps is 255/(1 + 1) - 1 = 127.
> The period is (1 + 1)(127 + 1), which is not 255.
> 
> Or take tmp = 510. prescale is 510/(254 + 1) = 2. period_steps is then
> 510/(2 + 1) - 1 = 169. period is (2 + 1)(169 + 1), which is 510. The
> calculated period is right, but that is not the "optimal" value!
> 
> I think you're right in that I do actually need to consider the -1, and
> do a ceiling operation, when calculating prescale, IOW:
> 	*prescale = ceil(div_u64(tmp, MCHPCOREPWM_PERIOD_STEPS_MAX + 1)) - 1;
> 	*period_steps = div_u64(tmp, *prescale + 1) - 1;
> 	[I know I can't actually call ceil()]
> 
> That'd do the same thing as the truncation where
> div_u64(tmp, MCHPCOREPWM_PERIOD_STEPS_MAX + 1) is not a round number,
> but it improves the round number case, eg tmp = 510:
> prescale = 510/(254 + 1) - 1 = 1, period_steps = 510/(1 + 1) - 1 = 254
> period = (1 + 1)(254 + 1) = 510
> 
> It does mean a zero period would need to be special cased, but I don't
> mind that.
> 
> > One thing I think is strange is that with clk_rate = 1000001 and your
> > algorithm we get:
> > 
> > requested period = 1274998 ns -> real period = 1269998.73000127  (prescale = 4, period_steps = 253)
> > requested period = 1274999 ns -> real period = 1271998.728001272 (prescale = 5, period_steps = 211)
> 
> This second one here, where tmp = 1275, is a good example of the above.
> With the ceil() change, this would be prescale = 4, period_steps = 254
> which I think makes more sense.
> > 
> > while 1271998.728001272 would be a better match for a request with
> > period = 1274998 than 1269998.73000127.
> > 
> > I spend too much time to think about that now. I'm unsure if this is
> > because the -1 is missing, or if there is a bug in the idea to pick a
> > small prescale to allow a big period_steps value (in combination with
> > the request to pick the biggest possible period).
> 
> With the inconsistency fixed, I think getting the slightly less accurate
> period is a byproduct of prioritising the finegrainedness of the duty
> cycle.
> 
> > Hmm, maybe you understand that better than me? I'll have to think about
> > it.
> 
> [end of snip from the last mail]
> 
> > 
> > Having said that here are my results of thinking a bit about how to
> > choose register values:
> > 
> > Let r(p) denote the period that is actually configured if p is
> > requested.
> > 
> > The nice properties we want (i.e. those a consumer might expect?) are:
> > 
> >  a) ∀p: r(p) ≤ p
> >     i.e. never configure a bigger period than requested
> >     (This is a bit arbitrary, but nice to get a consistent behaviour for
> >     all drivers and easier to handle than requesting the nearest
> >     possible period.)
> > 
> >  b) ∀p, q: p ≤ q ⟹ r(p) ≤ r(q)
> >     i.e. monotonicity
> > 
> >  c) ∀p: r(roundup(r(p))) = r(p)
> >     i.e. idempotency
> > 
> >  d) ∀p, q: r(p) ≤ q ⟹ r(p) ≤ r(q)
> >     i.e. pick the biggest possible period
> > 
> > (Sidenote: d) and a) imply b) and c))
> > 
> > If you always set period_steps to 0xfe
> > (in Python syntax:
> > 
> > 	tmp = p * clkrate // NSPS
> > 	period_steps = 0xfe
> > 	prescale = tmp // (period_steps + 1) - 1
> > 
> > ) you get this consistent behaviour.
> > 
> > This has the upside of being easy to implement and cheap to run.
> > Downside is that it's coarse and fails to implement periods that would
> > require e.g prescale = 0 and period_steps < 0xfe. As period_steps is
> > big, the domain to chose the duty cycle from is good.
> 
> I want to maintain support for prescale = 0, so I'm not really
> interested in a computation that forsakes that.
> 
> > If you pick period_steps and prescale such that
> > 	(period_steps + 1) * (prescale + 1)
> > is the maximal value that makes r(p) ≤ p you have an increased runtime
> > because you have to test multiple values for period_steps. I don't think
> > there is an algorithm without a loop to determine an optimal pair?!
> > Usually this gives a better match that the easy algorithm above for the
> > period, but the domain for duty_cycle is (usually) smaller.
> > I think we have a) and d) here, so that's good.
> > 
> > I think for most consumers a big domain for duty_cycle is more important
> > that a good match for the requested period. So I tend to recommend the
> > easy algorithm, but I'm not religious about that and open for other
> > opinion and reasoning.
> 
> I'll be honest and say that I am getting a bit fatigued with the way
> that issues w/ the calculations keep cropping up. I'll put a bit of time
> into trying to figure out how to fix the tmp = 6400900 case that you
> mentioned above, but if it comes out in the wash that that is a facet of
> the way this stuff is computed, then I might be inclined to forge ahead
> without resolving it... I'd certainly want to explain in a comment
> somewhere (limitations section?) the point at which it starts getting
> less accurate though. I'll write a script to iterate through the values
> of tmp & see if the reason is obvious.

I threw together a python script to go through the various values of tmp
& check which ones ended up with these "non-optimal" prescale values.
For tmp < 1000, I saw:
tmp: 511 | 255 - period_steps: 86
tmp: 766 | 255 - period_steps: 65
tmp: 767 | 255 - period_steps: 65
That then grows as tmp moves upwards, so:
tmp: 1021  | 255 - period_steps: 52
tmp: 1022  | 255 - period_steps: 52
tmp: 1023  | 255 - period_steps: 52

That looks oddly like (tmp % (254 + 1)) < prescale and seems to be an
"artifact" of using 254 to calculate the prescaler rather than 255.
I think that that can be worked around while keeping the current simple
approach to calculating prescale/period_steps. Looks like 32385 out of
the 65280 valid values for tmp can benefit! That's in comparison to the
127 that would produce the invalid period_steps value of 255.

I'll go submit something fixing this next week then I suppose.

> I would like to keep (something resembling) the current simple
> implementation of the period calculation, as simplicity is a significant
> advantage! I do also like the strategy of trying to maximise the number
> of available duty cycle steps, I think it makes perfect sense to make
> that the goal.
> 
> Thanks again for spending time on this Uwe,
> Conor.
> 



[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-03-25 11:53 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-06  9:48 [PATCH v14 0/2] Microchip Soft IP corePWM driver Conor Dooley
2023-03-06  9:48 ` [PATCH v14 1/2] pwm: add microchip soft ip " Conor Dooley
2023-03-22 10:55   ` Uwe Kleine-König
2023-03-22 22:49     ` Conor Dooley
2023-03-23  9:14       ` Uwe Kleine-König
2023-03-24 18:42         ` Conor Dooley
2023-03-25 11:52           ` Conor Dooley
2023-03-06  9:48 ` [PATCH v14 2/2] MAINTAINERS: add pwm to PolarFire SoC entry Conor Dooley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).