On Fri, Jul 15, 2016 at 12:38:54PM +0200, Ondřej Jirman wrote:
> On 15.7.2016 10:53, Maxime Ripard wrote:
> > On Fri, Jul 01, 2016 at 02:50:57AM +0200, Ondřej Jirman wrote:
> >>>>  /**
> >>>> + * sun8i_h3_apply_pll1_factors() - applies n, k, m, p factors to the
> >>>> + * register using an algorithm that tries to reserve the PLL lock
> >>>> + */
> >>>> +
> >>>> +static void sun8i_h3_apply_pll1_factors(struct clk_factors *factors, struct factors_request *req)
> >>>> +{
> >>>> +	const struct clk_factors_config *config = factors->config;
> >>>> +	u32 reg;
> >>>> +
> >>>> +	/* Fetch the register value */
> >>>> +	reg = readl(factors->reg);
> >>>> +
> >>>> +	if (FACTOR_GET(config->pshift, config->pwidth, reg) < req->p) {
> >>>> +		reg = FACTOR_SET(config->pshift, config->pwidth, reg, req->p);
> >>>> +
> >>>> +		writel(reg, factors->reg);
> >>>> +		__delay(2000);
> >>>> +	}
> >>>
> >>> So there was some doubts about the fact that P was being used, or at
> >>> least that it was useful.
> >>
> >> p is necessary to reduce frequencies below 288 MHz according to the
> >> datasheet.
> > 
> > Yes, but you could reach those frequencies without P, too, and it's
> > not part of any OPP provided by Allwinner.
> 
> The arisc firmware for H3 contains table of factors for frequences from
> 0 to 2GHz and, P is used below 240MHz. M is never used, BTW. (other
> datasheets specify M as for testing use only, whatever that means - not
> H3, but it seems it's the same PLL block)

Interesting. Which SoCs in particular?

> >>>> +	if (FACTOR_GET(config->mshift, config->mwidth, reg) < req->m) {
> >>>> +		reg = FACTOR_SET(config->mshift, config->mwidth, reg, req->m);
> >>>> +
> >>>> +		writel(reg, factors->reg);
> >>>> +		__delay(2000);
> >>>> +	}
> >>>> +
> >>>> +	reg = FACTOR_SET(config->nshift, config->nwidth, reg, req->n);
> >>>> +	reg = FACTOR_SET(config->kshift, config->kwidth, reg, req->k);
> >>>> +
> >>>> +	writel(reg, factors->reg);
> >>>> +	__delay(20);
> >>>> +
> >>>> +	while (!(readl(factors->reg) & (1 << config->lock)));
> >>>
> >>> So, they are applying the dividers first, and then applying the
> >>> multipliers, and then wait for the PLL to stabilize.
> >>
> >> Not exactly, first we are increasing dividers if the new dividers are
> >> higher that that what's already set. This ensures that because
> >> application of dividers is immediate by the design of the PLL, the
> >> application of multipliers isn't. So the VCO would still run at the same
> >> frequency for a while gradually rising to a new value for example,
> >> while the dividers would be reduced immediately. Leading to crash.
> >>
> >> PLL
> >> --------------------------
> >> PRE DIV(f0) -> VCO(f1) -> POST DIV(f2)
> >>    P             K,N           M
> >>
> >> Example: (we set all factors at once, reducing dividers and multipliers
> >> at the same time at 0ms - this should lead to no change in the output
> >> frequency, but...)
> >>
> >> -1ms: f0 = 24MHz, f1 = 2GHz,   f2 = 1GHz
> >>  0ms: f0 = 24MHz, f1 = 2GHz,   f2 = 2GHz       - boom
> >>  1ms: f0 = 24MHz, f1 = 1.5GHz, f2 = 1.5GHz
> >>  2ms: f0 = 24MHz, f1 = 1GHz,   f2 = 1GHz
> >>
> >> The current code crashes exactly at boom, you don't get any more
> >> instructions to execute.
> >>
> >> See.
> >>
> >> So this patch first increases dividers (only if necessary), changes
> >> multipliers and waits for change to happen (takes around 2000 cycles),
> >> and then decreases dividers (only if necessary).
> >>
> >> So we get:
> >>
> >> -1ms: f0 = 24MHz, f1 = 2GHz,   f2 = 1GHz
> >>  0ms: f0 = 24MHz, f1 = 2GHz,   f2 = 1GHz   - no boom, multiplier
> >>                                              reduced
> >>  1ms: f0 = 24MHz, f1 = 1.5GHz, f2 = 0.75GHz
> >> 1.9ms: f0 = 24MHz, f1 = 1GHz,   f2 = 0.5GHz - we got PLL sync
> >>  2ms: f0 = 24MHz, f1 = 1GHz,   f2 = 1GHz   - and here we reduce divider
> >> at last
> > 
> > Awesome explanation, thanks!
> > 
> > So I guess it really all boils down to the fact that the CPU is
> > clocked way outside of it's operating frequency while the PLL
> > stabilizes, right?
> 
> It may be, depending on the factors before and after change. I haven't
> tested what factors the mainline kernel calculates for each frequency.
> The arisc never uses M, and only uses P at frequencies that would not
> allow for this behavior.
> 
> I created a test program for arisc that runs a loop on the main CPU
> using msgbox to send pings to the arisc CPU, and the vary the PLL1
> randomly from the arisc, and haven't been able to lockup the main CPU
> yet with either method.
> 
> There's also AXI clock, that depends on PLL1. Arisc firmware uses the
> same method to change it while changing CPUX clock. If the clock would
> rise above certain frequency, AXI divider is increased before the PLL1
> change. If it would fall below certain frequency it is decreased after
> the PLL1 change.

If we ever need to change that, we can always rely on a clock notifier
to adjust the divider before the change happen (and support all the
scenarios, like the clock change has been aborted).

> > If so, then yes, trying to switch to the 24MHz oscillator before
> > applying the factors, and then switching back when the PLL is stable
> > would be a nice solution.
> > 
> > I just checked, and all the SoCs we've had so far have that
> > possibility, so if it works, for now, I'd like to stick to that.
> 
> It would need to be tested. U-boot does the change only once, while the
> kernel would be doing it all the time and between various frequencies
> and PLL settings. So the issues may show up with this solution too.

That would have the benefit of being quite easy to document, not be a
huge amount of code and it would work on all the CPUs PLLs we have so
far, so still, a pretty big win. If it doesn't, of course, we don't
really have the choice.

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com