All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maxime Ripard <maxime@cerno.tech>
To: Stephen Boyd <sboyd@kernel.org>
Cc: Jerome Brunet <jbrunet@baylibre.com>,
	Mike Turquette <mturquette@baylibre.com>,
	linux-clk@vger.kernel.org,
	Naresh Kamboju <naresh.kamboju@linaro.org>,
	Alexander Stein <alexander.stein@ew.tq-group.com>,
	Marek Szyprowski <m.szyprowski@samsung.com>,
	Tony Lindgren <tony@atomide.com>,
	Yassine Oudjana <y.oudjana@protonmail.com>,
	Neil Armstrong <narmstrong@baylibre.com>
Subject: Re: [PATCH 22/22] clk: Prevent a clock without a rate to register
Date: Sat, 23 Apr 2022 11:17:24 +0200	[thread overview]
Message-ID: <20220423091724.crm7szplzidutxvd@houat> (raw)
In-Reply-To: <20220423044228.2AA7AC385A0@smtp.kernel.org>

[-- Attachment #1: Type: text/plain, Size: 8406 bytes --]

On Fri, Apr 22, 2022 at 09:42:26PM -0700, Stephen Boyd wrote:
> Quoting Maxime Ripard (2022-04-08 03:41:27)
> > On Fri, Apr 08, 2022 at 11:18:58AM +0200, Jerome Brunet wrote:
> > > On Fri 08 Apr 2022 at 11:10, Maxime Ripard <maxime@cerno.tech> wrote:
> > > > A rate of 0 for a clock is considered an error, as evidenced by the
> > > > documentation of clk_get_rate() and the code of clk_get_rate() and
> > > > clk_core_get_rate_nolock().
> 
> Where?
> 
> > > >
> > > > The main source of that error is if the clock is supposed to have a
> > > > parent but is orphan at the moment of the call. This is likely to be
> > > > transient and solved later in the life of the system as more clocks are
> > > > registered.
> > > >
> > > > The corollary is thus that if a clock is not an orphan, has a parent that
> > > > has a rate (so is not an orphan itself either) but returns a rate of 0,
> > > > something is wrong in the driver. Let's return an error in such a case.
> > > >
> > > > Signed-off-by: Maxime Ripard <maxime@cerno.tech>
> > > > ---
> > > >  drivers/clk/clk.c | 10 ++++++++++
> > > >  1 file changed, 10 insertions(+)
> > > >
> > > > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > > > index 8bbb6adeeead..e8c55678da85 100644
> > > > --- a/drivers/clk/clk.c
> > > > +++ b/drivers/clk/clk.c
> > > > @@ -3773,6 +3773,16 @@ static int __clk_core_init(struct clk_core *core)
> > > >             rate = 0;
> > > >     core->rate = core->req_rate = rate;
> > > >  
> > > > +   /*
> > > > +    * If we're not an orphan clock and our parent has a rate, then
> > > > +    * if our rate is 0, something is badly broken in recalc_rate.
> > > > +    */
> > > > +   if (!core->orphan && (parent && parent->rate) && !core->rate) {
> > > > +           ret = -EINVAL;
> > > > +           pr_warn("%s: recalc_rate returned a null rate\n", core->name);
> > > > +           goto out;
> > > > +   }
> > > > +
> > > 
> > > As hinted in the cover letter, I don't really agree with that.
> > > 
> > > There are situations where we can't compute the rate. Getting invalid
> > > value in the register is one reason.
> > > 
> > > You mentioned the PLLs of the Amlogic SoCs (it is not limited to g12 - all
> > > SoCs would be affected):
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/clk/meson/clk-pll.c#n82
> > > Yes, PLL that have not been previously used (by the ROMCode or the
> > > bootloader) tend to have the value of the divider set to 0 which in
> > > invalid as it would result in a division by zero.
> > > 
> > > I don't think this is a bug. It is just what the HW is, an unlocked,
> > > uninitialized PLL. There is no problem here and the PLL can remain like
> > > that until it is needed.
> > 
> > I think the larger issue is around the semantics of clk_get_rate(), and
> > especially whether we can call it without a clk_enable(), and whether
> > returning 0 is fine.
> > 
> > The (clk.h) documentation of clk_get_rate() mentions that "This is only
> > valid once the clock source has been enabled", and it's fairly
> > ambiguous. I can see how it could be interpreted as "you need to call
> > clk_enable() before calling clk_get_rate()", but it can also be
> > interpreted as "The returned rate will only be valid once clk_enable()
> > is called".
> 
> I enjoy the ambiguity! :) This question has come up before and it
> doesn't really matter. Drivers can call clk_prepare_enable() if they
> want to be sure that clk_get_rate() is meaningful to them, or they can
> not.

Right, but that alone already removes the ambiguity :)

If not calling clk_enable() prior to calling clk_get_rate() is ok, then
we don't *need* to call it, both are valid.

Therefore, the "you need to call clk_enable() before calling
clk_get_rate()" interpretation isn't the right one, right?

> The CCF returns a rate that it gets from calling recalc_rate, which
> could be inaccurate for others reasons, either because some driver has
> called clk_set_rate() after the clk_get_rate()

Right, but I guess that's ok, the CCF never claimed to support atomicity
in any way. That could also be the result of a parent changing its rate,
or a reparenting, etc.

Still, it means that it was the best estimate we could do when
clk_get_rate() was called. It just turned out to be short-lived.

> or because the clk is an orphan still and clk_get() succeeded

In this case, core->parent is NULL, so we could argue that it's the same
case than clk_get_rate(NULL). It makes sense.

> or because the clk_op couldn't calculate it at the time of caching.

I guess this is what the discussion is all about, whether it's actually
something that we should find it acceptable or not.

From what you're saying, it is indeed acceptable and thus clk_get_rate()
can return 0 not only from a NULL clock, but also if it's incapable of
figuring out a rate for that clock. We should thus document that
recalc_rate can indeed return 0 in such a case, and that users should
take this into account and not always expect a valid rate.

Would a documentation for recalc_rate like that:

Recalculate the rate of this clock, by querying hardware. The parent
rate is an input parameter. It is up to the caller to ensure that the
prepare_mutex is held across this call. Returns the calculated rate. If
the driver cannot figure out a rate for this clock, it must return 0.
Optional, but recommended - if this op is not set then clock rate will
be initialized to 0.

And for clk_get_rate() (in drivers/clk/clk.c):

Simply returns the cached rate of the clk, unless CLK_GET_RATE_NOCACHE
flag is set, which means a recalc_rate will be issued. Can be called if
the clock is enabled or not. If clk is NULL, or if an error occurred,
then returns 0.

Work for everyone?

> Indeed the CCF doesn't try to recalc the rate after enabling the clk.
> Maybe we should do that? It would mean that we have to schedule a work
> from the enable path to update the rate accounting outside of any
> atomic context.
> 
> Just thinking out loud, the simpler solution is to probably drop all
> rate caching in the CCF and get the frequency on a clk_get_rate() call.

The above is just enough for me. recalc_rate returning 0 is valid, and
should be taken properly into account.

Longer term I guess it makes sense to drop the rate caching, but I've
introduced enough clock regressions already for a couple of releases :)

> It complicates some of the core though when we check to see if we need
> to update clk rates. We could have some middle ground where drivers
> indicate that they want to update their cached rate because it's valid
> now (either from their enable path or from somewhere else). This may be
> nice actually because we could have clk providers call this to force a
> recalc down the tree from where they've updated. I think the qcom
> DisplayPort phy would want this.

A good workaround for now would be to set CLK_GET_RATE_NOCACHE for those
clocks.

> > 
> > I think the latter is the proper interpretation though based on what the
> > drivers are doing, and even the CCF itself will call recalc_rate without
> > making sure that the clock is enabled (in __clk_core_init() for example).
> > 
> > Then there is the question of whether returning 0 is fine. Again
> > clk_get_rate() (clk.c) documentation states that "If clk is NULL then
> > returns 0.". This is indeed returned in case of an error condition (in
> > clk_get_rate() itself, but also in clk_core_get_rate_nolock()).
> 
> A NULL clk isn't an error. We use NULL in the CCF to indicate that it's
> an optional clk. Returning 0 from clk_get_rate() is not an error. If
> clk_get() returns an error pointer then it's an error. And NULL isn't an
> error value per PTR_ERR() (because NULL == 0 when casted, this isn't
> golang).

Ok. So we can either group both the "we couldn't figure out a rate" and
the "our clock is NULL" case as the same case in clk_get_rate(), or we
can convert clk_get_rate() to return signed integers and a negative
error code. The latter seems much more intrusive to me and will probably
lead to a whole bunch of regressions, so I'd be more inclined to stick
to the former. Especially since drivers seem to either don't care or
treat 0 as an error already.

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

  reply	other threads:[~2022-04-23  9:17 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-08  9:10 [PATCH 00/22] clk: More clock rate fixes and tests Maxime Ripard
2022-04-08  9:10 ` [PATCH 01/22] clk: Drop the rate range on clk_put() Maxime Ripard
2022-04-08  9:10 ` [PATCH 02/22] clk: tests: Add test suites description Maxime Ripard
2022-04-23  4:06   ` Stephen Boyd
2022-04-08  9:10 ` [PATCH 03/22] clk: tests: Add reference to the orphan mux bug report Maxime Ripard
2022-04-08  9:10 ` [PATCH 04/22] clk: tests: Add tests for uncached clock Maxime Ripard
2022-04-08  9:10 ` [PATCH 05/22] clk: tests: Add tests for single parent mux Maxime Ripard
2022-04-08  9:10 ` [PATCH 06/22] clk: tests: Add tests for mux with multiple parents Maxime Ripard
2022-04-08  9:10 ` [PATCH 07/22] clk: tests: Add some tests for orphan " Maxime Ripard
2022-04-08  9:10 ` [PATCH 08/22] clk: Take into account uncached clocks in clk_set_rate_range() Maxime Ripard
2022-04-08  9:10 ` [PATCH 09/22] clk: Fix clk_get_parent() documentation Maxime Ripard
2022-04-08  9:10 ` [PATCH 10/22] clk: Set req_rate on reparenting Maxime Ripard
2022-04-08  9:10 ` [PATCH 11/22] clk: Skip set_rate_range if our clock is orphan Maxime Ripard
2022-04-08  9:10 ` [PATCH 12/22] clk: Add our request boundaries in clk_core_init_rate_req Maxime Ripard
2022-04-08  9:10 ` [PATCH 13/22] clk: Change clk_core_init_rate_req prototype Maxime Ripard
2022-04-08  9:10 ` [PATCH 14/22] clk: Introduce clk_hw_init_rate_request() Maxime Ripard
2022-04-23  3:46   ` Stephen Boyd
2022-04-23  7:17     ` Maxime Ripard
2022-04-08  9:10 ` [PATCH 15/22] clk: Add missing clk_core_init_rate_req calls Maxime Ripard
2022-04-23  3:51   ` Stephen Boyd
2022-04-23  7:32     ` Maxime Ripard
2022-04-08  9:10 ` [PATCH 16/22] clk: Remove redundant clk_core_init_rate_req() call Maxime Ripard
2022-04-23  4:02   ` Stephen Boyd
2022-04-23  7:44     ` Maxime Ripard
2022-04-08  9:10 ` [PATCH 17/22] clk: Switch from __clk_determine_rate to clk_core_round_rate_nolock Maxime Ripard
2022-04-08  9:10 ` [PATCH 18/22] clk: Introduce clk_core_has_parent() Maxime Ripard
2022-04-08  9:10 ` [PATCH 19/22] clk: Stop forwarding clk_rate_requests to the parent Maxime Ripard
2022-04-08  9:10 ` [PATCH 20/22] clk: Zero the clk_rate_request structure Maxime Ripard
2022-04-08  9:10 ` [PATCH 21/22] clk: Test the clock pointer in clk_hw_get_name() Maxime Ripard
2022-04-08  9:10 ` [PATCH 22/22] clk: Prevent a clock without a rate to register Maxime Ripard
2022-04-08  9:18   ` Jerome Brunet
2022-04-08 10:41     ` Maxime Ripard
2022-04-08 11:24       ` Jerome Brunet
2022-04-08 12:55         ` Maxime Ripard
2022-04-08 14:48           ` Jerome Brunet
2022-04-08 15:36             ` Maxime Ripard
2022-04-11  7:40               ` Neil Armstrong
2022-04-12 12:56                 ` Maxime Ripard
2022-04-11  8:20               ` Jerome Brunet
2022-04-23  4:42       ` Stephen Boyd
2022-04-23  9:17         ` Maxime Ripard [this message]
2022-04-29  2:08           ` Stephen Boyd
2022-04-29 15:45             ` Maxime Ripard
2022-04-08 12:17     ` Marek Szyprowski
2022-04-08 12:25       ` Maxime Ripard
2022-04-08 13:46         ` Marek Szyprowski
2022-04-23  4:12   ` Stephen Boyd
2022-04-23  7:49     ` Maxime Ripard
2022-04-10 12:06 ` [PATCH 00/22] clk: More clock rate fixes and tests Yassine Oudjana
2022-04-11 11:39   ` Maxime Ripard
2022-04-11  6:25 ` (EXT) " Alexander Stein
2022-04-11  7:24   ` Alexander Stein
2022-04-11 11:54     ` Maxime Ripard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220423091724.crm7szplzidutxvd@houat \
    --to=maxime@cerno.tech \
    --cc=alexander.stein@ew.tq-group.com \
    --cc=jbrunet@baylibre.com \
    --cc=linux-clk@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=mturquette@baylibre.com \
    --cc=naresh.kamboju@linaro.org \
    --cc=narmstrong@baylibre.com \
    --cc=sboyd@kernel.org \
    --cc=tony@atomide.com \
    --cc=y.oudjana@protonmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.