All of lore.kernel.org
 help / color / mirror / Atom feed
From: Doug Anderson <dianders@chromium.org>
To: Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com>
Cc: "Vijaya Krishna Nivarthi (Temp)" <vnivarth@qti.qualcomm.com>,
	Andy Gross <agross@kernel.org>,
	"bjorn.andersson@linaro.org" <bjorn.andersson@linaro.org>,
	Konrad Dybcio <konrad.dybcio@somainline.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Jiri Slaby <jirislaby@kernel.org>,
	linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	"linux-serial@vger.kernel.org" <linux-serial@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Mukesh Savaliya (QUIC)" <quic_msavaliy@quicinc.com>,
	Matthias Kaehlcke <mka@chromium.org>,
	Stephen Boyd <swboyd@chromium.org>
Subject: Re: [V2] tty: serial: qcom-geni-serial: Fix get_clk_div_rate() which otherwise could return a sub-optimal clock rate.
Date: Wed, 6 Jul 2022 11:21:36 -0700	[thread overview]
Message-ID: <CAD=FV=VhA=iGnip_DOdNOZn_mjqgC=37o4tWFQXNpTAqf=09sA@mail.gmail.com> (raw)
In-Reply-To: <69e3fec3-bd49-8877-f1f8-453b09b8c940@quicinc.com>

Hi,

On Wed, Jul 6, 2022 at 10:44 AM Vijaya Krishna Nivarthi
<quic_vnivarth@quicinc.com> wrote:
>
> Hi,
>
>
> On 7/6/2022 8:56 PM, Doug Anderson wrote:
> > Hi,
> >
> > On Mon, Jul 4, 2022 at 11:57 AM Vijaya Krishna Nivarthi (Temp)
> > <vnivarth@qti.qualcomm.com> wrote:
> >> Hi,
> >>
> >>
> >>> -----Original Message-----
> >>> From: Doug Anderson <dianders@chromium.org>
> >>> Sent: Friday, July 1, 2022 8:38 PM
> >>> To: Vijaya Krishna Nivarthi (Temp) (QUIC) <quic_vnivarth@quicinc.com>
> >>> Cc: Andy Gross <agross@kernel.org>; bjorn.andersson@linaro.org; Konrad
> >>> Dybcio <konrad.dybcio@somainline.org>; Greg Kroah-Hartman
> >>> <gregkh@linuxfoundation.org>; Jiri Slaby <jirislaby@kernel.org>; linux-arm-
> >>> msm <linux-arm-msm@vger.kernel.org>; linux-serial@vger.kernel.org; LKML
> >>> <linux-kernel@vger.kernel.org>; Mukesh Savaliya (QUIC)
> >>> <quic_msavaliy@quicinc.com>; Matthias Kaehlcke <mka@chromium.org>;
> >>> Stephen Boyd <swboyd@chromium.org>
> >>> Subject: Re: [V2] tty: serial: qcom-geni-serial: Fix get_clk_div_rate() which
> >>> otherwise could return a sub-optimal clock rate.
> >>>
> >>> WARNING: This email originated from outside of Qualcomm. Please be wary
> >>> of any links or attachments, and do not enable macros.
> >>>
> >>> Hi,
> >>>
> >>> On Fri, Jul 1, 2022 at 4:04 AM Vijaya Krishna Nivarthi (Temp) (QUIC)
> >>> <quic_vnivarth@quicinc.com> wrote:
> >>>> Hi,
> >>>>
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Doug Anderson <dianders@chromium.org>
> >>>>> Sent: Thursday, June 30, 2022 4:45 AM
> >>>>> To: Vijaya Krishna Nivarthi (Temp) (QUIC)
> >>>>> <quic_vnivarth@quicinc.com>
> >>>>> Cc: Andy Gross <agross@kernel.org>; bjorn.andersson@linaro.org;
> >>>>> Konrad Dybcio <konrad.dybcio@somainline.org>; Greg Kroah-Hartman
> >>>>> <gregkh@linuxfoundation.org>; Jiri Slaby <jirislaby@kernel.org>;
> >>>>> linux-arm- msm <linux-arm-msm@vger.kernel.org>;
> >>>>> linux-serial@vger.kernel.org; LKML <linux-kernel@vger.kernel.org>;
> >>>>> Mukesh Savaliya (QUIC) <quic_msavaliy@quicinc.com>; Matthias
> >>>>> Kaehlcke <mka@chromium.org>; Stephen Boyd
> >>> <swboyd@chromium.org>
> >>>>> Subject: Re: [V2] tty: serial: qcom-geni-serial: Fix
> >>>>> get_clk_div_rate() which otherwise could return a sub-optimal clock rate.
> >>>>>
> >>>>>
> >>>>>
> >>>>>> +                               /* Save the first (lowest freq) within tolerance */
> >>>>>> +                               ser_clk = freq;
> >>>>>> +                               *clk_div = new_div;
> >>>>>> +                               /* no more search for exact match required in 2nd run
> >>> */
> >>>>>> +                               if (!exact_match)
> >>>>>> +                                       break;
> >>>>>> +                       }
> >>>>>> +               }
> >>>>>>
> >>>>>> -               prev = freq;
> >>>>>> +               div = freq / desired_clk + 1;
> >>>>> Can't you infinite loop now?
> >>>>>
> >>>>> Start with:
> >>>>>
> >>>>> desired_clk = 10000
> >>>>> div = 1
> >>>>> percent_tol = 2
> >>>>>
> >>>>>
> >>>>> Now:
> >>>>>
> >>>>> mult = 10000
> >>>>> offset = 200
> >>>>> test_freq = 9800
> >>>>> freq = 9800
> >>>>> div = 9800 / 10000 + 1 = 0 + 1 = 1
> >>>>>
> >>>>> ...and then you'll loop again with "div = 1", won't you? ...or did I
> >>>>> get something wrong in my analysis? This is the reason my proposed
> >>>>> algorithm had two loops.
> >>>>>
> >>>>>
> >>>> I went back to your proposed algorithm and made couple of simple
> >>> changes, and it seemed like what we need.
> >>>> a) look only for exact match once a clock rate within tolerance is
> >>>> found
> >>>> b) swap test_freq and freq at end of while loops to make it run as
> >>>> desired
> >>>>
> >>>>
> >>>>          maxdiv = CLK_DIV_MSK >> CLK_DIV_SHFT;
> >>>>          div = 1;
> >>>>
> >>>>          while (div < maxdiv) {
> >>>>                  mult = (unsigned long long)div * desired_clk;
> >>>>                  if (mult != (unsigned long)mult)
> >>>>                          break;
> >>>>
> >>>>                  if (ser_clk)
> >>>>                          offset = 0;
> >>>>                  ===================a=====================
> >>>>                  else
> >>>>                          offset = div_u64(mult * percent_tol, 100);
> >>>>
> >>>>                  /*
> >>>>                   * Loop requesting (freq - 2%) and possibly (freq).
> >>>>                   *
> >>>>                   * We'll keep track of the lowest freq inexact match we found
> >>>>                   * but always try to find a perfect match. NOTE: this algorithm
> >>>>                   * could miss a slightly better freq if there's more than one
> >>>>                   * freq between (freq - 2%) and (freq) but (freq) can't be made
> >>>>                   * exactly, but that's OK.
> >>>>                   *
> >>>>                   * This absolutely relies on the fact that the Qualcomm clock
> >>>>                   * driver always rounds up.
> >>>>                   */
> >>>>                  test_freq = mult - offset;
> >>>>                  while (test_freq <= mult) {
> >>>>                          freq = clk_round_rate(clk, test_freq);
> >>>>
> >>>>                          /*
> >>>>                           * A dead-on freq is an insta-win. This implicitly
> >>>>                           * handles when "freq == mult"
> >>>>                           */
> >>>>                          if (!(freq % desired_clk)) {
> >>>>                                  *clk_div = freq / desired_clk;
> >>>>                                  return freq;
> >>>>                          }
> >>>>
> >>>>                          /*
> >>>>                           * Only time clock framework doesn't round up is if
> >>>>                           * we're past the max clock rate. We're done searching
> >>>>                           * if that's the case.
> >>>>                           */
> >>>>                          if (freq < test_freq)
> >>>>                                  return ser_clk;
> >>>>
> >>>>                          /* Save the first (lowest freq) within tolerance */
> >>>>                          if (!ser_clk && freq <= mult + offset) {
> >>>>                                  ser_clk = freq;
> >>>>                                  *clk_div = div;
> >>>>                          }
> >>>>
> >>>>                          /*
> >>>>                           * If we already rounded up past mult then this will
> >>>>                           * cause the loop to exit. If not then this will run
> >>>>                           * the loop a second time with exactly mult.
> >>>>                           */
> >>>>                          test_freq = max(test_freq + 1, mult);
> >>>>                                                       ====b====
> >>>>                  }
> >>>>
> >>>>                  /*
> >>>>                   * freq will always be bigger than mult by at least 1.
> >>>>                   * That means we can get the next divider with a DIV_ROUND_UP.
> >>>>                   * This has the advantage of skipping by a whole bunch of divs
> >>>>                   * If the clock framework already bypassed them.
> >>>>                   */
> >>>>                  div = DIV_ROUND_UP(freq, desired_clk);
> >>>>                                                         ===b==
> >>>>          }
> >>>>
> >>>>
> >>>> Will also drop exact_match now.
> >>>>
> >>>> Will upload v3 after testing.
> >>> The more I've been thinking about it, the more I wonder if we even need the
> >>> special case of looking for an exact match at all. It feels like we should choose
> >>> one: we either look for the best match or we look for the one with the
> >>> lowest clock source rate. The weird half-half approach that we have right
> >>> now feels like over-engineering and complicates things.
> >>>
> >>> How about this (again, only lightly tested). Worst case if we _truly_ need a
> >>> close-to-exact match we could pass a tolerance of 0 in and we'd get
> >>> something that's nearly exact, though I'm not suggesting we actually do that.
> >>> If we think 2% is good enough then we should just accept the first (and
> >>> lowest clock rate) 2% match we find.
> >>>
> >>>      abs_tol = div_u64((u64)desired_clk * percent_tol, 100);
> >>>      maxdiv = CLK_DIV_MSK >> CLK_DIV_SHFT;
> >>>      div = 1;
> >>>      while (div <= maxdiv) {
> >>>          mult = (u64)div * desired_clk;
> >>>          if (mult != (unsigned long)mult)
> >>>              break;
> >>>
> >>>          offset = div * abs_tol;
> >>>          freq = clk_round_rate(clk, mult - offset);
> >>>
> >>>          /* Can only get lower if we're done */
> >>>          if (freq < mult - offset)
> >>>              break;
> >>>
> >>>          /*
> >>>           * Re-calculate div in case rounding skipped rates but we
> >>>           * ended up at a good one, then check for a match.
> >>>           */
> >>>          div = DIV_ROUND_CLOSEST(freq, desired_clk);
> >>>          achieved = DIV_ROUND_CLOSEST(freq, div);
> >>>          if (achieved <= desired_clk + abs_tol &&
> >>>              achieved >= desired_clk - abs_tol) {
> >>>              *clk_div = div;
> >>>              return freq;
> >>>          }
> >>>
> >>>          /*
> >>>           * Always increase div by at least one, but we'll go more than
> >>>           * one if clk_round_rate() gave us something higher.
> >>>           */
> >>>          div = DIV_ROUND_UP(max(freq, (unsigned long)mult) + 1, desired_clk);
> >> Wouldn’t DIV_ROUND_UP(freq, desired_clk) suffice here?
> >> freq >= mult-offset, else we would have hit break.
> > No. As you say, freq >= "mult-offset". That means that freq could be
> > == "mult-offset", right? If offset > 0 then freq could be < mult. Then
> > your DIV_ROUND_UP() would just take you right back to where you
> > started the loop with and you'd end up with an infinite loop, wouldn't
> > you?
> >
> Probably No.
>
> ( (freq >= mult-offset) && (freq <= mult) ) =>
>
> ( (freq >= mult-offset) && (freq <= mult+offset) )
>
> would mean that
>
> div = DIV_ROUND_CLOSEST(freq, desired_clk);
> evaluates to original div and we are within tolerance and hence we can return and hence don't even reach DIV_ROUND_UP?
>
> Please note that freq can skip any multiples and land up anywhere.
>
> As long as it has not gone beyond clock rate table, either it lands
> within tolerance of nearest multiple of desired_clk (in which case we
> return)
>
> OR
>
> We move on to next div = (freq/desired_clk + 1)

Ah, you are correct. So just:

div = DIV_ROUND_UP(freq, desired_clk);

...because freq _has_ to be greater than mult. If it was < "mult -
offset" we would have ended the loop. If it was between "mult -
offset" and "mult + offset" (inclusive) then we would have success. So
freq must be > "mult + offset" at the end of the loop.

-Doug

  reply	other threads:[~2022-07-06 18:22 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-29 10:00 [V2] tty: serial: qcom-geni-serial: Fix get_clk_div_rate() which otherwise could return a sub-optimal clock rate Vijaya Krishna Nivarthi
2022-06-29 13:02 ` kernel test robot
2022-06-29 23:15 ` Doug Anderson
2022-07-01 11:04   ` Vijaya Krishna Nivarthi (Temp) (QUIC)
2022-07-01 15:08     ` Doug Anderson
2022-07-04 18:57       ` Vijaya Krishna Nivarthi (Temp)
2022-07-06 15:26         ` Doug Anderson
2022-07-06 17:43           ` Vijaya Krishna Nivarthi
2022-07-06 18:21             ` Doug Anderson [this message]
2022-07-07 19:24               ` Vijaya Krishna Nivarthi
2022-06-30 15:00 ` Greg KH
2022-06-30 17:19   ` Vijaya Krishna Nivarthi (Temp) (QUIC)
2022-06-30 21:57     ` Doug Anderson
2022-07-04 21:20 ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAD=FV=VhA=iGnip_DOdNOZn_mjqgC=37o4tWFQXNpTAqf=09sA@mail.gmail.com' \
    --to=dianders@chromium.org \
    --cc=agross@kernel.org \
    --cc=bjorn.andersson@linaro.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jirislaby@kernel.org \
    --cc=konrad.dybcio@somainline.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-serial@vger.kernel.org \
    --cc=mka@chromium.org \
    --cc=quic_msavaliy@quicinc.com \
    --cc=quic_vnivarth@quicinc.com \
    --cc=swboyd@chromium.org \
    --cc=vnivarth@qti.qualcomm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.