All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Stratiienko <r.stratiienko@gmail.com>
To: "Jernej Škrabec" <jernej.skrabec@gmail.com>
Cc: "Clément Péron" <peron.clem@gmail.com>,
	"Michael Turquette" <mturquette@baylibre.com>,
	sboyd@kernel.org, mripard@kernel.org, wens@csie.org,
	linux-clk@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-sunxi@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] clk: sunxi-ng: sun50i: h6: Modify GPU clock configuration to support DFS
Date: Sat, 25 Jun 2022 16:27:18 +0300	[thread overview]
Message-ID: <CAGphcd=ND93Kek+8=WhXQcU2Z54megQJOetZnwx-EhHwG8ZjDQ@mail.gmail.com> (raw)
In-Reply-To: <1779483.VLH7GnMWUR@jernej-laptop>

Hi,

DVFS was tested as DVFS using devfreq driver, not the script.

The following OPP table was used:
https://github.com/clementperon/linux/commit/add3ef683238095d2721de03601d5b01f2d9ce22

As is already mentioned in the commit message, P causes the issues as well.

Regards,
Roman

сб, 25 июн. 2022 г. в 13:43, Jernej Škrabec <jernej.skrabec@gmail.com>:

>
> Hi Roman,
>
> Dne petek, 24. junij 2022 ob 18:52:11 CEST je Roman Stratiienko napisal(a):
> > Using simple bash script it was discovered that not all CCU registers
> > can be safely used for DFS, e.g.:
> >
> >     while true
> >     do
> >         devmem 0x3001030 4 0xb0003e02
> >         devmem 0x3001030 4 0xb0001e02
> >     done
> >
> > Script above changes the GPU_PLL multiplier register value. While the
> > script is running, the user should interact with the user interface.
> >
> > Using this method the following results were obtained:
> > | Register  | Name           | Bits  | Values | Result |
> > | --        | --             | --    | --     | --     |
> > | 0x3001030 | GPU_PLL.MULT   | 15..8 | 20-62  | OK     |
> > | 0x3001030 | GPU_PLL.INDIV  |     1 | 0-1    | OK     |
> > | 0x3001030 | GPU_PLL.OUTDIV |     0 | 0-1    | FAIL   |
> > | 0x3001670 | GPU_CLK.DIV    |  3..0 | ANY    | FAIL   |
> >
> > Once bits that caused system failure disabled (kept default 0),
> > it was discovered that GPU_CLK.MUX was used during DFS for some
> > reason and was causing the failure too.
> >
> > After disabling GPU_PLL.OUTDIV the system started to fail during
> > booting for some reason until the maximum frequency of GPU_PLL
> > clock was limited to 756MHz.
> >
> > After all the changes made DVFS started to work seamlessly.
>
> I appreciate testing effort, but I don't think userspace approach is good way
> for testing DVFS. I see 2 issues:
> - As name already suggest, voltage also plays crucial role for stability. You
> didn't say on which board you tested this, but I assume it has PMIC. Did you
> make sure GPU voltage regulator is always at 1.04 V, which is needed for 756
> MHz?
> - Kernel clock driver always goes through proper procedure for clock rate
> change, which involves several steps. Bypassing them might also cause some
> stability problems.
>
> I agree that GPU PLL should be limited to 756 MHz max. This seems to be
> maximum operating point specified at vendor DT. But I managed to extract some
> more information from vendor GPU driver. More specifically, from this snippet,
> located in modules/gpu/mali-midgard/kernel_mode/driver/drivers/gpu/arm/
> midgard/platform/sunxi/mali_kbase_config_sunxi.c:
>
> pll_freq = target->freq;
> while (pll_freq < 288000000)
>         pll_freq *= 2;
>
> err = clk_set_rate(sunxi_mali->gpu_pll_clk, pll_freq);
> <...>
> err = clk_set_rate(kbdev->clock, target->freq);
> <...>
>
> Apparently, minimum stable PLL frequency is 288 MHz (this should be added) and
> divider in peripheral clock can really be used, although preferably not.
> Vendor GPU operating points specify only 2 lower than 288 MHz points - at 264
> MHz and 216 MHz. I'm fully aware that they may not be really stable and given
> that these two and next two all share minimum voltage of 810 mV, power and
> thermal savings are probably not that great, so we can skip them and pin
> peripheral divider to 1, as you already did.
>
> Another discrepancy I see is that vendor DT has two operating points, at 336
> MHz and 384 MHz, which also use factor P (also known as d2 in vendor clock
> source). This can be again an oversight or alternatively, it can be that P
> factor can actually be used, but just with lower frequencies.
>
> Can you please make another test with GPU operating points specified in DT and
> check if it works with P factor left in?
>
> For reference, vendor DT has following operating points (kHz, uV):
> 756000 1040000
> 624000 950000
> 576000 930000
> 540000 910000
> 504000 890000
> 456000 870000
> 432000 860000
> 420000 850000
> 408000 840000
> 384000 830000
> 360000 820000
> 336000 810000
> 312000 810000
> 264000 810000
> 216000 810000
>
> Best regards,
> Jernej
>
> >
> > Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
> > ---
> >  drivers/clk/sunxi-ng/ccu-sun50i-h6.c | 12 +++++-------
> >  1 file changed, 5 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h6.c
> > b/drivers/clk/sunxi-ng/ccu-sun50i-h6.c index 2ddf0a0da526f..d941238cd178a
> > 100644
> > --- a/drivers/clk/sunxi-ng/ccu-sun50i-h6.c
> > +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h6.c
> > @@ -95,13 +95,14 @@ static struct ccu_nkmp pll_periph1_clk = {
> >       },
> >  };
> >
> > +/* For GPU PLL, using an output divider for DFS causes system to fail */
> >  #define SUN50I_H6_PLL_GPU_REG                0x030
> >  static struct ccu_nkmp pll_gpu_clk = {
> >       .enable         = BIT(31),
> >       .lock           = BIT(28),
> >       .n              = _SUNXI_CCU_MULT_MIN(8, 8, 12),
> >       .m              = _SUNXI_CCU_DIV(1, 1), /* input divider */
> > -     .p              = _SUNXI_CCU_DIV(0, 1), /* output divider
> */
> > +     .max_rate       = 756000000UL,
> >       .common         = {
> >               .reg            = 0x030,
> >               .hw.init        = CLK_HW_INIT("pll-gpu", "osc24M",
> > @@ -294,12 +295,9 @@ static SUNXI_CCU_M_WITH_MUX_GATE(deinterlace_clk,
> > "deinterlace", static SUNXI_CCU_GATE(bus_deinterlace_clk,
> > "bus-deinterlace", "psi-ahb1-ahb2", 0x62c, BIT(0), 0);
> >
> > -static const char * const gpu_parents[] = { "pll-gpu" };
> > -static SUNXI_CCU_M_WITH_MUX_GATE(gpu_clk, "gpu", gpu_parents, 0x670,
> > -                                    0, 3,    /* M */
> > -                                    24, 1,   /* mux */
> > -                                    BIT(31), /* gate */
> > -                                    CLK_SET_RATE_PARENT);
> > +/* GPU_CLK divider kept disabled to avoid interferences with DFS */
> > +static SUNXI_CCU_GATE(gpu_clk, "gpu", "pll-gpu", 0x670,
> > +                   BIT(31), CLK_SET_RATE_PARENT);
> >
> >  static SUNXI_CCU_GATE(bus_gpu_clk, "bus-gpu", "psi-ahb1-ahb2",
> >                     0x67c, BIT(0), 0);
>
>
>
>

WARNING: multiple messages have this Message-ID (diff)
From: Roman Stratiienko <r.stratiienko@gmail.com>
To: "Jernej Škrabec" <jernej.skrabec@gmail.com>
Cc: "Clément Péron" <peron.clem@gmail.com>,
	"Michael Turquette" <mturquette@baylibre.com>,
	sboyd@kernel.org, mripard@kernel.org, wens@csie.org,
	linux-clk@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-sunxi@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] clk: sunxi-ng: sun50i: h6: Modify GPU clock configuration to support DFS
Date: Sat, 25 Jun 2022 16:27:18 +0300	[thread overview]
Message-ID: <CAGphcd=ND93Kek+8=WhXQcU2Z54megQJOetZnwx-EhHwG8ZjDQ@mail.gmail.com> (raw)
In-Reply-To: <1779483.VLH7GnMWUR@jernej-laptop>

Hi,

DVFS was tested as DVFS using devfreq driver, not the script.

The following OPP table was used:
https://github.com/clementperon/linux/commit/add3ef683238095d2721de03601d5b01f2d9ce22

As is already mentioned in the commit message, P causes the issues as well.

Regards,
Roman

сб, 25 июн. 2022 г. в 13:43, Jernej Škrabec <jernej.skrabec@gmail.com>:

>
> Hi Roman,
>
> Dne petek, 24. junij 2022 ob 18:52:11 CEST je Roman Stratiienko napisal(a):
> > Using simple bash script it was discovered that not all CCU registers
> > can be safely used for DFS, e.g.:
> >
> >     while true
> >     do
> >         devmem 0x3001030 4 0xb0003e02
> >         devmem 0x3001030 4 0xb0001e02
> >     done
> >
> > Script above changes the GPU_PLL multiplier register value. While the
> > script is running, the user should interact with the user interface.
> >
> > Using this method the following results were obtained:
> > | Register  | Name           | Bits  | Values | Result |
> > | --        | --             | --    | --     | --     |
> > | 0x3001030 | GPU_PLL.MULT   | 15..8 | 20-62  | OK     |
> > | 0x3001030 | GPU_PLL.INDIV  |     1 | 0-1    | OK     |
> > | 0x3001030 | GPU_PLL.OUTDIV |     0 | 0-1    | FAIL   |
> > | 0x3001670 | GPU_CLK.DIV    |  3..0 | ANY    | FAIL   |
> >
> > Once bits that caused system failure disabled (kept default 0),
> > it was discovered that GPU_CLK.MUX was used during DFS for some
> > reason and was causing the failure too.
> >
> > After disabling GPU_PLL.OUTDIV the system started to fail during
> > booting for some reason until the maximum frequency of GPU_PLL
> > clock was limited to 756MHz.
> >
> > After all the changes made DVFS started to work seamlessly.
>
> I appreciate testing effort, but I don't think userspace approach is good way
> for testing DVFS. I see 2 issues:
> - As name already suggest, voltage also plays crucial role for stability. You
> didn't say on which board you tested this, but I assume it has PMIC. Did you
> make sure GPU voltage regulator is always at 1.04 V, which is needed for 756
> MHz?
> - Kernel clock driver always goes through proper procedure for clock rate
> change, which involves several steps. Bypassing them might also cause some
> stability problems.
>
> I agree that GPU PLL should be limited to 756 MHz max. This seems to be
> maximum operating point specified at vendor DT. But I managed to extract some
> more information from vendor GPU driver. More specifically, from this snippet,
> located in modules/gpu/mali-midgard/kernel_mode/driver/drivers/gpu/arm/
> midgard/platform/sunxi/mali_kbase_config_sunxi.c:
>
> pll_freq = target->freq;
> while (pll_freq < 288000000)
>         pll_freq *= 2;
>
> err = clk_set_rate(sunxi_mali->gpu_pll_clk, pll_freq);
> <...>
> err = clk_set_rate(kbdev->clock, target->freq);
> <...>
>
> Apparently, minimum stable PLL frequency is 288 MHz (this should be added) and
> divider in peripheral clock can really be used, although preferably not.
> Vendor GPU operating points specify only 2 lower than 288 MHz points - at 264
> MHz and 216 MHz. I'm fully aware that they may not be really stable and given
> that these two and next two all share minimum voltage of 810 mV, power and
> thermal savings are probably not that great, so we can skip them and pin
> peripheral divider to 1, as you already did.
>
> Another discrepancy I see is that vendor DT has two operating points, at 336
> MHz and 384 MHz, which also use factor P (also known as d2 in vendor clock
> source). This can be again an oversight or alternatively, it can be that P
> factor can actually be used, but just with lower frequencies.
>
> Can you please make another test with GPU operating points specified in DT and
> check if it works with P factor left in?
>
> For reference, vendor DT has following operating points (kHz, uV):
> 756000 1040000
> 624000 950000
> 576000 930000
> 540000 910000
> 504000 890000
> 456000 870000
> 432000 860000
> 420000 850000
> 408000 840000
> 384000 830000
> 360000 820000
> 336000 810000
> 312000 810000
> 264000 810000
> 216000 810000
>
> Best regards,
> Jernej
>
> >
> > Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
> > ---
> >  drivers/clk/sunxi-ng/ccu-sun50i-h6.c | 12 +++++-------
> >  1 file changed, 5 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-h6.c
> > b/drivers/clk/sunxi-ng/ccu-sun50i-h6.c index 2ddf0a0da526f..d941238cd178a
> > 100644
> > --- a/drivers/clk/sunxi-ng/ccu-sun50i-h6.c
> > +++ b/drivers/clk/sunxi-ng/ccu-sun50i-h6.c
> > @@ -95,13 +95,14 @@ static struct ccu_nkmp pll_periph1_clk = {
> >       },
> >  };
> >
> > +/* For GPU PLL, using an output divider for DFS causes system to fail */
> >  #define SUN50I_H6_PLL_GPU_REG                0x030
> >  static struct ccu_nkmp pll_gpu_clk = {
> >       .enable         = BIT(31),
> >       .lock           = BIT(28),
> >       .n              = _SUNXI_CCU_MULT_MIN(8, 8, 12),
> >       .m              = _SUNXI_CCU_DIV(1, 1), /* input divider */
> > -     .p              = _SUNXI_CCU_DIV(0, 1), /* output divider
> */
> > +     .max_rate       = 756000000UL,
> >       .common         = {
> >               .reg            = 0x030,
> >               .hw.init        = CLK_HW_INIT("pll-gpu", "osc24M",
> > @@ -294,12 +295,9 @@ static SUNXI_CCU_M_WITH_MUX_GATE(deinterlace_clk,
> > "deinterlace", static SUNXI_CCU_GATE(bus_deinterlace_clk,
> > "bus-deinterlace", "psi-ahb1-ahb2", 0x62c, BIT(0), 0);
> >
> > -static const char * const gpu_parents[] = { "pll-gpu" };
> > -static SUNXI_CCU_M_WITH_MUX_GATE(gpu_clk, "gpu", gpu_parents, 0x670,
> > -                                    0, 3,    /* M */
> > -                                    24, 1,   /* mux */
> > -                                    BIT(31), /* gate */
> > -                                    CLK_SET_RATE_PARENT);
> > +/* GPU_CLK divider kept disabled to avoid interferences with DFS */
> > +static SUNXI_CCU_GATE(gpu_clk, "gpu", "pll-gpu", 0x670,
> > +                   BIT(31), CLK_SET_RATE_PARENT);
> >
> >  static SUNXI_CCU_GATE(bus_gpu_clk, "bus-gpu", "psi-ahb1-ahb2",
> >                     0x67c, BIT(0), 0);
>
>
>
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-06-25 13:27 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-24 16:52 [PATCH] clk: sunxi-ng: sun50i: h6: Modify GPU clock configuration to support DFS Roman Stratiienko
2022-06-24 16:52 ` Roman Stratiienko
2022-06-25 10:43 ` Jernej Škrabec
2022-06-25 10:43   ` Jernej Škrabec
2022-06-25 13:27   ` Roman Stratiienko [this message]
2022-06-25 13:27     ` Roman Stratiienko
2022-06-25 14:02     ` Roman Stratiienko
2022-06-25 14:02       ` Roman Stratiienko
2022-06-28 12:58       ` Clément Péron
2022-06-28 12:58         ` Clément Péron
2022-07-03  6:49 ` Samuel Holland
2022-07-03  6:49   ` Samuel Holland
2022-07-03 16:40   ` Roman Stratiienko
2022-07-03 16:40     ` Roman Stratiienko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGphcd=ND93Kek+8=WhXQcU2Z54megQJOetZnwx-EhHwG8ZjDQ@mail.gmail.com' \
    --to=r.stratiienko@gmail.com \
    --cc=jernej.skrabec@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-clk@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sunxi@lists.linux.dev \
    --cc=mripard@kernel.org \
    --cc=mturquette@baylibre.com \
    --cc=peron.clem@gmail.com \
    --cc=sboyd@kernel.org \
    --cc=wens@csie.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.