* [PATCH v2 0/2] Fix 64 bit issues in common clock framework @ 2023-05-26 17:10 Sebastian Reichel 2023-05-26 17:10 ` [PATCH v2 1/2] clk: composite: Fix handling of high clock rates Sebastian Reichel 2023-05-26 17:10 ` [PATCH v2 2/2] clk: divider: Fix divisions Sebastian Reichel 0 siblings, 2 replies; 11+ messages in thread From: Sebastian Reichel @ 2023-05-26 17:10 UTC (permalink / raw) To: Michael Turquette, Stephen Boyd, linux-clk, linux-kernel Cc: Christopher Obbard, David Laight, Sebastian Reichel, kernel Hi, I found two independent issues related to handling of big rates in the common clock framework. Changes since PATCHv1: * https://lore.kernel.org/linux-clk/20230519190522.194729-1-sebastian.reichel@collabora.com/ * Add Christopher Obbard's Reviewed-by to the first patch * Update the second patch to use DIV_ROUND_UP instead of DIV64_U64_ROUND_UP Thanks, -- Sebastian Sebastian Reichel (2): clk: composite: Fix handling of high clock rates clk: divider: Fix divisions drivers/clk/clk-composite.c | 5 ++++- drivers/clk/clk-divider.c | 6 +++--- 2 files changed, 7 insertions(+), 4 deletions(-) -- 2.39.2 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 1/2] clk: composite: Fix handling of high clock rates 2023-05-26 17:10 [PATCH v2 0/2] Fix 64 bit issues in common clock framework Sebastian Reichel @ 2023-05-26 17:10 ` Sebastian Reichel 2023-05-29 8:50 ` AngeloGioacchino Del Regno 2023-06-13 0:10 ` Stephen Boyd 2023-05-26 17:10 ` [PATCH v2 2/2] clk: divider: Fix divisions Sebastian Reichel 1 sibling, 2 replies; 11+ messages in thread From: Sebastian Reichel @ 2023-05-26 17:10 UTC (permalink / raw) To: Michael Turquette, Stephen Boyd, linux-clk, linux-kernel Cc: Christopher Obbard, David Laight, Sebastian Reichel, kernel, stable ULONG_MAX is used by a few drivers to figure out the highest available clock rate via clk_round_rate(clk, ULONG_MAX). Since abs() takes a signed value as input, the current logic effectively calculates with ULONG_MAX = -1, which results in the worst parent clock being chosen instead of the best one. For example on Rockchip RK3588 the eMMC driver tries to figure out the highest available clock rate. There are three parent clocks available resulting in the following rate diffs with the existing logic: GPLL: abs(18446744073709551615 - 1188000000) = 1188000001 CPLL: abs(18446744073709551615 - 1500000000) = 1500000001 XIN24M: abs(18446744073709551615 - 24000000) = 24000001 As a result the clock framework will promote a maximum supported clock rate of 24 MHz, even though 1.5GHz are possible. With the updated logic any casting between signed and unsigned is avoided and the numbers look like this instead: GPLL: 18446744073709551615 - 1188000000 = 18446744072521551615 CPLL: 18446744073709551615 - 1500000000 = 18446744072209551615 XIN24M: 18446744073709551615 - 24000000 = 18446744073685551615 As a result the parent with the highest acceptable rate is chosen instead of the parent clock with the lowest one. Cc: stable@vger.kernel.org Fixes: 49502408007b ("mmc: sdhci-of-dwcmshc: properly determine max clock on Rockchip") Tested-by: Christopher Obbard <chris.obbard@collabora.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> --- drivers/clk/clk-composite.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/clk/clk-composite.c b/drivers/clk/clk-composite.c index edfa94641bbf..66759fe28fad 100644 --- a/drivers/clk/clk-composite.c +++ b/drivers/clk/clk-composite.c @@ -119,7 +119,10 @@ static int clk_composite_determine_rate(struct clk_hw *hw, if (ret) continue; - rate_diff = abs(req->rate - tmp_req.rate); + if (req->rate >= tmp_req.rate) + rate_diff = req->rate - tmp_req.rate; + else + rate_diff = tmp_req.rate - req->rate; if (!rate_diff || !req->best_parent_hw || best_rate_diff > rate_diff) { -- 2.39.2 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/2] clk: composite: Fix handling of high clock rates 2023-05-26 17:10 ` [PATCH v2 1/2] clk: composite: Fix handling of high clock rates Sebastian Reichel @ 2023-05-29 8:50 ` AngeloGioacchino Del Regno 2023-06-13 0:10 ` Stephen Boyd 1 sibling, 0 replies; 11+ messages in thread From: AngeloGioacchino Del Regno @ 2023-05-29 8:50 UTC (permalink / raw) To: Sebastian Reichel, Michael Turquette, Stephen Boyd, linux-clk, linux-kernel Cc: Christopher Obbard, David Laight, kernel, stable Il 26/05/23 19:10, Sebastian Reichel ha scritto: > ULONG_MAX is used by a few drivers to figure out the highest available > clock rate via clk_round_rate(clk, ULONG_MAX). Since abs() takes a > signed value as input, the current logic effectively calculates with > ULONG_MAX = -1, which results in the worst parent clock being chosen > instead of the best one. > > For example on Rockchip RK3588 the eMMC driver tries to figure out > the highest available clock rate. There are three parent clocks > available resulting in the following rate diffs with the existing > logic: > > GPLL: abs(18446744073709551615 - 1188000000) = 1188000001 > CPLL: abs(18446744073709551615 - 1500000000) = 1500000001 > XIN24M: abs(18446744073709551615 - 24000000) = 24000001 > > As a result the clock framework will promote a maximum supported > clock rate of 24 MHz, even though 1.5GHz are possible. With the > updated logic any casting between signed and unsigned is avoided > and the numbers look like this instead: > > GPLL: 18446744073709551615 - 1188000000 = 18446744072521551615 > CPLL: 18446744073709551615 - 1500000000 = 18446744072209551615 > XIN24M: 18446744073709551615 - 24000000 = 18446744073685551615 > > As a result the parent with the highest acceptable rate is chosen > instead of the parent clock with the lowest one. > > Cc: stable@vger.kernel.org > Fixes: 49502408007b ("mmc: sdhci-of-dwcmshc: properly determine max clock on Rockchip") > Tested-by: Christopher Obbard <chris.obbard@collabora.com> > Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/2] clk: composite: Fix handling of high clock rates 2023-05-26 17:10 ` [PATCH v2 1/2] clk: composite: Fix handling of high clock rates Sebastian Reichel 2023-05-29 8:50 ` AngeloGioacchino Del Regno @ 2023-06-13 0:10 ` Stephen Boyd 2023-06-13 12:14 ` Maxime Ripard 1 sibling, 1 reply; 11+ messages in thread From: Stephen Boyd @ 2023-06-13 0:10 UTC (permalink / raw) To: Michael Turquette, Sebastian Reichel, linux-clk, linux-kernel Cc: Christopher Obbard, David Laight, Sebastian Reichel, kernel, stable, Maxime Ripard Quoting Sebastian Reichel (2023-05-26 10:10:56) > ULONG_MAX is used by a few drivers to figure out the highest available > clock rate via clk_round_rate(clk, ULONG_MAX). Since abs() takes a > signed value as input, the current logic effectively calculates with > ULONG_MAX = -1, which results in the worst parent clock being chosen > instead of the best one. > > For example on Rockchip RK3588 the eMMC driver tries to figure out > the highest available clock rate. There are three parent clocks > available resulting in the following rate diffs with the existing > logic: > > GPLL: abs(18446744073709551615 - 1188000000) = 1188000001 > CPLL: abs(18446744073709551615 - 1500000000) = 1500000001 > XIN24M: abs(18446744073709551615 - 24000000) = 24000001 I had to read the abs() macro to understand this and also look at the types for 'req->rate' and 'tmp_req.rate' (both are unsigned long) to understand what's going on. I'm not sure I'd say that abs() takes the input as a signed value. Instead, it converts the input to a signed type to figure out if it should negate the value or not. The problem is the subtraction result is larger than can fit in a signed long long on a 64-bit machine, so we can't use the macro at all if the type is unsigned long and the sign bit is set. > > As a result the clock framework will promote a maximum supported > clock rate of 24 MHz, even though 1.5GHz are possible. With the > updated logic any casting between signed and unsigned is avoided > and the numbers look like this instead: > > GPLL: 18446744073709551615 - 1188000000 = 18446744072521551615 > CPLL: 18446744073709551615 - 1500000000 = 18446744072209551615 > XIN24M: 18446744073709551615 - 24000000 = 18446744073685551615 > > As a result the parent with the highest acceptable rate is chosen > instead of the parent clock with the lowest one. > > Cc: stable@vger.kernel.org > Fixes: 49502408007b ("mmc: sdhci-of-dwcmshc: properly determine max clock on Rockchip") > Tested-by: Christopher Obbard <chris.obbard@collabora.com> > Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> > --- > drivers/clk/clk-composite.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/clk/clk-composite.c b/drivers/clk/clk-composite.c > index edfa94641bbf..66759fe28fad 100644 > --- a/drivers/clk/clk-composite.c > +++ b/drivers/clk/clk-composite.c > @@ -119,7 +119,10 @@ static int clk_composite_determine_rate(struct clk_hw *hw, > if (ret) > continue; > > - rate_diff = abs(req->rate - tmp_req.rate); > + if (req->rate >= tmp_req.rate) > + rate_diff = req->rate - tmp_req.rate; > + else > + rate_diff = tmp_req.rate - req->rate; This problem is widespread $ git grep abs\(.*- -- drivers/clk/ | wc -l 52 so we may have a bigger problem here. Maybe some sort of coccinelle script or smatch checker can be written to look for abs() usage with an unsigned long type or a subtraction expression. This may also be worse after converting drivers to clk_hw_forward_rate_request() and clk_hw_init_rate_request() because those set the rate to ULONG_MAX. +Maxime for that as an FYI. Maybe we need an abs_diff() macro instead, that checks the type and on CONFIG_64BIT uses a conditional like above, but if it is a smaller type then it just uses abs() on the expression because it knows the difference will fit in the signed type conversion. I see that such a macro exists in some drm driver, so maybe it can be promoted to linux/math.h and then every grep hit above can use this macro instead. Care to take that on? Either way, I've applied this to clk-fixes as it is a regression. I'm just worried that this problem is more extensive. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/2] clk: composite: Fix handling of high clock rates 2023-06-13 0:10 ` Stephen Boyd @ 2023-06-13 12:14 ` Maxime Ripard 2023-06-13 18:25 ` Stephen Boyd 0 siblings, 1 reply; 11+ messages in thread From: Maxime Ripard @ 2023-06-13 12:14 UTC (permalink / raw) To: Stephen Boyd Cc: Michael Turquette, Sebastian Reichel, linux-clk, linux-kernel, Christopher Obbard, David Laight, kernel, stable [-- Attachment #1: Type: text/plain, Size: 4433 bytes --] On Mon, Jun 12, 2023 at 05:10:35PM -0700, Stephen Boyd wrote: > Quoting Sebastian Reichel (2023-05-26 10:10:56) > > ULONG_MAX is used by a few drivers to figure out the highest available > > clock rate via clk_round_rate(clk, ULONG_MAX). Since abs() takes a > > signed value as input, the current logic effectively calculates with > > ULONG_MAX = -1, which results in the worst parent clock being chosen > > instead of the best one. > > > > For example on Rockchip RK3588 the eMMC driver tries to figure out > > the highest available clock rate. There are three parent clocks > > available resulting in the following rate diffs with the existing > > logic: > > > > GPLL: abs(18446744073709551615 - 1188000000) = 1188000001 > > CPLL: abs(18446744073709551615 - 1500000000) = 1500000001 > > XIN24M: abs(18446744073709551615 - 24000000) = 24000001 > > I had to read the abs() macro to understand this and also look at the > types for 'req->rate' and 'tmp_req.rate' (both are unsigned long) to > understand what's going on. I'm not sure I'd say that abs() takes the > input as a signed value. Instead, it converts the input to a signed type > to figure out if it should negate the value or not. The problem is the > subtraction result is larger than can fit in a signed long long on a > 64-bit machine, so we can't use the macro at all if the type is unsigned > long and the sign bit is set. > > > > > As a result the clock framework will promote a maximum supported > > clock rate of 24 MHz, even though 1.5GHz are possible. With the > > updated logic any casting between signed and unsigned is avoided > > and the numbers look like this instead: > > > > GPLL: 18446744073709551615 - 1188000000 = 18446744072521551615 > > CPLL: 18446744073709551615 - 1500000000 = 18446744072209551615 > > XIN24M: 18446744073709551615 - 24000000 = 18446744073685551615 > > > > As a result the parent with the highest acceptable rate is chosen > > instead of the parent clock with the lowest one. > > > > Cc: stable@vger.kernel.org > > Fixes: 49502408007b ("mmc: sdhci-of-dwcmshc: properly determine max clock on Rockchip") > > Tested-by: Christopher Obbard <chris.obbard@collabora.com> > > Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> > > --- > > drivers/clk/clk-composite.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/clk/clk-composite.c b/drivers/clk/clk-composite.c > > index edfa94641bbf..66759fe28fad 100644 > > --- a/drivers/clk/clk-composite.c > > +++ b/drivers/clk/clk-composite.c > > @@ -119,7 +119,10 @@ static int clk_composite_determine_rate(struct clk_hw *hw, > > if (ret) > > continue; > > > > - rate_diff = abs(req->rate - tmp_req.rate); > > + if (req->rate >= tmp_req.rate) > > + rate_diff = req->rate - tmp_req.rate; > > + else > > + rate_diff = tmp_req.rate - req->rate; > > This problem is widespread > > $ git grep abs\(.*- -- drivers/clk/ | wc -l > 52 > > so we may have a bigger problem here. Maybe some sort of coccinelle > script or smatch checker can be written to look for abs() usage with an > unsigned long type or a subtraction expression. This may also be worse > after converting drivers to clk_hw_forward_rate_request() and > clk_hw_init_rate_request() because those set the rate to ULONG_MAX. > +Maxime for that as an FYI. We set max_rate to ULONG_MAX in those functions, and I don't think we have a lot of driver that will call clk_round_rate on the max rate, so we should be somewhat ok? > Maybe we need an abs_diff() macro instead, that checks the type and on > CONFIG_64BIT uses a conditional like above, but if it is a smaller type > then it just uses abs() on the expression because it knows the > difference will fit in the signed type conversion. I see that such a > macro exists in some drm driver, so maybe it can be promoted to > linux/math.h and then every grep hit above can use this macro instead. > Care to take that on? > > Either way, I've applied this to clk-fixes as it is a regression. I'm > just worried that this problem is more extensive. Yeah, that construct is pretty much everywhere, including in the core :/ Maxime [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/2] clk: composite: Fix handling of high clock rates 2023-06-13 12:14 ` Maxime Ripard @ 2023-06-13 18:25 ` Stephen Boyd 2023-06-14 10:28 ` Maxime Ripard 0 siblings, 1 reply; 11+ messages in thread From: Stephen Boyd @ 2023-06-13 18:25 UTC (permalink / raw) To: Maxime Ripard Cc: Michael Turquette, Sebastian Reichel, linux-clk, linux-kernel, Christopher Obbard, David Laight, kernel, stable Quoting Maxime Ripard (2023-06-13 05:14:25) > On Mon, Jun 12, 2023 at 05:10:35PM -0700, Stephen Boyd wrote: > > Quoting Sebastian Reichel (2023-05-26 10:10:56) > > > diff --git a/drivers/clk/clk-composite.c b/drivers/clk/clk-composite.c > > > index edfa94641bbf..66759fe28fad 100644 > > > --- a/drivers/clk/clk-composite.c > > > +++ b/drivers/clk/clk-composite.c > > > @@ -119,7 +119,10 @@ static int clk_composite_determine_rate(struct clk_hw *hw, > > > if (ret) > > > continue; > > > > > > - rate_diff = abs(req->rate - tmp_req.rate); > > > + if (req->rate >= tmp_req.rate) > > > + rate_diff = req->rate - tmp_req.rate; > > > + else > > > + rate_diff = tmp_req.rate - req->rate; > > > > This problem is widespread > > > > $ git grep abs\(.*- -- drivers/clk/ | wc -l > > 52 > > > > so we may have a bigger problem here. Maybe some sort of coccinelle > > script or smatch checker can be written to look for abs() usage with an > > unsigned long type or a subtraction expression. This may also be worse > > after converting drivers to clk_hw_forward_rate_request() and > > clk_hw_init_rate_request() because those set the rate to ULONG_MAX. > > +Maxime for that as an FYI. > > We set max_rate to ULONG_MAX in those functions, and I don't think we > have a lot of driver that will call clk_round_rate on the max rate, so > we should be somewhat ok? Good point. I haven't looked to see if any drivers are using the max_rate directly. Hopefully they aren't. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/2] clk: composite: Fix handling of high clock rates 2023-06-13 18:25 ` Stephen Boyd @ 2023-06-14 10:28 ` Maxime Ripard 0 siblings, 0 replies; 11+ messages in thread From: Maxime Ripard @ 2023-06-14 10:28 UTC (permalink / raw) To: Stephen Boyd Cc: Michael Turquette, Sebastian Reichel, linux-clk, linux-kernel, Christopher Obbard, David Laight, kernel, stable [-- Attachment #1: Type: text/plain, Size: 1931 bytes --] On Tue, Jun 13, 2023 at 11:25:12AM -0700, Stephen Boyd wrote: > Quoting Maxime Ripard (2023-06-13 05:14:25) > > On Mon, Jun 12, 2023 at 05:10:35PM -0700, Stephen Boyd wrote: > > > Quoting Sebastian Reichel (2023-05-26 10:10:56) > > > > diff --git a/drivers/clk/clk-composite.c b/drivers/clk/clk-composite.c > > > > index edfa94641bbf..66759fe28fad 100644 > > > > --- a/drivers/clk/clk-composite.c > > > > +++ b/drivers/clk/clk-composite.c > > > > @@ -119,7 +119,10 @@ static int clk_composite_determine_rate(struct clk_hw *hw, > > > > if (ret) > > > > continue; > > > > > > > > - rate_diff = abs(req->rate - tmp_req.rate); > > > > + if (req->rate >= tmp_req.rate) > > > > + rate_diff = req->rate - tmp_req.rate; > > > > + else > > > > + rate_diff = tmp_req.rate - req->rate; > > > > > > This problem is widespread > > > > > > $ git grep abs\(.*- -- drivers/clk/ | wc -l > > > 52 > > > > > > so we may have a bigger problem here. Maybe some sort of coccinelle > > > script or smatch checker can be written to look for abs() usage with an > > > unsigned long type or a subtraction expression. This may also be worse > > > after converting drivers to clk_hw_forward_rate_request() and > > > clk_hw_init_rate_request() because those set the rate to ULONG_MAX. > > > +Maxime for that as an FYI. > > > > We set max_rate to ULONG_MAX in those functions, and I don't think we > > have a lot of driver that will call clk_round_rate on the max rate, so > > we should be somewhat ok? > > Good point. I haven't looked to see if any drivers are using the > max_rate directly. Hopefully they aren't. I had a quick grep for 'req->max_rate' under drivers/clk and there's no driver using abs on that field. Maxime [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 2/2] clk: divider: Fix divisions 2023-05-26 17:10 [PATCH v2 0/2] Fix 64 bit issues in common clock framework Sebastian Reichel 2023-05-26 17:10 ` [PATCH v2 1/2] clk: composite: Fix handling of high clock rates Sebastian Reichel @ 2023-05-26 17:10 ` Sebastian Reichel 2023-05-29 8:50 ` AngeloGioacchino Del Regno 2023-06-13 0:41 ` Stephen Boyd 1 sibling, 2 replies; 11+ messages in thread From: Sebastian Reichel @ 2023-05-26 17:10 UTC (permalink / raw) To: Michael Turquette, Stephen Boyd, linux-clk, linux-kernel Cc: Christopher Obbard, David Laight, Sebastian Reichel, kernel The clock framework handles clock rates as "unsigned long", so u32 on 32-bit architectures and u64 on 64-bit architectures. The current code pointlessly casts the dividend to u64 on 32-bit architectures and thus pointlessly reducing the performance. On the other hand on 64-bit architectures the divisor is masked and only the lower 32-bit are used. Thus requesting a frequency >= 4.3GHz results in incorrect values. For example requesting 4300000000 (4.3 GHz) will effectively request ca. 5 MHz. Requesting clk_round_rate(clk, ULONG_MAX) is a bit of a special case, since that still returns correct values as long as the parent clock is below 8.5 GHz. Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> --- drivers/clk/clk-divider.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/clk/clk-divider.c b/drivers/clk/clk-divider.c index a2c2b5203b0a..c38e8aa60e54 100644 --- a/drivers/clk/clk-divider.c +++ b/drivers/clk/clk-divider.c @@ -220,7 +220,7 @@ static int _div_round_up(const struct clk_div_table *table, unsigned long parent_rate, unsigned long rate, unsigned long flags) { - int div = DIV_ROUND_UP_ULL((u64)parent_rate, rate); + int div = DIV_ROUND_UP(parent_rate, rate); if (flags & CLK_DIVIDER_POWER_OF_TWO) div = __roundup_pow_of_two(div); @@ -237,7 +237,7 @@ static int _div_round_closest(const struct clk_div_table *table, int up, down; unsigned long up_rate, down_rate; - up = DIV_ROUND_UP_ULL((u64)parent_rate, rate); + up = DIV_ROUND_UP(parent_rate, rate); down = parent_rate / rate; if (flags & CLK_DIVIDER_POWER_OF_TWO) { @@ -473,7 +473,7 @@ int divider_get_val(unsigned long rate, unsigned long parent_rate, { unsigned int div, value; - div = DIV_ROUND_UP_ULL((u64)parent_rate, rate); + div = DIV_ROUND_UP(parent_rate, rate); if (!_is_valid_div(table, div, flags)) return -EINVAL; -- 2.39.2 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] clk: divider: Fix divisions 2023-05-26 17:10 ` [PATCH v2 2/2] clk: divider: Fix divisions Sebastian Reichel @ 2023-05-29 8:50 ` AngeloGioacchino Del Regno 2023-06-13 0:41 ` Stephen Boyd 1 sibling, 0 replies; 11+ messages in thread From: AngeloGioacchino Del Regno @ 2023-05-29 8:50 UTC (permalink / raw) To: Sebastian Reichel, Michael Turquette, Stephen Boyd, linux-clk, linux-kernel Cc: Christopher Obbard, David Laight, kernel Il 26/05/23 19:10, Sebastian Reichel ha scritto: > The clock framework handles clock rates as "unsigned long", so u32 on > 32-bit architectures and u64 on 64-bit architectures. > > The current code pointlessly casts the dividend to u64 on 32-bit > architectures and thus pointlessly reducing the performance. > > On the other hand on 64-bit architectures the divisor is masked and only > the lower 32-bit are used. Thus requesting a frequency >= 4.3GHz results > in incorrect values. For example requesting 4300000000 (4.3 GHz) will > effectively request ca. 5 MHz. Requesting clk_round_rate(clk, ULONG_MAX) > is a bit of a special case, since that still returns correct values as > long as the parent clock is below 8.5 GHz. > > Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/2] clk: divider: Fix divisions 2023-05-26 17:10 ` [PATCH v2 2/2] clk: divider: Fix divisions Sebastian Reichel 2023-05-29 8:50 ` AngeloGioacchino Del Regno @ 2023-06-13 0:41 ` Stephen Boyd 2023-06-13 8:05 ` David Laight 1 sibling, 1 reply; 11+ messages in thread From: Stephen Boyd @ 2023-06-13 0:41 UTC (permalink / raw) To: Michael Turquette, Sebastian Reichel, linux-clk, linux-kernel Cc: Christopher Obbard, David Laight, Sebastian Reichel, kernel Quoting Sebastian Reichel (2023-05-26 10:10:57) > The clock framework handles clock rates as "unsigned long", so u32 on > 32-bit architectures and u64 on 64-bit architectures. > > The current code pointlessly casts the dividend to u64 on 32-bit > architectures and thus pointlessly reducing the performance. It looks like that was done to make the DIV_ROUND_UP() macro not overflow the dividend on 32-bit machines (from 9556f9dad8f5): DIV_ROUND_UP(3000000000, 1500000000) = (3.0G + 1.5G - 1) / 1.5G = OVERFLOW / 1.5G but I agree, the u64 cast is not necessary if DIV_ROUND_UP_ULL() is used as that macro casts the dividend to unsigned long long anyway. > > On the other hand on 64-bit architectures the divisor is masked and only > the lower 32-bit are used. Thus requesting a frequency >= 4.3GHz results > in incorrect values. For example requesting 4300000000 (4.3 GHz) will > effectively request ca. 5 MHz. Nice catch. But I'm concerned that the case above is broken by changing to DIV_ROUND_UP(). As this code is generic, I fear we'll have to change this code that divides rates to use DIV64_U64_ROUND_UP() because we don't know how large the rate is (i.e. it could be larger than 32-bits on a 64-bit machine). > Requesting clk_round_rate(clk, ULONG_MAX) > is a bit of a special case, since that still returns correct values as > long as the parent clock is below 8.5 GHz. > > Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> > --- > drivers/clk/clk-divider.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/clk/clk-divider.c b/drivers/clk/clk-divider.c > index a2c2b5203b0a..c38e8aa60e54 100644 > --- a/drivers/clk/clk-divider.c > +++ b/drivers/clk/clk-divider.c > @@ -220,7 +220,7 @@ static int _div_round_up(const struct clk_div_table *table, > unsigned long parent_rate, unsigned long rate, > unsigned long flags) > { > - int div = DIV_ROUND_UP_ULL((u64)parent_rate, rate); > + int div = DIV_ROUND_UP(parent_rate, rate); > > if (flags & CLK_DIVIDER_POWER_OF_TWO) > div = __roundup_pow_of_two(div); > @@ -237,7 +237,7 @@ static int _div_round_closest(const struct clk_div_table *table, > int up, down; > unsigned long up_rate, down_rate; > > - up = DIV_ROUND_UP_ULL((u64)parent_rate, rate); > + up = DIV_ROUND_UP(parent_rate, rate); > down = parent_rate / rate; > > if (flags & CLK_DIVIDER_POWER_OF_TWO) { > @@ -473,7 +473,7 @@ int divider_get_val(unsigned long rate, unsigned long parent_rate, > { > unsigned int div, value; > > - div = DIV_ROUND_UP_ULL((u64)parent_rate, rate); > + div = DIV_ROUND_UP(parent_rate, rate); > > if (!_is_valid_div(table, div, flags)) > return -EINVAL; This is undoing parts of commit 9556f9dad8f5 ("clk: divider: handle integer overflow when dividing large clock rates"). Please pair this patch with extensive kunit tests in a new test suite clk-divider_test.c file. I don't know if UML supports changing sizeof(long), but that would be a cool feature to tease out these sorts of issues. I suppose we'll just have to run the kunit tests on various architectures to cover the possibilities. ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH v2 2/2] clk: divider: Fix divisions 2023-06-13 0:41 ` Stephen Boyd @ 2023-06-13 8:05 ` David Laight 0 siblings, 0 replies; 11+ messages in thread From: David Laight @ 2023-06-13 8:05 UTC (permalink / raw) To: 'Stephen Boyd', Michael Turquette, Sebastian Reichel, linux-clk, linux-kernel Cc: Christopher Obbard, Sebastian Reichel, kernel From: Stephen Boyd > Sent: 13 June 2023 01:42 > > Quoting Sebastian Reichel (2023-05-26 10:10:57) > > The clock framework handles clock rates as "unsigned long", so u32 on > > 32-bit architectures and u64 on 64-bit architectures. > > > > The current code pointlessly casts the dividend to u64 on 32-bit > > architectures and thus pointlessly reducing the performance. > > It looks like that was done to make the DIV_ROUND_UP() macro not > overflow the dividend on 32-bit machines (from 9556f9dad8f5): > > DIV_ROUND_UP(3000000000, 1500000000) = (3.0G + 1.5G - 1) / 1.5G > = OVERFLOW / 1.5G Maybe add: #define DIV_ROUND_UP_NZ(x, y) (((x) - 1)/(y) + 1) which doesn't overflow but requires x != 0. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-06-14 10:30 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-05-26 17:10 [PATCH v2 0/2] Fix 64 bit issues in common clock framework Sebastian Reichel 2023-05-26 17:10 ` [PATCH v2 1/2] clk: composite: Fix handling of high clock rates Sebastian Reichel 2023-05-29 8:50 ` AngeloGioacchino Del Regno 2023-06-13 0:10 ` Stephen Boyd 2023-06-13 12:14 ` Maxime Ripard 2023-06-13 18:25 ` Stephen Boyd 2023-06-14 10:28 ` Maxime Ripard 2023-05-26 17:10 ` [PATCH v2 2/2] clk: divider: Fix divisions Sebastian Reichel 2023-05-29 8:50 ` AngeloGioacchino Del Regno 2023-06-13 0:41 ` Stephen Boyd 2023-06-13 8:05 ` David Laight
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).