All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] Fix clock division overflow problem
@ 2023-10-24 16:18 Sebastian Reichel
  2023-10-24 16:18 ` [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW Sebastian Reichel
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Sebastian Reichel @ 2023-10-24 16:18 UTC (permalink / raw)
  To: Andy Shevchenko, Michael Turquette, Stephen Boyd, David Laight,
	linux-clk, linux-kernel
  Cc: Vasily Gorbik, Niklas Schnelle, Linus Torvalds,
	Sebastian Reichel, kernel

Hi,

I noticed the overflow issue while debugging an issue related to the Rockchip
RK3588 eMMC. It's not an issue on that particular platform and it does not seem
this has been an issue on any other platform so far, so I did not add any
stable Tags. Considering 64 bit platforms are the standard nowadays and 3GHz is
a reasonable small value, I expect this to become an issue soon, though.

Changes since PATCHv3:
 * https://lore.kernel.org/all/20230630183835.464216-1-sebastian.reichel@collabora.com/
 * Add some people to Cc, because similar overflow issue came up during s390 pull request
 * Split DIV_ROUND_UP_NO_OVERFLOW into its own patch
 * Add patch replacing open-coded abs_diff() from the previously applied patch

Changes since PATCHv2:
 * https://lore.kernel.org/all/20230526171057.66876-1-sebastian.reichel@collabora.com/
 * Drop first patch (applied)
 * Update second patch to use newly introduced DIV_ROUND_UP_NO_OVERFLOW

Changes since PATCHv1:
 * https://lore.kernel.org/linux-clk/20230519190522.194729-1-sebastian.reichel@collabora.com/
 * Add Christopher Obbard's Reviewed-by to the first patch
 * Update the second patch to use DIV_ROUND_UP instead of DIV64_U64_ROUND_UP

Greetings,

-- Sebastian


Sebastian Reichel (3):
  math.h: add DIV_ROUND_UP_NO_OVERFLOW
  clk: divider: Fix divisor masking on 64 bit platforms
  clk: composite: replace open-coded abs_diff()

 drivers/clk/clk-composite.c |  6 ++----
 drivers/clk/clk-divider.c   |  6 +++---
 include/linux/math.h        | 11 +++++++++++
 3 files changed, 16 insertions(+), 7 deletions(-)

-- 
2.42.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-24 16:18 [PATCH v4 0/3] Fix clock division overflow problem Sebastian Reichel
@ 2023-10-24 16:18 ` Sebastian Reichel
  2023-10-24 16:26   ` David Laight
  2023-10-24 19:32   ` Linus Torvalds
  2023-10-24 16:18 ` [PATCH v4 2/3] clk: divider: Fix divisor masking on 64 bit platforms Sebastian Reichel
  2023-10-24 16:18 ` [PATCH v4 3/3] clk: composite: replace open-coded abs_diff() Sebastian Reichel
  2 siblings, 2 replies; 20+ messages in thread
From: Sebastian Reichel @ 2023-10-24 16:18 UTC (permalink / raw)
  To: Andy Shevchenko, Michael Turquette, Stephen Boyd, David Laight,
	linux-clk, linux-kernel
  Cc: Vasily Gorbik, Niklas Schnelle, Linus Torvalds,
	Sebastian Reichel, kernel

Add a new DIV_ROUND_UP helper, which cannot overflow when
big numbers are being used.

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
---
 include/linux/math.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/math.h b/include/linux/math.h
index dd4152711de7..f80bfb375ab9 100644
--- a/include/linux/math.h
+++ b/include/linux/math.h
@@ -36,6 +36,17 @@
 
 #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
 
+/**
+ * DIV_ROUND_UP_NO_OVERFLOW - divide two numbers and always round up
+ * @n: numerator / dividend
+ * @d: denominator / divisor
+ *
+ * This functions does the same as DIV_ROUND_UP, but internally uses a
+ * division and a modulo operation instead of math tricks. This way it
+ * avoids overflowing when handling big numbers.
+ */
+#define DIV_ROUND_UP_NO_OVERFLOW(n, d) (((n) / (d)) + !!((n) % (d)))
+
 #define DIV_ROUND_DOWN_ULL(ll, d) \
 	({ unsigned long long _tmp = (ll); do_div(_tmp, d); _tmp; })
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 2/3] clk: divider: Fix divisor masking on 64 bit platforms
  2023-10-24 16:18 [PATCH v4 0/3] Fix clock division overflow problem Sebastian Reichel
  2023-10-24 16:18 ` [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW Sebastian Reichel
@ 2023-10-24 16:18 ` Sebastian Reichel
  2023-10-24 16:18 ` [PATCH v4 3/3] clk: composite: replace open-coded abs_diff() Sebastian Reichel
  2 siblings, 0 replies; 20+ messages in thread
From: Sebastian Reichel @ 2023-10-24 16:18 UTC (permalink / raw)
  To: Andy Shevchenko, Michael Turquette, Stephen Boyd, David Laight,
	linux-clk, linux-kernel
  Cc: Vasily Gorbik, Niklas Schnelle, Linus Torvalds,
	Sebastian Reichel, kernel

The clock framework handles clock rates as "unsigned long", so u32 on
32-bit architectures and u64 on 64-bit architectures.

The current code casts the dividend to u64 on 32-bit to avoid a
potential overflow. For example DIV_ROUND_UP(3000000000, 1500000000)
= (3.0G + 1.5G - 1) / 1.5G = = OVERFLOW / 1.5G, which has been
introduced in commit 9556f9dad8f5 ("clk: divider: handle integer overflow
when dividing large clock rates").

On 64 bit platforms this masks the divisor, so that only the lower
32 bit are used. Thus requesting a frequency >= 4.3GHz results
in incorrect values. For example requesting 4300000000 (4.3 GHz) will
effectively request ca. 5 MHz. Requesting clk_round_rate(clk, ULONG_MAX)
is a bit of a special case, since that still returns correct values as
long as the parent clock is below 8.5 GHz.

Fix this by switching to DIV_ROUND_UP_NO_OVERFLOW, which cannot
overflow. This avoids any requirements on the arguments (except
that divisor should not be 0 obviously).

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
---
 drivers/clk/clk-divider.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/clk/clk-divider.c b/drivers/clk/clk-divider.c
index a2c2b5203b0a..94b4fb66a60f 100644
--- a/drivers/clk/clk-divider.c
+++ b/drivers/clk/clk-divider.c
@@ -220,7 +220,7 @@ static int _div_round_up(const struct clk_div_table *table,
 			 unsigned long parent_rate, unsigned long rate,
 			 unsigned long flags)
 {
-	int div = DIV_ROUND_UP_ULL((u64)parent_rate, rate);
+	int div = DIV_ROUND_UP_NO_OVERFLOW(parent_rate, rate);
 
 	if (flags & CLK_DIVIDER_POWER_OF_TWO)
 		div = __roundup_pow_of_two(div);
@@ -237,7 +237,7 @@ static int _div_round_closest(const struct clk_div_table *table,
 	int up, down;
 	unsigned long up_rate, down_rate;
 
-	up = DIV_ROUND_UP_ULL((u64)parent_rate, rate);
+	up = DIV_ROUND_UP_NO_OVERFLOW(parent_rate, rate);
 	down = parent_rate / rate;
 
 	if (flags & CLK_DIVIDER_POWER_OF_TWO) {
@@ -473,7 +473,7 @@ int divider_get_val(unsigned long rate, unsigned long parent_rate,
 {
 	unsigned int div, value;
 
-	div = DIV_ROUND_UP_ULL((u64)parent_rate, rate);
+	div = DIV_ROUND_UP_NO_OVERFLOW(parent_rate, rate);
 
 	if (!_is_valid_div(table, div, flags))
 		return -EINVAL;
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 3/3] clk: composite: replace open-coded abs_diff()
  2023-10-24 16:18 [PATCH v4 0/3] Fix clock division overflow problem Sebastian Reichel
  2023-10-24 16:18 ` [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW Sebastian Reichel
  2023-10-24 16:18 ` [PATCH v4 2/3] clk: divider: Fix divisor masking on 64 bit platforms Sebastian Reichel
@ 2023-10-24 16:18 ` Sebastian Reichel
  2023-10-24 16:23   ` Andy Shevchenko
  2 siblings, 1 reply; 20+ messages in thread
From: Sebastian Reichel @ 2023-10-24 16:18 UTC (permalink / raw)
  To: Andy Shevchenko, Michael Turquette, Stephen Boyd, David Laight,
	linux-clk, linux-kernel
  Cc: Vasily Gorbik, Niklas Schnelle, Linus Torvalds,
	Sebastian Reichel, kernel, Andy Shevchenko

Replace the open coded abs_diff() with the existing helper function.

Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
---
 drivers/clk/clk-composite.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/clk/clk-composite.c b/drivers/clk/clk-composite.c
index 66759fe28fad..478a4e594336 100644
--- a/drivers/clk/clk-composite.c
+++ b/drivers/clk/clk-composite.c
@@ -6,6 +6,7 @@
 #include <linux/clk-provider.h>
 #include <linux/device.h>
 #include <linux/err.h>
+#include <linux/math.h>
 #include <linux/slab.h>
 
 static u8 clk_composite_get_parent(struct clk_hw *hw)
@@ -119,10 +120,7 @@ static int clk_composite_determine_rate(struct clk_hw *hw,
 			if (ret)
 				continue;
 
-			if (req->rate >= tmp_req.rate)
-				rate_diff = req->rate - tmp_req.rate;
-			else
-				rate_diff = tmp_req.rate - req->rate;
+			rate_diff = abs_diff(req->rate, tmp_req.rate);
 
 			if (!rate_diff || !req->best_parent_hw
 				       || best_rate_diff > rate_diff) {
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 3/3] clk: composite: replace open-coded abs_diff()
  2023-10-24 16:18 ` [PATCH v4 3/3] clk: composite: replace open-coded abs_diff() Sebastian Reichel
@ 2023-10-24 16:23   ` Andy Shevchenko
  0 siblings, 0 replies; 20+ messages in thread
From: Andy Shevchenko @ 2023-10-24 16:23 UTC (permalink / raw)
  To: Sebastian Reichel
  Cc: Michael Turquette, Stephen Boyd, David Laight, linux-clk,
	linux-kernel, Vasily Gorbik, Niklas Schnelle, Linus Torvalds,
	kernel

On Tue, Oct 24, 2023 at 06:18:17PM +0200, Sebastian Reichel wrote:
> Replace the open coded abs_diff() with the existing helper function.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-24 16:18 ` [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW Sebastian Reichel
@ 2023-10-24 16:26   ` David Laight
  2023-10-24 19:32   ` Linus Torvalds
  1 sibling, 0 replies; 20+ messages in thread
From: David Laight @ 2023-10-24 16:26 UTC (permalink / raw)
  To: 'Sebastian Reichel',
	Andy Shevchenko, Michael Turquette, Stephen Boyd, linux-clk,
	linux-kernel
  Cc: Vasily Gorbik, Niklas Schnelle, Linus Torvalds, kernel

From: Sebastian Reichel
> Sent: 24 October 2023 17:18
> 
> Add a new DIV_ROUND_UP helper, which cannot overflow when
> big numbers are being used.

For non-zero you can use (n - 1)/d + 1 instead of (n + d - 1)/d
So maybe add:
#define DIV_ROUND_UP_NON_ZERO(n, d) (((n) - 1)/(d) + 1)

Saves the compiler having to get the remainder (if not
generated by a divide instruction.

	David

> 
> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
> ---
>  include/linux/math.h | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/include/linux/math.h b/include/linux/math.h
> index dd4152711de7..f80bfb375ab9 100644
> --- a/include/linux/math.h
> +++ b/include/linux/math.h
> @@ -36,6 +36,17 @@
> 
>  #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
> 
> +/**
> + * DIV_ROUND_UP_NO_OVERFLOW - divide two numbers and always round up
> + * @n: numerator / dividend
> + * @d: denominator / divisor
> + *
> + * This functions does the same as DIV_ROUND_UP, but internally uses a
> + * division and a modulo operation instead of math tricks. This way it
> + * avoids overflowing when handling big numbers.
> + */
> +#define DIV_ROUND_UP_NO_OVERFLOW(n, d) (((n) / (d)) + !!((n) % (d)))
> +
>  #define DIV_ROUND_DOWN_ULL(ll, d) \
>  	({ unsigned long long _tmp = (ll); do_div(_tmp, d); _tmp; })
> 
> --
> 2.42.0

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-24 16:18 ` [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW Sebastian Reichel
  2023-10-24 16:26   ` David Laight
@ 2023-10-24 19:32   ` Linus Torvalds
  2023-10-24 22:53     ` Linus Torvalds
  2023-10-25 17:28     ` Sebastian Reichel
  1 sibling, 2 replies; 20+ messages in thread
From: Linus Torvalds @ 2023-10-24 19:32 UTC (permalink / raw)
  To: Sebastian Reichel
  Cc: Andy Shevchenko, Michael Turquette, Stephen Boyd, David Laight,
	linux-clk, linux-kernel, Vasily Gorbik, Niklas Schnelle, kernel

[ Since you're cc'ing the s390 people, I assume that means that this
all ended up being a follow-up to the s390 issue discussion ]

On Tue, 24 Oct 2023 at 06:19, Sebastian Reichel
<sebastian.reichel@collabora.com> wrote:
>
> Add a new DIV_ROUND_UP helper, which cannot overflow when
> big numbers are being used.

This is really horrendously bad on some architectures (ie it might
require *two* divisions):

> +#define DIV_ROUND_UP_NO_OVERFLOW(n, d) (((n) / (d)) + !!((n) % (d)))

but David Laight at least had a suggestion for that: when allowing
multiple uses, you can just do

   #define DIV_ROUND_UP(n,d) ((n) ? ((n)-1)/(d)+1 : 0)

so now you're back to just one division and no horrible extra expense.

But we can't allow those multiple uses in general, sadly.

I would really prefer to just make our regular DIV_ROUND_UP() DTRT.  But:

 - people do use it with complex first arguments (ie function calls
etc) that we don't want to evaluate twice

 - we can't make it an inline function, because the types aren't fixed

 - we can't even use a statement expression and __auto_type, because
these things are used in type definitions etc and need to be constant
expressions

That last thing means that even the "__builtin_choose_expr()" doesn't
work, because the statement expression still needs to be _parsed_ as
such, and that's not something we can do in those contexts.

Very annoying. Let me think about this.

                     Linus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-24 19:32   ` Linus Torvalds
@ 2023-10-24 22:53     ` Linus Torvalds
  2023-10-25  8:03       ` Rasmus Villemoes
                         ` (3 more replies)
  2023-10-25 17:28     ` Sebastian Reichel
  1 sibling, 4 replies; 20+ messages in thread
From: Linus Torvalds @ 2023-10-24 22:53 UTC (permalink / raw)
  To: Sebastian Reichel
  Cc: Andy Shevchenko, Michael Turquette, Stephen Boyd, David Laight,
	linux-clk, linux-kernel, Vasily Gorbik, Niklas Schnelle, kernel

[-- Attachment #1: Type: text/plain, Size: 1141 bytes --]

On Tue, 24 Oct 2023 at 09:32, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I would really prefer to just make our regular DIV_ROUND_UP() DTRT.  But:
>
>  - people do use it with complex first arguments (ie function calls
> etc) that we don't want to evaluate twice
>
>  - we can't make it an inline function, because the types aren't fixed
>
>  - we can't even use a statement expression and __auto_type, because
> these things are used in type definitions etc and need to be constant
> expressions

Ok. I have a potential beginning of a solution.

It is unbelievably disgustingly complicated. But it might approach
being correct.

And by that "it might approach being correct" I obviously mean "this
is untested, but builds at least some kernel code".

I'm almost certain it will fail on more complex cases, because I
already found a lot of questionable stuff that was simply hidden by
the old macro just silently doing the C arithmetic type conversions,
and this thing does type handling manually.

I'm hoping that somebody will go "Linus, you're just being
*completely* silly, it's much easier to do XYZ".

            Linus

[-- Attachment #2: 0001-Introduce-complicated-non-overflowing-DIV_ROUND_UP.patch --]
[-- Type: text/x-patch, Size: 5557 bytes --]

From 712c9eacded5b717cbb88db1df5e430342012269 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Tue, 24 Oct 2023 12:51:06 -1000
Subject: [PATCH] Introduce (complicated) non-overflowing DIV_ROUND_UP()

Doing a non-overflowing DIV_ROUND_UP() that is usable in all contexts is
actually very nasty.

This is a trial balloon..  The signed cases need more thought.  The best
option would be to disallow them (by not listing them in the _Generic()
rules). But they currently happen, often for bad reasons, ie wireless has

	DIV_ROUND_UP(interval, MSEC_PER_SEC);

and while 'interval' is a proper u32, MSEC_PER_SEC is defined to be
'1000L', so the resulting C arithmetic is done in signed 'long'.

We also have things like

	DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT)

in the kaslr code, where the compiler complains if we ever then use that
'1UL << TB_SHIFT' as an 'int' in the macro expansion, so the
type-specific macros all take the divisor as a 'unsigned long' even if
that might be wider than the target type.

Not-yet-signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 include/linux/div_round_up.h | 104 +++++++++++++++++++++++++++++++++++
 include/uapi/linux/const.h   |   6 +-
 2 files changed, 108 insertions(+), 2 deletions(-)
 create mode 100644 include/linux/div_round_up.h

diff --git a/include/linux/div_round_up.h b/include/linux/div_round_up.h
new file mode 100644
index 000000000000..d7fd319ba5c1
--- /dev/null
+++ b/include/linux/div_round_up.h
@@ -0,0 +1,104 @@
+#ifndef _DIV_ROUND_UP_H
+#define _DIV_ROUND_UP_H
+
+/*
+ * 'DIV_ROUND_UP(n,d)' is annoyingly complicated to do, because while
+ * we've historically evaluated 'd' multiple times using the simple
+ * expression
+ *
+ *    (((n) + (d) - 1) / (d))
+ *
+ * we do _not_ want to evaluate 'n' multiple times. And the simple
+ * expression can overflow to zero for big values of 'n'.
+ *
+ * A non-overflowing version (by David Laight) is
+ *
+ *    (((n) - 1)/(d) + 1)
+ *
+ * but there the 'n-1' will underflow when 'n' is zero, and adding the
+ * conditional for that would then evaluate 'n' twice.
+ *
+ * Avoiding the double evaluation with a statement expression using
+ *
+ *    ({ __auto_type __n = (n) .... })
+ *
+ * like we do with the similarly constrained 'min()' and 'max()'
+ * defines would be trivial, but cannot be used in a top-level
+ * context like a type definition, which is a common case.
+ *
+ * Even using __builtin_choose_expr() and only using the statement
+ * expression for the non-constant case, gcc refuses to even parse
+ * statement expressions outside of function context:
+ *
+ *   error: braced-group within expression allowed only inside a function
+ *
+ * The obvious solution is to use an inline function, but that fails
+ * due to forced type conversion. And widening the types to the maximum
+ * size generates horrific code.
+ *
+ * End result: this horror.
+ */
+
+/*
+ * Split the logic into the constant case (which can be used everywhere)
+ * and the non-constant case (limited to function context, but needs to
+ * parse everywhere).
+ */
+#define __keep_const(cond, expr) \
+	__builtin_choose_expr(__is_constexpr(cond), __const_##expr, __##expr)
+#define __KERNEL_DIV_ROUND_UP(n, d) \
+	__keep_const(n, div_round_up(n,d))
+
+/*
+ * The constant case is trivial, since then we can just re-evaluate to
+ * our hearts content.
+ */
+#define __const_div_round_up(n,d) ((n) != 0 ? ((n)-1)/(d)+1 : 0)
+
+/*
+ * The other cases need to be split by type.
+ *
+ * Signed cases seem badly defined, but do exist. We should
+ * consider removing them from this _Generic(), and fixing any
+ * result 'not compatible with any association' cases.
+ */
+#define __div_round_up(n,d) _Generic((n)+(d),		\
+	unsigned long long: __div_round_up_ull(n,d),	\
+	long long: __div_round_up_ll(n,d),		\
+	unsigned long: __div_round_up_ul(n,d),		\
+	long: __div_round_up_l(n,d),			\
+	unsigned int: __div_round_up_u(n,d),		\
+	int: __div_round_up_i(n,d))
+
+static inline unsigned long long __div_round_up_ull(unsigned long long n, unsigned long d)
+{
+#ifdef CONFIG_32BIT
+	if (!n)
+		return 0
+	do_div(n-1, d);
+	return n+1;
+#else
+	return __const_div_round_up(n,d);
+#endif
+}
+
+static inline unsigned long __div_round_up_ul(unsigned long n, unsigned long d)
+{ return __const_div_round_up(n,d); }
+
+static inline unsigned int __div_round_up_u(unsigned int n, unsigned long d)
+{ return __const_div_round_up(n,(unsigned int)d); }
+
+// Very questionable signed cases...
+static inline long long __div_round_up_ll(long long n, unsigned long d)
+{
+	if (n < 0) return 0;
+	return __div_round_up_ull(n,d);
+}
+
+static inline long __div_round_up_l(long n, unsigned long d)
+{ return __const_div_round_up(n,(long)d); }
+
+static inline int __div_round_up_i(int n, unsigned long d)
+{ return __const_div_round_up(n,(int)d); }
+
+#endif
diff --git a/include/uapi/linux/const.h b/include/uapi/linux/const.h
index a429381e7ca5..8efa1d77c414 100644
--- a/include/uapi/linux/const.h
+++ b/include/uapi/linux/const.h
@@ -20,6 +20,10 @@
 #define __AC(X,Y)	(X##Y)
 #define _AC(X,Y)	__AC(X,Y)
 #define _AT(T,X)	((T)(X))
+
+/* Odd historical placement */
+#include <linux/div_round_up.h>
+
 #endif
 
 #define _UL(x)		(_AC(x, UL))
@@ -31,6 +35,4 @@
 #define __ALIGN_KERNEL(x, a)		__ALIGN_KERNEL_MASK(x, (__typeof__(x))(a) - 1)
 #define __ALIGN_KERNEL_MASK(x, mask)	(((x) + (mask)) & ~(mask))
 
-#define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
-
 #endif /* _UAPI_LINUX_CONST_H */
-- 
2.42.0.398.ga9ecda2788


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-24 22:53     ` Linus Torvalds
@ 2023-10-25  8:03       ` Rasmus Villemoes
  2023-10-25 17:37         ` Linus Torvalds
  2023-10-25  8:38       ` David Laight
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 20+ messages in thread
From: Rasmus Villemoes @ 2023-10-25  8:03 UTC (permalink / raw)
  To: Linus Torvalds, Sebastian Reichel
  Cc: Andy Shevchenko, Michael Turquette, Stephen Boyd, David Laight,
	linux-clk, linux-kernel, Vasily Gorbik, Niklas Schnelle, kernel

On 25/10/2023 00.53, Linus Torvalds wrote:

> I'm hoping that somebody will go "Linus, you're just being
> *completely* silly, it's much easier to do XYZ".

I don't have better suggestions, but a few notes:

Both the existing and new implementation are simply wrong for negative
n, because / doesn't do floor(), it does round-towards-0 (I do believe
that's a bug in the C standard, but not much one can do about that). So
both the old and new say that dru(-5, 5) == 0, dru(-7, 5) == 0, dru(-10,
5) == -1. They are correct for dru(-9, 5) == -1 and other "1 mod d" cases.

But the new implementation disagrees with the old for -d+1 < n < 0: The
old implementation correctly says dru(-3, 5) == 0, but the new gives 1 (!).

Preventing signed types from being used is probably too difficult. But
it would be nice if we could at least catch a negative _value_ and do
something sensible. Can we make the build fail for a negative,
compile-time constant, n? I have no idea what to do for run-time
negative values - we already don't do the math right, but starting to
return a positive number for some negative n's really doesn't seem right.

Aside: I don't think uapi/ stuff can include non-uapi/ stuff.

Aside2: do_div(n-1, d); probably doesn't compile :)

Rasmus


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-24 22:53     ` Linus Torvalds
  2023-10-25  8:03       ` Rasmus Villemoes
@ 2023-10-25  8:38       ` David Laight
  2023-10-25 17:41         ` Linus Torvalds
  2023-10-25 15:05       ` Vasily Gorbik
  2023-10-25 17:36       ` Sebastian Reichel
  3 siblings, 1 reply; 20+ messages in thread
From: David Laight @ 2023-10-25  8:38 UTC (permalink / raw)
  To: 'Linus Torvalds', Sebastian Reichel
  Cc: Andy Shevchenko, Michael Turquette, Stephen Boyd, linux-clk,
	linux-kernel, Vasily Gorbik, Niklas Schnelle, kernel

From: Linus Torvalds
> Sent: 24 October 2023 23:53
> 
> On Tue, 24 Oct 2023 at 09:32, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > I would really prefer to just make our regular DIV_ROUND_UP() DTRT.  But:
> >
> >  - people do use it with complex first arguments (ie function calls
> > etc) that we don't want to evaluate twice
> >
> >  - we can't make it an inline function, because the types aren't fixed
> >
> >  - we can't even use a statement expression and __auto_type, because
> > these things are used in type definitions etc and need to be constant
> > expressions

Doesn't min() get around that by using is_constexpr() and
__builtin_choose_exptr() - the same could be done here.

> 
> Ok. I have a potential beginning of a solution.
> 
> It is unbelievably disgustingly complicated. But it might approach
> being correct.
> 
> And by that "it might approach being correct" I obviously mean "this
> is untested, but builds at least some kernel code".
> 
> I'm almost certain it will fail on more complex cases, because I
> already found a lot of questionable stuff that was simply hidden by
> the old macro just silently doing the C arithmetic type conversions,
> and this thing does type handling manually.
> 
> I'm hoping that somebody will go "Linus, you're just being
> *completely* silly, it's much easier to do XYZ".

> Doing a non-overflowing DIV_ROUND_UP() that is usable in all contexts is
> actually very nasty.
>
> This is a trial balloon..  The signed cases need more thought.  The best
> option would be to disallow them (by not listing them in the _Generic()
> rules). But they currently happen, often for bad reasons, ie wireless has
>
> 	DIV_ROUND_UP(interval, MSEC_PER_SEC);
>
> and while 'interval' is a proper u32, MSEC_PER_SEC is defined to be
> '1000L', so the resulting C arithmetic is done in signed 'long'.

Maybe use some of the 'stuff' from min() and convert compile-time
constant 'd' to signed int to avoid promotions.

Indeed the whole thing really only makes sense for (d > 0 && n >= 0)
so forcing an unsigned divide wouldn't be a bad thing at all.
It will also generate better code when 'd' is a power of 2.

Ignoring the n==0 case I think this always generates an unsigned
divide, never does sign extension and does a 32bit divide
for 32bit arguments.

#define CVT_ULL(x) ((x) + 0u + 0ul + 0ull)
#define DIV_ROUND_UP(n, d) ((CVT_ULL(n) + CVT_ULL(d) - 1) / CVT_ULL(d) + 1)

It should be possible to error if 'd' is a signed variable or
a non-positive constant.
I'd guess most 'd' are constants.

Erroring signed 'n' is possible but might be annoying.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-24 22:53     ` Linus Torvalds
  2023-10-25  8:03       ` Rasmus Villemoes
  2023-10-25  8:38       ` David Laight
@ 2023-10-25 15:05       ` Vasily Gorbik
  2023-10-25 17:43         ` Linus Torvalds
  2023-10-25 17:36       ` Sebastian Reichel
  3 siblings, 1 reply; 20+ messages in thread
From: Vasily Gorbik @ 2023-10-25 15:05 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sebastian Reichel, Andy Shevchenko, Michael Turquette,
	Stephen Boyd, David Laight, linux-clk, linux-kernel,
	Niklas Schnelle, kernel

On Tue, Oct 24, 2023 at 12:53:03PM -1000, Linus Torvalds wrote:
>+/*
>+ * The other cases need to be split by type.
>+ *
>+ * Signed cases seem badly defined, but do exist. We should
>+ * consider removing them from this _Generic(), and fixing any
>+ * result 'not compatible with any association' cases.
>+ */
>+#define __div_round_up(n,d) _Generic((n)+(d),		\
>+	unsigned long long: __div_round_up_ull(n,d),	\
>+	long long: __div_round_up_ll(n,d),		\
>+	unsigned long: __div_round_up_ul(n,d),		\
>+	long: __div_round_up_l(n,d),			\
>+	unsigned int: __div_round_up_u(n,d),		\
>+	int: __div_round_up_i(n,d))

You probably want

 #define __div_round_up(n,d) _Generic((n)+(d),		\
	unsigned long long: __div_round_up_ull,		\
	long long: __div_round_up_ll,			\
	unsigned long: __div_round_up_ul,		\
	long: __div_round_up_l,				\
	unsigned int: __div_round_up_u,			\
	int: __div_round_up_i)(n,d)

to avoid early type-checking for expressions that will be discarded
and prevent errors like:

drivers/net/ethernet/microchip/sparx5/sparx5_tc_flower.c: In function 'sparx5_tc_flower_parse_act_police':
drivers/net/ethernet/microchip/sparx5/sparx5_main.h:435:34: error: conversion from 'long long unsigned int' to 'long unsigned int' changes value from '25000000000' to '3525163520' [-Werror=overflow]
  435 | #define SPX5_SDLB_GROUP_RATE_MAX 25000000000ULL
      |                                  ^~~~~~~~~~~~~~
./include/linux/div_round_up.h:68:42: note: in definition of macro '__div_round_up'
   68 |         unsigned long: __div_round_up_ul(n,d),          \
      |                                          ^
./include/linux/div_round_up.h:50:9: note: in expansion of macro '__keep_const'
   50 |         __keep_const(n, div_round_up(n,d))
      |         ^~~~~~~~~~~~
./include/linux/math.h:37:22: note: in expansion of macro '__KERNEL_DIV_ROUND_UP'
   37 | #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
      |                      ^~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/microchip/sparx5/sparx5_tc_flower.c:732:38: note: in expansion of macro 'SPX5_SDLB_GROUP_RATE_MAX'
  732 |         if (pol->rate > DIV_ROUND_UP(SPX5_SDLB_GROUP_RATE_MAX, 1000)) {
      |                                      ^~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/microchip/sparx5/sparx5_main.h:435:34: error: overflow in conversion from 'long long unsigned int' to 'long int' changes value from '25000000000' to '-769803776' [-Werror=overflow]
...
drivers/net/ethernet/microchip/sparx5/sparx5_main.h:435:34: error: conversion from 'long long unsigned int' to 'unsigned int' changes value from '25000000000' to '3525163520' [-Werror=overflow]
...
drivers/net/ethernet/microchip/sparx5/sparx5_main.h:435:34: error: overflow in conversion from 'long long unsigned int' to 'int' changes value from '25000000000' to '-769803776' [-Werror=overflow]

Plus typos fixes below passes allyesconfig for s390, 32-bit x86 and arm.

 static inline unsigned long long __div_round_up_ull(unsigned long long n, unsigned long d)
 {
 #ifdef CONFIG_32BIT
 	if (!n)
-		return 0
-	do_div(n-1, d);
-	return n+1;
+		return 0;
+	n--;
+	do_div(n, d);
+	return n + 1;
 #else
 	return __const_div_round_up(n,d);
 #endif

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-24 19:32   ` Linus Torvalds
  2023-10-24 22:53     ` Linus Torvalds
@ 2023-10-25 17:28     ` Sebastian Reichel
  1 sibling, 0 replies; 20+ messages in thread
From: Sebastian Reichel @ 2023-10-25 17:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andy Shevchenko, Michael Turquette, Stephen Boyd, David Laight,
	linux-clk, linux-kernel, Vasily Gorbik, Niklas Schnelle, kernel

[-- Attachment #1: Type: text/plain, Size: 2592 bytes --]

Hi,

On Tue, Oct 24, 2023 at 09:32:27AM -1000, Linus Torvalds wrote:
> [ Since you're cc'ing the s390 people, I assume that means that this
> all ended up being a follow-up to the s390 issue discussion ]

Actually I've noticed the issue in the clock framework some time
ago and Andy Shevchenko pointed me towards the s390 thread, which
is more or less about the same fundamental problem. So I added
everyone in Cc, to make sure this is solved properly and not just
s390 specific.

> On Tue, 24 Oct 2023 at 06:19, Sebastian Reichel
> <sebastian.reichel@collabora.com> wrote:
> >
> > Add a new DIV_ROUND_UP helper, which cannot overflow when
> > big numbers are being used.
> 
> This is really horrendously bad on some architectures (ie it might
> require *two* divisions):

I came up with this helper for the issue in the clock framework,
which currently use DIV_ROUND_UP_ULL() as a workaround. That solves
the issue on 32 bit systems (input arguments are of type 'unsigned
long'). But it also means doing u64 / u32 and I expect that do be
much more more expensive on 32 bit platforms than two divisons.

On 64 bit platforms the workaround is wrong, since the second
argument is reduced from 64 bit to 32 bit. It effectively breaks
platforms trying to request a frequency above 4.3 GHz from a clock
divider. Since that's not yet a common thing I guess this has not
yet been found and fixed by somebody else. I also just found it as
bycatch.

> > +#define DIV_ROUND_UP_NO_OVERFLOW(n, d) (((n) / (d)) + !!((n) % (d)))
> 
> but David Laight at least had a suggestion for that: when allowing
> multiple uses, you can just do
> 
>    #define DIV_ROUND_UP(n,d) ((n) ? ((n)-1)/(d)+1 : 0)
> 
> so now you're back to just one division and no horrible extra expense.
> 
> But we can't allow those multiple uses in general, sadly.
> 
> I would really prefer to just make our regular DIV_ROUND_UP() DTRT.  But:
> 
>  - people do use it with complex first arguments (ie function calls
> etc) that we don't want to evaluate twice
> 
>  - we can't make it an inline function, because the types aren't fixed
> 
>  - we can't even use a statement expression and __auto_type, because
> these things are used in type definitions etc and need to be constant
> expressions
> 
> That last thing means that even the "__builtin_choose_expr()" doesn't
> work, because the statement expression still needs to be _parsed_ as
> such, and that's not something we can do in those contexts.
> 
> Very annoying. Let me think about this.

Thanks.

-- Sebastian

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-24 22:53     ` Linus Torvalds
                         ` (2 preceding siblings ...)
  2023-10-25 15:05       ` Vasily Gorbik
@ 2023-10-25 17:36       ` Sebastian Reichel
  3 siblings, 0 replies; 20+ messages in thread
From: Sebastian Reichel @ 2023-10-25 17:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andy Shevchenko, Michael Turquette, Stephen Boyd, David Laight,
	linux-clk, linux-kernel, Vasily Gorbik, Niklas Schnelle, kernel

[-- Attachment #1: Type: text/plain, Size: 1414 bytes --]

Hi,

On Tue, Oct 24, 2023 at 12:53:03PM -1000, Linus Torvalds wrote:
> On Tue, 24 Oct 2023 at 09:32, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > I would really prefer to just make our regular DIV_ROUND_UP() DTRT.  But:
> >
> >  - people do use it with complex first arguments (ie function calls
> > etc) that we don't want to evaluate twice
> >
> >  - we can't make it an inline function, because the types aren't fixed
> >
> >  - we can't even use a statement expression and __auto_type, because
> > these things are used in type definitions etc and need to be constant
> > expressions
> 
> Ok. I have a potential beginning of a solution.
> 
> It is unbelievably disgustingly complicated. But it might approach
> being correct.
> 
> And by that "it might approach being correct" I obviously mean "this
> is untested, but builds at least some kernel code".
> 
> I'm almost certain it will fail on more complex cases, because I
> already found a lot of questionable stuff that was simply hidden by
> the old macro just silently doing the C arithmetic type conversions,
> and this thing does type handling manually.
> 
> I'm hoping that somebody will go "Linus, you're just being
> *completely* silly, it's much easier to do XYZ".

I think your new __div_round_up_ull() should also be used by
DIV_ROUND_UP_ULL, which is defined in linux/math.h.

-- Sebastian

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-25  8:03       ` Rasmus Villemoes
@ 2023-10-25 17:37         ` Linus Torvalds
  0 siblings, 0 replies; 20+ messages in thread
From: Linus Torvalds @ 2023-10-25 17:37 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Sebastian Reichel, Andy Shevchenko, Michael Turquette,
	Stephen Boyd, David Laight, linux-clk, linux-kernel,
	Vasily Gorbik, Niklas Schnelle, kernel

On Tue, 24 Oct 2023 at 22:03, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:
>
> Both the existing and new implementation are simply wrong for negative
> n, because / doesn't do floor(), it does round-towards-0 (I do believe
> that's a bug in the C standard, but not much one can do about that).

Honestly, while I found several cases where the arithmetic was done
with a signed type, all the ones I looked at were purely about the
_type_, not the values.

Any actual negative values would have been too random before to make a
difference. In that patch, I tried to keep the signed type around
mainly to at least get *similar* values to the old code, but honestly,
it was all broken before already if the values were actually signed.

If it were to change the result and somebody would find a bug due to
that, that would be a good thing, in other words. The negative value
would need to be fixed anyway - as you say, integer division of signed
values is simply not rgeat in the first place.

Side note: if you dislike the "round towards zero" behavior, I can
tell you that it is better than the historical alternative, which was
"implementation defined".

IIRC, the rule used to be that signed division could either round down
or towards zero, and the only rule was that (a) it had to be
documented (ie "implementation defined" rather than "undefined") and
(b) the modulus had to follow the same rules, ie regardless of
rounding, you'd have

  (a/b)*b + a%b = a

That said:

> Preventing signed types from being used is probably too difficult.

It might not be. My horror with _Generic() is certainly not pretty,
but it *does* have the advantage that we could error out for signed
cases.

I did that initially, and it didn't look horrific, particularly if you
just accept signed positive constants as unsigned values (which
requires another bit of macro magic but is not conceptually hard).

The patch I sent out was not it, though - it was meant really as a
minimal "something like this" that still compiled my defconfig.

> Aside: I don't think uapi/ stuff can include non-uapi/ stuff.

Yeah, no, I left it in that uapi header just for the same reason I
didn't bother trying to fix any signed type cases - actively avoiding
changes *outside* the DIV_ROUND_UP() case itself.

In reality, I'd just make any users of __KERNEL_DIV_ROUND_UP include
<linux/div_round_up.h> and get rid of the crazy UAPI thing entirely.

So ignore that part.

> Aside2: do_div(n-1, d); probably doesn't compile :)

.. I did say it was untested and "might approach being correct".

That 32-bit ULL case should also have some attempt at checking that
the divisor really fits in 32 bits. Because I still refuse to believe
that the full 64-by-64 divisions are sane on a 32-bit machine, and
anybody who does that needs to go to extra lengths for them. No
"silently do them without telling the programmer they are doing
something horrendously bad".

                Linus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-25  8:38       ` David Laight
@ 2023-10-25 17:41         ` Linus Torvalds
  2023-10-26  8:41           ` David Laight
  0 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2023-10-25 17:41 UTC (permalink / raw)
  To: David Laight
  Cc: Sebastian Reichel, Andy Shevchenko, Michael Turquette,
	Stephen Boyd, linux-clk, linux-kernel, Vasily Gorbik,
	Niklas Schnelle, kernel

On Tue, 24 Oct 2023 at 22:38, David Laight <David.Laight@aculab.com> wrote:
>
> From: Linus Torvalds
> > >  - we can't even use a statement expression and __auto_type, because
> > > these things are used in type definitions etc and need to be constant
> > > expressions
>
> Doesn't min() get around that by using is_constexpr() and
> __builtin_choose_exptr() - the same could be done here.

Nope. I wanted to do it that way - it would have made things much
simpler and avoid the whole _Generic() thing, but try it - you cannot
use statement expressions in a non-function context even with
__builtin_choose_expr().

And no, min/max have never been usable in that context

                 Linus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-25 15:05       ` Vasily Gorbik
@ 2023-10-25 17:43         ` Linus Torvalds
  2023-10-26  8:57           ` David Laight
  0 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2023-10-25 17:43 UTC (permalink / raw)
  To: Vasily Gorbik
  Cc: Sebastian Reichel, Andy Shevchenko, Michael Turquette,
	Stephen Boyd, David Laight, linux-clk, linux-kernel,
	Niklas Schnelle, kernel

On Wed, 25 Oct 2023 at 05:05, Vasily Gorbik <gor@linux.ibm.com> wrote:
>
> You probably want
>
>  #define __div_round_up(n,d) _Generic((n)+(d),          \
>         unsigned long long: __div_round_up_ull,         \
>         long long: __div_round_up_ll,                   \
>         unsigned long: __div_round_up_ul,               \
>         long: __div_round_up_l,                         \
>         unsigned int: __div_round_up_u,                 \
>         int: __div_round_up_i)(n,d)
>
> to avoid early type-checking for expressions that will be discarded
> and prevent errors like:

Ack. I noticed that later when I tried to do a bigger config build -
the compiler would warn about the implicit truncation of the integer
arguments (for the cases where they weren't used).

> Plus typos fixes below passes allyesconfig for s390, 32-bit x86 and arm.

Lovely.

It would have been even better if somebody told me that I was stupid
and there was some nice trick to it, but at least the _Generic()
approach doesn't seem broken - just a few tweaks needed.

               Linus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-25 17:41         ` Linus Torvalds
@ 2023-10-26  8:41           ` David Laight
  0 siblings, 0 replies; 20+ messages in thread
From: David Laight @ 2023-10-26  8:41 UTC (permalink / raw)
  To: 'Linus Torvalds'
  Cc: Sebastian Reichel, Andy Shevchenko, Michael Turquette,
	Stephen Boyd, linux-clk, linux-kernel, Vasily Gorbik,
	Niklas Schnelle, kernel

From: Linus Torvalds 
> Sent: 25 October 2023 18:41
> 
> On Tue, 24 Oct 2023 at 22:38, David Laight <David.Laight@aculab.com> wrote:
> >
> > From: Linus Torvalds
> > > >  - we can't even use a statement expression and __auto_type, because
> > > > these things are used in type definitions etc and need to be constant
> > > > expressions
> >
> > Doesn't min() get around that by using is_constexpr() and
> > __builtin_choose_exptr() - the same could be done here.
> 
> Nope. I wanted to do it that way - it would have made things much
> simpler and avoid the whole _Generic() thing, but try it - you cannot
> use statement expressions in a non-function context even with
> __builtin_choose_expr().

_Generic() has exactly the same issues as __builtin_choose_expr().
All the code has to be valid for all types.
Makes testing for negative constants a PITA.
Something like this worked:
	(__buitin_choose_expr(__is_constexpr((x) && is_signed_type(__typeof(x)), x, 1) < 0)
Although our is_signed_type() isn't constant (enough) for pointer types.
So you need another check for (typeof(x))1 being constant.

> And no, min/max have never been usable in that context

I've clearly not tested it and assumed the comment implied
that was ok.

It'd probably work in C++ - which is perfectly willing to
add functions to initialise static data - even when you'd
rather have a compile time error.

	David

> 
>                  Linus

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-25 17:43         ` Linus Torvalds
@ 2023-10-26  8:57           ` David Laight
  2023-10-26 16:54             ` Linus Torvalds
  0 siblings, 1 reply; 20+ messages in thread
From: David Laight @ 2023-10-26  8:57 UTC (permalink / raw)
  To: 'Linus Torvalds', Vasily Gorbik
  Cc: Sebastian Reichel, Andy Shevchenko, Michael Turquette,
	Stephen Boyd, linux-clk, linux-kernel, Niklas Schnelle, kernel

From: Linus Torvalds
> Sent: 25 October 2023 18:44
> 
> On Wed, 25 Oct 2023 at 05:05, Vasily Gorbik <gor@linux.ibm.com> wrote:
> >
> > You probably want
> >
> >  #define __div_round_up(n,d) _Generic((n)+(d),          \
> >         unsigned long long: __div_round_up_ull,         \
> >         long long: __div_round_up_ll,                   \
> >         unsigned long: __div_round_up_ul,               \
> >         long: __div_round_up_l,                         \
> >         unsigned int: __div_round_up_u,                 \
> >         int: __div_round_up_i)(n,d)
> >
> > to avoid early type-checking for expressions that will be discarded
> > and prevent errors like:
> 
> Ack. I noticed that later when I tried to do a bigger config build -
> the compiler would warn about the implicit truncation of the integer
> arguments (for the cases where they weren't used).
> 
> > Plus typos fixes below passes allyesconfig for s390, 32-bit x86 and arm.
> 
> Lovely.
> 
> It would have been even better if somebody told me that I was stupid
> and there was some nice trick to it, but at least the _Generic()
> approach doesn't seem broken - just a few tweaks needed.

Doesn't that version end up calling inline functions?
So won't be usable in static initialisers - the same as statement functions.

Will this trick work:
#define ZERO_UNLESS_TYPE(x, type) _Generic(x, type: x, default 0)
#define div_ru(type, fn, n, d) \
	type: fn(ZERO_UNLESS_TYPE(type, (n) + (d), n), ZERO_UNLESS_TYPE(type, (n) + (d), d)
then:
#define __div_round_up(n, d) _Generic((n)+(d),          \
	div_ru(unsigned long long, __div_round_up_ull, n, d),
	...

Which would allow the 'functions' be #defines.

But it still has the issues of doing 'big' divisions.
Although the compiler will use 32bit (and 64 by 32 bit) divisions
for 64 bit types if it knows the values are small.

clang seems to add conditionals to avoid 64x64 divides.
Probably worth it for intel x86 cpu, but not for amd ones.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-26  8:57           ` David Laight
@ 2023-10-26 16:54             ` Linus Torvalds
  2023-10-27  7:24               ` David Laight
  0 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2023-10-26 16:54 UTC (permalink / raw)
  To: David Laight
  Cc: Vasily Gorbik, Sebastian Reichel, Andy Shevchenko,
	Michael Turquette, Stephen Boyd, linux-clk, linux-kernel,
	Niklas Schnelle, kernel

On Wed, 25 Oct 2023 at 22:57, David Laight <David.Laight@aculab.com> wrote:
>
> Doesn't that version end up calling inline functions?
> So won't be usable in static initialisers - the same as statement functions.

David, please either read what I write, or actually test things out,
instead of just adding noise.

It is not a problem to have a _Generic() - or a
__builtin_chooise_expr() - that has an inline function that doesn't
get used. The compiler will remove it.

The problem is literally that you can't use "__auto_type", because
*STATEMENT EXPRESSIONS* do not work outside of function definitions,
because gcc won't even *parse* them, and just says

  error: braced-group within expression allowed only inside a function

exactly like I explained in the comment of the patch that does the _Generic().

Which is why my solution ended up using those inline functions, and
splitting manually by type, instead of an expression statement and
__auto_type.

Whether you then split into different expressions with _Generic() or
by using __builtin_choose_expr() is irrelevant, although once you do
"split by type", _Generic() ends up being much more convenient because
that's what it's designed for (and you don't end up trying to figure
out the size with sizeof() and typeof tricks).

               Linus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
  2023-10-26 16:54             ` Linus Torvalds
@ 2023-10-27  7:24               ` David Laight
  0 siblings, 0 replies; 20+ messages in thread
From: David Laight @ 2023-10-27  7:24 UTC (permalink / raw)
  To: 'Linus Torvalds'
  Cc: Vasily Gorbik, Sebastian Reichel, Andy Shevchenko,
	Michael Turquette, Stephen Boyd, linux-clk, linux-kernel,
	Niklas Schnelle, kernel

From: Linus Torvalds
> Sent: 26 October 2023 17:54
> Subject: Re: [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW
> 
> On Wed, 25 Oct 2023 at 22:57, David Laight <David.Laight@aculab.com> wrote:
> >
> > Doesn't that version end up calling inline functions?
> > So won't be usable in static initialisers - the same as statement functions.
> 
> David, please either read what I write, or actually test things out,
> instead of just adding noise.

I do test a lot of stuff.
For some reason I'd assumed that you would be able to
have a call to a function inside a static initialiser.

Clearly not enough coffee :-(

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-10-27  7:24 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-24 16:18 [PATCH v4 0/3] Fix clock division overflow problem Sebastian Reichel
2023-10-24 16:18 ` [PATCH v4 1/3] math.h: add DIV_ROUND_UP_NO_OVERFLOW Sebastian Reichel
2023-10-24 16:26   ` David Laight
2023-10-24 19:32   ` Linus Torvalds
2023-10-24 22:53     ` Linus Torvalds
2023-10-25  8:03       ` Rasmus Villemoes
2023-10-25 17:37         ` Linus Torvalds
2023-10-25  8:38       ` David Laight
2023-10-25 17:41         ` Linus Torvalds
2023-10-26  8:41           ` David Laight
2023-10-25 15:05       ` Vasily Gorbik
2023-10-25 17:43         ` Linus Torvalds
2023-10-26  8:57           ` David Laight
2023-10-26 16:54             ` Linus Torvalds
2023-10-27  7:24               ` David Laight
2023-10-25 17:36       ` Sebastian Reichel
2023-10-25 17:28     ` Sebastian Reichel
2023-10-24 16:18 ` [PATCH v4 2/3] clk: divider: Fix divisor masking on 64 bit platforms Sebastian Reichel
2023-10-24 16:18 ` [PATCH v4 3/3] clk: composite: replace open-coded abs_diff() Sebastian Reichel
2023-10-24 16:23   ` Andy Shevchenko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.