Linux-arch Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v11 0/4] Introduce the for_each_set_clump macro
@ 2020-10-06  9:20 Syed Nayyar Waris
  2020-10-06  9:22 ` [PATCH v11 1/4] bitops: " Syed Nayyar Waris
  2020-10-07  8:38 ` [PATCH v11 0/4] " Linus Walleij
  0 siblings, 2 replies; 7+ messages in thread
From: Syed Nayyar Waris @ 2020-10-06  9:20 UTC (permalink / raw)
  To: linus.walleij, akpm
  Cc: andriy.shevchenko, vilhelm.gray, michal.simek, arnd, rrichter,
	linus.walleij, bgolaszewski, yamada.masahiro, rui.zhang,
	daniel.lezcano, amit.kucheria, linux-arch, linux-gpio,
	linux-kernel, linux-arm-kernel, linux-pm

Hello Linus,

Since this patchset primarily affects GPIO drivers, would you like
to pick it up through your GPIO tree?

This patchset introduces a new generic version of for_each_set_clump. 
The previous version of for_each_set_clump8 used a fixed size 8-bit
clump, but the new generic version can work with clump of any size but
less than or equal to BITS_PER_LONG. The patchset utilizes the new macro 
in several GPIO drivers.

The earlier 8-bit for_each_set_clump8 facilitated a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
XXXXXXXX represents the current 8-bit group:

    Example:        10111110 00000000 11111111 00110011
    First loop:     10111110 00000000 11111111 XXXXXXXX
    Second loop:    10111110 00000000 XXXXXXXX 00110011
    Third loop:     XXXXXXXX 00000000 11111111 00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

But with the new for_each_set_clump the clump size can be different from 8 bits.
Moreover, the clump can be split at word boundary in situations where word 
size is not multiple of clump size. Following are examples showing the working 
of new macro for clump sizes of 24 bits and 6 bits.

Example 1:
clump size: 24 bits, Number of clumps (or ports): 10
bitmap stores the bit information from where successive clumps are retrieved.

     /* bitmap memory region */
        0x00aa0000ff000000;  /* Most significant bits */
        0xaaaaaa0000ff0000;
        0x000000aa000000aa;
        0xbbbbabcdeffedcba;  /* Least significant bits */

Different iterations of for_each_set_clump:-
'offset' is the bit position and 'clump' is the 24 bit clump from the
above bitmap.
Iteration first:        offset: 0 clump: 0xfedcba
Iteration second:       offset: 24 clump: 0xabcdef
Iteration third:        offset: 48 clump: 0xaabbbb
Iteration fourth:       offset: 96 clump: 0xaa
Iteration fifth:        offset: 144 clump: 0xff
Iteration sixth:        offset: 168 clump: 0xaaaaaa
Iteration seventh:      offset: 216 clump: 0xff
Loop breaks because in the end the remaining bits (0x00aa) size was less
than clump size of 24 bits.

In above example it can be seen that in iteration third, the 24 bit clump
that was retrieved was split between bitmap[0] and bitmap[1]. This example 
also shows that 24 bit zeroes if present in between, were skipped (preserving
the previous for_each_set_macro8 behaviour). 

Example 2:
clump size = 6 bits, Number of clumps (or ports) = 3.

     /* bitmap memory region */
        0x00aa0000ff000000;  /* Most significant bits */
        0xaaaaaa0000ff0000;
        0x0f00000000000000;
        0x0000000000000ac0;  /* Least significant bits */

Different iterations of for_each_set_clump:
'offset' is the bit position and 'clump' is the 6 bit clump from the
above bitmap.
Iteration first:        offset: 6 clump: 0x2b
Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
Here 6 * 3 is clump size * no. of clumps.

Changes in v11:
 - [Patch 1/4]: Document range of values 'nbits' can take.
 - [Patch 4/4]: Change variable name 'flag' to 'flags'.

Changes in v10:
 - Patchset based on v5.9-rc1.

Changes in v9:
 - [Patch 4/4]: Remove looping of 'for_each_set_clump' and instead process two 
   halves of a 64-bit bitmap separately or individually. Use normal spin_lock 
   call for second inner lock. And take the spin_lock_init call outside the 'if'
   condition in the probe function of driver.

Changes in v8:
 - [Patch 2/4]: Minor change: Use '__initdata' for correct section mismatch
   in 'clump_test_data' array.

Changes in v7:
 - [Patch 2/4]: Minor changes: Use macro 'DECLARE_BITMAP()' and split 'struct'
   definition and test data.

Changes in v6:
 - [Patch 2/4]: Make 'for loop' inside test_for_each_set_clump more
   succinct.

Changes in v5:
 - [Patch 4/4]: Minor change: Hardcode value for better code readability.

Changes in v4:
 - [Patch 2/4]: Use 'for' loop in test function of for_each_set_clump.
 - [Patch 3/4]: Minor change: Inline value for better code readability.
 - [Patch 4/4]: Minor change: Inline value for better code readability.

Changes in v3:
 - [Patch 3/4]: Change datatype of some variables from u64 to unsigned long
   in function thunderx_gpio_set_multiple.

CHanges in v2:
 - [Patch 2/4]: Unify different tests for 'for_each_set_clump'. Pass test data as
   function parameters.
 - [Patch 2/4]: Remove unnecessary bitmap_zero calls.

Syed Nayyar Waris (4):
  bitops: Introduce the for_each_set_clump macro
  lib/test_bitmap.c: Add for_each_set_clump test cases
  gpio: thunderx: Utilize for_each_set_clump macro
  gpio: xilinx: Utilize generic bitmap_get_value and _set_value

 drivers/gpio/gpio-thunderx.c      |  11 ++-
 drivers/gpio/gpio-xilinx.c        |  64 ++++++-------
 include/asm-generic/bitops/find.h |  19 ++++
 include/linux/bitmap.h            |  63 +++++++++++++
 include/linux/bitops.h            |  13 +++
 lib/find_bit.c                    |  14 +++
 lib/test_bitmap.c                 | 144 ++++++++++++++++++++++++++++++
 7 files changed, 292 insertions(+), 36 deletions(-)


base-commit: 9123e3a74ec7b934a4a099e98af6a61c2f80bbf5
-- 
2.26.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v11 1/4] bitops: Introduce the for_each_set_clump macro
  2020-10-06  9:20 [PATCH v11 0/4] Introduce the for_each_set_clump macro Syed Nayyar Waris
@ 2020-10-06  9:22 ` Syed Nayyar Waris
  2020-10-06 11:27   ` Andy Shevchenko
  2020-10-07  8:38 ` [PATCH v11 0/4] " Linus Walleij
  1 sibling, 1 reply; 7+ messages in thread
From: Syed Nayyar Waris @ 2020-10-06  9:22 UTC (permalink / raw)
  To: linus.walleij, akpm
  Cc: andriy.shevchenko, vilhelm.gray, arnd, linux-arch, linux-kernel

This macro iterates for each group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to
the bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value() and bitmap_set_value() functions are introduced to
respectively get and set a value of n-bits in a bitmap memory region.
The n-bits can have any size less than or equal to BITS_PER_LONG.
Moreover, during setting value of n-bit in bitmap, if a situation arise
that the width of next n-bit is exceeding the word boundary, then it
will divide itself such that some portion of it is stored in that word,
while the remaining portion is stored in the next higher word. Similar
situation occurs while retrieving the value from bitmap.

Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Syed Nayyar Waris <syednwaris@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com>
---
Changes in v11:
 - Document valid range of values that 'nbits' can take.

Changes in v10:
 - No change.

Changes in v9:
 - No change.

Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - No change.

Changes in v4:
 - No change.

Changes in v3:
 - No change.

Changes in v2:
 - No change.

 include/asm-generic/bitops/find.h | 19 ++++++++++
 include/linux/bitmap.h            | 63 +++++++++++++++++++++++++++++++
 include/linux/bitops.h            | 13 +++++++
 lib/find_bit.c                    | 14 +++++++
 4 files changed, 109 insertions(+)

diff --git a/include/asm-generic/bitops/find.h b/include/asm-generic/bitops/find.h
index 9fdf21302fdf..4e6600759455 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -97,4 +97,23 @@ extern unsigned long find_next_clump8(unsigned long *clump,
 #define find_first_clump8(clump, bits, size) \
 	find_next_clump8((clump), (bits), (size), 0)
 
+/**
+ * find_next_clump - find next clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @size: bitmap size in number of bits
+ * @offset: bit offset at which to start searching
+ * @clump_size: clump size in bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+extern unsigned long find_next_clump(unsigned long *clump,
+				      const unsigned long *addr,
+				      unsigned long size, unsigned long offset,
+				      unsigned long clump_size);
+
+#define find_first_clump(clump, bits, size, clump_size) \
+	find_next_clump((clump), (bits), (size), 0, (clump_size))
+
 #endif /*_ASM_GENERIC_BITOPS_FIND_H_ */
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 99058eb81042..6e0cc6877b68 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -75,7 +75,11 @@
  *  bitmap_from_arr32(dst, buf, nbits)          Copy nbits from u32[] buf to dst
  *  bitmap_to_arr32(buf, src, nbits)            Copy nbits from buf to u32[] dst
  *  bitmap_get_value8(map, start)               Get 8bit value from map at start
+ *  bitmap_get_value(map, start, nbits)		Get bit value of size
+ *						'nbits' from map at start
  *  bitmap_set_value8(map, value, start)        Set 8bit value to map at start
+ *  bitmap_set_value(map, value, start, nbits)	Set bit value of size 'nbits'
+ *						of map at start
  *
  * Note, bitmap_zero() and bitmap_fill() operate over the region of
  * unsigned longs, that is, bits behind bitmap till the unsigned long
@@ -563,6 +567,35 @@ static inline unsigned long bitmap_get_value8(const unsigned long *map,
 	return (map[index] >> offset) & 0xFF;
 }
 
+/**
+ * bitmap_get_value - get a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG inclusive).
+ *	nbits less than 1 or more than BITS_PER_LONG causes undefined behaviour.
+ *
+ * Returns value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+static inline unsigned long bitmap_get_value(const unsigned long *map,
+					      unsigned long start,
+					      unsigned long nbits)
+{
+	const size_t index = BIT_WORD(start);
+	const unsigned long offset = start % BITS_PER_LONG;
+	const unsigned long ceiling = roundup(start + 1, BITS_PER_LONG);
+	const unsigned long space = ceiling - start;
+	unsigned long value_low, value_high;
+
+	if (space >= nbits)
+		return (map[index] >> offset) & GENMASK(nbits - 1, 0);
+	else {
+		value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
+		value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
+		return (value_low >> offset) | (value_high << space);
+	}
+}
+
 /**
  * bitmap_set_value8 - set an 8-bit value within a memory region
  * @map: address to the bitmap memory region
@@ -579,6 +612,36 @@ static inline void bitmap_set_value8(unsigned long *map, unsigned long value,
 	map[index] |= value << offset;
 }
 
+/**
+ * bitmap_set_value - set n-bit value within a memory region
+ * @map: address to the bitmap memory region
+ * @value: value of nbits
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG inclusive).
+ *	nbits less than 1 or more than BITS_PER_LONG causes undefined behaviour.
+ */
+static inline void bitmap_set_value(unsigned long *map,
+				    unsigned long value,
+				    unsigned long start, unsigned long nbits)
+{
+	const size_t index = BIT_WORD(start);
+	const unsigned long offset = start % BITS_PER_LONG;
+	const unsigned long ceiling = roundup(start + 1, BITS_PER_LONG);
+	const unsigned long space = ceiling - start;
+
+	value &= GENMASK(nbits - 1, 0);
+
+	if (space >= nbits) {
+		map[index] &= ~(GENMASK(nbits + offset - 1, offset));
+		map[index] |= value << offset;
+	} else {
+		map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
+		map[index + 0] |= value << offset;
+		map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
+		map[index + 1] |= value >> space;
+	}
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __LINUX_BITMAP_H */
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 99f2ac30b1d9..36a445e4a7cc 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -62,6 +62,19 @@ extern unsigned long __sw_hweight64(__u64 w);
 	     (start) < (size); \
 	     (start) = find_next_clump8(&(clump), (bits), (size), (start) + 8))
 
+/**
+ * for_each_set_clump - iterate over bitmap for each clump with set bits
+ * @start: bit offset to start search and to store the current iteration offset
+ * @clump: location to store copy of current 8-bit clump
+ * @bits: bitmap address to base the search on
+ * @size: bitmap size in number of bits
+ * @clump_size: clump size in bits
+ */
+#define for_each_set_clump(start, clump, bits, size, clump_size) \
+	for ((start) = find_first_clump(&(clump), (bits), (size), (clump_size)); \
+	     (start) < (size); \
+	     (start) = find_next_clump(&(clump), (bits), (size), (start) + (clump_size), (clump_size)))
+
 static inline int get_bitmask_order(unsigned int count)
 {
 	int order;
diff --git a/lib/find_bit.c b/lib/find_bit.c
index 49f875f1baf7..1341bd39b32a 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -190,3 +190,17 @@ unsigned long find_next_clump8(unsigned long *clump, const unsigned long *addr,
 	return offset;
 }
 EXPORT_SYMBOL(find_next_clump8);
+
+unsigned long find_next_clump(unsigned long *clump, const unsigned long *addr,
+			       unsigned long size, unsigned long offset,
+			       unsigned long clump_size)
+{
+	offset = find_next_bit(addr, size, offset);
+	if (offset == size)
+		return size;
+
+	offset = rounddown(offset, clump_size);
+	*clump = bitmap_get_value(addr, offset, clump_size);
+	return offset;
+}
+EXPORT_SYMBOL(find_next_clump);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v11 1/4] bitops: Introduce the for_each_set_clump macro
  2020-10-06  9:22 ` [PATCH v11 1/4] bitops: " Syed Nayyar Waris
@ 2020-10-06 11:27   ` Andy Shevchenko
  2020-10-15 22:53     ` Syed Nayyar Waris
  0 siblings, 1 reply; 7+ messages in thread
From: Andy Shevchenko @ 2020-10-06 11:27 UTC (permalink / raw)
  To: Syed Nayyar Waris
  Cc: linus.walleij, akpm, vilhelm.gray, arnd, linux-arch, linux-kernel

On Tue, Oct 06, 2020 at 02:52:16PM +0530, Syed Nayyar Waris wrote:
> This macro iterates for each group of bits (clump) with set bits,
> within a bitmap memory region. For each iteration, "start" is set to
> the bit offset of the found clump, while the respective clump value is
> stored to the location pointed by "clump". Additionally, the
> bitmap_get_value() and bitmap_set_value() functions are introduced to
> respectively get and set a value of n-bits in a bitmap memory region.
> The n-bits can have any size less than or equal to BITS_PER_LONG.
> Moreover, during setting value of n-bit in bitmap, if a situation arise
> that the width of next n-bit is exceeding the word boundary, then it
> will divide itself such that some portion of it is stored in that word,
> while the remaining portion is stored in the next higher word. Similar
> situation occurs while retrieving the value from bitmap.

...

> @@ -75,7 +75,11 @@
>   *  bitmap_from_arr32(dst, buf, nbits)          Copy nbits from u32[] buf to dst
>   *  bitmap_to_arr32(buf, src, nbits)            Copy nbits from buf to u32[] dst
>   *  bitmap_get_value8(map, start)               Get 8bit value from map at start
> + *  bitmap_get_value(map, start, nbits)		Get bit value of size
> + *						'nbits' from map at start
>   *  bitmap_set_value8(map, value, start)        Set 8bit value to map at start
> + *  bitmap_set_value(map, value, start, nbits)	Set bit value of size 'nbits'
> + *						of map at start

Formatting here is done with solely spaces, no TABs.

...

> +/**
> + * bitmap_get_value - get a value of n-bits from the memory region
> + * @map: address to the bitmap memory region
> + * @start: bit offset of the n-bit value
> + * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG inclusive).


> + *	nbits less than 1 or more than BITS_PER_LONG causes undefined behaviour.

Please, detach this from field description and move to a main description.

> + *
> + * Returns value of nbits located at the @start bit offset within the @map
> + * memory region.
> + */

...

> +		return (map[index] >> offset) & GENMASK(nbits - 1, 0);

Have you considered to use rather BIT{_ULL}(nbits) - 1?
It maybe better for code generation.

...

> +/**
> + * bitmap_set_value - set n-bit value within a memory region
> + * @map: address to the bitmap memory region
> + * @value: value of nbits
> + * @start: bit offset of the n-bit value
> + * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG inclusive).

> + *	nbits less than 1 or more than BITS_PER_LONG causes undefined behaviour.

Please, detach this from field description and move to a main description.

> + */

...

> +	value &= GENMASK(nbits - 1, 0);

Ditto.

> +		map[index] &= ~(GENMASK(nbits + offset - 1, offset));

Last time I checked such GENMASK) use, it gave a lot of code when
GENMASK(nbits - 1, 0) << offset works much better, but see also above.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v11 0/4] Introduce the for_each_set_clump macro
  2020-10-06  9:20 [PATCH v11 0/4] Introduce the for_each_set_clump macro Syed Nayyar Waris
  2020-10-06  9:22 ` [PATCH v11 1/4] bitops: " Syed Nayyar Waris
@ 2020-10-07  8:38 ` Linus Walleij
  1 sibling, 0 replies; 7+ messages in thread
From: Linus Walleij @ 2020-10-07  8:38 UTC (permalink / raw)
  To: Syed Nayyar Waris
  Cc: Andrew Morton, Andy Shevchenko, William Breathitt Gray,
	Michal Simek, Arnd Bergmann, Robert Richter, Bartosz Golaszewski,
	Masahiro Yamada, Zhang Rui, Daniel Lezcano,
	(Exiting) Amit Kucheria, Linux-Arch, open list:GPIO SUBSYSTEM,
	linux-kernel, Linux ARM, Linux PM list

On Tue, Oct 6, 2020 at 11:20 AM Syed Nayyar Waris <syednwaris@gmail.com> wrote:

> Since this patchset primarily affects GPIO drivers, would you like
> to pick it up through your GPIO tree?

Definitely will, once we are finished!

I see Andy still has comments and we need more iterations.
That is fine, because we are not in any hurry. Just keep posting
it!

Let's merge this for v5.11 when we are finished with it.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v11 1/4] bitops: Introduce the for_each_set_clump macro
  2020-10-06 11:27   ` Andy Shevchenko
@ 2020-10-15 22:53     ` Syed Nayyar Waris
  2020-10-16  9:17       ` Andy Shevchenko
  0 siblings, 1 reply; 7+ messages in thread
From: Syed Nayyar Waris @ 2020-10-15 22:53 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Linus Walleij, Andrew Morton, William Breathitt Gray,
	Arnd Bergmann, Linux-Arch, Linux Kernel Mailing List

On Tue, Oct 6, 2020 at 4:56 PM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Tue, Oct 06, 2020 at 02:52:16PM +0530, Syed Nayyar Waris wrote:
> > This macro iterates for each group of bits (clump) with set bits,
> > within a bitmap memory region. For each iteration, "start" is set to
> > the bit offset of the found clump, while the respective clump value is
> > stored to the location pointed by "clump". Additionally, the
> > bitmap_get_value() and bitmap_set_value() functions are introduced to
> > respectively get and set a value of n-bits in a bitmap memory region.
> > The n-bits can have any size less than or equal to BITS_PER_LONG.
> > Moreover, during setting value of n-bit in bitmap, if a situation arise
> > that the width of next n-bit is exceeding the word boundary, then it
> > will divide itself such that some portion of it is stored in that word,
> > while the remaining portion is stored in the next higher word. Similar
> > situation occurs while retrieving the value from bitmap.
>
> ...
>
> > @@ -75,7 +75,11 @@
> >   *  bitmap_from_arr32(dst, buf, nbits)          Copy nbits from u32[] buf to dst
> >   *  bitmap_to_arr32(buf, src, nbits)            Copy nbits from buf to u32[] dst
> >   *  bitmap_get_value8(map, start)               Get 8bit value from map at start
> > + *  bitmap_get_value(map, start, nbits)              Get bit value of size
> > + *                                           'nbits' from map at start
> >   *  bitmap_set_value8(map, value, start)        Set 8bit value to map at start
> > + *  bitmap_set_value(map, value, start, nbits)       Set bit value of size 'nbits'
> > + *                                           of map at start
>
> Formatting here is done with solely spaces, no TABs.

Okay. Done

>
> ...
>
> > +/**
> > + * bitmap_get_value - get a value of n-bits from the memory region
> > + * @map: address to the bitmap memory region
> > + * @start: bit offset of the n-bit value
> > + * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG inclusive).
>
>
> > + *   nbits less than 1 or more than BITS_PER_LONG causes undefined behaviour.
>
> Please, detach this from field description and move to a main description.

Okay. Done.
>
> > + *
> > + * Returns value of nbits located at the @start bit offset within the @map
> > + * memory region.
> > + */
>
> ...
>
> > +             return (map[index] >> offset) & GENMASK(nbits - 1, 0);
>
> Have you considered to use rather BIT{_ULL}(nbits) - 1?
> It maybe better for code generation.

Yes I have considered using BIT{_ULL} in earlier versions of patchset.
It has a problem:

This macro when used in both bitmap_get_value and
bitmap_set_value functions, it will give unexpected results when nbits or clump
size is BITS_PER_LONG (32 or 64 depending on arch).

Actually when nbits (clump size) is 64 (BITS_PER_LONG is 64, for example),
(BIT(nbits) - 1)
gives a value of zero and when this zero is ANDed with any value, it
makes it full zero. This is unexpected, and incorrect calculation occurs.

What actually happens is in the macro expansion of BIT(64), that is 1
<< 64, the '1' overflows from leftmost bit position (most significant
bit) and re-enters at the rightmost bit position (least significant
bit), therefore 1 << 64 becomes '0x1', and when another '1' is
subtracted from this, the final result becomes 0.

This is undefined behavior in the C standard (section 6.5.7 in the N1124)

>
> ...
>
> > +/**
> > + * bitmap_set_value - set n-bit value within a memory region
> > + * @map: address to the bitmap memory region
> > + * @value: value of nbits
> > + * @start: bit offset of the n-bit value
> > + * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG inclusive).
>
> > + *   nbits less than 1 or more than BITS_PER_LONG causes undefined behaviour.
>
> Please, detach this from field description and move to a main description.

Okay. Done

>
> > + */
>
> ...
>
> > +     value &= GENMASK(nbits - 1, 0);
>
> Ditto.
>
> > +             map[index] &= ~(GENMASK(nbits + offset - 1, offset));
>
> Last time I checked such GENMASK) use, it gave a lot of code when
> GENMASK(nbits - 1, 0) << offset works much better, but see also above.

Yes I have incorporated your suggestion to use the '<<' operator. Thank You.


>
> --
> With Best Regards,
> Andy Shevchenko
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v11 1/4] bitops: Introduce the for_each_set_clump macro
  2020-10-15 22:53     ` Syed Nayyar Waris
@ 2020-10-16  9:17       ` Andy Shevchenko
  2020-10-16 11:45         ` Syed Nayyar Waris
  0 siblings, 1 reply; 7+ messages in thread
From: Andy Shevchenko @ 2020-10-16  9:17 UTC (permalink / raw)
  To: Syed Nayyar Waris
  Cc: Linus Walleij, Andrew Morton, William Breathitt Gray,
	Arnd Bergmann, Linux-Arch, Linux Kernel Mailing List

On Fri, Oct 16, 2020 at 04:23:05AM +0530, Syed Nayyar Waris wrote:
> On Tue, Oct 6, 2020 at 4:56 PM Andy Shevchenko
> <andriy.shevchenko@linux.intel.com> wrote:
> > On Tue, Oct 06, 2020 at 02:52:16PM +0530, Syed Nayyar Waris wrote:

...

> > > +             return (map[index] >> offset) & GENMASK(nbits - 1, 0);
> >
> > Have you considered to use rather BIT{_ULL}(nbits) - 1?
> > It maybe better for code generation.
> 
> Yes I have considered using BIT{_ULL} in earlier versions of patchset.
> It has a problem:
> 
> This macro when used in both bitmap_get_value and
> bitmap_set_value functions, it will give unexpected results when nbits or clump
> size is BITS_PER_LONG (32 or 64 depending on arch).
> 
> Actually when nbits (clump size) is 64 (BITS_PER_LONG is 64, for example),
> (BIT(nbits) - 1)
> gives a value of zero and when this zero is ANDed with any value, it
> makes it full zero. This is unexpected, and incorrect calculation occurs.
> 
> What actually happens is in the macro expansion of BIT(64), that is 1
> << 64, the '1' overflows from leftmost bit position (most significant
> bit) and re-enters at the rightmost bit position (least significant
> bit), therefore 1 << 64 becomes '0x1', and when another '1' is
> subtracted from this, the final result becomes 0.
> 
> This is undefined behavior in the C standard (section 6.5.7 in the N1124)

I see, indeed, for 64/32 it is like this.

...

> Yes I have incorporated your suggestion to use the '<<' operator. Thank You.

One side note, consider the use round_up() vs. roundup(). I don't remember
which one is optimized to divisor being power of 2.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v11 1/4] bitops: Introduce the for_each_set_clump macro
  2020-10-16  9:17       ` Andy Shevchenko
@ 2020-10-16 11:45         ` Syed Nayyar Waris
  0 siblings, 0 replies; 7+ messages in thread
From: Syed Nayyar Waris @ 2020-10-16 11:45 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Linus Walleij, Andrew Morton, William Breathitt Gray,
	Arnd Bergmann, Linux-Arch, Linux Kernel Mailing List

On Fri, Oct 16, 2020 at 2:46 PM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Fri, Oct 16, 2020 at 04:23:05AM +0530, Syed Nayyar Waris wrote:
> > On Tue, Oct 6, 2020 at 4:56 PM Andy Shevchenko
> > <andriy.shevchenko@linux.intel.com> wrote:
> > > On Tue, Oct 06, 2020 at 02:52:16PM +0530, Syed Nayyar Waris wrote:
>
> ...
>
> > > > +             return (map[index] >> offset) & GENMASK(nbits - 1, 0);
> > >
> > > Have you considered to use rather BIT{_ULL}(nbits) - 1?
> > > It maybe better for code generation.
> >
> > Yes I have considered using BIT{_ULL} in earlier versions of patchset.
> > It has a problem:
> >
> > This macro when used in both bitmap_get_value and
> > bitmap_set_value functions, it will give unexpected results when nbits or clump
> > size is BITS_PER_LONG (32 or 64 depending on arch).
> >
> > Actually when nbits (clump size) is 64 (BITS_PER_LONG is 64, for example),
> > (BIT(nbits) - 1)
> > gives a value of zero and when this zero is ANDed with any value, it
> > makes it full zero. This is unexpected, and incorrect calculation occurs.
> >
> > What actually happens is in the macro expansion of BIT(64), that is 1
> > << 64, the '1' overflows from leftmost bit position (most significant
> > bit) and re-enters at the rightmost bit position (least significant
> > bit), therefore 1 << 64 becomes '0x1', and when another '1' is
> > subtracted from this, the final result becomes 0.
> >
> > This is undefined behavior in the C standard (section 6.5.7 in the N1124)
>
> I see, indeed, for 64/32 it is like this.
>
> ...
>
> > Yes I have incorporated your suggestion to use the '<<' operator. Thank You.
>
> One side note, consider the use round_up() vs. roundup(). I don't remember
> which one is optimized to divisor being power of 2.

Yes. changed 'roundup' to 'round_up'. 'round_up' is optimized for
power-of-2. Thank you.

Syed Nayyar Waris

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-06  9:20 [PATCH v11 0/4] Introduce the for_each_set_clump macro Syed Nayyar Waris
2020-10-06  9:22 ` [PATCH v11 1/4] bitops: " Syed Nayyar Waris
2020-10-06 11:27   ` Andy Shevchenko
2020-10-15 22:53     ` Syed Nayyar Waris
2020-10-16  9:17       ` Andy Shevchenko
2020-10-16 11:45         ` Syed Nayyar Waris
2020-10-07  8:38 ` [PATCH v11 0/4] " Linus Walleij

Linux-arch Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-arch/0 linux-arch/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-arch linux-arch/ https://lore.kernel.org/linux-arch \
		linux-arch@vger.kernel.org
	public-inbox-index linux-arch

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-arch


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git