On Mon, Nov 09, 2020 at 08:18:51PM +0530, Syed Nayyar Waris wrote: > On Mon, Nov 9, 2020 at 8:09 PM William Breathitt Gray > wrote: > > > > On Mon, Nov 09, 2020 at 08:41:28AM -0500, William Breathitt Gray wrote: > > > On Mon, Nov 09, 2020 at 06:04:11PM +0530, Syed Nayyar Waris wrote: > > > > On Sun, Nov 01, 2020 at 09:08:29PM +0100, Arnd Bergmann wrote: > > > > > On Sun, Nov 1, 2020 at 4:00 PM William Breathitt Gray > > > > > wrote: > > > > > > > > > > > > On Thu, Oct 29, 2020 at 11:44:47PM +0100, Arnd Bergmann wrote: > > > > > > > On Sun, Oct 18, 2020 at 11:44 PM Syed Nayyar Waris wrote: > > > > > > > > > > > > > > > > This patch reimplements the xgpio_set_multiple() function in > > > > > > > > drivers/gpio/gpio-xilinx.c to use the new generic functions: > > > > > > > > bitmap_get_value() and bitmap_set_value(). The code is now simpler > > > > > > > > to read and understand. Moreover, instead of looping for each bit > > > > > > > > in xgpio_set_multiple() function, now we can check each channel at > > > > > > > > a time and save cycles. > > > > > > > > > > > > > > This now causes -Wtype-limits warnings in linux-next with gcc-10: > > > > > > > > > > > > Hi Arnd, > > > > > > > > > > > > What version of gcc-10 are you running? I'm having trouble generating > > > > > > these warnings so I suspect I'm using a different version than you. > > > > > > > > > > I originally saw it with the binaries from > > > > > https://mirrors.edge.kernel.org/pub/tools/crosstool/, but I have > > > > > also been able to reproduce it with a minimal test case on the > > > > > binaries from godbolt.org, see https://godbolt.org/z/Wq8q4n > > > > > > > > > > > Let me first verify that I understand the problem correctly. The issue > > > > > > is the possibility of a stack smash in bitmap_set_value() when the value > > > > > > of start + nbits is larger than the length of the map bitmap memory > > > > > > region. This is because index (or index + 1) could be outside the range > > > > > > of the bitmap memory region passed in as map. Is my understanding > > > > > > correct here? > > > > > > > > > > Yes, that seems to be the case here. > > > > > > > > > > > In xgpio_set_multiple(), the variables width[0] and width[1] serve as > > > > > > possible start and nbits values for the bitmap_set_value() calls. > > > > > > Because width[0] and width[1] are unsigned int variables, GCC considers > > > > > > the possibility that the value of width[0]/width[1] might exceed the > > > > > > length of the bitmap memory region named old and thus result in a stack > > > > > > smash. > > > > > > > > > > > > I don't know if invalid width values are actually possible for the > > > > > > Xilinx gpio device, but let's err on the side of safety and assume this > > > > > > is actually a possibility. We should verify that the combined value of > > > > > > gpio_width[0] + gpio_width[1] does not exceed 64 bits; we can add a > > > > > > check for this in xgpio_probe() when we grab the gpio_width values. > > > > > > > > > > > > However, we're still left with the GCC warnings because GCC is not smart > > > > > > enough to know that we've already checked the boundary and width[0] and > > > > > > width[1] are valid values. I suspect we can avoid this warning is we > > > > > > refactor bitmap_set_value() to increment map seperately and then set it: > > > > > > > > > > As I understand it, part of the problem is that gcc sees the possible > > > > > range as being constrained by the operations on 'start' and 'nbits', > > > > > in particular the shift in BIT_WORD() that put an upper bound on > > > > > the index, but then it sees that the upper bound is higher than the > > > > > upper bound of the array, i.e. element zero. > > > > > > > > > > I added a check > > > > > > > > > > if (start >= 64 || start + size >= 64) return; > > > > > > > > > > in the godbolt.org testcase, which does help limit the start > > > > > index appropriately, but it is not sufficient to let the compiler > > > > > see that the 'if (space >= nbits) ' condition is guaranteed to > > > > > be true for all values here. > > > > > > > > > > > static inline void bitmap_set_value(unsigned long *map, > > > > > > unsigned long value, > > > > > > unsigned long start, unsigned long nbits) > > > > > > { > > > > > > const unsigned long offset = start % BITS_PER_LONG; > > > > > > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG); > > > > > > const unsigned long space = ceiling - start; > > > > > > > > > > > > map += BIT_WORD(start); > > > > > > value &= GENMASK(nbits - 1, 0); > > > > > > > > > > > > if (space >= nbits) { > > > > > > *map &= ~(GENMASK(nbits - 1, 0) << offset); > > > > > > *map |= value << offset; > > > > > > } else { > > > > > > *map &= ~BITMAP_FIRST_WORD_MASK(start); > > > > > > *map |= value << offset; > > > > > > map++; > > > > > > *map &= ~BITMAP_LAST_WORD_MASK(start + nbits); > > > > > > *map |= value >> space; > > > > > > } > > > > > > } > > > > > > > > > > > > This avoids adding a costly conditional check inside bitmap_set_value() > > > > > > when almost all bitmap_set_value() calls will have static arguments with > > > > > > well-defined and obvious boundaries. > > > > > > > > > > > > Do you think this would be an acceptable solution to resolve your GCC > > > > > > warnings? > > > > > > > > > > Unfortunately, it does not seem to make a difference, as gcc still > > > > > knows that this compiles to the same result, and it produces the same > > > > > warning as before (see https://godbolt.org/z/rjx34r) > > > > > > > > > > Arnd > > > > > > > > Hi Arnd, > > > > > > > > Sharing a different version of bitmap_set_valuei() function. See below. > > > > > > > > Let me know if the below solution looks good to you and if it resolves > > > > the above compiler warning. > > > > > > > > > > > > @@ -1,5 +1,5 @@ > > > > static inline void bitmap_set_value(unsigned long *map, > > > > - unsigned long value, > > > > + unsigned long value, const size_t length, > > > > unsigned long start, unsigned long nbits) > > > > { > > > > const size_t index = BIT_WORD(start); > > > > @@ -7,6 +7,9 @@ static inline void bitmap_set_value(unsigned long *map, > > > > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG); > > > > const unsigned long space = ceiling - start; > > > > > > > > + if (index >= length) > > > > + return; > > > > + > > > > value &= GENMASK(nbits - 1, 0); > > > > > > > > if (space >= nbits) { > > > > @@ -15,6 +18,10 @@ static inline void bitmap_set_value(unsigned long *map, > > > > } else { > > > > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start); > > > > map[index + 0] |= value << offset; > > > > + > > > > + if (index + 1 >= length) > > > > + return; > > > > + > > > > map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits); > > > > map[index + 1] |= value >> space; > > > > } > > > > > > One of my concerns is that we're incurring the latency two additional > > > conditional checks just to suppress a compiler warning about a case that > > > wouldn't occur in the actual use of bitmap_set_value(). I'm hoping > > > there's a way for us to suppress these warnings without adding onto the > > > latency of this function; given that bitmap_set_value() is intended to > > > be used in loops, conditionals here could significantly increase latency > > > in drivers. > > > > > > I wonder if array_index_nospec() might have the side effect of > > > suppressing these warnings for us. For example, would this work: > > > > > > static inline void bitmap_set_value(unsigned long *map, > > > unsigned long value, > > > unsigned long start, unsigned long nbits) > > > { > > > const unsigned long offset = start % BITS_PER_LONG; > > > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG); > > > const unsigned long space = ceiling - start; > > > size_t index = BIT_WORD(start); > > > > > > value &= GENMASK(nbits - 1, 0); > > > > > > if (space >= nbits) { > > > index = array_index_nospec(index, index + 1); > > > > > > map[index] &= ~(GENMASK(nbits - 1, 0) << offset); > > > map[index] |= value << offset; > > > } else { > > > index = array_index_nospec(index, index + 2); > > > > > > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start); > > > map[index + 0] |= value << offset; > > > map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits); > > > map[index + 1] |= value >> space; > > > } > > > } > > > > > > Or is this going to produce the same warning because we're not using an > > > explicit check against the map array size? > > > > > > William Breathitt Gray > > > > After testing my suggestion, it looks like the warnings are still > > present. :-( > > > > Something else I've also considered is perhaps using the GCC built-in > > function __builtin_unreachable() instead of returning. So in Syed's code > > we would have the following instead: > > > > if (index + 1 >= length) > > __builtin_unreachable(); > > > > This might allow GCC to optimize better and avoid the conditional check > > all together, thus avoiding latency while also hinting enough context to > > the compiler to suppress the warnings. > > > > William Breathitt Gray > > I also thought of another optimization. Arnd, William, let me know > what you think about it. > > Since exceeding the array limit is a rather rare event, we can use the > gcc extension: 'unlikely' for the boundary checks. > We can use it at the two places where 'index' and 'index + 1' is being > checked against the boundary limit. > > It might help optimize the code. Wouldn't it? > > Syed Nayyar Waris We probably don't need unlikely() because __builtin_unreachable() should suffice to inform GCC that this condition will never occur -- in other words, GCC will compile optimized code to avoid the conditional entirely. By the way, I think we only need the (index + 1 >= length) check; the first index conditional check is not needed and does not affect the warnings at all, so we might as well get rid of it. William Breathitt Gray