On Sun, Dec 27, 2020 at 11:03:06PM +0100, Arnd Bergmann wrote: > On Sat, Dec 26, 2020 at 7:42 AM Syed Nayyar Waris wrote: > > > > This macro iterates for each group of bits (clump) with set bits, > > within a bitmap memory region. For each iteration, "start" is set to > > the bit offset of the found clump, while the respective clump value is > > stored to the location pointed by "clump". Additionally, the > > bitmap_get_value() and bitmap_set_value() functions are introduced to > > respectively get and set a value of n-bits in a bitmap memory region. > > The n-bits can have any size from 1 to BITS_PER_LONG. size less > > than 1 or more than BITS_PER_LONG causes undefined behaviour. > > Moreover, during setting value of n-bit in bitmap, if a situation arise > > that the width of next n-bit is exceeding the word boundary, then it > > will divide itself such that some portion of it is stored in that word, > > while the remaining portion is stored in the next higher word. Similar > > situation occurs while retrieving the value from bitmap. > > > > GCC gives warning in bitmap_set_value(): https://godbolt.org/z/rjx34r > > Add explicit check to see if the value being written into the bitmap > > does not fall outside the bitmap. > > The situation that it is falling outside would never be possible in the > > code because the boundaries are required to be correct before the > > function is called. The responsibility is on the caller for ensuring the > > boundaries are correct. > > The code change is simply to silence the GCC warning messages > > because GCC is not aware that the boundaries have already been checked. > > As such, we're better off using __builtin_unreachable() here because we > > can avoid the latency of the conditional check entirely. > > Didn't the __builtin_unreachable() end up leading to an objtool > warning about incorrect stack frames for the code path that leads > into the undefined behavior? I thought I saw a message from the 0day > build bot about that and didn't expect to see it again after that. > > Can you actually measure any performance difference compared > to BUG_ON() that avoids the undefined behavior? Practically > all CPUs from the past 20 years have branch predictors that should > completely hide measurable overhead from this. > > Arnd When I initially recommended using __builtin_unreachable(), I was anticipating the use of bitmap_set_value() in kernel at large -- so the possible performance hit from a conditional check was a concern for me. However, now that we're restricting the scope of bitmap_set_value() to only the GPIO subsystem, such optimization is no longer a major concern I feel: gpio-xilinx is the only driver utilizing bitmap_set_value() -- and we know it won't be called in a loop -- so whatever hypothetical performance hit there might be is inconsequential in the end. Instead, we should focus on code clarity now. I believe it makes sense given the new scope of this function to revert back to the earlier suggestion of passing in and checking the boundary explicitly, and to remove the __builtin_unreachable() call for now. If bitmap_set_value() becomes available to the rest of the kernel in the future, we can reconsider whether or not to use __builtin_unreachable(). William Breathitt Gray