Message ID  CAM7yPRBPP6SFzdmwWF5Y99g+aWcp=OY9Uvp5h1MSDPmsORNw@mail.gmail.com 

State  New, archived 
Headers  show 
Series 

Related  show 
On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <ppbuk5246@gmail.com> wrote: > > On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <yury.norov@gmail.com> wrote: > > > Also look at lib/find_bit_benchmark.c > Thanks. I'll see. > > > We need find_next_*_bit() because find_first_*_bit() can start searching only at wordaligned > > bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct, > > for the purpose of reverse traversing we can go with already existing find_last_bit(), > > Thank you. I haven't thought that way. > But I think if we implement reverse traversing using find_last_bit(), > we have a problem. > Suppose the last bit 0, 1, 2, is set. > If we start > find_last_bit(bitmap, 3) ==> return 2; > find_last_bit(bitmap, 2) ==> return 1; > find_last_bit(bitmap, 1) ==> return 0; > find_last_bit(bitmap, 0) ===> return 0? // here we couldn't > distinguish size 0 input or 0 is set If you traverse backward and reach bit #0, you're done. No need to continue. > > and the for_each traverse routine prevent above case by returning size > (nbits) using find_next_bit. > So, for compatibility and the same expected return value like next traversing, > I think we need to find_prev_*_bit routine. if my understanding is correct. > > > > I think this patch has some good catches. We definitely need to implement > > find_last_zero_bit(), as it is used by fs/ufs, and their local implementation is not optimal. > > > > We also should consider adding reverse traversing macros based on find_last_*_bit(), > > if there are proposed users. > > Not only this, I think 'steal_from_bitmap_to_front' can be improved > using ffind_prev_zero_bit > like > > diff git a/fs/btrfs/freespacecache.c b/fs/btrfs/freespacecache.c > index af0013d3df63..9debb9707390 100644 >  a/fs/btrfs/freespacecache.c > +++ b/fs/btrfs/freespacecache.c > @@ 2372,7 +2372,6 @@ static bool steal_from_bitmap_to_front(struct > btrfs_free_space_ctl *ctl, > u64 bitmap_offset; > unsigned long i; > unsigned long j; >  unsigned long prev_j; > u64 bytes; > > bitmap_offset = offset_to_bitmap(ctl, info>offset); > @@ 2388,20 +2387,15 @@ static bool steal_from_bitmap_to_front(struct > btrfs_free_space_ctl *ctl, > return false; > > i = offset_to_bit(bitmap>offset, ctl>unit, info>offset)  1; >  j = 0; >  prev_j = (unsigned long)1; >  for_each_clear_bit_from(j, bitmap>bitmap, BITS_PER_BITMAP) { >  if (j > i) >  break; >  prev_j = j; >  } >  if (prev_j == i) > + j = find_prev_zero_bit(bitmap>bitmap, BITS_PER_BITMAP, i); This one may be implemented with find_last_zero_bit() as well: unsigned log j = find_last_zero_bit(bitmap, BITS_PER_BITMAP); if (j <= i  j >= BITS_PER_BITMAP) return false; I believe the latter version is better because find_last_*_bit() is simpler in implementation (and partially exists), has less parameters, and therefore simpler for users, and doesn't introduce functionality duplication. The only consideration I can imagine to advocate find_prev*() is the performance advantage in the scenario when we know for sure that first N bits of bitmap are all set/clear, and we can bypass traversing that area. But again, in this case we can pass the bitmap address with the appropriate offset, and stay with find_last_*() > + > + if (j == i) > return false; > >  if (prev_j == (unsigned long)1) > + if (j == BITS_PER_BITMAP) > bytes = (i + 1) * ctl>unit; > else >  bytes = (i  prev_j) * ctl>unit; > + bytes = (i  j) * ctl>unit; > > info>offset = bytes; > info>bytes += bytes; > > Thanks. > > HTH > Levi.
On Thu, Dec 3, 2020 at 6:26 AM Yury Norov <yury.norov@gmail.com> wrote: > > On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <ppbuk5246@gmail.com> wrote: > > > > On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <yury.norov@gmail.com> wrote: > > > > > Also look at lib/find_bit_benchmark.c > > Thanks. I'll see. > > > > > We need find_next_*_bit() because find_first_*_bit() can start searching only at wordaligned > > > bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct, > > > for the purpose of reverse traversing we can go with already existing find_last_bit(), > > > > Thank you. I haven't thought that way. > > But I think if we implement reverse traversing using find_last_bit(), > > we have a problem. > > Suppose the last bit 0, 1, 2, is set. > > If we start > > find_last_bit(bitmap, 3) ==> return 2; > > find_last_bit(bitmap, 2) ==> return 1; > > find_last_bit(bitmap, 1) ==> return 0; > > find_last_bit(bitmap, 0) ===> return 0? // here we couldn't > > distinguish size 0 input or 0 is set > > If you traverse backward and reach bit #0, you're done. No need to continue. I think the case when I consider the this macro like #define for_each_clear_bit_reverse(bit, addr, size) for ((bit) = find_last_zero_bit((addr), (size)) (bit) < (size); (bit) = find_prev_zero_bit((addr), (size), (bit))) If we implement the above macro only with find_last_zero_bit, I think there is no way without adding any additional variable to finish loop. But I don't want to add additional variable to sustain same format with for_each_clear_bit, That's why i decide to implement find_prev_*_bit series. I don't know it's correct thinking or biased. Am I wrong? > > > > > and the for_each traverse routine prevent above case by returning size > > (nbits) using find_next_bit. > > So, for compatibility and the same expected return value like next traversing, > > I think we need to find_prev_*_bit routine. if my understanding is correct. > > > > > > > I think this patch has some good catches. We definitely need to implement > > > find_last_zero_bit(), as it is used by fs/ufs, and their local implementation is not optimal. > > > > > > We also should consider adding reverse traversing macros based on find_last_*_bit(), > > > if there are proposed users. > > > > Not only this, I think 'steal_from_bitmap_to_front' can be improved > > using ffind_prev_zero_bit > > like > > > > diff git a/fs/btrfs/freespacecache.c b/fs/btrfs/freespacecache.c > > index af0013d3df63..9debb9707390 100644 > >  a/fs/btrfs/freespacecache.c > > +++ b/fs/btrfs/freespacecache.c > > @@ 2372,7 +2372,6 @@ static bool steal_from_bitmap_to_front(struct > > btrfs_free_space_ctl *ctl, > > u64 bitmap_offset; > > unsigned long i; > > unsigned long j; > >  unsigned long prev_j; > > u64 bytes; > > > > bitmap_offset = offset_to_bitmap(ctl, info>offset); > > @@ 2388,20 +2387,15 @@ static bool steal_from_bitmap_to_front(struct > > btrfs_free_space_ctl *ctl, > > return false; > > > > i = offset_to_bit(bitmap>offset, ctl>unit, info>offset)  1; > >  j = 0; > >  prev_j = (unsigned long)1; > >  for_each_clear_bit_from(j, bitmap>bitmap, BITS_PER_BITMAP) { > >  if (j > i) > >  break; > >  prev_j = j; > >  } > >  if (prev_j == i) > > + j = find_prev_zero_bit(bitmap>bitmap, BITS_PER_BITMAP, i); > > This one may be implemented with find_last_zero_bit() as well: > > unsigned log j = find_last_zero_bit(bitmap, BITS_PER_BITMAP); > if (j <= i  j >= BITS_PER_BITMAP) > return false; > Actually, in that code, we don't need to check the bit after i. Originally, if my understanding is correct, former code tries to find the last 0 bit before i. and if all bits are fully set before i, it use next one as i + 1 that's why i think the if condition should be if (j >= i) But above condition couldn't the discern the case when all bits are fully set before i. Also, I think we don't need to check the bit after i and In this case, find_prev_zero_bit which specifies the start point is clear to show the meaning of the code. > I believe the latter version is better because find_last_*_bit() is simpler in > implementation (and partially exists), has less parameters, and therefore > simpler for users, and doesn't introduce functionality duplication. > > The only consideration I can imagine to advocate find_prev*() is the performance > advantage in the scenario when we know for sure that first N bits of > bitmap are all > set/clear, and we can bypass traversing that area. But again, in this > case we can pass the > bitmap address with the appropriate offset, and stay with find_last_*() > > > + > > + if (j == i) > > return false; > > > >  if (prev_j == (unsigned long)1) > > + if (j == BITS_PER_BITMAP) > > bytes = (i + 1) * ctl>unit; > > else > >  bytes = (i  prev_j) * ctl>unit; > > + bytes = (i  j) * ctl>unit; > > > > info>offset = bytes; > > info>bytes += bytes; > > > > Thanks. > > > > HTH > > Levi. Thanks but
On Thu, Dec 3, 2020 at 7:51 AM Yun Levi <ppbuk5246@gmail.com> wrote: > > On Thu, Dec 3, 2020 at 6:26 AM Yury Norov <yury.norov@gmail.com> wrote: > > > > On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <ppbuk5246@gmail.com> wrote: > > > > > > On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <yury.norov@gmail.com> wrote: > > > > > > > Also look at lib/find_bit_benchmark.c > > > Thanks. I'll see. > > > > > > > We need find_next_*_bit() because find_first_*_bit() can start searching only at wordaligned > > > > bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct, > > > > for the purpose of reverse traversing we can go with already existing find_last_bit(), > > > > > > Thank you. I haven't thought that way. > > > But I think if we implement reverse traversing using find_last_bit(), > > > we have a problem. > > > Suppose the last bit 0, 1, 2, is set. > > > If we start > > > find_last_bit(bitmap, 3) ==> return 2; > > > find_last_bit(bitmap, 2) ==> return 1; > > > find_last_bit(bitmap, 1) ==> return 0; > > > find_last_bit(bitmap, 0) ===> return 0? // here we couldn't > > > distinguish size 0 input or 0 is set > > > > If you traverse backward and reach bit #0, you're done. No need to continue. > I think the case when I consider the this macro like > > #define for_each_clear_bit_reverse(bit, addr, size) > for ((bit) = find_last_zero_bit((addr), (size)) > (bit) < (size); > (bit) = find_prev_zero_bit((addr), (size), (bit))) > > If we implement the above macro only with find_last_zero_bit, > I think there is no way without adding any additional variable to finish loop. > But I don't want to add additional variable to sustain same format > with for_each_clear_bit, > That's why i decide to implement find_prev_*_bit series. > > I don't know it's correct thinking or biased. Am I wrong? > > > > > > > > > and the for_each traverse routine prevent above case by returning size > > > (nbits) using find_next_bit. > > > So, for compatibility and the same expected return value like next traversing, > > > I think we need to find_prev_*_bit routine. if my understanding is correct. > > > > > > > > > > I think this patch has some good catches. We definitely need to implement > > > > find_last_zero_bit(), as it is used by fs/ufs, and their local implementation is not optimal. > > > > > > > > We also should consider adding reverse traversing macros based on find_last_*_bit(), > > > > if there are proposed users. > > > > > > Not only this, I think 'steal_from_bitmap_to_front' can be improved > > > using ffind_prev_zero_bit > > > like > > > > > > diff git a/fs/btrfs/freespacecache.c b/fs/btrfs/freespacecache.c > > > index af0013d3df63..9debb9707390 100644 > > >  a/fs/btrfs/freespacecache.c > > > +++ b/fs/btrfs/freespacecache.c > > > @@ 2372,7 +2372,6 @@ static bool steal_from_bitmap_to_front(struct > > > btrfs_free_space_ctl *ctl, > > > u64 bitmap_offset; > > > unsigned long i; > > > unsigned long j; > > >  unsigned long prev_j; > > > u64 bytes; > > > > > > bitmap_offset = offset_to_bitmap(ctl, info>offset); > > > @@ 2388,20 +2387,15 @@ static bool steal_from_bitmap_to_front(struct > > > btrfs_free_space_ctl *ctl, > > > return false; > > > > > > i = offset_to_bit(bitmap>offset, ctl>unit, info>offset)  1; > > >  j = 0; > > >  prev_j = (unsigned long)1; > > >  for_each_clear_bit_from(j, bitmap>bitmap, BITS_PER_BITMAP) { > > >  if (j > i) > > >  break; > > >  prev_j = j; > > >  } > > >  if (prev_j == i) > > > + j = find_prev_zero_bit(bitmap>bitmap, BITS_PER_BITMAP, i); > > > > This one may be implemented with find_last_zero_bit() as well: > > > > unsigned log j = find_last_zero_bit(bitmap, BITS_PER_BITMAP); > > if (j <= i  j >= BITS_PER_BITMAP) > > return false; > > > Actually, in that code, we don't need to check the bit after i. > Originally, if my understanding is correct, former code tries to find > the last 0 bit before i. > and if all bits are fully set before i, it use next one as i + 1 > > that's why i think the if condition should be > if (j >= i) > > But above condition couldn't the discern the case when all bits are > fully set before i. > Also, I think we don't need to check the bit after i and In this case, > find_prev_zero_bit which > specifies the start point is clear to show the meaning of the code. > > > > I believe the latter version is better because find_last_*_bit() is simpler in > > implementation (and partially exists), has less parameters, and therefore > > simpler for users, and doesn't introduce functionality duplication. I think it's not duplication. Actually, former you teach me find_first_*_bit should be start wordaligned bit, But as find_first_*_bit declares it as "size of bitmap" not a start offset. Though the bitmap size it's wordaligned, it doesn't matter to fine first bit in the specified size of bitmap (it no, it will return just size of bitmap) Likewise, find_last_*_bit is also similar in context. Fundamentally, it's not a start offset of bitmap but I think it just size of bitmap. That's the reason why we need to find_next_*_bit to start at the specified offset. In this matter, I think it's better to have find_prev_*_bit. So, I think we can use both of these functions to be used to achieve a goal. But, each function has different concept actually that's why I don't think it's not duplication. if my understanding is wrong.. Forgive me. and let me know.. Thanks. > > > > The only consideration I can imagine to advocate find_prev*() is the performance > > advantage in the scenario when we know for sure that first N bits of > > bitmap are all > > set/clear, and we can bypass traversing that area. But again, in this > > case we can pass the > > bitmap address with the appropriate offset, and stay with find_last_*() > > > > > + > > > + if (j == i) > > > return false; > > > > > >  if (prev_j == (unsigned long)1) > > > + if (j == BITS_PER_BITMAP) > > > bytes = (i + 1) * ctl>unit; > > > else > > >  bytes = (i  prev_j) * ctl>unit; > > > + bytes = (i  j) * ctl>unit; > > > > > > info>offset = bytes; > > > info>bytes += bytes; > > > > > > Thanks. > > > > > > HTH > > > Levi. > > Thanks but
On 03/12/2020 02.23, Yun Levi wrote: > On Thu, Dec 3, 2020 at 7:51 AM Yun Levi <ppbuk5246@gmail.com> wrote: >> >> On Thu, Dec 3, 2020 at 6:26 AM Yury Norov <yury.norov@gmail.com> wrote: >>> >>> On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <ppbuk5246@gmail.com> wrote: >>>> >>>> On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <yury.norov@gmail.com> wrote: >>>> >>>>> Also look at lib/find_bit_benchmark.c >>>> Thanks. I'll see. >>>> >>>>> We need find_next_*_bit() because find_first_*_bit() can start searching only at wordaligned >>>>> bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct, >>>>> for the purpose of reverse traversing we can go with already existing find_last_bit(), >>>> >>>> Thank you. I haven't thought that way. >>>> But I think if we implement reverse traversing using find_last_bit(), >>>> we have a problem. >>>> Suppose the last bit 0, 1, 2, is set. >>>> If we start >>>> find_last_bit(bitmap, 3) ==> return 2; >>>> find_last_bit(bitmap, 2) ==> return 1; >>>> find_last_bit(bitmap, 1) ==> return 0; >>>> find_last_bit(bitmap, 0) ===> return 0? // here we couldn't Either just make the return type of all find_prev/find_last be signed int and use 1 as the sentinel to indicate "no such position exists", so the loop condition would be foo >= 0. Or, change the condition from "stop if we get the size returned" to "only continue if we get something strictly less than the size we passed in (i.e., something which can possibly be a valid bit index). In the latter case, both (unsigned)1 aka UINT_MAX and the actual size value passed work equally well as a sentinel. If one uses UINT_MAX, a for_each_bit_reverse() macro would just be something like for (i = find_last_bit(bitmap, size); i < size; i = find_last_bit(bitmap, i)) if one wants to use the size argument as the sentinel, the caller would have to supply a scratch variable to keep track of the last i value: for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i = find_last_bit(bitmap, j)) which is probably a little less ergonomic. Rasmus
> If one uses UINT_MAX, a for_each_bit_reverse() macro would just be > something like > > for (i = find_last_bit(bitmap, size); i < size; i = > find_last_bit(bitmap, i)) > > if one wants to use the size argument as the sentinel, the caller would > have to supply a scratch variable to keep track of the last i value: > > for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i = > find_last_bit(bitmap, j)) > > which is probably a little less ergonomic. Actually Because I want to avoid the modification of return type of find_last_*_bit for new sentinel, I add find_prev_*_bit. the big difference between find_last_bit and find_prev_bit is find_last_bit doesn't check the size bit and use sentinel with size. but find_prev_bit check the offset bit and use sentinel with size which passed by another argument. So if we use find_prev_bit, we could have a clear iteration if using find_prev_bit like. #define for_each_set_bit_reverse(bit, addr, size) \ for ((bit) = find_last_bit((addr), (size)); \ (bit) < (size); \ (bit) = find_prev_bit((addr), (size), (bit  1))) #define for_each_set_bit_from_reverse(bit, addr, size) \ for ((bit) = find_prev_bit((addr), (size), (bit)); \ (bit) < (size); \ (bit) = find_prev_bit((addr), (size), (bit  1))) Though find_prev_*_bit / find_last_*_bit have the same functionality. But they also have a small difference. I think this small this small difference doesn't make some of confusion to user but it help to solve problem with a simple way (just like the iteration above). So I think I need, find_prev_*_bit series. Am I missing anything? Thanks. Levi. On Thu, Dec 3, 2020 at 5:33 PM Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > > On 03/12/2020 02.23, Yun Levi wrote: > > On Thu, Dec 3, 2020 at 7:51 AM Yun Levi <ppbuk5246@gmail.com> wrote: > >> > >> On Thu, Dec 3, 2020 at 6:26 AM Yury Norov <yury.norov@gmail.com> wrote: > >>> > >>> On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <ppbuk5246@gmail.com> wrote: > >>>> > >>>> On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <yury.norov@gmail.com> wrote: > >>>> > >>>>> Also look at lib/find_bit_benchmark.c > >>>> Thanks. I'll see. > >>>> > >>>>> We need find_next_*_bit() because find_first_*_bit() can start searching only at wordaligned > >>>>> bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct, > >>>>> for the purpose of reverse traversing we can go with already existing find_last_bit(), > >>>> > >>>> Thank you. I haven't thought that way. > >>>> But I think if we implement reverse traversing using find_last_bit(), > >>>> we have a problem. > >>>> Suppose the last bit 0, 1, 2, is set. > >>>> If we start > >>>> find_last_bit(bitmap, 3) ==> return 2; > >>>> find_last_bit(bitmap, 2) ==> return 1; > >>>> find_last_bit(bitmap, 1) ==> return 0; > >>>> find_last_bit(bitmap, 0) ===> return 0? // here we couldn't > > Either just make the return type of all find_prev/find_last be signed > int and use 1 as the sentinel to indicate "no such position exists", so > the loop condition would be foo >= 0. Or, change the condition from > "stop if we get the size returned" to "only continue if we get something > strictly less than the size we passed in (i.e., something which can > possibly be a valid bit index). In the latter case, both (unsigned)1 > aka UINT_MAX and the actual size value passed work equally well as a > sentinel. > > If one uses UINT_MAX, a for_each_bit_reverse() macro would just be > something like > > for (i = find_last_bit(bitmap, size); i < size; i = > find_last_bit(bitmap, i)) > > if one wants to use the size argument as the sentinel, the caller would > have to supply a scratch variable to keep track of the last i value: > > for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i = > find_last_bit(bitmap, j)) > > which is probably a little less ergonomic. > > Rasmus
Yun, could you please stop topposting and excessive trimming in the thread? On Thu, Dec 3, 2020 at 1:47 AM Yun Levi <ppbuk5246@gmail.com> wrote: > > Either just make the return type of all find_prev/find_last be signed > > int and use 1 as the sentinel to indicate "no such position exists", so > > the loop condition would be foo >= 0. Or, change the condition from > > "stop if we get the size returned" to "only continue if we get something > > strictly less than the size we passed in (i.e., something which can > > possibly be a valid bit index). In the latter case, both (unsigned)1 > > aka UINT_MAX and the actual size value passed work equally well as a > > sentinel. > > > > If one uses UINT_MAX, a for_each_bit_reverse() macro would just be > > something like > > > > for (i = find_last_bit(bitmap, size); i < size; i = > > find_last_bit(bitmap, i)) > > > > if one wants to use the size argument as the sentinel, the caller would > > have to supply a scratch variable to keep track of the last i value: > > > > for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i = > > find_last_bit(bitmap, j)) > > > > which is probably a little less ergonomic. > > > > Rasmus I would prefer to avoid changing the find*bit() semantics. As for now, if any of find_*_bit() finds nothing, it returns the size of the bitmap it was passed. Changing this for a single function would break the consistency, and may cause problems for those who rely on existing behaviour. Passing nonpositive size to find_*_bit() should produce undefined behaviour, because we cannot dereference a pointer to the bitmap in this case; this is most probably a sign of a problem on a caller side anyways. Let's keep this logic unchanged? > Actually Because I want to avoid the modification of return type of > find_last_*_bit for new sentinel, > I add find_prev_*_bit. > the big difference between find_last_bit and find_prev_bit is > find_last_bit doesn't check the size bit and use sentinel with size. > but find_prev_bit check the offset bit and use sentinel with size > which passed by another argument. > So if we use find_prev_bit, we could have a clear iteration if > using find_prev_bit like. > > #define for_each_set_bit_reverse(bit, addr, size) \ > for ((bit) = find_last_bit((addr), (size)); \ > (bit) < (size); \ > (bit) = find_prev_bit((addr), (size), (bit  1))) > > #define for_each_set_bit_from_reverse(bit, addr, size) \ > for ((bit) = find_prev_bit((addr), (size), (bit)); \ > (bit) < (size); \ > (bit) = find_prev_bit((addr), (size), (bit  1))) > > Though find_prev_*_bit / find_last_*_bit have the same functionality. > But they also have a small difference. > I think this small this small difference doesn't make some of > confusion to user but it help to solve problem > with a simple way (just like the iteration above). > > So I think I need, find_prev_*_bit series. > > Am I missing anything? > > Thanks. > > Levi. As you said, find_last_bit() and proposed find_prev_*_bit() have the same functionality. If you really want to have find_prev_*_bit(), could you please at least write it using find_last_bit(), otherwise it would be just a blottering. Regarding reverse search, we can probably do like this (not tested, just an idea): #define for_each_set_bit_reverse(bit, addr, size) \ for ((bit) = find_last_bit((addr), (size)); \ (bit) < (size); \ (size) = (bit), (bit) = find_last_bit((addr), (bit))) Thanks, Yury
On Thu, Dec 03, 2020 at 10:46:25AM 0800, Yury Norov wrote:
> Yun, could you please stop topposting and excessive trimming in the thread?
And reconfigure the mail agent to make the "Subject" field appear and
fill it.
Willy
>On Fri, Dec 4, 2020 at 3:53 AM Willy Tarreau <w@1wt.eu> wrote: > > On Thu, Dec 03, 2020 at 10:46:25AM 0800, Yury Norov wrote: > > Yun, could you please stop topposting and excessive trimming in the thread? > > And reconfigure the mail agent to make the "Subject" field appear and > fill it. >On Thu, Dec 03, 2020 at 10:46:25AM 0800, Yury Norov wrote: > Yun, could you please stop topposting and excessive trimming in the thread? Sorry to make you uncomfortable... Thanks for advice. >On Thu, Dec 03, 2020 at 10:46:25AM 0800, Yury Norov wrote: > As you said, find_last_bit() and proposed find_prev_*_bit() have the > same functionality. > If you really want to have find_prev_*_bit(), could you please at > least write it using find_last_bit(), otherwise it would be just a > blottering. Actually find_prev_*_bit call _find_prev_bit which is a common helper function like _find_next_bit. As you know this function is required to support __BIGEDIAN's little endian search. find_prev_bit actually wrapper of _find_prev_bit which have a feature the find_last_bit. That makes the semantics difference between find_last_bit and find_prev_bit.  specify where you find from and In loop, find_last_bit couldn't sustain original size as sentinel return value (we should change the size argument for next searching But it means whenever we call, "NOT SET or NOT CLEAR"'s sentinel return value is changed per call). Because we should have _find_prev_bit, I think it's the matter to choose which is better to usein find_prev_bit (find_last_bit? or _find_prev_bit?) sustaining find_prev_bit feature (give size as sentinel return, from where I start). if my understanding is correct. In my view, I prefer to use _find_prev_bit like find_next_bit for integrated format. But In some of the benchmarking, find_last_bit is better than _find_prev_bit, here what I tested (look similar but sometimes have some difference). Start testing find_bit() with randomfilled bitmap [ +0.001850] find_next_bit: 842792 ns, 163788 iterations [ +0.000873] find_prev_bit: 870914 ns, 163788 iterations [ +0.000824] find_next_zero_bit: 821959 ns, 163894 iterations [ +0.000677] find_prev_zero_bit: 676240 ns, 163894 iterations [ +0.000777] find_last_bit: 659103 ns, 163788 iterations [ +0.001822] find_first_bit: 1708041 ns, 16250 iterations [ +0.000539] find_next_and_bit: 492182 ns, 73871 iterations [ +0.000001] Start testing find_bit() with sparse bitmap [ +0.000222] find_next_bit: 13227 ns, 654 iterations [ +0.000013] find_prev_bit: 11652 ns, 654 iterations [ +0.001845] find_next_zero_bit: 1723869 ns, 327028 iterations [ +0.001538] find_prev_zero_bit: 1355808 ns, 327028 iterations [ +0.000010] find_last_bit: 8114 ns, 654 iterations [ +0.000867] find_first_bit: 710639 ns, 654 iterations [ +0.000006] find_next_and_bit: 4273 ns, 1 iterations [ +0.000004] find_next_and_bit: 3278 ns, 1 iterations Start testing find_bit() with randomfilled bitmap [ +0.001784] find_next_bit: 805553 ns, 164240 iterations [ +0.000643] find_prev_bit: 632474 ns, 164240 iterations [ +0.000950] find_next_zero_bit: 877215 ns, 163442 iterations [ +0.000664] find_prev_zero_bit: 662339 ns, 163442 iterations [ +0.000680] find_last_bit: 602204 ns, 164240 iterations [ +0.001912] find_first_bit: 1758208 ns, 16408 iterations [ +0.000760] find_next_and_bit: 531033 ns, 73798 iterations [ +0.000002] Start testing find_bit() with sparse bitmap [ +0.000203] find_next_bit: 12468 ns, 656 iterations [ +0.000205] find_prev_bit: 10948 ns, 656 iterations [ +0.001759] find_next_zero_bit: 1579447 ns, 327026 iterations [ +0.001935] find_prev_zero_bit: 1931961 ns, 327026 iterations [ +0.000013] find_last_bit: 9543 ns, 656 iterations [ +0.000732] find_first_bit: 562009 ns, 656 iterations [ +0.000217] find_next_and_bit: 6804 ns, 1 iterations [ +0.000007] find_next_and_bit: 4367 ns, 1 iterations Is it better to write find_prev_bit using find_last_bit? I question again. Thanks for your great advice, But please forgive my fault and lackness. HTH. Levi.
On Thu, Dec 3, 2020 at 5:36 PM Yun Levi <ppbuk5246@gmail.com> wrote: > > >On Fri, Dec 4, 2020 at 3:53 AM Willy Tarreau <w@1wt.eu> wrote: > > > > On Thu, Dec 03, 2020 at 10:46:25AM 0800, Yury Norov wrote: > > > Yun, could you please stop topposting and excessive trimming in the thread? > > > > And reconfigure the mail agent to make the "Subject" field appear and > > fill it. > > >On Thu, Dec 03, 2020 at 10:46:25AM 0800, Yury Norov wrote: > > Yun, could you please stop topposting and excessive trimming in the thread? > Sorry to make you uncomfortable... Thanks for advice. > > >On Thu, Dec 03, 2020 at 10:46:25AM 0800, Yury Norov wrote: > > As you said, find_last_bit() and proposed find_prev_*_bit() have the > > same functionality. > > If you really want to have find_prev_*_bit(), could you please at > > least write it using find_last_bit(), otherwise it would be just a > > blottering. > > Actually find_prev_*_bit call _find_prev_bit which is a common helper function > like _find_next_bit. > As you know this function is required to support __BIGEDIAN's little > endian search. > find_prev_bit actually wrapper of _find_prev_bit which have a feature > the find_last_bit. > > That makes the semantics difference between find_last_bit and find_prev_bit. >  specify where you find from and > In loop, find_last_bit couldn't sustain original size as sentinel > return value > (we should change the size argument for next searching > But it means whenever we call, "NOT SET or NOT CLEAR"'s sentinel > return value is changed per call). > > Because we should have _find_prev_bit, > I think it's the matter to choose which is better to usein > find_prev_bit (find_last_bit? or _find_prev_bit?) > sustaining find_prev_bit feature (give size as sentinel return, from > where I start). > if my understanding is correct. > > In my view, I prefer to use _find_prev_bit like find_next_bit for > integrated format. > > But In some of the benchmarking, find_last_bit is better than _find_prev_bit, > here what I tested (look similar but sometimes have some difference). > > Start testing find_bit() with randomfilled bitmap > [ +0.001850] find_next_bit: 842792 ns, 163788 iterations > [ +0.000873] find_prev_bit: 870914 ns, 163788 iterations > [ +0.000824] find_next_zero_bit: 821959 ns, 163894 iterations > [ +0.000677] find_prev_zero_bit: 676240 ns, 163894 iterations > [ +0.000777] find_last_bit: 659103 ns, 163788 iterations > [ +0.001822] find_first_bit: 1708041 ns, 16250 iterations > [ +0.000539] find_next_and_bit: 492182 ns, 73871 iterations > [ +0.000001] > Start testing find_bit() with sparse bitmap > [ +0.000222] find_next_bit: 13227 ns, 654 iterations > [ +0.000013] find_prev_bit: 11652 ns, 654 iterations > [ +0.001845] find_next_zero_bit: 1723869 ns, 327028 iterations > [ +0.001538] find_prev_zero_bit: 1355808 ns, 327028 iterations > [ +0.000010] find_last_bit: 8114 ns, 654 iterations > [ +0.000867] find_first_bit: 710639 ns, 654 iterations > [ +0.000006] find_next_and_bit: 4273 ns, 1 iterations > [ +0.000004] find_next_and_bit: 3278 ns, 1 iterations > > Start testing find_bit() with randomfilled bitmap > [ +0.001784] find_next_bit: 805553 ns, 164240 iterations > [ +0.000643] find_prev_bit: 632474 ns, 164240 iterations > [ +0.000950] find_next_zero_bit: 877215 ns, 163442 iterations > [ +0.000664] find_prev_zero_bit: 662339 ns, 163442 iterations > [ +0.000680] find_last_bit: 602204 ns, 164240 iterations > [ +0.001912] find_first_bit: 1758208 ns, 16408 iterations > [ +0.000760] find_next_and_bit: 531033 ns, 73798 iterations > [ +0.000002] > Start testing find_bit() with sparse bitmap > [ +0.000203] find_next_bit: 12468 ns, 656 iterations > [ +0.000205] find_prev_bit: 10948 ns, 656 iterations > [ +0.001759] find_next_zero_bit: 1579447 ns, 327026 iterations > [ +0.001935] find_prev_zero_bit: 1931961 ns, 327026 iterations > [ +0.000013] find_last_bit: 9543 ns, 656 iterations > [ +0.000732] find_first_bit: 562009 ns, 656 iterations > [ +0.000217] find_next_and_bit: 6804 ns, 1 iterations > [ +0.000007] find_next_and_bit: 4367 ns, 1 iterations > > Is it better to write find_prev_bit using find_last_bit? > I question again. I answer again. It's better not to write find_prev_bit at all and learn how to use existing functionality. Yury > Thanks for your great advice, But please forgive my fault and lackness. > > HTH. > Levi.
> I answer again. It's better not to write find_prev_bit at all and > learn how to use existing functionality. Thanks for the answer I'll fix and send the patch again :) On Sat, Dec 5, 2020 at 3:14 AM Yury Norov <yury.norov@gmail.com> wrote: > > On Thu, Dec 3, 2020 at 5:36 PM Yun Levi <ppbuk5246@gmail.com> wrote: > > > > >On Fri, Dec 4, 2020 at 3:53 AM Willy Tarreau <w@1wt.eu> wrote: > > > > > > On Thu, Dec 03, 2020 at 10:46:25AM 0800, Yury Norov wrote: > > > > Yun, could you please stop topposting and excessive trimming in the thread? > > > > > > And reconfigure the mail agent to make the "Subject" field appear and > > > fill it. > > > > >On Thu, Dec 03, 2020 at 10:46:25AM 0800, Yury Norov wrote: > > > Yun, could you please stop topposting and excessive trimming in the thread? > > Sorry to make you uncomfortable... Thanks for advice. > > > > >On Thu, Dec 03, 2020 at 10:46:25AM 0800, Yury Norov wrote: > > > As you said, find_last_bit() and proposed find_prev_*_bit() have the > > > same functionality. > > > If you really want to have find_prev_*_bit(), could you please at > > > least write it using find_last_bit(), otherwise it would be just a > > > blottering. > > > > Actually find_prev_*_bit call _find_prev_bit which is a common helper function > > like _find_next_bit. > > As you know this function is required to support __BIGEDIAN's little > > endian search. > > find_prev_bit actually wrapper of _find_prev_bit which have a feature > > the find_last_bit. > > > > That makes the semantics difference between find_last_bit and find_prev_bit. > >  specify where you find from and > > In loop, find_last_bit couldn't sustain original size as sentinel > > return value > > (we should change the size argument for next searching > > But it means whenever we call, "NOT SET or NOT CLEAR"'s sentinel > > return value is changed per call). > > > > Because we should have _find_prev_bit, > > I think it's the matter to choose which is better to usein > > find_prev_bit (find_last_bit? or _find_prev_bit?) > > sustaining find_prev_bit feature (give size as sentinel return, from > > where I start). > > if my understanding is correct. > > > > In my view, I prefer to use _find_prev_bit like find_next_bit for > > integrated format. > > > > But In some of the benchmarking, find_last_bit is better than _find_prev_bit, > > here what I tested (look similar but sometimes have some difference). > > > > Start testing find_bit() with randomfilled bitmap > > [ +0.001850] find_next_bit: 842792 ns, 163788 iterations > > [ +0.000873] find_prev_bit: 870914 ns, 163788 iterations > > [ +0.000824] find_next_zero_bit: 821959 ns, 163894 iterations > > [ +0.000677] find_prev_zero_bit: 676240 ns, 163894 iterations > > [ +0.000777] find_last_bit: 659103 ns, 163788 iterations > > [ +0.001822] find_first_bit: 1708041 ns, 16250 iterations > > [ +0.000539] find_next_and_bit: 492182 ns, 73871 iterations > > [ +0.000001] > > Start testing find_bit() with sparse bitmap > > [ +0.000222] find_next_bit: 13227 ns, 654 iterations > > [ +0.000013] find_prev_bit: 11652 ns, 654 iterations > > [ +0.001845] find_next_zero_bit: 1723869 ns, 327028 iterations > > [ +0.001538] find_prev_zero_bit: 1355808 ns, 327028 iterations > > [ +0.000010] find_last_bit: 8114 ns, 654 iterations > > [ +0.000867] find_first_bit: 710639 ns, 654 iterations > > [ +0.000006] find_next_and_bit: 4273 ns, 1 iterations > > [ +0.000004] find_next_and_bit: 3278 ns, 1 iterations > > > > Start testing find_bit() with randomfilled bitmap > > [ +0.001784] find_next_bit: 805553 ns, 164240 iterations > > [ +0.000643] find_prev_bit: 632474 ns, 164240 iterations > > [ +0.000950] find_next_zero_bit: 877215 ns, 163442 iterations > > [ +0.000664] find_prev_zero_bit: 662339 ns, 163442 iterations > > [ +0.000680] find_last_bit: 602204 ns, 164240 iterations > > [ +0.001912] find_first_bit: 1758208 ns, 16408 iterations > > [ +0.000760] find_next_and_bit: 531033 ns, 73798 iterations > > [ +0.000002] > > Start testing find_bit() with sparse bitmap > > [ +0.000203] find_next_bit: 12468 ns, 656 iterations > > [ +0.000205] find_prev_bit: 10948 ns, 656 iterations > > [ +0.001759] find_next_zero_bit: 1579447 ns, 327026 iterations > > [ +0.001935] find_prev_zero_bit: 1931961 ns, 327026 iterations > > [ +0.000013] find_last_bit: 9543 ns, 656 iterations > > [ +0.000732] find_first_bit: 562009 ns, 656 iterations > > [ +0.000217] find_next_and_bit: 6804 ns, 1 iterations > > [ +0.000007] find_next_and_bit: 4367 ns, 1 iterations > > > > Is it better to write find_prev_bit using find_last_bit? > > I question again. > > I answer again. It's better not to write find_prev_bit at all and > learn how to use existing functionality. > > Yury > > > Thanks for your great advice, But please forgive my fault and lackness. > > > > HTH. > > Levi.
On 03/12/2020 19.46, Yury Norov wrote: > I would prefer to avoid changing the find*bit() semantics. As for now, > if any of find_*_bit() > finds nothing, it returns the size of the bitmap it was passed. Yeah, we should actually try to fix that, it causes bad code generation. It's hard, because callers of course do that "if ret == size" check. But it's really silly that something like find_first_bit needs to do that "min(i*BPL + __ffs(word), size)"  the caller does a comparison anyway, that comparison might as well be "ret >= size" rather than "ret == size", and then we could get rid of that branch (which min() necessarily becomes) at the end of find_next_bit. I haven't dug very deep into this, but I could also imagine the archspecific parts of this might become a little easier to do if the semantics were just "if no such bit, return an indeterminate value >= the size". > Changing this for > a single function would break the consistency, and may cause problems > for those who > rely on existing behaviour. True. But I think it should be possible  I suppose most users are via the iterator macros, which could all be updated at once. Changing ret == size to ret >= size will still work even if the implementations have not been switched over, so it should be doable. > > Passing nonpositive size to find_*_bit() should produce undefined > behaviour, because we cannot dereference a pointer to the bitmap in > this case; this is most probably a sign of a problem on a caller side > anyways. No, the outofline bitmap functions should all handle the case of a zerosize bitmap sensibly. Is bitmap full? Yes (all the 0 bits are set). Is bitmap empty? Yes, (none of the 0 bits are set). Find the first bit set (returns 0, there's no such bit) Etc. The static inlines for small_const_nbits do assume that the pointer can be dereferenced, which is why small_const_nbits was updated to mean 1<=bits<=BITS_PER_LONG rather than just bits<=BITS_PER_LONG. Rasmus
On Sat, Dec 5, 2020 at 3:10 AM Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > > On 03/12/2020 19.46, Yury Norov wrote: > > > I would prefer to avoid changing the find*bit() semantics. As for now, > > if any of find_*_bit() > > finds nothing, it returns the size of the bitmap it was passed. > > Yeah, we should actually try to fix that, it causes bad code generation. > It's hard, because callers of course do that "if ret == size" check. But > it's really silly that something like find_first_bit needs to do that > "min(i*BPL + __ffs(word), size)"  the caller does a comparison anyway, > that comparison might as well be "ret >= size" rather than "ret == > size", and then we could get rid of that branch (which min() necessarily > becomes) at the end of find_next_bit. We didn't do that 5 years ago because it's too invasive and the improvement is barely measurable, the difference is 2 instructions (on arm64).e. Has something changed since that? 20000000000000000 <find_first_bit_better>: 0: aa0003e3 mov x3, x0 4: aa0103e0 mov x0, x1 8: b4000181 cbz x1, 38 <find_first_bit_better+0x38> c: f9400064 ldr x4, [x3] 10: d2800802 mov x2, #0x40 // #64 14: 91002063 add x3, x3, #0x8 18: b40000c4 cbz x4, 30 <find_first_bit_better+0x30> 1c: 14000008 b 3c <find_first_bit_better+0x3c> 20: f8408464 ldr x4, [x3], #8 24: 91010045 add x5, x2, #0x40 28: b50000c4 cbnz x4, 40 <find_first_bit_better+0x40> 2c: aa0503e2 mov x2, x5 30: eb00005f cmp x2, x0 34: 54ffff63 b.cc 20 <find_first_bit_better+0x20> // b.lo, b.ul, b.last 38: d65f03c0 ret 3c: d2800002 mov x2, #0x0 // #0 40: dac00084 rbit x4, x4 44: dac01084 clz x4, x4 48: 8b020080 add x0, x4, x2 4c: d65f03c0 ret 0000000000000050 <find_first_bit_worse>: 50: aa0003e4 mov x4, x0 54: aa0103e0 mov x0, x1 58: b4000181 cbz x1, 88 <find_first_bit_worse+0x38> 5c: f9400083 ldr x3, [x4] 60: d2800802 mov x2, #0x40 // #64 64: 91002084 add x4, x4, #0x8 68: b40000c3 cbz x3, 80 <find_first_bit_worse+0x30> 6c: 14000008 b 8c <find_first_bit_worse+0x3c> 70: f8408483 ldr x3, [x4], #8 74: 91010045 add x5, x2, #0x40 78: b50000c3 cbnz x3, 90 <find_first_bit_worse+0x40> 7c: aa0503e2 mov x2, x5 80: eb02001f cmp x0, x2 84: 54ffff68 b.hi 70 <find_first_bit_worse+0x20> // b.pmore 88: d65f03c0 ret 8c: d2800002 mov x2, #0x0 // #0 90: dac00063 rbit x3, x3 94: dac01063 clz x3, x3 98: 8b020062 add x2, x3, x2 9c: eb02001f cmp x0, x2 a0: 9a829000 csel x0, x0, x2, ls // ls = plast a4: d65f03c0 ret > I haven't dug very deep into this, but I could also imagine the > archspecific parts of this might become a little easier to do if the > semantics were just "if no such bit, return an indeterminate value >= > the size". > > > Changing this for > > a single function would break the consistency, and may cause problems > > for those who > > rely on existing behaviour. > > True. But I think it should be possible  I suppose most users are via > the iterator macros, which could all be updated at once. Changing ret == > size to ret >= size will still work even if the implementations have not > been switched over, so it should be doable. Since there's no assembler users for it, we can do just: #define find_first_bit(bitmap, size) min(better_find_first_bit((bitmap), (size)), (size)) ... and deprecate find_first_bit. > > Passing nonpositive size to find_*_bit() should produce undefined > > behaviour, because we cannot dereference a pointer to the bitmap in > > this case; this is most probably a sign of a problem on a caller side > > anyways. > > No, the outofline bitmap functions should all handle the case of a > zerosize bitmap sensibly. I could be more specific, the behaviour is defined: don't dereference the address and return undefined value (which now is always 0). > Is bitmap full? Yes (all the 0 bits are set). > Is bitmap empty? Yes, (none of the 0 bits are set). > Find the first bit set (returns 0, there's no such bit) I can't answer because this object is not a map of bits  there's no room for bits inside. > Etc. The static inlines for small_const_nbits do assume that the pointer > can be dereferenced, which is why small_const_nbits was updated to mean > 1<=bits<=BITS_PER_LONG rather than just bits<=BITS_PER_LONG. I don't want to do something like if (size == 0) return 1; ... because it legitimizes this kind of usage and hides problems on callers' side. Instead, I'd add WARN_ON(size == 0), but I don't think it's so critical to bother with it. Yury > Rasmus
diff git a/fs/btrfs/freespacecache.c b/fs/btrfs/freespacecache.c index af0013d3df63..9debb9707390 100644  a/fs/btrfs/freespacecache.c +++ b/fs/btrfs/freespacecache.c @@ 2372,7 +2372,6 @@ static bool steal_from_bitmap_to_front(struct btrfs_free_space_ctl *ctl, u64 bitmap_offset; unsigned long i; unsigned long j;  unsigned long prev_j; u64 bytes;