Re: [PATCH 1/2] lib/find: Make functions safe on changing bitmaps

From: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
To: Yury Norov <yury.norov@gmail.com>, Jan Kara <jack@suse.cz>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	Matthew Wilcox <willy@infradead.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] lib/find: Make functions safe on changing bitmaps
Date: Sat, 14 Oct 2023 04:21:50 +0200	[thread overview]
Message-ID: <021970ad-942a-4fe8-ac95-c8089527f7d2@alu.unizg.hr> (raw)
In-Reply-To: <ZSndoNcA7YWHXeUi@yury-ThinkPad>

On 10/14/2023 2:15 AM, Yury Norov wrote:
> Restore LKML
> 
> On Thu, Oct 12, 2023 at 02:21:10PM +0200, Jan Kara wrote:
>> On Wed 11-10-23 11:26:29, Yury Norov wrote:
>>> Long story short: KCSAN found some potential issues related to how
>>> people use bitmap API. And instead of working through that issues,
>>> the following code shuts down KCSAN by applying READ_ONCE() here
>>> and there.
>>
>> I'm sorry but this is not what the patch does. I'm not sure how to get the
>> message across so maybe let me start from a different angle:
>>
>> Bitmaps are perfectly fine to be used without any external locking if
>> only atomic bit ops (set_bit, clear_bit, test_and_{set/clear}_bit) are
>> used. This is a significant performance gain compared to using a spinlock
>> or other locking and people do this for a long time. I hope we agree on
>> that.
>>
>> Now it is also common that you need to find a set / clear bit in a bitmap.
>> To maintain lockless protocol and deal with races people employ schemes
>> like (the dumbest form):
>>
>> 	do {
>> 		bit = find_first_bit(bitmap, n);
>> 		if (bit >= n)
>> 			abort...
>> 	} while (!test_and_clear_bit(bit, bitmap));
>>
>> So the code loops until it finds a set bit that is successfully cleared by
>> it. This is perfectly fine and safe lockless code and such use should be
>> supported. Agreed?
> 
> Great example. When you're running non-atomic functions concurrently,
> the result may easily become incorrect, and this is what you're
> demonstrating here.
> 
> Regarding find_first_bit() it means that:
>   - it may erroneously return unset bit;
>   - it may erroneously return non-first set bit;
>   - it may erroneously return no bits for non-empty bitmap.
> 
> Effectively it means that find_first bit may just return a random number.
> 
> Let's take another example:
> 
> 	do {
> 		bit = get_random_number();
> 		if (bit >= n)
> 			abort...
> 	} while (!test_and_clear_bit(bit, bitmap));
> 
> When running concurrently, the difference between this and your code
> is only in probability of getting set bit somewhere from around the
> beginning of bitmap.
> 
> The key point is that find_bit() may return undef even if READ_ONCE() is
> used. If bitmap gets changed anytime in the process, the result becomes
> invalid. It may happen even after returning from find_first_bit().
> 
> And if my understanding correct, your code is designed in the
> assumption that find_first_bit() may return garbage, so handles it
> correctly.
> 
>> *Except* that the above actually is not safe due to find_first_bit()
>> implementation and KCSAN warns about that. The problem is that:
>>
>> Assume *addr == 1
>> CPU1			CPU2
>> find_first_bit(addr, 64)
>>    val = *addr;
>>    if (val) -> true
>> 			clear_bit(0, addr)
>>      val = *addr -> compiler decided to refetch addr contents for whatever
>> 		   reason in the generated assembly
>>      __ffs(val) -> now executed for value 0 which has undefined results.
> 
> Yes, __ffs(0) is undef. But the whole function is undef when accessing
> bitmap concurrently.
> 
>> And the READ_ONCE() this patch adds prevents the compiler from adding the
>> refetching of addr into the assembly.
> 
> That's true. But it doesn't improve on the situation. It was an undef
> before, and it's undef after, but a 2% slower undef.
> 
> Now on that KCSAN warning. If I understand things correctly, for the
> example above, KCSAN warning is false-positive, because you're
> intentionally running lockless.
> 
> But for some other people it may be a true error, and now they'll have
> no chance to catch it if KCSAN is forced to ignore find_bit() entirely.
> 
> We've got the whole class of lockless algorithms that allow safe concurrent
> access to the memory. And now that there's a tool that searches for them
> (concurrent accesses), we need to have an option to somehow teach it
> to suppress irrelevant warnings. Maybe something like this?
> 
>          lockless_algorithm_begin(bitmap, bitmap_size(nbits));
> 	do {
> 		bit = find_first_bit(bitmap, nbits);
> 		if (bit >= nbits)
> 			break;
> 	} while (!test_and_clear_bit(bit, bitmap));
>          lockless_algorithm_end(bitmap, bitmap_size(nbits));
> 
> And, of course, as I suggested a couple iterations ago, you can invent
> a thread-safe version of find_bit(), that would be perfectly correct
> for lockless use:
> 
>   unsigned long _find_and_clear_bit(volatile unsigned long *addr, unsigned long size)
>   {
>          unsigned long bit = 0;
>   
>          while (!test_and_clear_bit(bit, bitmap) {
>                  bit = FIND_FIRST_BIT(addr[idx], /* nop */, size);
>                  if (bit >= size)
>                          return size;
>          }
> 
>          return bit;
>   }

Hi, Yuri,

But the code above effectively does the same as the READ_ONCE() macro
as defined in rwonce.h:

#ifndef __READ_ONCE
#define __READ_ONCE(x)	(*(const volatile __unqual_scalar_typeof(x) *)&(x))
#endif

#define READ_ONCE(x)							\
({									\
	compiletime_assert_rwonce_type(x);				\
	__READ_ONCE(x);							\
})

Both uses only prevent the funny stuff the compiler might have done to the
read of the addr[idx], there's no black magic in READ_ONCE().

Both examples would probably result in the same assembly and produce the
same 2% slowdown ...

Only you declare volatile in one place, and READ_ONCE() in each read, but
this will only compile a bit slower and generate the same machine code.

Best regards,
Mirsad Todorovac

> Didn't test that, but I hope 'volatile' specifier should be enough
> for compiler to realize that it shouldn't optimize memory access, and
> for KCSAN that everything's OK here.
> 
> By the way, thank you for respectful and professional communication.
> 
> Thanks,
> Yury

-- 
Mirsad Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
tel. +385 (0)1 3711 451
mob. +385 91 57 88 355