All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nikolay Borisov <nborisov@suse.com>
To: dsterba@suse.cz, Nikolay Borisov <nborisov@suse.com>,
	Josef Bacik <josef@toxicpanda.com>,
	linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH v5 2/8] btrfs: only let one thread pre-flush delayed refs in commit
Date: Tue, 12 Jan 2021 11:17:45 +0200	[thread overview]
Message-ID: <e1fd5cc1-0f28-f670-69f4-e9958b4964e6@suse.com> (raw)
In-Reply-To: <20210111215051.GH6430@twin.jikos.cz>



On 11.01.21 г. 23:50 ч., David Sterba wrote:
> On Mon, Jan 11, 2021 at 10:33:42AM +0200, Nikolay Borisov wrote:
>> On 8.01.21 г. 18:01 ч., David Sterba wrote:
>>> On Fri, Dec 18, 2020 at 02:24:20PM -0500, Josef Bacik wrote:
>>>> @@ -2043,23 +2043,22 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
>>>>  	btrfs_trans_release_metadata(trans);
>>>>  	trans->block_rsv = NULL;
>>>>  
>>>> -	/* make a pass through all the delayed refs we have so far
>>>> -	 * any runnings procs may add more while we are here
>>>> -	 */
>>>> -	ret = btrfs_run_delayed_refs(trans, 0);
>>>> -	if (ret) {
>>>> -		btrfs_end_transaction(trans);
>>>> -		return ret;
>>>> -	}
>>>> -
>>>> -	cur_trans = trans->transaction;
>>>> -
>>>>  	/*
>>>> -	 * set the flushing flag so procs in this transaction have to
>>>> -	 * start sending their work down.
>>>> +	 * We only want one transaction commit doing the flushing so we do not
>>>> +	 * waste a bunch of time on lock contention on the extent root node.
>>>>  	 */
>>>> -	cur_trans->delayed_refs.flushing = 1;
>>>> -	smp_wmb();
>>>
>>> This barrier obviously separates the flushing = 1 and the rest of the
>>> code, now implemented as test_and_set_bit, which implies full barrier.
>>>
>>> However, hunk in btrfs_should_end_transaction removes the barrier and
>>> I'm not sure whether this is correct:
>>>
>>> -	smp_mb();
>>>  	if (cur_trans->state >= TRANS_STATE_COMMIT_START ||
>>> -	    cur_trans->delayed_refs.flushing)
>>> +	    test_bit(BTRFS_DELAYED_REFS_FLUSHING,
>>> +		     &cur_trans->delayed_refs.flags))
>>>  		return true;
>>>
>>> This is never called under locks so we don't have complete
>>> synchronization of neither the transaction state nor the flushing bit.
>>> btrfs_should_end_transaction is merely a hint and not called in critical
>>> places so we could probably afford to keep it without a barrier, or keep
>>> it with comment(s).
>>
>> I think the point is moot in this case, because the test_bit either sees
>> the flag or it doesn't. It's not possible for the flag to be set AND
>> should_end_transaction return false that would be gross violation of
>> program correctness.
> 
> So that's for the flushing part, but what about cur_trans->state?

Looking at the code, the barrier was there to order the publishing of
the delayed_ref.flushing (now replaced by the bit flag) against
surrounding code.

So independently of this patch, let's reason about trans state. In
should_end_transaction it's read without holding any locks. (U)

It's modified in btrfs_cleanup_transaction without holding the
fs_info->trans_lock (U), but the STATE_ERROR flag is going to be set.

set in cleanup_transaction under fs_info->trans_lock (L)
set in btrfs_commit_trans to COMMIT_START under fs_info->trans_lock.(L)
set in btrfs_commit_trans to COMMIT_DOING under fs_info->trans_lock.(L)
set in btrfs_commit_trans to COMMIT_UNBLOCK under fs_info->trans_lock.(L)

set in btrfs_commit_trans to COMMIT_COMPLETED without locks but at this
point the transaction is finished and fs_info->running_trans is NULL (U
but irrelevant).

So by the looks of it we can have a concurrent READ race with a Write,
due to reads not taking a lock. In this case what we want to ensure is
we either see new or old state. I consulted with Will Deacon and he said
that in such a case we'd want to annotate the accesses to ->state with
(READ|WRITE)_ONCE so as to avoid a theoretical tear, in this case I
don't think this could happen but I imagine at some point kcsan would
flag such an access as racy (which it is).

> 

  reply	other threads:[~2021-01-12  9:19 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-18 19:24 [PATCH v5 0/8] A variety of lock contention fixes Josef Bacik
2020-12-18 19:24 ` [PATCH v5 1/8] btrfs: do not block on deleted bgs mutex in the cleaner Josef Bacik
2020-12-18 19:24 ` [PATCH v5 2/8] btrfs: only let one thread pre-flush delayed refs in commit Josef Bacik
2021-01-08 16:01   ` David Sterba
2021-01-11  8:33     ` Nikolay Borisov
2021-01-11 21:50       ` David Sterba
2021-01-12  9:17         ` Nikolay Borisov [this message]
2021-01-26 17:36           ` David Sterba
2020-12-18 19:24 ` [PATCH v5 3/8] btrfs: delayed refs pre-flushing should only run the heads we have Josef Bacik
2020-12-18 19:24 ` [PATCH v5 4/8] btrfs: only run delayed refs once before committing Josef Bacik
2020-12-18 19:24 ` [PATCH v5 5/8] btrfs: move delayed ref flushing for qgroup into qgroup helper Josef Bacik
2020-12-18 19:24 ` [PATCH v5 6/8] btrfs: remove bogus BUG_ON in alloc_reserved_tree_block Josef Bacik
2020-12-18 19:24 ` [PATCH v5 7/8] btrfs: stop running all delayed refs during snapshot Josef Bacik
2020-12-18 19:24 ` [PATCH v5 8/8] btrfs: run delayed refs less often in commit_cowonly_roots Josef Bacik
2021-01-26 17:51 ` [PATCH v5 0/8] A variety of lock contention fixes David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e1fd5cc1-0f28-f670-69f4-e9958b4964e6@suse.com \
    --to=nborisov@suse.com \
    --cc=dsterba@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --subject='Re: [PATCH v5 2/8] btrfs: only let one thread pre-flush delayed refs in commit' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.