From: Jeff Mahoney <jeffm@suse.com>
To: Nikolay Borisov <nborisov@suse.com>,
dsterba@suse.com, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 1/3] btrfs: qgroups, fix rescan worker running races
Date: Thu, 3 May 2018 11:57:31 -0400 [thread overview]
Message-ID: <cb1c47ef-1161-b60b-f176-c47d7b7df73e@suse.com> (raw)
In-Reply-To: <ce166087-96d9-43d7-b13c-0bdb316b1da4@suse.com>
[-- Attachment #1.1: Type: text/plain, Size: 2662 bytes --]
On 5/3/18 11:52 AM, Nikolay Borisov wrote:
>
>
> On 3.05.2018 16:39, Jeff Mahoney wrote:
>> On 5/3/18 3:24 AM, Nikolay Borisov wrote:
>>>
>>>
>>> On 3.05.2018 00:11, jeffm@suse.com wrote:
>>>> From: Jeff Mahoney <jeffm@suse.com>
>>>>
>>>> Commit 8d9eddad194 (Btrfs: fix qgroup rescan worker initialization)
>>>> fixed the issue with BTRFS_IOC_QUOTA_RESCAN_WAIT being racy, but
>>>> ended up reintroducing the hang-on-unmount bug that the commit it
>>>> intended to fix addressed.
>>>>
>>>> The race this time is between qgroup_rescan_init setting
>>>> ->qgroup_rescan_running = true and the worker starting. There are
>>>> many scenarios where we initialize the worker and never start it. The
>>>> completion btrfs_ioctl_quota_rescan_wait waits for will never come.
>>>> This can happen even without involving error handling, since mounting
>>>> the file system read-only returns between initializing the worker and
>>>> queueing it.
>>>>
>>>> The right place to do it is when we're queuing the worker. The flag
>>>> really just means that btrfs_ioctl_quota_rescan_wait should wait for
>>>> a completion.
>>>>
>>>> Since the BTRFS_QGROUP_STATUS_FLAG_RESCAN flag is overloaded to
>>>> refer to both runtime behavior and on-disk state, we introduce a new
>>>> fs_info->qgroup_rescan_ready to indicate that we're initialized and
>>>> waiting to start.
>>>
>>> Am I correct in my understanding that this qgroup_rescan_ready flag is
>>> used to avoid qgroup_rescan_init being called AFTER it has already been
>>> called but BEFORE queue_rescan_worker ? Why wasn't the initial version
>>> of this patch without this flag sufficient?
>>
>> No, the race is between clearing the BTRFS_QGROUP_STATUS_FLAG_RESCAN
>> flag near the end of the worker and clearing the running flag. The
>> rescan lock is dropped in between, so btrfs_rescan_init will let a new
>> rescan request in while we update the status item on disk. We wouldn't
>> have queued another worker since that's what the warning catches, but if
>> there were already tasks waiting for completion, they wouldn't have been
>> woken since the wait queue list would be reinitialized. There's no way
>> to reorder clearing the flag without changing how we handle
>> ->qgroup_flags. I plan on doing that separately. This was just meant
>> to be the simple fix.
>
> Great, I think some of this information should go into the change log,
> in explaining what the symptoms of the race condition are.
You're right. I was treating as a race that my patch introduced but it
didn't. It just complained about it.
-Jeff
--
Jeff Mahoney
SUSE Labs
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2018-05-03 15:57 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-02 21:11 [PATCH v3 0/3] btrfs: qgroup rescan races (part 1) jeffm
2018-05-02 21:11 ` [PATCH 1/3] btrfs: qgroups, fix rescan worker running races jeffm
2018-05-03 7:24 ` Nikolay Borisov
2018-05-03 13:39 ` Jeff Mahoney
2018-05-03 15:52 ` Nikolay Borisov
2018-05-03 15:57 ` Jeff Mahoney [this message]
2018-05-10 19:49 ` Jeff Mahoney
2018-05-10 23:04 ` Jeff Mahoney
2020-01-16 6:41 ` Qu Wenruo
2018-05-02 21:11 ` [PATCH 2/3] btrfs: qgroups, remove unnecessary memset before btrfs_init_work jeffm
2018-05-02 21:11 ` [PATCH 3/3] btrfs: qgroup, don't try to insert status item after ENOMEM in rescan worker jeffm
2018-05-03 6:23 ` [PATCH v3 0/3] btrfs: qgroup rescan races (part 1) Nikolay Borisov
2018-05-03 22:27 ` Jeff Mahoney
2018-05-04 5:59 ` Nikolay Borisov
2018-05-04 13:32 ` Jeff Mahoney
2018-05-04 13:41 ` Nikolay Borisov
2019-11-28 3:28 ` Qu Wenruo
2019-12-03 19:32 ` David Sterba
-- strict thread matches above, loose matches on Subject: below --
2018-04-26 19:23 [PATCH 1/3] btrfs: qgroups, fix rescan worker running races jeffm
2018-04-27 8:42 ` Nikolay Borisov
2018-04-27 8:48 ` Filipe Manana
2018-04-27 16:00 ` Jeff Mahoney
2018-04-27 15:56 ` David Sterba
2018-04-27 16:02 ` Jeff Mahoney
2018-04-27 16:40 ` David Sterba
2018-04-27 19:32 ` Jeff Mahoney
2018-04-28 17:09 ` David Sterba
2018-04-27 19:28 ` Noah Massey
2018-04-28 17:10 ` David Sterba
2018-04-30 6:20 ` Qu Wenruo
2018-04-30 14:07 ` Jeff Mahoney
2018-05-02 10:29 ` David Sterba
2018-05-02 13:15 ` David Sterba
2018-05-02 13:58 ` Jeff Mahoney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cb1c47ef-1161-b60b-f176-c47d7b7df73e@suse.com \
--to=jeffm@suse.com \
--cc=dsterba@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=nborisov@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).