Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: dsterba@suse.cz, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v4 1/3] btrfs: scrub: fix circular locking dependency warning
Date: Thu, 31 Jan 2019 14:34:54 +0800
Message-ID: <be521dbc-0b2a-696a-7167-21d50c6c71a9@oracle.com> (raw)
In-Reply-To: <20190130140758.GR2900@twin.jikos.cz>



On 1/30/19 10:07 PM, David Sterba wrote:
> On Wed, Jan 30, 2019 at 02:45:00PM +0800, Anand Jain wrote:
>> v3->v4: Fix list corruption as reported by btrfs/073 by David.
>>             [1]
>>             https://patchwork.kernel.org/patch/10705741/
>> 	Which I was able to reproduce with an instrumented kernel but not with
>> 	btrfs/073.
>>          In v3 patch, it releases the fs_info::scrub_lock to destroy the work queue
>> 	which raced with new scrub requests, overwriting the scrub workers
>> 	pointers. So in v4, it kills the function scrub_workers_put(), and
>> 	performs the destroy_workqueue in two stages, with worker pointers
>> 	copied locally.
> 
>> @@ -3932,9 +3925,16 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start,
>>   
>>   	mutex_lock(&fs_info->scrub_lock);
>>   	dev->scrub_ctx = NULL;
>> -	scrub_workers_put(fs_info);
>> +	if (--fs_info->scrub_workers_refcnt == 0) {
>> +		scrub_workers = fs_info->scrub_workers;
>> +		scrub_wr_comp = fs_info->scrub_wr_completion_workers;
>> +		scrub_parity = fs_info->scrub_parity_workers;
>> +	}
>>   	mutex_unlock(&fs_info->scrub_lock);
>>   
>> +	btrfs_destroy_workqueue(scrub_workers);
>> +	btrfs_destroy_workqueue(scrub_wr_comp);
>> +	btrfs_destroy_workqueue(scrub_parity);
> 
> https://lore.kernel.org/linux-btrfs/1543554924-17397-2-git-send-email-anand.jain@oracle.com/
> 
> Comparing to the previous version, it's almost the same I think. If
> scrub_workers_get races between the unlock and destroy_workers, anything
> that uses fs_info->scrub_wokers will soon use freed memory.
> 
> The difference is that the worker pointers are read from fs_info under a
> lock but are still used outside. I haven't tested this version but from
> the analysis of previous crash, I don't see how v4 is supposed to be
> better.
>

Consider v3 code as below:

When process-A is at [1] (below) start another
btrfs scrub start, lets call it process-B.
When process-A is at [1] it unlocks the fs_info::scrub_lock so the
process-B can overwrite fs_info::scrub_workers,
fs_info::scrub_wr_completion_workers, fs_info::scrub_parity_workers
which the process-A at [1] has not yet called destroyed.

Process-A
---------

btrfs scrub start /mnt

::
         mutex_lock(&fs_info->scrub_lock);
::
         if (dev->scrub_ctx ||
             (!is_dev_replace &&
              btrfs_dev_replace_is_ongoing(&fs_info->dev_replace))) {
                 up_read(&fs_info->dev_replace.rwsem);
                 mutex_unlock(&fs_info->scrub_lock);
                 mutex_unlock(&fs_info->fs_devices->device_list_mutex);
                 ret = -EINPROGRESS;
                 goto out_free_ctx;
         }
::
         ret = scrub_workers_get(fs_info, is_dev_replace);  <-- [2]
::
         dev->scrub_ctx = sctx;
         mutex_unlock(&fs_info->scrub_lock);

::
                 ret = scrub_enumerate_chunks(sctx, dev, start, end);
::
         atomic_dec(&fs_info->scrubs_running);
::

         mutex_lock(&fs_info->scrub_lock);
         dev->scrub_ctx = NULL;
         scrub_workers_put(fs_info);
         mutex_unlock(&fs_info->scrub_lock);



static noinline_for_stack void scrub_workers_put(struct btrfs_fs_info 
*fs_info)
{
         lockdep_assert_held(&fs_info->scrub_lock);
         if (--fs_info->scrub_workers_refcnt == 0) {
                 mutex_unlock(&fs_info->scrub_lock);

		<wait for process-B>  [1]

                 btrfs_destroy_workqueue(fs_info->scrub_workers);
 
btrfs_destroy_workqueue(fs_info->scrub_wr_completion_workers);
                 btrfs_destroy_workqueue(fs_info->scrub_parity_workers);
                 mutex_lock(&fs_info->scrub_lock);
         }
         WARN_ON(fs_info->scrub_workers_refcnt < 0);
}




Process-B
---------
Start when process-A is at [1] (above)
btrfs scrub start /mnt

::
  at [2] (above) the fs_info::scrub_workers,
fs_info::scrub_wr_completion_workers, fs_info::scrub_parity_workers
of process-A are overwritten.


So in v4.
--------

Similar to dev::scrub_ctx the fs_info::scrub_workers,
fs_info::scrub_wr_completion_workers, fs_info::scrub_parity_workers
are stored locally before fs_info::scrub_lock is released, so the
list pointers aren't corrupted.

Hope this clarifies.

Thanks, Anand

  reply index

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-30  6:44 [PATCH v4 0/3] btrfs: scrub: fix scrub_lock Anand Jain
2019-01-30  6:45 ` [PATCH v4 1/3] btrfs: scrub: fix circular locking dependency warning Anand Jain
2019-01-30 14:07   ` David Sterba
2019-01-31  6:34     ` Anand Jain [this message]
2019-02-12 15:49       ` David Sterba
2019-02-12 16:45   ` David Sterba
2019-02-13 16:02     ` Anand Jain
2019-01-30  6:45 ` [PATCH v4 2/3] btrfs: scrub: add scrub_lock lockdep check in scrub_workers_get Anand Jain
2019-01-30  6:45 ` [PATCH v4 3/3] btrfs: scrub: convert scrub_workers_refcnt to refcount_t Anand Jain
2019-02-08 17:02 ` [PATCH v4 0/3] btrfs: scrub: fix scrub_lock David Sterba
2019-02-11  8:09   ` Anand Jain
2019-02-11  8:38 ` [PATCH v5 3/3] btrfs: scrub: convert scrub_workers_refcnt to refcount_t Anand Jain
2019-02-12 17:15 ` [PATCH] btrfs: scrub: add assertions for worker pointers David Sterba
2019-02-13 16:02   ` Anand Jain
2019-02-13 17:35 ` [PATCH v4 0/3] btrfs: scrub: fix scrub_lock David Sterba

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=be521dbc-0b2a-696a-7167-21d50c6c71a9@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org linux-btrfs@archiver.kernel.org
	public-inbox-index linux-btrfs


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/ public-inbox