All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: Chris Mason <clm@fb.com>,
	dsterba@suse.cz, David Sterba <dsterba@suse.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: [PULL] Btrfs for 4.7, part 2
Date: Tue, 14 Jun 2016 18:52:03 +0800	[thread overview]
Message-ID: <1e7bc117-aaa3-c77f-b933-cd0c37b5ce68@oracle.com> (raw)
In-Reply-To: <20160529122103.GA8726@clm-mbp.masoncoding.com>


Chris,

   Sorry for the delay due to vacation.

more below..

On 05/29/2016 08:21 PM, Chris Mason wrote:
> On Sat, May 28, 2016 at 01:14:13PM +0800, Anand Jain wrote:
>>
>>
>> On 05/27/2016 11:42 PM, Chris Mason wrote:
>>>> I'm getting errors from btrfs fi show -d, after the very last round of
>>>> device replaces.  A little extra debugging:
>>>>
>>>> bytenr mismatch, want=4332716032, have=0
>>>> ERROR: cannot read chunk root
>>>> ERROR reading /dev/vdh
>>>> failed /dev/vdh
>>>>
>>>> Which is cute because the very next command we run fscks /dev/vdh and
>>>> succeeds.
>>
>> Checked the code paths both btrfs fi show -d and btrfs check,
>> both are calling flush during relative open_ctree in progs.
>>
>> However the flush is called after we have read superblock. That
>> means the read_superblock during 'show' cli (only) will read superblock
>> without flush, and 'check' won't, because 011 calls 'check' after
>> 'show'. But it still does not explain the above error, which is
>> during open_ctree not at read superblock. Remains strange case as
>> of now.
>
> It's because we're just not done writing it out yet when btrfs fi show
> is run.
> I think replace is special here.
>
>>
>> Also. I can't reproduce.
>>
>
> I'm in a relatively new test rig using kvm, which probably explains why
> I haven't seen it before.  You can probably make it easier by adding
> a sleep inside the actual __free_device() func.
>
>>>> So the page cache is stale and this isn't related to any of our
>>>> patches.
>>>
>>> close_ctree() calls into btrfs_close_devices(), which calls
>>> btrfs_close_one_device(), which uses:
>>>
>>> call_rcu(&device->rcu, free_device);
>>>
>>> close_ctree() also does an rcu_barrier() to make sure and wait for
>>> free_device() to finish.
>>>
>>> But, free_device() just puts the work into schedule_work(), so we don't
>>> know for sure the blkdev_put is done when we exit.
>>
>> Right, saw that before. Any idea why its like that ? Or if it
>> should be fixed?
>
> It's just trying to limit the work that is done from call_rcu, and it
> should
> definitely be fixed.  It might cause EBUSY or other problems.  Probably
> easiest to add a counter or completion object that gets changed by the
> __free_device function.


yes indeed sleep made the problem to reproduce,

Also looks like this problem was identified by below
commit before, however the fix wasn't correct.
    ----
      commit bc178622d40d87e75abc131007342429c9b03351
      btrfs: use rcu_barrier() to wait for bdev puts at unmount

      ::
      Adding an rcu_barrier() to btrfs_close_devices() causes unmount
      to wait
      until all blkdev_put()s are done, and the device is truly free once
      unmount complet
    ----

  As free_devces() spinoff __free_device() to make the actual
  bdev put we need to wait on __free_device(). But rcu_barrier()
  just waits for free_device() to complete, so at the end of
  rcu_barrier() the blkdev_put()  may not be completed.


  Wrote a new fix as in the patches,
   [PATH 2/2] btrfs: wait for bdev put

  For review comments.


Thanks, -Anand

> -chris

  reply	other threads:[~2016-06-14 10:52 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-26  9:27 [PULL] Btrfs for 4.7, part 2 David Sterba
2016-05-27  0:14 ` Chris Mason
2016-05-27 11:18   ` David Sterba
2016-05-27 14:35     ` Chris Mason
2016-05-27 15:42       ` Chris Mason
2016-05-28  5:14         ` Anand Jain
2016-05-29 12:21           ` Chris Mason
2016-06-14 10:52             ` Anand Jain [this message]
2016-06-14 10:55               ` [PATCH 1/2] btrfs: reorg btrfs_close_one_device() Anand Jain
2016-06-14 10:55                 ` [PATCH 2/2] btrfs: wait for bdev put Anand Jain
2016-06-18 16:34                   ` Holger Hoffstätte
2016-06-20  8:33                     ` Anand Jain
2016-06-21 10:24                   ` [PATCH v2 " Anand Jain
2016-06-21 11:46                     ` Holger Hoffstätte
2016-06-21 13:00                     ` Chris Mason
2016-06-22 10:18                       ` Anand Jain
2016-06-22 21:47                         ` Chris Mason
2016-06-23 13:07                           ` Anand Jain
2016-06-23 12:54                   ` [PATCH v3 2/2] btrfs: make sure device is synced before return Anand Jain
2016-06-23 14:27                     ` Chris Mason
2016-07-08 14:13                     ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1e7bc117-aaa3-c77f-b933-cd0c37b5ce68@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.