linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Fix aborts when dropping snapshots
@ 2018-11-30 16:52 Josef Bacik
  2018-11-30 16:52 ` [PATCH 1/2] btrfs: catch cow on deleting snapshots Josef Bacik
  2018-11-30 16:52 ` [PATCH 2/2] btrfs: run delayed items before dropping the snapshot Josef Bacik
  0 siblings, 2 replies; 10+ messages in thread
From: Josef Bacik @ 2018-11-30 16:52 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

With more machines running my recent delayed ref rsv patches we started seeing a
spike in boxes aborting in __btrfs_free_extent with being unable to find the
extent ref.  The full details are in 2/2, but the summary is there's been a bug
ever since the original delayed inode item stuff was introduced where it could
run once the snapshot was being deleted, which will result in all sorts of
extent reference shenanigans.  This was tricky to hit before, but with my iput
changes it's become much easier to hit on our build boxes that are heavy users
of snapshot creation/deletion.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] btrfs: catch cow on deleting snapshots
  2018-11-30 16:52 [PATCH 0/2] Fix aborts when dropping snapshots Josef Bacik
@ 2018-11-30 16:52 ` Josef Bacik
  2018-11-30 17:14   ` Filipe Manana
  2018-12-01  0:19   ` Qu Wenruo
  2018-11-30 16:52 ` [PATCH 2/2] btrfs: run delayed items before dropping the snapshot Josef Bacik
  1 sibling, 2 replies; 10+ messages in thread
From: Josef Bacik @ 2018-11-30 16:52 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Josef Bacik

From: Josef Bacik <jbacik@fb.com>

When debugging some weird extent reference bug I suspected that we were
changing a snapshot while we were deleting it, which could explain my
bug.  This was indeed what was happening, and this patch helped me
verify my theory.  It is never correct to modify the snapshot once it's
being deleted, so mark the root when we are deleting it and make sure we
complain about it when it happens.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.c       | 3 +++
 fs/btrfs/ctree.h       | 1 +
 fs/btrfs/extent-tree.c | 9 +++++++++
 3 files changed, 13 insertions(+)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 5912a97b07a6..5f82f86085e8 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -1440,6 +1440,9 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans,
 	u64 search_start;
 	int ret;
 
+	if (test_bit(BTRFS_ROOT_DELETING, &root->state))
+		WARN(1, KERN_CRIT "cow'ing blocks on a fs root thats being dropped\n");
+
 	if (trans->transaction != fs_info->running_transaction)
 		WARN(1, KERN_CRIT "trans %llu running %llu\n",
 		       trans->transid,
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index facde70c15ed..5a3a94ccb65c 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1199,6 +1199,7 @@ enum {
 	BTRFS_ROOT_FORCE_COW,
 	BTRFS_ROOT_MULTI_LOG_TASKS,
 	BTRFS_ROOT_DIRTY,
+	BTRFS_ROOT_DELETING,
 };
 
 /*
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 581c2a0b2945..dcb699dd57f3 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -9333,6 +9333,15 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
 	if (block_rsv)
 		trans->block_rsv = block_rsv;
 
+	/*
+	 * This will help us catch people modifying the fs tree while we're
+	 * dropping it.  It is unsafe to mess with the fs tree while it's being
+	 * dropped as we unlock the root node and parent nodes as we walk down
+	 * the tree, assuming nothing will change.  If something does change
+	 * then we'll have stale information and drop references to blocks we've
+	 * already dropped.
+	 */
+	set_bit(BTRFS_ROOT_DELETING, &root->state);
 	if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) {
 		level = btrfs_header_level(root->node);
 		path->nodes[level] = btrfs_lock_root_node(root);
-- 
2.14.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] btrfs: run delayed items before dropping the snapshot
  2018-11-30 16:52 [PATCH 0/2] Fix aborts when dropping snapshots Josef Bacik
  2018-11-30 16:52 ` [PATCH 1/2] btrfs: catch cow on deleting snapshots Josef Bacik
@ 2018-11-30 16:52 ` Josef Bacik
  2018-11-30 17:12   ` Filipe Manana
  1 sibling, 1 reply; 10+ messages in thread
From: Josef Bacik @ 2018-11-30 16:52 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Josef Bacik, stable

From: Josef Bacik <jbacik@fb.com>

With my delayed refs patches in place we started seeing a large amount
of aborts in __btrfs_free_extent

BTRFS error (device sdb1): unable to find ref byte nr 91947008 parent 0 root 35964  owner 1 offset 0
Call Trace:
 ? btrfs_merge_delayed_refs+0xaf/0x340
 __btrfs_run_delayed_refs+0x6ea/0xfc0
 ? btrfs_set_path_blocking+0x31/0x60
 btrfs_run_delayed_refs+0xeb/0x180
 btrfs_commit_transaction+0x179/0x7f0
 ? btrfs_check_space_for_delayed_refs+0x30/0x50
 ? should_end_transaction.isra.19+0xe/0x40
 btrfs_drop_snapshot+0x41c/0x7c0
 btrfs_clean_one_deleted_snapshot+0xb5/0xd0
 cleaner_kthread+0xf6/0x120
 kthread+0xf8/0x130
 ? btree_invalidatepage+0x90/0x90
 ? kthread_bind+0x10/0x10
 ret_from_fork+0x35/0x40

This was because btrfs_drop_snapshot depends on the root not being modified
while it's dropping the snapshot.  It will unlock the root node (and really
every node) as it walks down the tree, only to re-lock it when it needs to do
something.  This is a problem because if we modify the tree we could cow a block
in our path, which free's our reference to that block.  Then once we get back to
that shared block we'll free our reference to it again, and get ENOENT when
trying to lookup our extent reference to that block in __btrfs_free_extent.

This is ultimately happening because we have delayed items left to be processed
for our deleted snapshot _after_ all of the inodes are closed for the snapshot.
We only run the delayed inode item if we're deleting the inode, and even then we
do not run the delayed insertions or delayed removals.  These can be run at any
point after our final inode does it's last iput, which is what triggers the
snapshot deletion.  We can end up with the snapshot deletion happening and then
have the delayed items run on that file system, resulting in the above problem.

This problem has existed forever, however my patches made it much easier to hit
as I wake up the cleaner much more often to deal with delayed iputs, which made
us more likely to start the snapshot dropping work before the transaction
commits, which is when the delayed items would generally be run.  Before,
generally speaking, we would run the delayed items, commit the transaction, and
wakeup the cleaner thread to start deleting snapshots, which means we were less
likely to hit this problem.  You could still hit it if you had multiple
snapshots to be deleted and ended up with lots of delayed items, but it was
definitely harder.

Fix for now by simply running all the delayed items before starting to drop the
snapshot.  We could make this smarter in the future by making the delayed items
per-root, and then simply drop any delayed items for roots that we are going to
delete.  But for now just a quick and easy solution is the safest.

Cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index dcb699dd57f3..965702034b22 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -9330,6 +9330,8 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
 		goto out_free;
 	}
 
+	btrfs_run_delayed_items(trans);
+
 	if (block_rsv)
 		trans->block_rsv = block_rsv;
 
-- 
2.14.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] btrfs: run delayed items before dropping the snapshot
  2018-11-30 16:52 ` [PATCH 2/2] btrfs: run delayed items before dropping the snapshot Josef Bacik
@ 2018-11-30 17:12   ` Filipe Manana
  2018-11-30 21:15     ` Filipe Manana
  0 siblings, 1 reply; 10+ messages in thread
From: Filipe Manana @ 2018-11-30 17:12 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team, Josef Bacik, stable

On Fri, Nov 30, 2018 at 4:53 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> From: Josef Bacik <jbacik@fb.com>
>
> With my delayed refs patches in place we started seeing a large amount
> of aborts in __btrfs_free_extent
>
> BTRFS error (device sdb1): unable to find ref byte nr 91947008 parent 0 root 35964  owner 1 offset 0
> Call Trace:
>  ? btrfs_merge_delayed_refs+0xaf/0x340
>  __btrfs_run_delayed_refs+0x6ea/0xfc0
>  ? btrfs_set_path_blocking+0x31/0x60
>  btrfs_run_delayed_refs+0xeb/0x180
>  btrfs_commit_transaction+0x179/0x7f0
>  ? btrfs_check_space_for_delayed_refs+0x30/0x50
>  ? should_end_transaction.isra.19+0xe/0x40
>  btrfs_drop_snapshot+0x41c/0x7c0
>  btrfs_clean_one_deleted_snapshot+0xb5/0xd0
>  cleaner_kthread+0xf6/0x120
>  kthread+0xf8/0x130
>  ? btree_invalidatepage+0x90/0x90
>  ? kthread_bind+0x10/0x10
>  ret_from_fork+0x35/0x40
>
> This was because btrfs_drop_snapshot depends on the root not being modified
> while it's dropping the snapshot.  It will unlock the root node (and really
> every node) as it walks down the tree, only to re-lock it when it needs to do
> something.  This is a problem because if we modify the tree we could cow a block
> in our path, which free's our reference to that block.  Then once we get back to
> that shared block we'll free our reference to it again, and get ENOENT when
> trying to lookup our extent reference to that block in __btrfs_free_extent.
>
> This is ultimately happening because we have delayed items left to be processed
> for our deleted snapshot _after_ all of the inodes are closed for the snapshot.
> We only run the delayed inode item if we're deleting the inode, and even then we
> do not run the delayed insertions or delayed removals.  These can be run at any
> point after our final inode does it's last iput, which is what triggers the
> snapshot deletion.  We can end up with the snapshot deletion happening and then
> have the delayed items run on that file system, resulting in the above problem.
>
> This problem has existed forever, however my patches made it much easier to hit
> as I wake up the cleaner much more often to deal with delayed iputs, which made
> us more likely to start the snapshot dropping work before the transaction
> commits, which is when the delayed items would generally be run.  Before,
> generally speaking, we would run the delayed items, commit the transaction, and
> wakeup the cleaner thread to start deleting snapshots, which means we were less
> likely to hit this problem.  You could still hit it if you had multiple
> snapshots to be deleted and ended up with lots of delayed items, but it was
> definitely harder.
>
> Fix for now by simply running all the delayed items before starting to drop the
> snapshot.  We could make this smarter in the future by making the delayed items
> per-root, and then simply drop any delayed items for roots that we are going to
> delete.  But for now just a quick and easy solution is the safest.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Great catch!
I've hit this error from  __btrfs_free_extent() a handful of times
over the years, but never managed
to reproduce it on demand or figure out it was related to snapshot deletion.

> ---
>  fs/btrfs/extent-tree.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index dcb699dd57f3..965702034b22 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -9330,6 +9330,8 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>                 goto out_free;
>         }
>
> +       btrfs_run_delayed_items(trans);
> +
>         if (block_rsv)
>                 trans->block_rsv = block_rsv;
>
> --
> 2.14.3
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] btrfs: catch cow on deleting snapshots
  2018-11-30 16:52 ` [PATCH 1/2] btrfs: catch cow on deleting snapshots Josef Bacik
@ 2018-11-30 17:14   ` Filipe Manana
  2018-11-30 17:19     ` Josef Bacik
  2018-12-01  0:19   ` Qu Wenruo
  1 sibling, 1 reply; 10+ messages in thread
From: Filipe Manana @ 2018-11-30 17:14 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team, Josef Bacik

On Fri, Nov 30, 2018 at 4:53 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> From: Josef Bacik <jbacik@fb.com>
>
> When debugging some weird extent reference bug I suspected that we were
> changing a snapshot while we were deleting it, which could explain my
> bug.  This was indeed what was happening, and this patch helped me
> verify my theory.  It is never correct to modify the snapshot once it's
> being deleted, so mark the root when we are deleting it and make sure we
> complain about it when it happens.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/ctree.c       | 3 +++
>  fs/btrfs/ctree.h       | 1 +
>  fs/btrfs/extent-tree.c | 9 +++++++++
>  3 files changed, 13 insertions(+)
>
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index 5912a97b07a6..5f82f86085e8 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -1440,6 +1440,9 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans,
>         u64 search_start;
>         int ret;
>
> +       if (test_bit(BTRFS_ROOT_DELETING, &root->state))
> +               WARN(1, KERN_CRIT "cow'ing blocks on a fs root thats being dropped\n");

Please use btrfs_warn(), it makes sure we use a consistent message
style, identifies the fs, etc.
Also, "thats" should be "that is" or "that's".

With that fixed,
Reviewed-by: Filipe Manana <fdmanana@suse.com>

> +
>         if (trans->transaction != fs_info->running_transaction)
>                 WARN(1, KERN_CRIT "trans %llu running %llu\n",
>                        trans->transid,
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index facde70c15ed..5a3a94ccb65c 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -1199,6 +1199,7 @@ enum {
>         BTRFS_ROOT_FORCE_COW,
>         BTRFS_ROOT_MULTI_LOG_TASKS,
>         BTRFS_ROOT_DIRTY,
> +       BTRFS_ROOT_DELETING,
>  };
>
>  /*
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 581c2a0b2945..dcb699dd57f3 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -9333,6 +9333,15 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>         if (block_rsv)
>                 trans->block_rsv = block_rsv;
>
> +       /*
> +        * This will help us catch people modifying the fs tree while we're
> +        * dropping it.  It is unsafe to mess with the fs tree while it's being
> +        * dropped as we unlock the root node and parent nodes as we walk down
> +        * the tree, assuming nothing will change.  If something does change
> +        * then we'll have stale information and drop references to blocks we've
> +        * already dropped.
> +        */
> +       set_bit(BTRFS_ROOT_DELETING, &root->state);
>         if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) {
>                 level = btrfs_header_level(root->node);
>                 path->nodes[level] = btrfs_lock_root_node(root);
> --
> 2.14.3
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] btrfs: catch cow on deleting snapshots
  2018-11-30 17:14   ` Filipe Manana
@ 2018-11-30 17:19     ` Josef Bacik
  2018-12-06 18:05       ` David Sterba
  0 siblings, 1 reply; 10+ messages in thread
From: Josef Bacik @ 2018-11-30 17:19 UTC (permalink / raw)
  To: Filipe Manana; +Cc: Josef Bacik, linux-btrfs, kernel-team, Josef Bacik

On Fri, Nov 30, 2018 at 05:14:54PM +0000, Filipe Manana wrote:
> On Fri, Nov 30, 2018 at 4:53 PM Josef Bacik <josef@toxicpanda.com> wrote:
> >
> > From: Josef Bacik <jbacik@fb.com>
> >
> > When debugging some weird extent reference bug I suspected that we were
> > changing a snapshot while we were deleting it, which could explain my
> > bug.  This was indeed what was happening, and this patch helped me
> > verify my theory.  It is never correct to modify the snapshot once it's
> > being deleted, so mark the root when we are deleting it and make sure we
> > complain about it when it happens.
> >
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > ---
> >  fs/btrfs/ctree.c       | 3 +++
> >  fs/btrfs/ctree.h       | 1 +
> >  fs/btrfs/extent-tree.c | 9 +++++++++
> >  3 files changed, 13 insertions(+)
> >
> > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> > index 5912a97b07a6..5f82f86085e8 100644
> > --- a/fs/btrfs/ctree.c
> > +++ b/fs/btrfs/ctree.c
> > @@ -1440,6 +1440,9 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans,
> >         u64 search_start;
> >         int ret;
> >
> > +       if (test_bit(BTRFS_ROOT_DELETING, &root->state))
> > +               WARN(1, KERN_CRIT "cow'ing blocks on a fs root thats being dropped\n");
> 
> Please use btrfs_warn(), it makes sure we use a consistent message
> style, identifies the fs, etc.
> Also, "thats" should be "that is" or "that's".
> 

Ah yeah, I was following the other convention in there but we should probably
convert all of those to btrfs_warn.  I'll fix the grammer thing as well, just a
leftover from the much less code of conduct friendly message I originally had
there.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] btrfs: run delayed items before dropping the snapshot
  2018-11-30 17:12   ` Filipe Manana
@ 2018-11-30 21:15     ` Filipe Manana
  0 siblings, 0 replies; 10+ messages in thread
From: Filipe Manana @ 2018-11-30 21:15 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team, Josef Bacik, stable

On Fri, Nov 30, 2018 at 5:12 PM Filipe Manana <fdmanana@gmail.com> wrote:
>
> On Fri, Nov 30, 2018 at 4:53 PM Josef Bacik <josef@toxicpanda.com> wrote:
> >
> > From: Josef Bacik <jbacik@fb.com>
> >
> > With my delayed refs patches in place we started seeing a large amount
> > of aborts in __btrfs_free_extent
> >
> > BTRFS error (device sdb1): unable to find ref byte nr 91947008 parent 0 root 35964  owner 1 offset 0
> > Call Trace:
> >  ? btrfs_merge_delayed_refs+0xaf/0x340
> >  __btrfs_run_delayed_refs+0x6ea/0xfc0
> >  ? btrfs_set_path_blocking+0x31/0x60
> >  btrfs_run_delayed_refs+0xeb/0x180
> >  btrfs_commit_transaction+0x179/0x7f0
> >  ? btrfs_check_space_for_delayed_refs+0x30/0x50
> >  ? should_end_transaction.isra.19+0xe/0x40
> >  btrfs_drop_snapshot+0x41c/0x7c0
> >  btrfs_clean_one_deleted_snapshot+0xb5/0xd0
> >  cleaner_kthread+0xf6/0x120
> >  kthread+0xf8/0x130
> >  ? btree_invalidatepage+0x90/0x90
> >  ? kthread_bind+0x10/0x10
> >  ret_from_fork+0x35/0x40
> >
> > This was because btrfs_drop_snapshot depends on the root not being modified
> > while it's dropping the snapshot.  It will unlock the root node (and really
> > every node) as it walks down the tree, only to re-lock it when it needs to do
> > something.  This is a problem because if we modify the tree we could cow a block
> > in our path, which free's our reference to that block.  Then once we get back to
> > that shared block we'll free our reference to it again, and get ENOENT when
> > trying to lookup our extent reference to that block in __btrfs_free_extent.
> >
> > This is ultimately happening because we have delayed items left to be processed
> > for our deleted snapshot _after_ all of the inodes are closed for the snapshot.
> > We only run the delayed inode item if we're deleting the inode, and even then we
> > do not run the delayed insertions or delayed removals.  These can be run at any
> > point after our final inode does it's last iput, which is what triggers the
> > snapshot deletion.  We can end up with the snapshot deletion happening and then
> > have the delayed items run on that file system, resulting in the above problem.
> >
> > This problem has existed forever, however my patches made it much easier to hit
> > as I wake up the cleaner much more often to deal with delayed iputs, which made
> > us more likely to start the snapshot dropping work before the transaction
> > commits, which is when the delayed items would generally be run.  Before,
> > generally speaking, we would run the delayed items, commit the transaction, and
> > wakeup the cleaner thread to start deleting snapshots, which means we were less
> > likely to hit this problem.  You could still hit it if you had multiple
> > snapshots to be deleted and ended up with lots of delayed items, but it was
> > definitely harder.
> >
> > Fix for now by simply running all the delayed items before starting to drop the
> > snapshot.  We could make this smarter in the future by making the delayed items
> > per-root, and then simply drop any delayed items for roots that we are going to
> > delete.  But for now just a quick and easy solution is the safest.
> >
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>
> Reviewed-by: Filipe Manana <fdmanana@suse.com>
>
> Great catch!
> I've hit this error from  __btrfs_free_extent() a handful of times
> over the years, but never managed
> to reproduce it on demand or figure out it was related to snapshot deletion.
>
> > ---
> >  fs/btrfs/extent-tree.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> > index dcb699dd57f3..965702034b22 100644
> > --- a/fs/btrfs/extent-tree.c
> > +++ b/fs/btrfs/extent-tree.c
> > @@ -9330,6 +9330,8 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
> >                 goto out_free;
> >         }
> >
> > +       btrfs_run_delayed_items(trans);
> > +

Btw, we should check the return value of this and return it if it's an error?
We can't do nothing with it in the context of the cleaner thread,
which is why, I suppose, you chose to ignore the value (besides that
the error might have been for some other root).
But this can be used in the context of relocation, where we can return
the error back to userspace.

Thanks.

> >         if (block_rsv)
> >                 trans->block_rsv = block_rsv;
> >
> > --
> > 2.14.3
> >
>
>
> --
> Filipe David Manana,
>
> “Whether you think you can, or you think you can't — you're right.”



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] btrfs: catch cow on deleting snapshots
  2018-11-30 16:52 ` [PATCH 1/2] btrfs: catch cow on deleting snapshots Josef Bacik
  2018-11-30 17:14   ` Filipe Manana
@ 2018-12-01  0:19   ` Qu Wenruo
  2018-12-01  3:02     ` Hans van Kranenburg
  1 sibling, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2018-12-01  0:19 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Josef Bacik


[-- Attachment #1.1: Type: text/plain, Size: 2651 bytes --]



On 2018/12/1 上午12:52, Josef Bacik wrote:
> From: Josef Bacik <jbacik@fb.com>
> 
> When debugging some weird extent reference bug I suspected that we were
> changing a snapshot while we were deleting it, which could explain my
> bug.

May I ask under which case we're going to modify an unlinked snapshot?

Maybe metadata relocation?

Thanks,
Qu

> This was indeed what was happening, and this patch helped me
> verify my theory.  It is never correct to modify the snapshot once it's
> being deleted, so mark the root when we are deleting it and make sure we
> complain about it when it happens.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/ctree.c       | 3 +++
>  fs/btrfs/ctree.h       | 1 +
>  fs/btrfs/extent-tree.c | 9 +++++++++
>  3 files changed, 13 insertions(+)
> 
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index 5912a97b07a6..5f82f86085e8 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -1440,6 +1440,9 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans,
>  	u64 search_start;
>  	int ret;
>  
> +	if (test_bit(BTRFS_ROOT_DELETING, &root->state))
> +		WARN(1, KERN_CRIT "cow'ing blocks on a fs root thats being dropped\n");
> +
>  	if (trans->transaction != fs_info->running_transaction)
>  		WARN(1, KERN_CRIT "trans %llu running %llu\n",
>  		       trans->transid,
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index facde70c15ed..5a3a94ccb65c 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -1199,6 +1199,7 @@ enum {
>  	BTRFS_ROOT_FORCE_COW,
>  	BTRFS_ROOT_MULTI_LOG_TASKS,
>  	BTRFS_ROOT_DIRTY,
> +	BTRFS_ROOT_DELETING,
>  };
>  
>  /*
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 581c2a0b2945..dcb699dd57f3 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -9333,6 +9333,15 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>  	if (block_rsv)
>  		trans->block_rsv = block_rsv;
>  
> +	/*
> +	 * This will help us catch people modifying the fs tree while we're
> +	 * dropping it.  It is unsafe to mess with the fs tree while it's being
> +	 * dropped as we unlock the root node and parent nodes as we walk down
> +	 * the tree, assuming nothing will change.  If something does change
> +	 * then we'll have stale information and drop references to blocks we've
> +	 * already dropped.
> +	 */
> +	set_bit(BTRFS_ROOT_DELETING, &root->state);
>  	if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) {
>  		level = btrfs_header_level(root->node);
>  		path->nodes[level] = btrfs_lock_root_node(root);
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] btrfs: catch cow on deleting snapshots
  2018-12-01  0:19   ` Qu Wenruo
@ 2018-12-01  3:02     ` Hans van Kranenburg
  0 siblings, 0 replies; 10+ messages in thread
From: Hans van Kranenburg @ 2018-12-01  3:02 UTC (permalink / raw)
  To: Qu Wenruo, Josef Bacik, linux-btrfs, kernel-team; +Cc: Josef Bacik


[-- Attachment #1.1: Type: text/plain, Size: 3213 bytes --]

On 12/1/18 1:19 AM, Qu Wenruo wrote:
> 
> 
> On 2018/12/1 上午12:52, Josef Bacik wrote:
>> From: Josef Bacik <jbacik@fb.com>
>>
>> When debugging some weird extent reference bug I suspected that we were
>> changing a snapshot while we were deleting it, which could explain my
>> bug.
> 
> May I ask under which case we're going to modify an unlinked snapshot?
> 
> Maybe metadata relocation?

I have some old logging from a filesystem that got corrupted when doing
a subvolume delete while doing metadata balance:

https://syrinx.knorrie.org/~knorrie/btrfs/domokun-balance-skinny-lockup.txt

This is one of the very few moments when I really lost a btrfs
filesystem and had to mkfs again to move on.

I'm wondering if this one looks related. I never found an explanation
for it.

Hans

> 
> Thanks,
> Qu
> 
>> This was indeed what was happening, and this patch helped me
>> verify my theory.  It is never correct to modify the snapshot once it's
>> being deleted, so mark the root when we are deleting it and make sure we
>> complain about it when it happens.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>> ---
>>  fs/btrfs/ctree.c       | 3 +++
>>  fs/btrfs/ctree.h       | 1 +
>>  fs/btrfs/extent-tree.c | 9 +++++++++
>>  3 files changed, 13 insertions(+)
>>
>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
>> index 5912a97b07a6..5f82f86085e8 100644
>> --- a/fs/btrfs/ctree.c
>> +++ b/fs/btrfs/ctree.c
>> @@ -1440,6 +1440,9 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans,
>>  	u64 search_start;
>>  	int ret;
>>  
>> +	if (test_bit(BTRFS_ROOT_DELETING, &root->state))
>> +		WARN(1, KERN_CRIT "cow'ing blocks on a fs root thats being dropped\n");
>> +
>>  	if (trans->transaction != fs_info->running_transaction)
>>  		WARN(1, KERN_CRIT "trans %llu running %llu\n",
>>  		       trans->transid,
>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>> index facde70c15ed..5a3a94ccb65c 100644
>> --- a/fs/btrfs/ctree.h
>> +++ b/fs/btrfs/ctree.h
>> @@ -1199,6 +1199,7 @@ enum {
>>  	BTRFS_ROOT_FORCE_COW,
>>  	BTRFS_ROOT_MULTI_LOG_TASKS,
>>  	BTRFS_ROOT_DIRTY,
>> +	BTRFS_ROOT_DELETING,
>>  };
>>  
>>  /*
>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>> index 581c2a0b2945..dcb699dd57f3 100644
>> --- a/fs/btrfs/extent-tree.c
>> +++ b/fs/btrfs/extent-tree.c
>> @@ -9333,6 +9333,15 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>>  	if (block_rsv)
>>  		trans->block_rsv = block_rsv;
>>  
>> +	/*
>> +	 * This will help us catch people modifying the fs tree while we're
>> +	 * dropping it.  It is unsafe to mess with the fs tree while it's being
>> +	 * dropped as we unlock the root node and parent nodes as we walk down
>> +	 * the tree, assuming nothing will change.  If something does change
>> +	 * then we'll have stale information and drop references to blocks we've
>> +	 * already dropped.
>> +	 */
>> +	set_bit(BTRFS_ROOT_DELETING, &root->state);
>>  	if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) {
>>  		level = btrfs_header_level(root->node);
>>  		path->nodes[level] = btrfs_lock_root_node(root);
>>
> 


-- 
Hans van Kranenburg


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] btrfs: catch cow on deleting snapshots
  2018-11-30 17:19     ` Josef Bacik
@ 2018-12-06 18:05       ` David Sterba
  0 siblings, 0 replies; 10+ messages in thread
From: David Sterba @ 2018-12-06 18:05 UTC (permalink / raw)
  To: Josef Bacik; +Cc: Filipe Manana, linux-btrfs, kernel-team, Josef Bacik

On Fri, Nov 30, 2018 at 12:19:18PM -0500, Josef Bacik wrote:
> On Fri, Nov 30, 2018 at 05:14:54PM +0000, Filipe Manana wrote:
> > On Fri, Nov 30, 2018 at 4:53 PM Josef Bacik <josef@toxicpanda.com> wrote:
> > >
> > > From: Josef Bacik <jbacik@fb.com>
> > >
> > > When debugging some weird extent reference bug I suspected that we were
> > > changing a snapshot while we were deleting it, which could explain my
> > > bug.  This was indeed what was happening, and this patch helped me
> > > verify my theory.  It is never correct to modify the snapshot once it's
> > > being deleted, so mark the root when we are deleting it and make sure we
> > > complain about it when it happens.
> > >
> > > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > > ---
> > >  fs/btrfs/ctree.c       | 3 +++
> > >  fs/btrfs/ctree.h       | 1 +
> > >  fs/btrfs/extent-tree.c | 9 +++++++++
> > >  3 files changed, 13 insertions(+)
> > >
> > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> > > index 5912a97b07a6..5f82f86085e8 100644
> > > --- a/fs/btrfs/ctree.c
> > > +++ b/fs/btrfs/ctree.c
> > > @@ -1440,6 +1440,9 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans,
> > >         u64 search_start;
> > >         int ret;
> > >
> > > +       if (test_bit(BTRFS_ROOT_DELETING, &root->state))
> > > +               WARN(1, KERN_CRIT "cow'ing blocks on a fs root thats being dropped\n");
> > 
> > Please use btrfs_warn(), it makes sure we use a consistent message
> > style, identifies the fs, etc.
> > Also, "thats" should be "that is" or "that's".
> > 
> 
> Ah yeah, I was following the other convention in there but we should probably
> convert all of those to btrfs_warn.  I'll fix the grammer thing as well, just a
> leftover from the much less code of conduct friendly message I originally had
> there.  Thanks,

Committed with the following fixup:

-               WARN(1, KERN_CRIT "cow'ing blocks on a fs root thats being dropped\n");
+               btrfs_error(fs_info,
+                       "COW'ing blocks on a fs root that's being dropped");


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-12-06 18:05 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-30 16:52 [PATCH 0/2] Fix aborts when dropping snapshots Josef Bacik
2018-11-30 16:52 ` [PATCH 1/2] btrfs: catch cow on deleting snapshots Josef Bacik
2018-11-30 17:14   ` Filipe Manana
2018-11-30 17:19     ` Josef Bacik
2018-12-06 18:05       ` David Sterba
2018-12-01  0:19   ` Qu Wenruo
2018-12-01  3:02     ` Hans van Kranenburg
2018-11-30 16:52 ` [PATCH 2/2] btrfs: run delayed items before dropping the snapshot Josef Bacik
2018-11-30 17:12   ` Filipe Manana
2018-11-30 21:15     ` Filipe Manana

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).