All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Btrfs: find_free_extent: Do not erroneously skip LOOP_CACHING_WAIT state
@ 2015-10-31 19:11 Chandan Rajendra
  2015-11-02  8:29 ` [PATCH V2] " Chandan Rajendra
  0 siblings, 1 reply; 6+ messages in thread
From: Chandan Rajendra @ 2015-10-31 19:11 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Chandan Rajendra, clm, jbacik, chandan

When executing generic/001 in a loop on a ppc64 machine (with both sectorsize
and nodesize set to 64k), the following call trace is observed,

WARNING: at /root/repos/linux/fs/btrfs/locking.c:253
Modules linked in:
CPU: 2 PID: 8353 Comm: umount Not tainted 4.3.0-rc5-13676-ga5e681d #54
task: c0000000f2b1f560 ti: c0000000f6008000 task.ti: c0000000f6008000
NIP: c000000000520c88 LR: c0000000004a3b34 CTR: 0000000000000000
REGS: c0000000f600a820 TRAP: 0700   Not tainted  (4.3.0-rc5-13676-ga5e681d)
MSR: 8000000102029032 <SF,VEC,EE,ME,IR,DR,RI>  CR: 24444884  XER: 00000000
CFAR: c0000000004a3b30 SOFTE: 1
GPR00: c0000000004a3b34 c0000000f600aaa0 c00000000108ac00 c0000000f5a808c0
GPR04: 0000000000000000 c0000000f600ae60 0000000000000000 0000000000000005
GPR08: 00000000000020a1 0000000000000001 c0000000f2b1f560 0000000000000030
GPR12: 0000000084842882 c00000000fdc0900 c0000000f600ae60 c0000000f070b800
GPR16: 0000000000000000 c0000000f3c8a000 0000000000000000 0000000000000049
GPR20: 0000000000000001 0000000000000001 c0000000f5aa01f8 0000000000000000
GPR24: 0f83e0f83e0f83e1 c0000000f5a808c0 c0000000f3c8d000 c000000000000000
GPR28: c0000000f600ae74 0000000000000001 c0000000f3c8d000 c0000000f5a808c0
NIP [c000000000520c88] .btrfs_tree_lock+0x48/0x2a0
LR [c0000000004a3b34] .btrfs_lock_root_node+0x44/0x80
Call Trace:
[c0000000f600aaa0] [c0000000f600ab80] 0xc0000000f600ab80 (unreliable)
[c0000000f600ab80] [c0000000004a3b34] .btrfs_lock_root_node+0x44/0x80
[c0000000f600ac00] [c0000000004a99dc] .btrfs_search_slot+0xa8c/0xc00
[c0000000f600ad40] [c0000000004ab878] .btrfs_insert_empty_items+0x98/0x120
[c0000000f600adf0] [c00000000050da44] .btrfs_finish_chunk_alloc+0x1d4/0x620
[c0000000f600af20] [c0000000004be854] .btrfs_create_pending_block_groups+0x1d4/0x2c0
[c0000000f600b020] [c0000000004bf188] .do_chunk_alloc+0x3c8/0x420
[c0000000f600b100] [c0000000004c27cc] .find_free_extent+0xbfc/0x1030
[c0000000f600b260] [c0000000004c2ce8] .btrfs_reserve_extent+0xe8/0x250
[c0000000f600b330] [c0000000004c2f90] .btrfs_alloc_tree_block+0x140/0x590
[c0000000f600b440] [c0000000004a47b4] .__btrfs_cow_block+0x124/0x780
[c0000000f600b530] [c0000000004a4fc0] .btrfs_cow_block+0xf0/0x250
[c0000000f600b5e0] [c0000000004a917c] .btrfs_search_slot+0x22c/0xc00
[c0000000f600b720] [c00000000050aa40] .btrfs_remove_chunk+0x1b0/0x9f0
[c0000000f600b850] [c0000000004c4e04] .btrfs_delete_unused_bgs+0x434/0x570
[c0000000f600b950] [c0000000004d3cb8] .close_ctree+0x2e8/0x3b0
[c0000000f600ba20] [c00000000049d178] .btrfs_put_super+0x18/0x30
[c0000000f600ba90] [c000000000243cd4] .generic_shutdown_super+0xa4/0x1a0
[c0000000f600bb10] [c0000000002441d8] .kill_anon_super+0x18/0x30
[c0000000f600bb90] [c00000000049c898] .btrfs_kill_super+0x18/0xc0
[c0000000f600bc10] [c0000000002444f8] .deactivate_locked_super+0x98/0xe0
[c0000000f600bc90] [c000000000269f94] .cleanup_mnt+0x54/0xa0
[c0000000f600bd10] [c0000000000bd744] .task_work_run+0xc4/0x100
[c0000000f600bdb0] [c000000000016334] .do_notify_resume+0x74/0x80
[c0000000f600be30] [c0000000000098b8] .ret_from_except_lite+0x64/0x68
Instruction dump:
fba1ffe8 fbc1fff0 fbe1fff8 7c791b78 f8010010 f821ff21 e94d0290 81030040
812a04e8 7d094a78 7d290034 5529d97e <0b090000> 3b400000 3be30050 3bc3004c

The above call trace is seen even on x86_64; albeit very rarely and that too
with nodesize set to 64k and with nospace_cache mount option being used.

The reason for the above call trace is,
btrfs_remove_chunk
  check_system_chunk
    Allocate chunk if required
  For each physical stripe on underlying device,
    btrfs_free_dev_extent
    ...
      Take lock on Device tree's root node
      btrfs_cow_block("dev tree's root node");
        btrfs_reserve_extent
          find_free_extent
	    index = BTRFS_RAID_DUP;
	    have_caching_bg = false;

            When in LOOP_CACHING_NOWAIT state, Assume we find a block group
	    which is being cached; Hence have_caching_bg is set to true

            When repeating the search for the next RAID index, we set
	    have_caching_bg to false.

Hence right after completing the LOOP_CACHING_NOWAIT state, we incorrectly
skip LOOP_CACHING_WAIT state and move to LOOP_ALLOC_CHUNK state where we
allocate a chunk and try to add entries corresponding to the chunk's physical
stripe into the device tree. When doing so the task deadlocks itself waiting
for the blocking lock on the root node of the device tree.

This commit fixes the issue by introducing a new local variable to help
indicate as to whether a block group of any RAID type is being cached.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
---
 fs/btrfs/extent-tree.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 92fdbc6..cac9da8 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -7027,6 +7027,7 @@ static noinline int find_free_extent(struct btrfs_root *orig_root,
 	bool failed_alloc = false;
 	bool use_cluster = true;
 	bool have_caching_bg = false;
+	bool orig_have_caching_bg = false;
 	bool full_search = false;
 
 	WARN_ON(num_bytes < root->sectorsize);
@@ -7376,6 +7377,9 @@ loop:
 	}
 	up_read(&space_info->groups_sem);
 
+	if ((loop == LOOP_CACHING_NOWAIT) && have_caching_bg && !orig_have_caching_bg)
+		orig_have_caching_bg = true;
+
 	if (!ins->objectid && loop >= LOOP_CACHING_WAIT && have_caching_bg)
 		goto search;
 
@@ -7398,7 +7402,7 @@ loop:
 			 * don't have any unached bgs and we've alrelady done a
 			 * full search through.
 			 */
-			if (have_caching_bg || !full_search)
+			if (orig_have_caching_bg || !full_search)
 				loop = LOOP_CACHING_WAIT;
 			else
 				loop = LOOP_ALLOC_CHUNK;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH V2] Btrfs: find_free_extent: Do not erroneously skip LOOP_CACHING_WAIT state
  2015-10-31 19:11 [PATCH] Btrfs: find_free_extent: Do not erroneously skip LOOP_CACHING_WAIT state Chandan Rajendra
@ 2015-11-02  8:29 ` Chandan Rajendra
  2015-11-02 15:46   ` Josef Bacik
  2015-11-02 16:52   ` Chris Mason
  0 siblings, 2 replies; 6+ messages in thread
From: Chandan Rajendra @ 2015-11-02  8:29 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Chandan Rajendra, clm, jbacik, chandan

When executing generic/001 in a loop on a ppc64 machine (with both sectorsize
and nodesize set to 64k), the following call trace is observed,

WARNING: at /root/repos/linux/fs/btrfs/locking.c:253
Modules linked in:
CPU: 2 PID: 8353 Comm: umount Not tainted 4.3.0-rc5-13676-ga5e681d #54
task: c0000000f2b1f560 ti: c0000000f6008000 task.ti: c0000000f6008000
NIP: c000000000520c88 LR: c0000000004a3b34 CTR: 0000000000000000
REGS: c0000000f600a820 TRAP: 0700   Not tainted  (4.3.0-rc5-13676-ga5e681d)
MSR: 8000000102029032 <SF,VEC,EE,ME,IR,DR,RI>  CR: 24444884  XER: 00000000
CFAR: c0000000004a3b30 SOFTE: 1
GPR00: c0000000004a3b34 c0000000f600aaa0 c00000000108ac00 c0000000f5a808c0
GPR04: 0000000000000000 c0000000f600ae60 0000000000000000 0000000000000005
GPR08: 00000000000020a1 0000000000000001 c0000000f2b1f560 0000000000000030
GPR12: 0000000084842882 c00000000fdc0900 c0000000f600ae60 c0000000f070b800
GPR16: 0000000000000000 c0000000f3c8a000 0000000000000000 0000000000000049
GPR20: 0000000000000001 0000000000000001 c0000000f5aa01f8 0000000000000000
GPR24: 0f83e0f83e0f83e1 c0000000f5a808c0 c0000000f3c8d000 c000000000000000
GPR28: c0000000f600ae74 0000000000000001 c0000000f3c8d000 c0000000f5a808c0
NIP [c000000000520c88] .btrfs_tree_lock+0x48/0x2a0
LR [c0000000004a3b34] .btrfs_lock_root_node+0x44/0x80
Call Trace:
[c0000000f600aaa0] [c0000000f600ab80] 0xc0000000f600ab80 (unreliable)
[c0000000f600ab80] [c0000000004a3b34] .btrfs_lock_root_node+0x44/0x80
[c0000000f600ac00] [c0000000004a99dc] .btrfs_search_slot+0xa8c/0xc00
[c0000000f600ad40] [c0000000004ab878] .btrfs_insert_empty_items+0x98/0x120
[c0000000f600adf0] [c00000000050da44] .btrfs_finish_chunk_alloc+0x1d4/0x620
[c0000000f600af20] [c0000000004be854] .btrfs_create_pending_block_groups+0x1d4/0x2c0
[c0000000f600b020] [c0000000004bf188] .do_chunk_alloc+0x3c8/0x420
[c0000000f600b100] [c0000000004c27cc] .find_free_extent+0xbfc/0x1030
[c0000000f600b260] [c0000000004c2ce8] .btrfs_reserve_extent+0xe8/0x250
[c0000000f600b330] [c0000000004c2f90] .btrfs_alloc_tree_block+0x140/0x590
[c0000000f600b440] [c0000000004a47b4] .__btrfs_cow_block+0x124/0x780
[c0000000f600b530] [c0000000004a4fc0] .btrfs_cow_block+0xf0/0x250
[c0000000f600b5e0] [c0000000004a917c] .btrfs_search_slot+0x22c/0xc00
[c0000000f600b720] [c00000000050aa40] .btrfs_remove_chunk+0x1b0/0x9f0
[c0000000f600b850] [c0000000004c4e04] .btrfs_delete_unused_bgs+0x434/0x570
[c0000000f600b950] [c0000000004d3cb8] .close_ctree+0x2e8/0x3b0
[c0000000f600ba20] [c00000000049d178] .btrfs_put_super+0x18/0x30
[c0000000f600ba90] [c000000000243cd4] .generic_shutdown_super+0xa4/0x1a0
[c0000000f600bb10] [c0000000002441d8] .kill_anon_super+0x18/0x30
[c0000000f600bb90] [c00000000049c898] .btrfs_kill_super+0x18/0xc0
[c0000000f600bc10] [c0000000002444f8] .deactivate_locked_super+0x98/0xe0
[c0000000f600bc90] [c000000000269f94] .cleanup_mnt+0x54/0xa0
[c0000000f600bd10] [c0000000000bd744] .task_work_run+0xc4/0x100
[c0000000f600bdb0] [c000000000016334] .do_notify_resume+0x74/0x80
[c0000000f600be30] [c0000000000098b8] .ret_from_except_lite+0x64/0x68
Instruction dump:
fba1ffe8 fbc1fff0 fbe1fff8 7c791b78 f8010010 f821ff21 e94d0290 81030040
812a04e8 7d094a78 7d290034 5529d97e <0b090000> 3b400000 3be30050 3bc3004c

The above call trace is seen even on x86_64; albeit very rarely and that too
with nodesize set to 64k and with nospace_cache mount option being used.

The reason for the above call trace is,
btrfs_remove_chunk
  check_system_chunk
    Allocate chunk if required
  For each physical stripe on underlying device,
    btrfs_free_dev_extent
      ...
      Take lock on Device tree's root node
      btrfs_cow_block("dev tree's root node");
        btrfs_reserve_extent
          find_free_extent
	    index = BTRFS_RAID_DUP;
	    have_caching_bg = false;

            When in LOOP_CACHING_NOWAIT state, Assume we find a block group
	    which is being cached; Hence have_caching_bg is set to true

            When repeating the search for the next RAID index, we set
	    have_caching_bg to false.

Hence right after completing the LOOP_CACHING_NOWAIT state, we incorrectly
skip LOOP_CACHING_WAIT state and move to LOOP_ALLOC_CHUNK state where we
allocate a chunk and try to add entries corresponding to the chunk's physical
stripe into the device tree. When doing so the task deadlocks itself waiting
for the blocking lock on the root node of the device tree.

This commit fixes the issue by introducing a new local variable to help
indicate as to whether a block group of any RAID type is being cached.

Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
---
Changelog:
v1->v2: Honor 80 column restriction.

 fs/btrfs/extent-tree.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index f50c7c2..99a8e57 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -7029,6 +7029,7 @@ static noinline int find_free_extent(struct btrfs_root *orig_root,
 	bool failed_alloc = false;
 	bool use_cluster = true;
 	bool have_caching_bg = false;
+	bool orig_have_caching_bg = false;
 	bool full_search = false;
 
 	WARN_ON(num_bytes < root->sectorsize);
@@ -7378,6 +7379,10 @@ loop:
 	}
 	up_read(&space_info->groups_sem);
 
+	if ((loop == LOOP_CACHING_NOWAIT) && have_caching_bg
+		&& !orig_have_caching_bg)
+		orig_have_caching_bg = true;
+
 	if (!ins->objectid && loop >= LOOP_CACHING_WAIT && have_caching_bg)
 		goto search;
 
@@ -7400,7 +7405,7 @@ loop:
 			 * don't have any unached bgs and we've alrelady done a
 			 * full search through.
 			 */
-			if (have_caching_bg || !full_search)
+			if (orig_have_caching_bg || !full_search)
 				loop = LOOP_CACHING_WAIT;
 			else
 				loop = LOOP_ALLOC_CHUNK;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH V2] Btrfs: find_free_extent: Do not erroneously skip LOOP_CACHING_WAIT state
  2015-11-02  8:29 ` [PATCH V2] " Chandan Rajendra
@ 2015-11-02 15:46   ` Josef Bacik
  2015-11-02 16:52   ` Chris Mason
  1 sibling, 0 replies; 6+ messages in thread
From: Josef Bacik @ 2015-11-02 15:46 UTC (permalink / raw)
  To: Chandan Rajendra, linux-btrfs; +Cc: clm, chandan

On 11/02/2015 03:29 AM, Chandan Rajendra wrote:
> When executing generic/001 in a loop on a ppc64 machine (with both sectorsize
> and nodesize set to 64k), the following call trace is observed,
>
> WARNING: at /root/repos/linux/fs/btrfs/locking.c:253
> Modules linked in:
> CPU: 2 PID: 8353 Comm: umount Not tainted 4.3.0-rc5-13676-ga5e681d #54
> task: c0000000f2b1f560 ti: c0000000f6008000 task.ti: c0000000f6008000
> NIP: c000000000520c88 LR: c0000000004a3b34 CTR: 0000000000000000
> REGS: c0000000f600a820 TRAP: 0700   Not tainted  (4.3.0-rc5-13676-ga5e681d)
> MSR: 8000000102029032 <SF,VEC,EE,ME,IR,DR,RI>  CR: 24444884  XER: 00000000
> CFAR: c0000000004a3b30 SOFTE: 1
> GPR00: c0000000004a3b34 c0000000f600aaa0 c00000000108ac00 c0000000f5a808c0
> GPR04: 0000000000000000 c0000000f600ae60 0000000000000000 0000000000000005
> GPR08: 00000000000020a1 0000000000000001 c0000000f2b1f560 0000000000000030
> GPR12: 0000000084842882 c00000000fdc0900 c0000000f600ae60 c0000000f070b800
> GPR16: 0000000000000000 c0000000f3c8a000 0000000000000000 0000000000000049
> GPR20: 0000000000000001 0000000000000001 c0000000f5aa01f8 0000000000000000
> GPR24: 0f83e0f83e0f83e1 c0000000f5a808c0 c0000000f3c8d000 c000000000000000
> GPR28: c0000000f600ae74 0000000000000001 c0000000f3c8d000 c0000000f5a808c0
> NIP [c000000000520c88] .btrfs_tree_lock+0x48/0x2a0
> LR [c0000000004a3b34] .btrfs_lock_root_node+0x44/0x80
> Call Trace:
> [c0000000f600aaa0] [c0000000f600ab80] 0xc0000000f600ab80 (unreliable)
> [c0000000f600ab80] [c0000000004a3b34] .btrfs_lock_root_node+0x44/0x80
> [c0000000f600ac00] [c0000000004a99dc] .btrfs_search_slot+0xa8c/0xc00
> [c0000000f600ad40] [c0000000004ab878] .btrfs_insert_empty_items+0x98/0x120
> [c0000000f600adf0] [c00000000050da44] .btrfs_finish_chunk_alloc+0x1d4/0x620
> [c0000000f600af20] [c0000000004be854] .btrfs_create_pending_block_groups+0x1d4/0x2c0
> [c0000000f600b020] [c0000000004bf188] .do_chunk_alloc+0x3c8/0x420
> [c0000000f600b100] [c0000000004c27cc] .find_free_extent+0xbfc/0x1030
> [c0000000f600b260] [c0000000004c2ce8] .btrfs_reserve_extent+0xe8/0x250
> [c0000000f600b330] [c0000000004c2f90] .btrfs_alloc_tree_block+0x140/0x590
> [c0000000f600b440] [c0000000004a47b4] .__btrfs_cow_block+0x124/0x780
> [c0000000f600b530] [c0000000004a4fc0] .btrfs_cow_block+0xf0/0x250
> [c0000000f600b5e0] [c0000000004a917c] .btrfs_search_slot+0x22c/0xc00
> [c0000000f600b720] [c00000000050aa40] .btrfs_remove_chunk+0x1b0/0x9f0
> [c0000000f600b850] [c0000000004c4e04] .btrfs_delete_unused_bgs+0x434/0x570
> [c0000000f600b950] [c0000000004d3cb8] .close_ctree+0x2e8/0x3b0
> [c0000000f600ba20] [c00000000049d178] .btrfs_put_super+0x18/0x30
> [c0000000f600ba90] [c000000000243cd4] .generic_shutdown_super+0xa4/0x1a0
> [c0000000f600bb10] [c0000000002441d8] .kill_anon_super+0x18/0x30
> [c0000000f600bb90] [c00000000049c898] .btrfs_kill_super+0x18/0xc0
> [c0000000f600bc10] [c0000000002444f8] .deactivate_locked_super+0x98/0xe0
> [c0000000f600bc90] [c000000000269f94] .cleanup_mnt+0x54/0xa0
> [c0000000f600bd10] [c0000000000bd744] .task_work_run+0xc4/0x100
> [c0000000f600bdb0] [c000000000016334] .do_notify_resume+0x74/0x80
> [c0000000f600be30] [c0000000000098b8] .ret_from_except_lite+0x64/0x68
> Instruction dump:
> fba1ffe8 fbc1fff0 fbe1fff8 7c791b78 f8010010 f821ff21 e94d0290 81030040
> 812a04e8 7d094a78 7d290034 5529d97e <0b090000> 3b400000 3be30050 3bc3004c
>
> The above call trace is seen even on x86_64; albeit very rarely and that too
> with nodesize set to 64k and with nospace_cache mount option being used.
>
> The reason for the above call trace is,
> btrfs_remove_chunk
>    check_system_chunk
>      Allocate chunk if required
>    For each physical stripe on underlying device,
>      btrfs_free_dev_extent
>        ...
>        Take lock on Device tree's root node
>        btrfs_cow_block("dev tree's root node");
>          btrfs_reserve_extent
>            find_free_extent
> 	    index = BTRFS_RAID_DUP;
> 	    have_caching_bg = false;
>
>              When in LOOP_CACHING_NOWAIT state, Assume we find a block group
> 	    which is being cached; Hence have_caching_bg is set to true
>
>              When repeating the search for the next RAID index, we set
> 	    have_caching_bg to false.
>
> Hence right after completing the LOOP_CACHING_NOWAIT state, we incorrectly
> skip LOOP_CACHING_WAIT state and move to LOOP_ALLOC_CHUNK state where we
> allocate a chunk and try to add entries corresponding to the chunk's physical
> stripe into the device tree. When doing so the task deadlocks itself waiting
> for the blocking lock on the root node of the device tree.
>
> This commit fixes the issue by introducing a new local variable to help
> indicate as to whether a block group of any RAID type is being cached.
>
> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
> ---
> Changelog:
> v1->v2: Honor 80 column restriction.
>
>   fs/btrfs/extent-tree.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index f50c7c2..99a8e57 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -7029,6 +7029,7 @@ static noinline int find_free_extent(struct btrfs_root *orig_root,
>   	bool failed_alloc = false;
>   	bool use_cluster = true;
>   	bool have_caching_bg = false;
> +	bool orig_have_caching_bg = false;
>   	bool full_search = false;
>
>   	WARN_ON(num_bytes < root->sectorsize);
> @@ -7378,6 +7379,10 @@ loop:
>   	}
>   	up_read(&space_info->groups_sem);
>
> +	if ((loop == LOOP_CACHING_NOWAIT) && have_caching_bg
> +		&& !orig_have_caching_bg)
> +		orig_have_caching_bg = true;
> +
>   	if (!ins->objectid && loop >= LOOP_CACHING_WAIT && have_caching_bg)
>   		goto search;
>
> @@ -7400,7 +7405,7 @@ loop:
>   			 * don't have any unached bgs and we've alrelady done a
>   			 * full search through.
>   			 */
> -			if (have_caching_bg || !full_search)
> +			if (orig_have_caching_bg || !full_search)
>   				loop = LOOP_CACHING_WAIT;
>   			else
>   				loop = LOOP_ALLOC_CHUNK;
>

Ah good catch, thanks Chandan,

Reviewed-by: Josef Bacik <jbacik@fb.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH V2] Btrfs: find_free_extent: Do not erroneously skip LOOP_CACHING_WAIT state
  2015-11-02  8:29 ` [PATCH V2] " Chandan Rajendra
  2015-11-02 15:46   ` Josef Bacik
@ 2015-11-02 16:52   ` Chris Mason
       [not found]     ` <CAOcd+r3ku8mhfK_ZvZRL2pRSZ=Y1Q2maNF_ob+8fEA=2Etm1Lg@mail.gmail.com>
  1 sibling, 1 reply; 6+ messages in thread
From: Chris Mason @ 2015-11-02 16:52 UTC (permalink / raw)
  To: Chandan Rajendra; +Cc: linux-btrfs, jbacik, chandan

On Mon, Nov 02, 2015 at 01:59:46PM +0530, Chandan Rajendra wrote:
> When executing generic/001 in a loop on a ppc64 machine (with both sectorsize
> and nodesize set to 64k), the following call trace is observed,

Thanks Chandan, I hit this same trace on x86-64 with 16K nodes.

-chris

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH V2] Btrfs: find_free_extent: Do not erroneously skip LOOP_CACHING_WAIT state
       [not found]     ` <CAOcd+r3ku8mhfK_ZvZRL2pRSZ=Y1Q2maNF_ob+8fEA=2Etm1Lg@mail.gmail.com>
@ 2015-12-13 10:18       ` Alex Lyakas
  2015-12-14  7:59         ` Chandan Rajendra
  0 siblings, 1 reply; 6+ messages in thread
From: Alex Lyakas @ 2015-12-13 10:18 UTC (permalink / raw)
  To: Chris Mason, Chandan Rajendra, linux-btrfs, Josef Bacik, chandan

[Resending in plain text, apologies.]

Hi Chandan, Josef, Chris,

I am not sure I understand the fix to the problem.

It may happen that when updating the device tree, we need to allocate a new
chunk via do_chunk_alloc (while we are holding the device tree root node
locked). This is a legitimate thing for find_free_extent() to do. And
do_chunk_alloc() call may lead to call to
btrfs_create_pending_block_groups(), which will try to update the device
tree. This may happen due to direct call to
btrfs_create_pending_block_groups() that exists in do_chunk_alloc(), or
perhaps by __btrfs_end_transaction() that find_free_extent() does after it
completed chunk allocation (although in this case it will use the
transaction that already exists in current->journal_info).
So the deadlock still may happen?

Thanks,
 Alex.

>
>
> On Mon, Nov 2, 2015 at 6:52 PM, Chris Mason <clm@fb.com> wrote:
>>
>> On Mon, Nov 02, 2015 at 01:59:46PM +0530, Chandan Rajendra wrote:
>> > When executing generic/001 in a loop on a ppc64 machine (with both
>> > sectorsize
>> > and nodesize set to 64k), the following call trace is observed,
>>
>> Thanks Chandan, I hit this same trace on x86-64 with 16K nodes.
>>
>> -chris
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH V2] Btrfs: find_free_extent: Do not erroneously skip LOOP_CACHING_WAIT state
  2015-12-13 10:18       ` Alex Lyakas
@ 2015-12-14  7:59         ` Chandan Rajendra
  0 siblings, 0 replies; 6+ messages in thread
From: Chandan Rajendra @ 2015-12-14  7:59 UTC (permalink / raw)
  To: Alex Lyakas; +Cc: Chris Mason, linux-btrfs, Josef Bacik, chandan

On Sunday 13 Dec 2015 12:18:55 Alex Lyakas wrote:
> [Resending in plain text, apologies.]
> 
> Hi Chandan, Josef, Chris,
> 
> I am not sure I understand the fix to the problem.
> 
> It may happen that when updating the device tree, we need to allocate a new
> chunk via do_chunk_alloc (while we are holding the device tree root node
> locked). This is a legitimate thing for find_free_extent() to do. And
> do_chunk_alloc() call may lead to call to
> btrfs_create_pending_block_groups(), which will try to update the device
> tree. This may happen due to direct call to
> btrfs_create_pending_block_groups() that exists in do_chunk_alloc(), or
> perhaps by __btrfs_end_transaction() that find_free_extent() does after it
> completed chunk allocation (although in this case it will use the
> transaction that already exists in current->journal_info).
> So the deadlock still may happen?

Hello Alex,

The "global block reservation" (see btrfs_fs_info->global_block_rsv) aims to
solve this problem. I don't claim to have understood the behaviour of
global_block_rsv completely. However, Global block reservation makes sure that
we have enough free space reserved (see update_global_block_rsv()) for future
operations on,
- Extent tree
- Checksum tree
- Device tree 
- Tree root tree and
- Quota tree.

Tasks changing the device tree should get their space requirements satisfied
from the global block reservation. Hence such changes to the device tree
should not end up forcing find_free_extent() to allocate a new chunk.

-- 
chandan


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-12-14  7:59 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-31 19:11 [PATCH] Btrfs: find_free_extent: Do not erroneously skip LOOP_CACHING_WAIT state Chandan Rajendra
2015-11-02  8:29 ` [PATCH V2] " Chandan Rajendra
2015-11-02 15:46   ` Josef Bacik
2015-11-02 16:52   ` Chris Mason
     [not found]     ` <CAOcd+r3ku8mhfK_ZvZRL2pRSZ=Y1Q2maNF_ob+8fEA=2Etm1Lg@mail.gmail.com>
2015-12-13 10:18       ` Alex Lyakas
2015-12-14  7:59         ` Chandan Rajendra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.