linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Free space tree space reservation fixes
@ 2021-12-02 20:34 Josef Bacik
  2021-12-02 20:34 ` [PATCH v2 1/2] btrfs: include the free space tree in the global rsv minimum calculation Josef Bacik
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Josef Bacik @ 2021-12-02 20:34 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

v1->v2:
- Updated the changelog for "btrfs: reserve extra space for free space tree" to
  make it clear why we're doubling the space reservation per Nikolay's request.

--- Original email ---
Hello,

Filipe reported a problem where he was getting an ENOSPC abort when running
delayed refs for generic/619.  This is because of two reasons, first generic/619
creates a very small file system, and our global block rsv calculation doesn't
take into account the size of the free space tree.  Thus we could get into a
situation where the global block rsv was not enough to handle the overflow.

The second is because we simply do not reserve space for the free space tree
modifications.  Fix this by making sure any free space tree root has their block
rsv set to the delayed refs rsv, and then make sure if we have the free space
tree enabled we're reserving extra space for those operations.

With these patches the problem Filipe was hitting went away.  Thanks,

Josef

Josef Bacik (2):
  btrfs: include the free space tree in the global rsv minimum
    calculation
  btrfs: reserve extra space for the free space tree

 fs/btrfs/block-rsv.c   | 31 ++++++++++++++++++-------------
 fs/btrfs/delayed-ref.c | 22 ++++++++++++++++++++++
 2 files changed, 40 insertions(+), 13 deletions(-)

-- 
2.26.3


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/2] btrfs: include the free space tree in the global rsv minimum calculation
  2021-12-02 20:34 [PATCH v2 0/2] Free space tree space reservation fixes Josef Bacik
@ 2021-12-02 20:34 ` Josef Bacik
  2021-12-02 20:34 ` [PATCH v2 2/2] btrfs: reserve extra space for the free space tree Josef Bacik
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2021-12-02 20:34 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Filipe reported a problem where generic/619 was failing with an ENOSPC
abort while running delayed refs, like the following

------------[ cut here ]------------
BTRFS: Transaction aborted (error -28)
WARNING: CPU: 3 PID: 522920 at fs/btrfs/free-space-tree.c:1049 add_to_free_space_tree+0xe5/0x110 [btrfs]
CPU: 3 PID: 522920 Comm: kworker/u16:19 Tainted: G        W         5.16.0-rc2-btrfs-next-106 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
RIP: 0010:add_to_free_space_tree+0xe5/0x110 [btrfs]
RSP: 0000:ffffa65087fb7b20 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffff9131eeaa RDI: 00000000ffffffff
RBP: ffff8d62e26481b8 R08: ffffffff9ad97ce0 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: 00000000ffffffe4
R13: ffff8d61c25fe688 R14: ffff8d61ebd88800 R15: ffff8d61ebd88a90
FS:  0000000000000000(0000) GS:ffff8d64ed400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa46a8b1000 CR3: 0000000148d18003 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 __btrfs_free_extent+0x516/0x950 [btrfs]
 __btrfs_run_delayed_refs+0x2b1/0x1250 [btrfs]
 btrfs_run_delayed_refs+0x86/0x210 [btrfs]
 flush_space+0x403/0x630 [btrfs]
 ? call_rcu_tasks_generic+0x50/0x80
 ? lock_release+0x223/0x4a0
 ? btrfs_get_alloc_profile+0xb5/0x290 [btrfs]
 ? do_raw_spin_unlock+0x4b/0xa0
 btrfs_async_reclaim_metadata_space+0x139/0x320 [btrfs]
 process_one_work+0x24c/0x5b0
 worker_thread+0x55/0x3c0
 ? process_one_work+0x5b0/0x5b0
 kthread+0x17c/0x1a0
 ? set_kthread_struct+0x40/0x40
 ret_from_fork+0x22/0x30

There's a couple of reasons for this, but in generic/619's case the
largest reason is because it is a very small file system, ad we do not
reserve enough space for the global reserve.

With the free space tree we now have the free space tree that we need to
modify when running delayed refs.  This means we need the global reserve
to take this into account when it calculates the minimum size it needs
to be.  This is especially important for very small file systems.

Fix this by adjusting the minimum global block rsv size math to include
the size of the free space tree when calculating the size.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/block-rsv.c | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c
index 21ac60ec19f6..b3086f252ad0 100644
--- a/fs/btrfs/block-rsv.c
+++ b/fs/btrfs/block-rsv.c
@@ -352,25 +352,29 @@ void btrfs_update_global_block_rsv(struct btrfs_fs_info *fs_info)
 {
 	struct btrfs_block_rsv *block_rsv = &fs_info->global_block_rsv;
 	struct btrfs_space_info *sinfo = block_rsv->space_info;
-	struct btrfs_root *extent_root = btrfs_extent_root(fs_info, 0);
-	struct btrfs_root *csum_root = btrfs_csum_root(fs_info, 0);
-	u64 num_bytes;
-	unsigned min_items;
+	struct btrfs_root *root, *tmp;
+	u64 num_bytes = btrfs_root_used(&fs_info->tree_root->root_item);
+	unsigned int min_items = 1;
 
 	/*
 	 * The global block rsv is based on the size of the extent tree, the
 	 * checksum tree and the root tree.  If the fs is empty we want to set
 	 * it to a minimal amount for safety.
+	 *
+	 * We also are going to need to modify the minimum of the tree root and
+	 * any global roots we could touch.
 	 */
-	num_bytes = btrfs_root_used(&extent_root->root_item) +
-		btrfs_root_used(&csum_root->root_item) +
-		btrfs_root_used(&fs_info->tree_root->root_item);
-
-	/*
-	 * We at a minimum are going to modify the csum root, the tree root, and
-	 * the extent root.
-	 */
-	min_items = 3;
+	read_lock(&fs_info->global_root_lock);
+	rbtree_postorder_for_each_entry_safe(root, tmp, &fs_info->global_root_tree,
+					     rb_node) {
+		if (root->root_key.objectid == BTRFS_EXTENT_TREE_OBJECTID ||
+		    root->root_key.objectid == BTRFS_CSUM_TREE_OBJECTID ||
+		    root->root_key.objectid == BTRFS_FREE_SPACE_TREE_OBJECTID) {
+			num_bytes += btrfs_root_used(&root->root_item);
+			min_items++;
+		}
+	}
+	read_unlock(&fs_info->global_root_lock);
 
 	/*
 	 * But we also want to reserve enough space so we can do the fallback
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/2] btrfs: reserve extra space for the free space tree
  2021-12-02 20:34 [PATCH v2 0/2] Free space tree space reservation fixes Josef Bacik
  2021-12-02 20:34 ` [PATCH v2 1/2] btrfs: include the free space tree in the global rsv minimum calculation Josef Bacik
@ 2021-12-02 20:34 ` Josef Bacik
  2021-12-06 10:44   ` Filipe Manana
  2021-12-03 13:09 ` [PATCH v2 0/2] Free space tree space reservation fixes Nikolay Borisov
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Josef Bacik @ 2021-12-02 20:34 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Filipe reported a problem where sometimes he'd get an ENOSPC abort when
running delayed refs with generic/619 and the free space tree enabled.
This is partly because we do not reserve space for modifying the free
space tree, nor do we have a block rsv associated with that tree.

The delayed_refs_rsv tracks the amount of space required to run delayed
refs.  This means 1 modification means 1 change to the extent root.
With the free space tree this turns into 2 changes, because modifying 1
extent means updating the extent tree and potentially updating the free
space tree to either remove that entry or add the free space.  Thus if
we have the FST enabled, simply double the reservation size for our
modification.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/block-rsv.c   |  1 +
 fs/btrfs/delayed-ref.c | 22 ++++++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c
index b3086f252ad0..b3ee49b0b1e8 100644
--- a/fs/btrfs/block-rsv.c
+++ b/fs/btrfs/block-rsv.c
@@ -426,6 +426,7 @@ void btrfs_init_root_block_rsv(struct btrfs_root *root)
 	switch (root->root_key.objectid) {
 	case BTRFS_CSUM_TREE_OBJECTID:
 	case BTRFS_EXTENT_TREE_OBJECTID:
+	case BTRFS_FREE_SPACE_TREE_OBJECTID:
 		root->block_rsv = &fs_info->delayed_refs_rsv;
 		break;
 	case BTRFS_ROOT_TREE_OBJECTID:
diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
index da9d20813147..533521be8fdf 100644
--- a/fs/btrfs/delayed-ref.c
+++ b/fs/btrfs/delayed-ref.c
@@ -84,6 +84,17 @@ void btrfs_delayed_refs_rsv_release(struct btrfs_fs_info *fs_info, int nr)
 	u64 num_bytes = btrfs_calc_insert_metadata_size(fs_info, nr);
 	u64 released = 0;
 
+	/*
+	 * We have to check the mount option here because we could be enabling
+	 * the free space tree for the first time and don't have the compat_ro
+	 * option set yet.
+	 *
+	 * We need extra reservations if we have the free space tree because
+	 * we'll have to modify that tree as well.
+	 */
+	if (btrfs_test_opt(fs_info, FREE_SPACE_TREE))
+		num_bytes <<= 1;
+
 	released = btrfs_block_rsv_release(fs_info, block_rsv, num_bytes, NULL);
 	if (released)
 		trace_btrfs_space_reservation(fs_info, "delayed_refs_rsv",
@@ -108,6 +119,17 @@ void btrfs_update_delayed_refs_rsv(struct btrfs_trans_handle *trans)
 
 	num_bytes = btrfs_calc_insert_metadata_size(fs_info,
 						    trans->delayed_ref_updates);
+	/*
+	 * We have to check the mount option here because we could be enabling
+	 * the free space tree for the first time and don't have the compat_ro
+	 * option set yet.
+	 *
+	 * We need extra reservations if we have the free space tree because
+	 * we'll have to modify that tree as well.
+	 */
+	if (btrfs_test_opt(fs_info, FREE_SPACE_TREE))
+		num_bytes <<= 1;
+
 	spin_lock(&delayed_rsv->lock);
 	delayed_rsv->size += num_bytes;
 	delayed_rsv->full = 0;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/2] Free space tree space reservation fixes
  2021-12-02 20:34 [PATCH v2 0/2] Free space tree space reservation fixes Josef Bacik
  2021-12-02 20:34 ` [PATCH v2 1/2] btrfs: include the free space tree in the global rsv minimum calculation Josef Bacik
  2021-12-02 20:34 ` [PATCH v2 2/2] btrfs: reserve extra space for the free space tree Josef Bacik
@ 2021-12-03 13:09 ` Nikolay Borisov
  2021-12-06 10:42 ` Filipe Manana
  2021-12-07 18:59 ` David Sterba
  4 siblings, 0 replies; 9+ messages in thread
From: Nikolay Borisov @ 2021-12-03 13:09 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 2.12.21 г. 22:34, Josef Bacik wrote:
> v1->v2:
> - Updated the changelog for "btrfs: reserve extra space for free space tree" to
>   make it clear why we're doubling the space reservation per Nikolay's request.
> 
> --- Original email ---
> Hello,
> 
> Filipe reported a problem where he was getting an ENOSPC abort when running
> delayed refs for generic/619.  This is because of two reasons, first generic/619
> creates a very small file system, and our global block rsv calculation doesn't
> take into account the size of the free space tree.  Thus we could get into a
> situation where the global block rsv was not enough to handle the overflow.
> 
> The second is because we simply do not reserve space for the free space tree
> modifications.  Fix this by making sure any free space tree root has their block
> rsv set to the delayed refs rsv, and then make sure if we have the free space
> tree enabled we're reserving extra space for those operations.
> 
> With these patches the problem Filipe was hitting went away.  Thanks,
> 
> Josef
> 
> Josef Bacik (2):
>   btrfs: include the free space tree in the global rsv minimum
>     calculation
>   btrfs: reserve extra space for the free space tree
> 
>  fs/btrfs/block-rsv.c   | 31 ++++++++++++++++++-------------
>  fs/btrfs/delayed-ref.c | 22 ++++++++++++++++++++++
>  2 files changed, 40 insertions(+), 13 deletions(-)
> 


For the whole series:

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/2] Free space tree space reservation fixes
  2021-12-02 20:34 [PATCH v2 0/2] Free space tree space reservation fixes Josef Bacik
                   ` (2 preceding siblings ...)
  2021-12-03 13:09 ` [PATCH v2 0/2] Free space tree space reservation fixes Nikolay Borisov
@ 2021-12-06 10:42 ` Filipe Manana
  2021-12-06 19:54   ` Josef Bacik
  2021-12-07 18:59 ` David Sterba
  4 siblings, 1 reply; 9+ messages in thread
From: Filipe Manana @ 2021-12-06 10:42 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Thu, Dec 02, 2021 at 03:34:30PM -0500, Josef Bacik wrote:
> v1->v2:
> - Updated the changelog for "btrfs: reserve extra space for free space tree" to
>   make it clear why we're doubling the space reservation per Nikolay's request.
> 
> --- Original email ---
> Hello,
> 
> Filipe reported a problem where he was getting an ENOSPC abort when running
> delayed refs for generic/619.  This is because of two reasons, first generic/619
> creates a very small file system, and our global block rsv calculation doesn't
> take into account the size of the free space tree.  Thus we could get into a
> situation where the global block rsv was not enough to handle the overflow.
> 
> The second is because we simply do not reserve space for the free space tree
> modifications.  Fix this by making sure any free space tree root has their block
> rsv set to the delayed refs rsv, and then make sure if we have the free space
> tree enabled we're reserving extra space for those operations.
> 
> With these patches the problem Filipe was hitting went away.  Thanks,

It went, but it often brings some leaks.
For example, generic/648 triggers those links often:

[267436.763282] BTRFS info (device loop0): forced readonly
[267436.763934] BTRFS warning (device loop0): Skipping commit of aborted transaction.
[267436.764874] BTRFS: error (device loop0) in cleanup_transaction:1913: errno=-5 IO failure
[267438.978412] ------------[ cut here ]------------
[267438.979610] WARNING: CPU: 3 PID: 44901 at fs/btrfs/block-group.c:127 btrfs_put_block_group+0x77/0xb0 [btrfs]
[267438.982274] Modules linked in: overlay dm_zero dm_snapshot dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio dm_log_writes dm_dust dm_flakey dm_mod loop btrfs blake2b_generic xor raid6_pq libcrc32c inte
l_rapl_msr intel_rapl_common bochs drm_vram_helper crct10dif_pclmul ghash_clmulni_intel drm_ttm_helper aesni_intel ttm crypto_simd ppdev cryptd drm_kms_helper sg input_leds parport_pc led_class joydev parport se
rio_raw evdev button pcspkr qemu_fw_cfg drm ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi virtio_net net_failover failover virtio_scsi ata_generic ata_piix crc32_pclmul libata v
irtio_pci crc32c_intel virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring virtio psmouse scsi_mod i2c_piix4 scsi_common [last unloaded: scsi_debug]
[267438.994384] CPU: 3 PID: 44901 Comm: umount Not tainted 5.16.0-rc3-btrfs-next-107 #1
[267438.995545] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[267438.997171] RIP: 0010:btrfs_put_block_group+0x77/0xb0 [btrfs]
[267438.998031] Code: 21 48 8b bd 80 01 00 00 e8 36 a8 03 e3 48 8b bd 08 04 00 00 e8 2a a8 03 e3 48 89 ef 5d e9 21 a8 03 e3 0f 0b eb db 0f 0b eb b1 <0f> 0b eb b4 0f 0b 48 8b 45 00 48 89 ee 48 8d b8 f0 17 00 00 e
8 b0
[267439.000593] RSP: 0018:ffffb06981af7dd0 EFLAGS: 00010206
[267439.001613] RAX: 0000000000000001 RBX: ffff9caa8c754000 RCX: ffff9caa5db739c8
[267439.002523] RDX: 0000000000000001 RSI: ffffffffc0afd6c7 RDI: ffff9caa5db73800
[267439.003455] RBP: ffff9caa5db73800 R08: 0000000000000000 R09: 0000000000000000
[267439.004359] R10: 0000000000000246 R11: 0000000000000000 R12: ffff9caa8c754148
[267439.005581] R13: ffff9caa8c754198 R14: ffff9caa5db73988 R15: dead000000000100
[267439.006497] FS:  00007fa77deb4800(0000) GS:ffff9cad6d400000(0000) knlGS:0000000000000000
[267439.007603] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[267439.008341] CR2: 00007fff383e4cf8 CR3: 00000002ede58001 CR4: 0000000000370ee0
[267439.009321] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[267439.010658] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[267439.011983] Call Trace:
[267439.012459]  <TASK>
[267439.012874]  btrfs_free_block_groups+0x255/0x3c0 [btrfs]
[267439.013941]  close_ctree+0x301/0x357 [btrfs]
[267439.014791]  generic_shutdown_super+0x74/0x120
[267439.015636]  kill_anon_super+0x14/0x30
[267439.016349]  btrfs_kill_super+0x12/0x20 [btrfs]
[267439.017244]  deactivate_locked_super+0x31/0xa0
[267439.018085]  cleanup_mnt+0x147/0x1c0
[267439.018767]  task_work_run+0x5c/0xa0
[267439.019448]  exit_to_user_mode_prepare+0x1e5/0x1f0
[267439.020320]  syscall_exit_to_user_mode+0x16/0x40
[267439.020911]  do_syscall_64+0x48/0xc0
[267439.021466]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[267439.022127] RIP: 0033:0x7fa77e0f6a97
[267439.022601] Code: 03 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 a1 03 0c 00 f7 d8 64 89 0
2 b8
[267439.024955] RSP: 002b:00007fff383e5d28 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[267439.025954] RAX: 0000000000000000 RBX: 00007fa77e21c264 RCX: 00007fa77e0f6a97
[267439.026866] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055e31788edd0
[267439.027784] RBP: 000055e31788eba0 R08: 0000000000000000 R09: 00007fff383e4aa0
[267439.028702] R10: 00007fa77e17bfc0 R11: 0000000000000246 R12: 0000000000000000
[267439.029729] R13: 000055e31788edd0 R14: 000055e31788ecb0 R15: 0000000000000000
[267439.030798]  </TASK>
[267439.031130] irq event stamp: 0
[267439.031559] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[267439.032450] hardirqs last disabled at (0): [<ffffffffa3894214>] copy_process+0x934/0x2040
[267439.033540] softirqs last  enabled at (0): [<ffffffffa3894214>] copy_process+0x934/0x2040
[267439.034578] softirqs last disabled at (0): [<0000000000000000>] 0x0
[267439.035380] ---[ end trace 63cff29aa6aacf3d ]---
[267439.036050] ------------[ cut here ]------------
[267439.036653] WARNING: CPU: 3 PID: 44901 at fs/btrfs/block-group.c:3976 btrfs_free_block_groups+0x330/0x3c0 [btrfs]
[267439.038057] Modules linked in: overlay dm_zero dm_snapshot dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio dm_log_writes dm_dust dm_flakey dm_mod loop btrfs blake2b_generic xor raid6_pq libcrc32c intel_rapl_msr intel_rapl_common bochs drm_vram_helper crct10di>
[267439.046505] CPU: 3 PID: 44901 Comm: umount Tainted: G        W         5.16.0-rc3-btrfs-next-107 #1
[267439.047636] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[267439.049099] RIP: 0010:btrfs_free_block_groups+0x330/0x3c0 [btrfs]
[267439.049930] Code: 00 00 00 ad de 49 be 22 01 00 00 00 00 ad de e8 76 f0 7c e3 48 89 df e8 4e 85 ff ff 48 8b 83 b0 12 00 00 49 39 c5 75 51 eb 7d <0f> 0b 31 c9 31 d2 4c 89 e6 48 89 df e8 8f 75 ff ff 48 83 7d 40 00
[267439.052257] RSP: 0018:ffffb06981af7de0 EFLAGS: 00010206
[267439.052933] RAX: ffff9cacc606ccb0 RBX: ffff9caa8c754000 RCX: 0000000000000000
[267439.053935] RDX: 0000000000000001 RSI: ffffffffa3b32cd7 RDI: 00000000ffffffff
[267439.054882] RBP: ffff9cacc606ccb0 R08: 0000000000000000 R09: 0000000000000000
[267439.055778] R10: 0000000000000246 R11: 0000000000000001 R12: ffff9cacc606cc00
[267439.056686] R13: ffff9caa8c7552b0 R14: dead000000000122 R15: dead000000000100
[267439.057628] FS:  00007fa77deb4800(0000) GS:ffff9cad6d400000(0000) knlGS:0000000000000000
[267439.058648] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[267439.059391] CR2: 00007fff383e4cf8 CR3: 00000002ede58001 CR4: 0000000000370ee0
[267439.060313] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[267439.061287] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[267439.062672] Call Trace:
[267439.063161]  <TASK>
[267439.063624]  close_ctree+0x301/0x357 [btrfs]
[267439.064953]  generic_shutdown_super+0x74/0x120
[267439.065842]  kill_anon_super+0x14/0x30
[267439.066581]  btrfs_kill_super+0x12/0x20 [btrfs]
[267439.067468]  deactivate_locked_super+0x31/0xa0
[267439.068311]  cleanup_mnt+0x147/0x1c0
[267439.069004]  task_work_run+0x5c/0xa0
[267439.069711]  exit_to_user_mode_prepare+0x1e5/0x1f0
[267439.070627]  syscall_exit_to_user_mode+0x16/0x40
[267439.071500]  do_syscall_64+0x48/0xc0
[267439.072179]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[267439.073135] RIP: 0033:0x7fa77e0f6a97
[267439.073837] Code: 03 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 a1 03 0c 00 f7 d8 64 89 02 b8
[267439.077297] RSP: 002b:00007fff383e5d28 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[267439.078704] RAX: 0000000000000000 RBX: 00007fa77e21c264 RCX: 00007fa77e0f6a97
[267439.080030] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055e31788edd0
[267439.081379] RBP: 000055e31788eba0 R08: 0000000000000000 R09: 00007fff383e4aa0
[267439.082710] R10: 00007fa77e17bfc0 R11: 0000000000000246 R12: 0000000000000000
[267439.084039] R13: 000055e31788edd0 R14: 000055e31788ecb0 R15: 0000000000000000
[267439.085386]  </TASK>
[267439.085814] irq event stamp: 0
[267439.086398] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[267439.087326] hardirqs last disabled at (0): [<ffffffffa3894214>] copy_process+0x934/0x2040
[267439.088364] softirqs last  enabled at (0): [<ffffffffa3894214>] copy_process+0x934/0x2040
[267439.089415] softirqs last disabled at (0): [<0000000000000000>] 0x0
[267439.090215] ---[ end trace 63cff29aa6aacf3e ]---
[267439.090791] BTRFS info (device dm-0): space_info 4 has 1072562176 free, is not full
[267439.091813] BTRFS info (device dm-0): space_info total=1073741824, used=1064960, pinned=0, reserved=49152, may_use=0, readonly=65536 zone_unusable=0
[267439.093909] BTRFS info (device dm-0): global_block_rsv: size 0 reserved 0
[267439.095078] BTRFS info (device dm-0): trans_block_rsv: size 0 reserved 0
[267439.096229] BTRFS info (device dm-0): chunk_block_rsv: size 0 reserved 0
[267439.097342] BTRFS info (device dm-0): delayed_block_rsv: size 0 reserved 0
[267439.098499] BTRFS info (device dm-0): delayed_refs_rsv: size 0 reserved 0
[267439.211991] BTRFS info (device dm-0): flagging fs with big metadata feature

It nevers happens without this patchset applied.
With it applied, it happens very often (but not always).

Thanks.

> 
> Josef
> 
> Josef Bacik (2):
>   btrfs: include the free space tree in the global rsv minimum
>     calculation
>   btrfs: reserve extra space for the free space tree
> 
>  fs/btrfs/block-rsv.c   | 31 ++++++++++++++++++-------------
>  fs/btrfs/delayed-ref.c | 22 ++++++++++++++++++++++
>  2 files changed, 40 insertions(+), 13 deletions(-)
> 
> -- 
> 2.26.3
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/2] btrfs: reserve extra space for the free space tree
  2021-12-02 20:34 ` [PATCH v2 2/2] btrfs: reserve extra space for the free space tree Josef Bacik
@ 2021-12-06 10:44   ` Filipe Manana
  2021-12-06 19:43     ` Josef Bacik
  0 siblings, 1 reply; 9+ messages in thread
From: Filipe Manana @ 2021-12-06 10:44 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Thu, Dec 02, 2021 at 03:34:32PM -0500, Josef Bacik wrote:
> Filipe reported a problem where sometimes he'd get an ENOSPC abort when
> running delayed refs with generic/619 and the free space tree enabled.
> This is partly because we do not reserve space for modifying the free
> space tree, nor do we have a block rsv associated with that tree.
> 
> The delayed_refs_rsv tracks the amount of space required to run delayed
> refs.  This means 1 modification means 1 change to the extent root.
> With the free space tree this turns into 2 changes, because modifying 1
> extent means updating the extent tree and potentially updating the free
> space tree to either remove that entry or add the free space.  Thus if
> we have the FST enabled, simply double the reservation size for our
> modification.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/block-rsv.c   |  1 +
>  fs/btrfs/delayed-ref.c | 22 ++++++++++++++++++++++
>  2 files changed, 23 insertions(+)
> 
> diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c
> index b3086f252ad0..b3ee49b0b1e8 100644
> --- a/fs/btrfs/block-rsv.c
> +++ b/fs/btrfs/block-rsv.c
> @@ -426,6 +426,7 @@ void btrfs_init_root_block_rsv(struct btrfs_root *root)
>  	switch (root->root_key.objectid) {
>  	case BTRFS_CSUM_TREE_OBJECTID:
>  	case BTRFS_EXTENT_TREE_OBJECTID:
> +	case BTRFS_FREE_SPACE_TREE_OBJECTID:
>  		root->block_rsv = &fs_info->delayed_refs_rsv;
>  		break;
>  	case BTRFS_ROOT_TREE_OBJECTID:
> diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
> index da9d20813147..533521be8fdf 100644
> --- a/fs/btrfs/delayed-ref.c
> +++ b/fs/btrfs/delayed-ref.c
> @@ -84,6 +84,17 @@ void btrfs_delayed_refs_rsv_release(struct btrfs_fs_info *fs_info, int nr)
>  	u64 num_bytes = btrfs_calc_insert_metadata_size(fs_info, nr);
>  	u64 released = 0;
>  
> +	/*
> +	 * We have to check the mount option here because we could be enabling
> +	 * the free space tree for the first time and don't have the compat_ro
> +	 * option set yet.
> +	 *
> +	 * We need extra reservations if we have the free space tree because
> +	 * we'll have to modify that tree as well.
> +	 */
> +	if (btrfs_test_opt(fs_info, FREE_SPACE_TREE))
> +		num_bytes <<= 1;
> +
>  	released = btrfs_block_rsv_release(fs_info, block_rsv, num_bytes, NULL);
>  	if (released)
>  		trace_btrfs_space_reservation(fs_info, "delayed_refs_rsv",
> @@ -108,6 +119,17 @@ void btrfs_update_delayed_refs_rsv(struct btrfs_trans_handle *trans)
>  
>  	num_bytes = btrfs_calc_insert_metadata_size(fs_info,
>  						    trans->delayed_ref_updates);
> +	/*
> +	 * We have to check the mount option here because we could be enabling
> +	 * the free space tree for the first time and don't have the compat_ro
> +	 * option set yet.
> +	 *
> +	 * We need extra reservations if we have the free space tree because
> +	 * we'll have to modify that tree as well.
> +	 */
> +	if (btrfs_test_opt(fs_info, FREE_SPACE_TREE))
> +		num_bytes <<= 1;

Don't we need to bump the minimum (limit variable) number of bytes at
btrfs_delayed_refs_rsv_refill() as well?

I don't see why not.

Thanks.

> +
>  	spin_lock(&delayed_rsv->lock);
>  	delayed_rsv->size += num_bytes;
>  	delayed_rsv->full = 0;
> -- 
> 2.26.3
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/2] btrfs: reserve extra space for the free space tree
  2021-12-06 10:44   ` Filipe Manana
@ 2021-12-06 19:43     ` Josef Bacik
  0 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2021-12-06 19:43 UTC (permalink / raw)
  To: Filipe Manana; +Cc: linux-btrfs, kernel-team

On Mon, Dec 06, 2021 at 10:44:51AM +0000, Filipe Manana wrote:
> On Thu, Dec 02, 2021 at 03:34:32PM -0500, Josef Bacik wrote:
> > Filipe reported a problem where sometimes he'd get an ENOSPC abort when
> > running delayed refs with generic/619 and the free space tree enabled.
> > This is partly because we do not reserve space for modifying the free
> > space tree, nor do we have a block rsv associated with that tree.
> > 
> > The delayed_refs_rsv tracks the amount of space required to run delayed
> > refs.  This means 1 modification means 1 change to the extent root.
> > With the free space tree this turns into 2 changes, because modifying 1
> > extent means updating the extent tree and potentially updating the free
> > space tree to either remove that entry or add the free space.  Thus if
> > we have the FST enabled, simply double the reservation size for our
> > modification.
> > 
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > ---
> >  fs/btrfs/block-rsv.c   |  1 +
> >  fs/btrfs/delayed-ref.c | 22 ++++++++++++++++++++++
> >  2 files changed, 23 insertions(+)
> > 
> > diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c
> > index b3086f252ad0..b3ee49b0b1e8 100644
> > --- a/fs/btrfs/block-rsv.c
> > +++ b/fs/btrfs/block-rsv.c
> > @@ -426,6 +426,7 @@ void btrfs_init_root_block_rsv(struct btrfs_root *root)
> >  	switch (root->root_key.objectid) {
> >  	case BTRFS_CSUM_TREE_OBJECTID:
> >  	case BTRFS_EXTENT_TREE_OBJECTID:
> > +	case BTRFS_FREE_SPACE_TREE_OBJECTID:
> >  		root->block_rsv = &fs_info->delayed_refs_rsv;
> >  		break;
> >  	case BTRFS_ROOT_TREE_OBJECTID:
> > diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
> > index da9d20813147..533521be8fdf 100644
> > --- a/fs/btrfs/delayed-ref.c
> > +++ b/fs/btrfs/delayed-ref.c
> > @@ -84,6 +84,17 @@ void btrfs_delayed_refs_rsv_release(struct btrfs_fs_info *fs_info, int nr)
> >  	u64 num_bytes = btrfs_calc_insert_metadata_size(fs_info, nr);
> >  	u64 released = 0;
> >  
> > +	/*
> > +	 * We have to check the mount option here because we could be enabling
> > +	 * the free space tree for the first time and don't have the compat_ro
> > +	 * option set yet.
> > +	 *
> > +	 * We need extra reservations if we have the free space tree because
> > +	 * we'll have to modify that tree as well.
> > +	 */
> > +	if (btrfs_test_opt(fs_info, FREE_SPACE_TREE))
> > +		num_bytes <<= 1;
> > +
> >  	released = btrfs_block_rsv_release(fs_info, block_rsv, num_bytes, NULL);
> >  	if (released)
> >  		trace_btrfs_space_reservation(fs_info, "delayed_refs_rsv",
> > @@ -108,6 +119,17 @@ void btrfs_update_delayed_refs_rsv(struct btrfs_trans_handle *trans)
> >  
> >  	num_bytes = btrfs_calc_insert_metadata_size(fs_info,
> >  						    trans->delayed_ref_updates);
> > +	/*
> > +	 * We have to check the mount option here because we could be enabling
> > +	 * the free space tree for the first time and don't have the compat_ro
> > +	 * option set yet.
> > +	 *
> > +	 * We need extra reservations if we have the free space tree because
> > +	 * we'll have to modify that tree as well.
> > +	 */
> > +	if (btrfs_test_opt(fs_info, FREE_SPACE_TREE))
> > +		num_bytes <<= 1;
> 
> Don't we need to bump the minimum (limit variable) number of bytes at
> btrfs_delayed_refs_rsv_refill() as well?
> 
> I don't see why not.
> 

Because refill is about adding more space to keep up with usage.  We're not
adding space at that point.  These things here are to make sure ->size is
correct.  Refill is about making sure ->reserved == ->size.  In this case we're
just trying to add the smallest unit possible, min(1 items worth of
modificaitons, ->size - >reserved).  Thanks,

Josef

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/2] Free space tree space reservation fixes
  2021-12-06 10:42 ` Filipe Manana
@ 2021-12-06 19:54   ` Josef Bacik
  0 siblings, 0 replies; 9+ messages in thread
From: Josef Bacik @ 2021-12-06 19:54 UTC (permalink / raw)
  To: Filipe Manana; +Cc: linux-btrfs, kernel-team

On Mon, Dec 06, 2021 at 10:42:30AM +0000, Filipe Manana wrote:
> On Thu, Dec 02, 2021 at 03:34:30PM -0500, Josef Bacik wrote:
> > v1->v2:
> > - Updated the changelog for "btrfs: reserve extra space for free space tree" to
> >   make it clear why we're doubling the space reservation per Nikolay's request.
> > 
> > --- Original email ---
> > Hello,
> > 
> > Filipe reported a problem where he was getting an ENOSPC abort when running
> > delayed refs for generic/619.  This is because of two reasons, first generic/619
> > creates a very small file system, and our global block rsv calculation doesn't
> > take into account the size of the free space tree.  Thus we could get into a
> > situation where the global block rsv was not enough to handle the overflow.
> > 
> > The second is because we simply do not reserve space for the free space tree
> > modifications.  Fix this by making sure any free space tree root has their block
> > rsv set to the delayed refs rsv, and then make sure if we have the free space
> > tree enabled we're reserving extra space for those operations.
> > 
> > With these patches the problem Filipe was hitting went away.  Thanks,
> 
> It went, but it often brings some leaks.
> For example, generic/648 triggers those links often:
> 
> [267436.763282] BTRFS info (device loop0): forced readonly
> [267436.763934] BTRFS warning (device loop0): Skipping commit of aborted transaction.
> [267436.764874] BTRFS: error (device loop0) in cleanup_transaction:1913: errno=-5 IO failure
> [267438.978412] ------------[ cut here ]------------
> [267438.979610] WARNING: CPU: 3 PID: 44901 at fs/btrfs/block-group.c:127 btrfs_put_block_group+0x77/0xb0 [btrfs]
> [267438.982274] Modules linked in: overlay dm_zero dm_snapshot dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio dm_log_writes dm_dust dm_flakey dm_mod loop btrfs blake2b_generic xor raid6_pq libcrc32c inte
> l_rapl_msr intel_rapl_common bochs drm_vram_helper crct10dif_pclmul ghash_clmulni_intel drm_ttm_helper aesni_intel ttm crypto_simd ppdev cryptd drm_kms_helper sg input_leds parport_pc led_class joydev parport se
> rio_raw evdev button pcspkr qemu_fw_cfg drm ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 sd_mod t10_pi virtio_net net_failover failover virtio_scsi ata_generic ata_piix crc32_pclmul libata v
> irtio_pci crc32c_intel virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring virtio psmouse scsi_mod i2c_piix4 scsi_common [last unloaded: scsi_debug]
> [267438.994384] CPU: 3 PID: 44901 Comm: umount Not tainted 5.16.0-rc3-btrfs-next-107 #1
> [267438.995545] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
> [267438.997171] RIP: 0010:btrfs_put_block_group+0x77/0xb0 [btrfs]
> [267438.998031] Code: 21 48 8b bd 80 01 00 00 e8 36 a8 03 e3 48 8b bd 08 04 00 00 e8 2a a8 03 e3 48 89 ef 5d e9 21 a8 03 e3 0f 0b eb db 0f 0b eb b1 <0f> 0b eb b4 0f 0b 48 8b 45 00 48 89 ee 48 8d b8 f0 17 00 00 e
> 8 b0
> [267439.000593] RSP: 0018:ffffb06981af7dd0 EFLAGS: 00010206
> [267439.001613] RAX: 0000000000000001 RBX: ffff9caa8c754000 RCX: ffff9caa5db739c8
> [267439.002523] RDX: 0000000000000001 RSI: ffffffffc0afd6c7 RDI: ffff9caa5db73800
> [267439.003455] RBP: ffff9caa5db73800 R08: 0000000000000000 R09: 0000000000000000
> [267439.004359] R10: 0000000000000246 R11: 0000000000000000 R12: ffff9caa8c754148
> [267439.005581] R13: ffff9caa8c754198 R14: ffff9caa5db73988 R15: dead000000000100
> [267439.006497] FS:  00007fa77deb4800(0000) GS:ffff9cad6d400000(0000) knlGS:0000000000000000
> [267439.007603] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [267439.008341] CR2: 00007fff383e4cf8 CR3: 00000002ede58001 CR4: 0000000000370ee0
> [267439.009321] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [267439.010658] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [267439.011983] Call Trace:
> [267439.012459]  <TASK>
> [267439.012874]  btrfs_free_block_groups+0x255/0x3c0 [btrfs]
> [267439.013941]  close_ctree+0x301/0x357 [btrfs]
> [267439.014791]  generic_shutdown_super+0x74/0x120
> [267439.015636]  kill_anon_super+0x14/0x30
> [267439.016349]  btrfs_kill_super+0x12/0x20 [btrfs]
> [267439.017244]  deactivate_locked_super+0x31/0xa0
> [267439.018085]  cleanup_mnt+0x147/0x1c0
> [267439.018767]  task_work_run+0x5c/0xa0
> [267439.019448]  exit_to_user_mode_prepare+0x1e5/0x1f0
> [267439.020320]  syscall_exit_to_user_mode+0x16/0x40
> [267439.020911]  do_syscall_64+0x48/0xc0
> [267439.021466]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [267439.022127] RIP: 0033:0x7fa77e0f6a97
> [267439.022601] Code: 03 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 a1 03 0c 00 f7 d8 64 89 0
> 2 b8
> [267439.024955] RSP: 002b:00007fff383e5d28 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> [267439.025954] RAX: 0000000000000000 RBX: 00007fa77e21c264 RCX: 00007fa77e0f6a97
> [267439.026866] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055e31788edd0
> [267439.027784] RBP: 000055e31788eba0 R08: 0000000000000000 R09: 00007fff383e4aa0
> [267439.028702] R10: 00007fa77e17bfc0 R11: 0000000000000246 R12: 0000000000000000
> [267439.029729] R13: 000055e31788edd0 R14: 000055e31788ecb0 R15: 0000000000000000
> [267439.030798]  </TASK>
> [267439.031130] irq event stamp: 0
> [267439.031559] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
> [267439.032450] hardirqs last disabled at (0): [<ffffffffa3894214>] copy_process+0x934/0x2040
> [267439.033540] softirqs last  enabled at (0): [<ffffffffa3894214>] copy_process+0x934/0x2040
> [267439.034578] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [267439.035380] ---[ end trace 63cff29aa6aacf3d ]---
> [267439.036050] ------------[ cut here ]------------
> [267439.036653] WARNING: CPU: 3 PID: 44901 at fs/btrfs/block-group.c:3976 btrfs_free_block_groups+0x330/0x3c0 [btrfs]
> [267439.038057] Modules linked in: overlay dm_zero dm_snapshot dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio dm_log_writes dm_dust dm_flakey dm_mod loop btrfs blake2b_generic xor raid6_pq libcrc32c intel_rapl_msr intel_rapl_common bochs drm_vram_helper crct10di>
> [267439.046505] CPU: 3 PID: 44901 Comm: umount Tainted: G        W         5.16.0-rc3-btrfs-next-107 #1
> [267439.047636] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
> [267439.049099] RIP: 0010:btrfs_free_block_groups+0x330/0x3c0 [btrfs]
> [267439.049930] Code: 00 00 00 ad de 49 be 22 01 00 00 00 00 ad de e8 76 f0 7c e3 48 89 df e8 4e 85 ff ff 48 8b 83 b0 12 00 00 49 39 c5 75 51 eb 7d <0f> 0b 31 c9 31 d2 4c 89 e6 48 89 df e8 8f 75 ff ff 48 83 7d 40 00
> [267439.052257] RSP: 0018:ffffb06981af7de0 EFLAGS: 00010206
> [267439.052933] RAX: ffff9cacc606ccb0 RBX: ffff9caa8c754000 RCX: 0000000000000000
> [267439.053935] RDX: 0000000000000001 RSI: ffffffffa3b32cd7 RDI: 00000000ffffffff
> [267439.054882] RBP: ffff9cacc606ccb0 R08: 0000000000000000 R09: 0000000000000000
> [267439.055778] R10: 0000000000000246 R11: 0000000000000001 R12: ffff9cacc606cc00
> [267439.056686] R13: ffff9caa8c7552b0 R14: dead000000000122 R15: dead000000000100
> [267439.057628] FS:  00007fa77deb4800(0000) GS:ffff9cad6d400000(0000) knlGS:0000000000000000
> [267439.058648] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [267439.059391] CR2: 00007fff383e4cf8 CR3: 00000002ede58001 CR4: 0000000000370ee0
> [267439.060313] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [267439.061287] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [267439.062672] Call Trace:
> [267439.063161]  <TASK>
> [267439.063624]  close_ctree+0x301/0x357 [btrfs]
> [267439.064953]  generic_shutdown_super+0x74/0x120
> [267439.065842]  kill_anon_super+0x14/0x30
> [267439.066581]  btrfs_kill_super+0x12/0x20 [btrfs]
> [267439.067468]  deactivate_locked_super+0x31/0xa0
> [267439.068311]  cleanup_mnt+0x147/0x1c0
> [267439.069004]  task_work_run+0x5c/0xa0
> [267439.069711]  exit_to_user_mode_prepare+0x1e5/0x1f0
> [267439.070627]  syscall_exit_to_user_mode+0x16/0x40
> [267439.071500]  do_syscall_64+0x48/0xc0
> [267439.072179]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [267439.073135] RIP: 0033:0x7fa77e0f6a97
> [267439.073837] Code: 03 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 a1 03 0c 00 f7 d8 64 89 02 b8
> [267439.077297] RSP: 002b:00007fff383e5d28 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> [267439.078704] RAX: 0000000000000000 RBX: 00007fa77e21c264 RCX: 00007fa77e0f6a97
> [267439.080030] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055e31788edd0
> [267439.081379] RBP: 000055e31788eba0 R08: 0000000000000000 R09: 00007fff383e4aa0
> [267439.082710] R10: 00007fa77e17bfc0 R11: 0000000000000246 R12: 0000000000000000
> [267439.084039] R13: 000055e31788edd0 R14: 000055e31788ecb0 R15: 0000000000000000
> [267439.085386]  </TASK>
> [267439.085814] irq event stamp: 0
> [267439.086398] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
> [267439.087326] hardirqs last disabled at (0): [<ffffffffa3894214>] copy_process+0x934/0x2040
> [267439.088364] softirqs last  enabled at (0): [<ffffffffa3894214>] copy_process+0x934/0x2040
> [267439.089415] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [267439.090215] ---[ end trace 63cff29aa6aacf3e ]---
> [267439.090791] BTRFS info (device dm-0): space_info 4 has 1072562176 free, is not full
> [267439.091813] BTRFS info (device dm-0): space_info total=1073741824, used=1064960, pinned=0, reserved=49152, may_use=0, readonly=65536 zone_unusable=0
> [267439.093909] BTRFS info (device dm-0): global_block_rsv: size 0 reserved 0
> [267439.095078] BTRFS info (device dm-0): trans_block_rsv: size 0 reserved 0
> [267439.096229] BTRFS info (device dm-0): chunk_block_rsv: size 0 reserved 0
> [267439.097342] BTRFS info (device dm-0): delayed_block_rsv: size 0 reserved 0
> [267439.098499] BTRFS info (device dm-0): delayed_refs_rsv: size 0 reserved 0
> [267439.211991] BTRFS info (device dm-0): flagging fs with big metadata feature
> 
> It nevers happens without this patchset applied.
> With it applied, it happens very often (but not always).
> 

This is the reserved leak, I saw it last week with generic/485 on our nightly
tests.  I've tasked Rohit with running it down, but it's not related to my
changes, it seems my changes made it easier to hit I guess.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/2] Free space tree space reservation fixes
  2021-12-02 20:34 [PATCH v2 0/2] Free space tree space reservation fixes Josef Bacik
                   ` (3 preceding siblings ...)
  2021-12-06 10:42 ` Filipe Manana
@ 2021-12-07 18:59 ` David Sterba
  4 siblings, 0 replies; 9+ messages in thread
From: David Sterba @ 2021-12-07 18:59 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Thu, Dec 02, 2021 at 03:34:30PM -0500, Josef Bacik wrote:
> v1->v2:
> - Updated the changelog for "btrfs: reserve extra space for free space tree" to
>   make it clear why we're doubling the space reservation per Nikolay's request.
> 
> --- Original email ---
> Hello,
> 
> Filipe reported a problem where he was getting an ENOSPC abort when running
> delayed refs for generic/619.  This is because of two reasons, first generic/619
> creates a very small file system, and our global block rsv calculation doesn't
> take into account the size of the free space tree.  Thus we could get into a
> situation where the global block rsv was not enough to handle the overflow.
> 
> The second is because we simply do not reserve space for the free space tree
> modifications.  Fix this by making sure any free space tree root has their block
> rsv set to the delayed refs rsv, and then make sure if we have the free space
> tree enabled we're reserving extra space for those operations.
> 
> With these patches the problem Filipe was hitting went away.  Thanks,
> 
> Josef
> 
> Josef Bacik (2):
>   btrfs: include the free space tree in the global rsv minimum
>     calculation
>   btrfs: reserve extra space for the free space tree

Added to misc-next, thanks. Filipe had a question in patch 2 which I
believe has been answered, but if there's anything to add/update please
let me know.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-12-07 18:59 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-02 20:34 [PATCH v2 0/2] Free space tree space reservation fixes Josef Bacik
2021-12-02 20:34 ` [PATCH v2 1/2] btrfs: include the free space tree in the global rsv minimum calculation Josef Bacik
2021-12-02 20:34 ` [PATCH v2 2/2] btrfs: reserve extra space for the free space tree Josef Bacik
2021-12-06 10:44   ` Filipe Manana
2021-12-06 19:43     ` Josef Bacik
2021-12-03 13:09 ` [PATCH v2 0/2] Free space tree space reservation fixes Nikolay Borisov
2021-12-06 10:42 ` Filipe Manana
2021-12-06 19:54   ` Josef Bacik
2021-12-07 18:59 ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).