All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
To: "fdmanana@gmail.com" <fdmanana@gmail.com>
Cc: David Sterba <dsterba@suse.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>,
	Josef Bacik <josef@toxicpanda.com>,
	Naohiro Aota <Naohiro.Aota@wdc.com>,
	Filipe Manana <fdmanana@suse.com>,
	Anand Jain <anand.jain@oracle.com>
Subject: Re: [PATCH v3 1/3] btrfs: discard relocated block groups
Date: Tue, 13 Apr 2021 17:48:06 +0000	[thread overview]
Message-ID: <PH0PR04MB74167FB19522DBEB1F70E80D9B4F9@PH0PR04MB7416.namprd04.prod.outlook.com> (raw)
In-Reply-To: CAL3q7H5xZLhHrBPJb5jwe8ZxAv=XfFC05kcw5-WqBySQP4uTBg@mail.gmail.com

On 13/04/2021 14:57, Filipe Manana wrote:
> And what about the other mechanism that triggers discards on pinned
> extents, after the transaction commits the super blocks?
> Why isn't that happening (with -o discard=sync)? We create the delayed
> references to drop extents from the relocated block group, which
> results in pinning extents.
> This is the case that surprised me that it isn't working for you.

I think this is the case. I would have expected to end up in this
part of btrfs_finish_extent_commit():

                                              
        /*                                                                       
         * Transaction is finished.  We don't need the lock anymore.  We         
         * do need to clean up the block groups in case of a transaction         
         * abort.                                                                
         */                                                                      
        deleted_bgs = &trans->transaction->deleted_bgs;                          
        list_for_each_entry_safe(block_group, tmp, deleted_bgs, bg_list) {       
                u64 trimmed = 0;                                                 
                                                                                 
                ret = -EROFS;                                                    
                if (!TRANS_ABORTED(trans))                                       
                        ret = btrfs_discard_extent(fs_info,                      
                                                   block_group->start,           
                                                   block_group->length,          
                                                   &trimmed);                    
                                                                                 
                list_del_init(&block_group->bg_list);                            
                btrfs_unfreeze_block_group(block_group);                         
                btrfs_put_block_group(block_group);                              
                                                                                 
                if (ret) {                                                       
                        const char *errstr = btrfs_decode_error(ret);            
                        btrfs_warn(fs_info,                                      
                           "discard failed while removing blockgroup: errno=%d %s",
                                   ret, errstr);                                 
                }                                                                
        }                                    

and the btrfs_discard_extent() over the whole block group would then trigger a
REQ_OP_ZONE_RESET operation, resetting the device's zone.

But as btrfs_delete_unused_bgs() doesn't add the block group to the 
->deleted_bgs list, we're not reaching above code. I /think/ (i.e. verification
pending) the -o discard=sync case works for regular block devices, as each extent
is discarded on it's own, by this (also in btrfs_finish_extent_commit()):

        while (!TRANS_ABORTED(trans)) {                                          
                struct extent_state *cached_state = NULL;                        
                                                                                 
                mutex_lock(&fs_info->unused_bg_unpin_mutex);                     
                ret = find_first_extent_bit(unpin, 0, &start, &end,              
                                            EXTENT_DIRTY, &cached_state);        
                if (ret) {                                                       
                        mutex_unlock(&fs_info->unused_bg_unpin_mutex);           
                        break;                                                   
                }                                                                
                                                                                 
                if (btrfs_test_opt(fs_info, DISCARD_SYNC))                       
                        ret = btrfs_discard_extent(fs_info, start,               
                                                   end + 1 - start, NULL);       
                                                                                 
                clear_extent_dirty(unpin, start, end, &cached_state);            
                unpin_extent_range(fs_info, start, end, true);                   
                mutex_unlock(&fs_info->unused_bg_unpin_mutex);                   
                free_extent_state(cached_state);                                 
                cond_resched();                                                  
        }

If this is the case, my patch will essentially discard the data twice, for a
non-zoned block device, which is certainly not ideal. So the correct fix would
be to get the block group into the 'trans->transaction->deleted_bgs' list
after relocation, which would work if we wouldn't check for block_group->ro in
btrfs_delete_unused_bgs(), but I suppose this check is there for a reason.

How about changing the patch to the following:

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6d9b2369f17a..ba13b2ea3c6f 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3103,6 +3103,9 @@ static int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset)
        struct btrfs_root *root = fs_info->chunk_root;
        struct btrfs_trans_handle *trans;
        struct btrfs_block_group *block_group;
+       u64 length;
        int ret;
 
        /*
@@ -3130,8 +3133,16 @@ static int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset)
        if (!block_group)
                return -ENOENT;
        btrfs_discard_cancel_work(&fs_info->discard_ctl, block_group);
+       length = block_group->length;
        btrfs_put_block_group(block_group);

+       /* 
+        * For a zoned filesystem we need to discard/zone-reset here, as the 
+        * discard code won't discard the whole block-group, but only single
+        * extents.
+        */
+       if (btrfs_is_zoned(fs_info)) {
+               ret = btrfs_discard_extent(fs_info, chunk_offset, length, NULL);
+               if (ret) /* Non working discard is not fatal */
+                       btrfs_warn(fs_info, "discarding chunk %llu failed",
+                                  chunk_offset);
+       }
+
        trans = btrfs_start_trans_remove_block_group(root->fs_info,
                                                     chunk_offset);
        if (IS_ERR(trans)) {

  reply	other threads:[~2021-04-13 17:48 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-09 10:53 [PATCH v3 0/3] btrfs: zoned: automatic BG reclaim Johannes Thumshirn
2021-04-09 10:53 ` [PATCH v3 1/3] btrfs: discard relocated block groups Johannes Thumshirn
2021-04-09 11:37   ` Filipe Manana
2021-04-12 13:49     ` Johannes Thumshirn
2021-04-12 14:08       ` Filipe Manana
2021-04-12 14:21         ` Johannes Thumshirn
2021-04-13 12:43           ` Johannes Thumshirn
2021-04-13 12:57             ` Filipe Manana
2021-04-13 17:48               ` Johannes Thumshirn [this message]
2021-04-14 11:16                 ` Filipe Manana
2021-04-14 11:22                   ` Johannes Thumshirn
2021-04-14 11:32                     ` Filipe Manana
2021-04-14 12:59                     ` Johannes Thumshirn
2021-04-14 13:13                       ` Filipe Manana
2021-04-09 10:53 ` [PATCH v3 2/3] btrfs: rename delete_unused_bgs_mutex Johannes Thumshirn
2021-04-09 10:53 ` [PATCH v3 3/3] btrfs: zoned: automatically reclaim zones Johannes Thumshirn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PH0PR04MB74167FB19522DBEB1F70E80D9B4F9@PH0PR04MB7416.namprd04.prod.outlook.com \
    --to=johannes.thumshirn@wdc.com \
    --cc=Naohiro.Aota@wdc.com \
    --cc=anand.jain@oracle.com \
    --cc=dsterba@suse.com \
    --cc=fdmanana@gmail.com \
    --cc=fdmanana@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.