All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Sterba <dsterba@suse.cz>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org, Greed Rong <greedrong@gmail.com>
Subject: Re: [PATCH 3/4] btrfs: preallocate anon_dev for subvolume and snapshot creation
Date: Tue, 16 Jun 2020 17:10:05 +0200	[thread overview]
Message-ID: <20200616151004.GE27795@twin.jikos.cz> (raw)
In-Reply-To: <20200616021737.44617-4-wqu@suse.com>

On Tue, Jun 16, 2020 at 10:17:36AM +0800, Qu Wenruo wrote:
> [BUG]
> When a lot of subvolumes are created, there is a user report about
> transaction aborted:
> 
>   ------------[ cut here ]------------
>   BTRFS: Transaction aborted (error -24)
>   WARNING: CPU: 17 PID: 17041 at fs/btrfs/transaction.c:1576 create_pending_snapshot+0xbc4/0xd10 [btrfs]
>   RIP: 0010:create_pending_snapshot+0xbc4/0xd10 [btrfs]
>   Call Trace:
>    create_pending_snapshots+0x82/0xa0 [btrfs]
>    btrfs_commit_transaction+0x275/0x8c0 [btrfs]
>    btrfs_mksubvol+0x4b9/0x500 [btrfs]
>    btrfs_ioctl_snap_create_transid+0x174/0x180 [btrfs]
>    btrfs_ioctl_snap_create_v2+0x11c/0x180 [btrfs]
>    btrfs_ioctl+0x11a4/0x2da0 [btrfs]
>    do_vfs_ioctl+0xa9/0x640
>    ksys_ioctl+0x67/0x90
>    __x64_sys_ioctl+0x1a/0x20
>    do_syscall_64+0x5a/0x110
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   ---[ end trace 33f2f83f3d5250e9 ]---
>   BTRFS: error (device sda1) in create_pending_snapshot:1576: errno=-24 unknown
>   BTRFS info (device sda1): forced readonly
>   BTRFS warning (device sda1): Skipping commit of aborted transaction.
>   BTRFS: error (device sda1) in cleanup_transaction:1831: errno=-24 unknown
> 
> [CAUSE]
> When the global anonymous block device pool is exhausted, the following
> call chain will fail, and lead to transaction abort:
> 
>  btrfs_ioctl_snap_create_v2()
>  |- btrfs_ioctl_snap_create_transid()
>     |- btrfs_mksubvol()
>        |- btrfs_commit_transaction()
>           |- create_pending_snapshot()
>              |- btrfs_get_fs_root()
>                 |- btrfs_init_fs_root()
>                    |- get_anon_bdev()
> 
> [FIX]
> Although we can't enlarge the anonymous block device pool, at least we
> can preallocate anon_dev for subvolume/snapshot creation.
> So that when the pool is exhausted, user will get an error other than
> aborting transaction later.
> 
> Reported-by: Greed Rong <greedrong@gmail.com>
> Link: https://lore.kernel.org/linux-btrfs/CA+UqX+NTrZ6boGnWHhSeZmEY5J76CTqmYjO2S+=tHJX7nb9DPw@mail.gmail.com/
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/disk-io.c     | 51 ++++++++++++++++++++++++++++++++++++------
>  fs/btrfs/disk-io.h     |  2 ++
>  fs/btrfs/ioctl.c       | 21 ++++++++++++++++-
>  fs/btrfs/transaction.c |  3 ++-
>  fs/btrfs/transaction.h |  2 ++
>  5 files changed, 70 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index cfc0ff288238..14fd69b71cb8 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -1395,7 +1395,7 @@ struct btrfs_root *btrfs_read_tree_root(struct btrfs_root *tree_root,
>  	goto out;
>  }
>  
> -static int btrfs_init_fs_root(struct btrfs_root *root)
> +static int btrfs_init_fs_root(struct btrfs_root *root, dev_t anon_dev)
>  {
>  	int ret;
>  	unsigned int nofs_flag;
> @@ -1435,9 +1435,13 @@ static int btrfs_init_fs_root(struct btrfs_root *root)
>  	 */
>  	if (is_fstree(root->root_key.objectid) &&
>  	    btrfs_root_refs(&root->root_item)) {
> -		ret = get_anon_bdev(&root->anon_dev);
> -		if (ret)
> -			goto fail;
> +		if (!anon_dev) {
> +			ret = get_anon_bdev(&root->anon_dev);
> +			if (ret)
> +				goto fail;
> +		} else {
> +			root->anon_dev = anon_dev;
> +		}
>  	}
>  
>  	mutex_lock(&root->objectid_mutex);
> @@ -1542,8 +1546,27 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info)
>  }
>  
>  
> -struct btrfs_root *btrfs_get_fs_root(struct btrfs_fs_info *fs_info,
> -				     u64 objectid, bool check_ref)
> +/*
> + * Get a fs root.
> + *
> + * For essential trees like root/extent tree, we grab it from fs_info directly.
> + * For subvolume trees, we check the cached fs roots first. If miss then
> + * read it from disk and add it to cached fs roots.
> + *
> + * Caller should release the root by calling btrfs_put_root() after the usage.
> + *
> + * NOTE: Reloc and log trees can't be read by this function as they share the
> + *	 same root objectid.
> + *
> + * @objectid:	Root (subvolume) id
> + * @anon_dev:	Preallocated anonymous block device number for new roots.
> + * 		Pass 0 for automatic allocation.
> + * @check_ref:	Whether to check root refs. If true, return -ENOENT for orphan
> + * 		roots.
> + */
> +static struct btrfs_root *__get_fs_root(struct btrfs_fs_info *fs_info,
> +					u64 objectid, dev_t anon_dev,
> +					bool check_ref)


> +struct btrfs_root *btrfs_get_fs_root(struct btrfs_fs_info *fs_info,
> +				     u64 objectid, bool check_ref)
> +{
> +	return __get_fs_root(fs_info, objectid, 0, check_ref);
> +}
> +
> +struct btrfs_root *btrfs_get_new_fs_root(struct btrfs_fs_info *fs_info,
> +					 u64 objectid, dev_t anon_dev)
> +{
> +	return __get_fs_root(fs_info, objectid, anon_dev, true);
> +}

This does not look like a good API, we should keep btrfs_get_fs_root and
add the anon_bdev initialization to the callers, there are only a few.

  reply	other threads:[~2020-06-16 15:10 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-16  2:17 [PATCH 0/4] btrfs: workaround exhausted anonymous block device pool Qu Wenruo
2020-06-16  2:17 ` [PATCH 1/4] btrfs: disk-io: don't allocate anonymous block device for user invisible roots Qu Wenruo
2020-06-16 19:21   ` Josef Bacik
2020-06-16  2:17 ` [PATCH 2/4] btrfs: detect uninitialized btrfs_root::anon_dev for user visible subvolumes Qu Wenruo
2020-06-16 19:25   ` Josef Bacik
2020-06-16 22:49     ` Qu Wenruo
2020-06-16 23:32       ` Josef Bacik
2020-06-16 23:49         ` Qu Wenruo
2020-06-17 11:31           ` David Sterba
2020-06-17 13:37             ` Josef Bacik
2020-06-17 23:39               ` Qu Wenruo
2020-06-16  2:17 ` [PATCH 3/4] btrfs: preallocate anon_dev for subvolume and snapshot creation Qu Wenruo
2020-06-16 15:10   ` David Sterba [this message]
2020-06-16 22:54     ` Qu Wenruo
2020-07-01  3:25     ` Qu Wenruo
2020-07-01 17:39       ` David Sterba
2020-07-01 23:56         ` Qu Wenruo
2020-07-02 16:08           ` David Sterba
2020-07-02 23:46             ` David Sterba
2020-07-03  5:19               ` Qu Wenruo
2020-07-03 12:29                 ` David Sterba
2020-07-03 12:39                   ` Qu Wenruo
2020-06-16  2:17 ` [PATCH 4/4] btrfs: free anon_dev earlier to prevent exhausting anonymous block device pool Qu Wenruo
2020-06-16 19:23   ` Josef Bacik
2020-06-16 22:48     ` David Sterba
2020-06-16 23:31       ` Josef Bacik
2020-06-30 14:14 ` [PATCH 0/4] btrfs: workaround exhausted " David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200616151004.GE27795@twin.jikos.cz \
    --to=dsterba@suse.cz \
    --cc=greedrong@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.