Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
From: Su Yue <Damenly_Su@gmx.com>
To: fdmanana@kernel.org, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] Btrfs: fix log context list corruption after rename exchange operation
Date: Sat, 9 Nov 2019 15:36:19 +0800
Message-ID: <7f2cfcee-4a06-d2db-7703-a26eb89cda50@gmx.com> (raw)
In-Reply-To: <20191108161156.2342-1-fdmanana@kernel.org>



On 2019/11/9 12:11 AM, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
>
> During rename exchange we might have successfully log the new name in the
> source root's log tree, in which case we leave our log context (allocated
> on stack) in the root's list of log contextes. However we might fail to
> log the new name in the destination root, in which case we fallback to
> a transaction commit later and never sync the log of the source root,
> which causes the source root log context to remain in the list of log
> contextes. This later causes invalid memory accesses because the context
> was allocated on stack and after rename exchange finishes the stack gets
> reused and overwritten for other purposes.
>
> The kernel's linked list corruption detector (CONFIG_DEBUG_LIST=y) can
> detect this and report something like the following:
>
>    [  691.489929] ------------[ cut here ]------------
>    [  691.489947] list_add corruption. prev->next should be next (ffff88819c944530), but was ffff8881c23f7be4. (prev=ffff8881c23f7a38).
>    [  691.489967] WARNING: CPU: 2 PID: 28933 at lib/list_debug.c:28 __list_add_valid+0x95/0xe0
>    (...)
>    [  691.489998] CPU: 2 PID: 28933 Comm: fsstress Not tainted 5.4.0-rc6-btrfs-next-62 #1
>    [  691.490001] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
>    [  691.490003] RIP: 0010:__list_add_valid+0x95/0xe0
>    (...)
>    [  691.490007] RSP: 0018:ffff8881f0b3faf8 EFLAGS: 00010282
>    [  691.490010] RAX: 0000000000000000 RBX: ffff88819c944530 RCX: 0000000000000000
>    [  691.490011] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffffa2c497e0
>    [  691.490013] RBP: ffff8881f0b3fe68 R08: ffffed103eaa4115 R09: ffffed103eaa4114
>    [  691.490015] R10: ffff88819c944000 R11: ffffed103eaa4115 R12: 7fffffffffffffff
>    [  691.490016] R13: ffff8881b4035610 R14: ffff8881e7b84728 R15: 1ffff1103e167f7b
>    [  691.490019] FS:  00007f4b25ea2e80(0000) GS:ffff8881f5500000(0000) knlGS:0000000000000000
>    [  691.490021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>    [  691.490022] CR2: 00007fffbb2d4eec CR3: 00000001f2a4a004 CR4: 00000000003606e0
>    [  691.490025] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>    [  691.490027] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>    [  691.490029] Call Trace:
>    [  691.490058]  btrfs_log_inode_parent+0x667/0x2730 [btrfs]
>    [  691.490083]  ? join_transaction+0x24a/0xce0 [btrfs]
>    [  691.490107]  ? btrfs_end_log_trans+0x80/0x80 [btrfs]
>    [  691.490111]  ? dget_parent+0xb8/0x460
>    [  691.490116]  ? lock_downgrade+0x6b0/0x6b0
>    [  691.490121]  ? rwlock_bug.part.0+0x90/0x90
>    [  691.490127]  ? do_raw_spin_unlock+0x142/0x220
>    [  691.490151]  btrfs_log_dentry_safe+0x65/0x90 [btrfs]
>    [  691.490172]  btrfs_sync_file+0x9f1/0xc00 [btrfs]
>    [  691.490195]  ? btrfs_file_write_iter+0x1800/0x1800 [btrfs]
>    [  691.490198]  ? rcu_read_lock_any_held.part.11+0x20/0x20
>    [  691.490204]  ? __do_sys_newstat+0x88/0xd0
>    [  691.490207]  ? cp_new_stat+0x5d0/0x5d0
>    [  691.490218]  ? do_fsync+0x38/0x60
>    [  691.490220]  do_fsync+0x38/0x60
>    [  691.490224]  __x64_sys_fdatasync+0x32/0x40
>    [  691.490228]  do_syscall_64+0x9f/0x540
>    [  691.490233]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>    [  691.490235] RIP: 0033:0x7f4b253ad5f0
>    (...)
>    [  691.490239] RSP: 002b:00007fffbb2d6078 EFLAGS: 00000246 ORIG_RAX: 000000000000004b
>    [  691.490242] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f4b253ad5f0
>    [  691.490244] RDX: 00007fffbb2d5fe0 RSI: 00007fffbb2d5fe0 RDI: 0000000000000003
>    [  691.490245] RBP: 000000000000000d R08: 0000000000000001 R09: 00007fffbb2d608c
>    [  691.490247] R10: 00000000000002e8 R11: 0000000000000246 R12: 00000000000001f4
>    [  691.490248] R13: 0000000051eb851f R14: 00007fffbb2d6120 R15: 00005635a498bda0
>
> This started happening recently when running some test cases from fstests
> like btrfs/004 for example, because support for rename exchange was added
> last week to fsstress from fstests.
>
> So fix this by deleting the log context for the source root from the list
> if we have logged the new name in the source root.
>
> Reported-by: Su Yue <Damenly_Su@gmx.com>
> Fixes: d4682ba03ef618 ("Btrfs: sync log after logging new name")
> CC: stable@vger.kernel.org # 4.19+
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Great, thanks for the fix.

Tested-by: Su Yue <Damenly_Su@gmx.com>
> ---
>   fs/btrfs/inode.c | 15 +++++++++++++++
>   1 file changed, 15 insertions(+)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 2750ae6e907d..28080f0803a1 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -9734,6 +9734,18 @@ static int btrfs_rename_exchange(struct inode *old_dir,
>   			commit_transaction = true;
>   	}
>   	if (commit_transaction) {
> +		/*
> +		 * We may have set commit_transaction when logging the new name
> +		 * in the destination root, in which case we left the source
> +		 * root context in the list of log contextes. So make sure we
> +		 * remove it to avoid invalid memory accesses, since the context
> +		 * was allocated in our stack frame.
> +		 */
> +		if (sync_log_root) {
> +			mutex_lock(&root->log_mutex);
> +			list_del_init(&ctx_root.list);
> +			mutex_unlock(&root->log_mutex);
> +		}
>   		ret = btrfs_commit_transaction(trans);
>   	} else {
>   		int ret2;
> @@ -9747,6 +9759,9 @@ static int btrfs_rename_exchange(struct inode *old_dir,
>   	if (old_ino == BTRFS_FIRST_FREE_OBJECTID)
>   		up_read(&fs_info->subvol_sem);
>
> +	ASSERT(list_empty(&ctx_root.list));
> +	ASSERT(list_empty(&ctx_dest.list));
> +
>   	return ret;
>   }
>
>

  reply index

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-08 16:11 fdmanana
2019-11-09  7:36 ` Su Yue [this message]
2019-11-11 18:45 ` David Sterba

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7f2cfcee-4a06-d2db-7703-a26eb89cda50@gmx.com \
    --to=damenly_su@gmx.com \
    --cc=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org
	public-inbox-index linux-btrfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git