stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Chao Yu <yuchao0@huawei.com>,
	Jaegeuk Kim <jaegeuk@kernel.org>
Subject: [PATCH 4.19 36/45] f2fs: fix to avoid deadlock of atomic file operations
Date: Tue, 26 Mar 2019 15:30:19 +0900	[thread overview]
Message-ID: <20190326042704.566582951@linuxfoundation.org> (raw)
In-Reply-To: <20190326042702.565683325@linuxfoundation.org>

4.19-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Chao Yu <yuchao0@huawei.com>

commit 48432984d718c95cf13e26d487c2d1b697c3c01f upstream.

Thread A				Thread B
- __fput
 - f2fs_release_file
  - drop_inmem_pages
   - mutex_lock(&fi->inmem_lock)
   - __revoke_inmem_pages
    - lock_page(page)
					- open
					- f2fs_setattr
					- truncate_setsize
					 - truncate_inode_pages_range
					  - lock_page(page)
					  - truncate_cleanup_page
					   - f2fs_invalidate_page
					    - drop_inmem_page
					    - mutex_lock(&fi->inmem_lock);

We may encounter above ABBA deadlock as reported by Kyungtae Kim:

I'm reporting a bug in linux-4.17.19: "INFO: task hung in
drop_inmem_page" (no reproducer)

I think this might be somehow related to the following:
https://groups.google.com/forum/#!searchin/syzkaller-bugs/INFO$3A$20task$20hung$20in$20%7Csort:date/syzkaller-bugs/c6soBTrdaIo/AjAzPeIzCgAJ

=========================================
INFO: task syz-executor7:10822 blocked for more than 120 seconds.
      Not tainted 4.17.19 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor7   D27024 10822   6346 0x00000004
Call Trace:
 context_switch kernel/sched/core.c:2867 [inline]
 __schedule+0x721/0x1e60 kernel/sched/core.c:3515
 schedule+0x88/0x1c0 kernel/sched/core.c:3559
 schedule_preempt_disabled+0x18/0x30 kernel/sched/core.c:3617
 __mutex_lock_common kernel/locking/mutex.c:833 [inline]
 __mutex_lock+0x5bd/0x1410 kernel/locking/mutex.c:893
 mutex_lock_nested+0x1b/0x20 kernel/locking/mutex.c:908
 drop_inmem_page+0xcb/0x810 fs/f2fs/segment.c:327
 f2fs_invalidate_page+0x337/0x5e0 fs/f2fs/data.c:2401
 do_invalidatepage mm/truncate.c:165 [inline]
 truncate_cleanup_page+0x261/0x330 mm/truncate.c:187
 truncate_inode_pages_range+0x552/0x1610 mm/truncate.c:367
 truncate_inode_pages mm/truncate.c:478 [inline]
 truncate_pagecache+0x6d/0x90 mm/truncate.c:801
 truncate_setsize+0x81/0xa0 mm/truncate.c:826
 f2fs_setattr+0x44f/0x1270 fs/f2fs/file.c:781
 notify_change+0xa62/0xe80 fs/attr.c:313
 do_truncate+0x12e/0x1e0 fs/open.c:63
 do_last fs/namei.c:2955 [inline]
 path_openat+0x2042/0x29f0 fs/namei.c:3505
 do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
 do_sys_open+0x35e/0x4e0 fs/open.c:1101
 __do_sys_open fs/open.c:1119 [inline]
 __se_sys_open fs/open.c:1114 [inline]
 __x64_sys_open+0x89/0xc0 fs/open.c:1114
 do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:00007f734e459c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007f734e45a6cc RCX: 00000000004497b9
RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
RBP: 000000000071bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e45a700
INFO: task syz-executor7:10858 blocked for more than 120 seconds.
      Not tainted 4.17.19 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor7   D28880 10858   6346 0x00000004
Call Trace:
 context_switch kernel/sched/core.c:2867 [inline]
 __schedule+0x721/0x1e60 kernel/sched/core.c:3515
 schedule+0x88/0x1c0 kernel/sched/core.c:3559
 __rwsem_down_write_failed_common kernel/locking/rwsem-xadd.c:565 [inline]
 rwsem_down_write_failed+0x5e6/0xc90 kernel/locking/rwsem-xadd.c:594
 call_rwsem_down_write_failed+0x17/0x30 arch/x86/lib/rwsem.S:117
 __down_write arch/x86/include/asm/rwsem.h:142 [inline]
 down_write+0x58/0xa0 kernel/locking/rwsem.c:72
 inode_lock include/linux/fs.h:713 [inline]
 do_truncate+0x120/0x1e0 fs/open.c:61
 do_last fs/namei.c:2955 [inline]
 path_openat+0x2042/0x29f0 fs/namei.c:3505
 do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
 do_sys_open+0x35e/0x4e0 fs/open.c:1101
 __do_sys_open fs/open.c:1119 [inline]
 __se_sys_open fs/open.c:1114 [inline]
 __x64_sys_open+0x89/0xc0 fs/open.c:1114
 do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:00007f734e3b4c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007f734e3b56cc RCX: 00000000004497b9
RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
RBP: 000000000071c238 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e3b5700
INFO: task syz-executor5:10829 blocked for more than 120 seconds.
      Not tainted 4.17.19 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor5   D28760 10829   6308 0x80000002
Call Trace:
 context_switch kernel/sched/core.c:2867 [inline]
 __schedule+0x721/0x1e60 kernel/sched/core.c:3515
 schedule+0x88/0x1c0 kernel/sched/core.c:3559
 io_schedule+0x21/0x80 kernel/sched/core.c:5179
 wait_on_page_bit_common mm/filemap.c:1100 [inline]
 __lock_page+0x2b5/0x390 mm/filemap.c:1273
 lock_page include/linux/pagemap.h:483 [inline]
 __revoke_inmem_pages+0xb35/0x11c0 fs/f2fs/segment.c:231
 drop_inmem_pages+0xa3/0x3e0 fs/f2fs/segment.c:306
 f2fs_release_file+0x2c7/0x330 fs/f2fs/file.c:1556
 __fput+0x2c7/0x780 fs/file_table.c:209
 ____fput+0x1a/0x20 fs/file_table.c:243
 task_work_run+0x151/0x1d0 kernel/task_work.c:113
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0x8ba/0x30a0 kernel/exit.c:865
 do_group_exit+0x13b/0x3a0 kernel/exit.c:968
 get_signal+0x6bb/0x1650 kernel/signal.c:2482
 do_signal+0x84/0x1b70 arch/x86/kernel/signal.c:810
 exit_to_usermode_loop+0x155/0x190 arch/x86/entry/common.c:162
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
 do_syscall_64+0x445/0x4e0 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:00007f1c68e74ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 000000000071bf80 RCX: 00000000004497b9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000071bf80
RBP: 000000000071bf80 R08: 0000000000000000 R09: 000000000071bf58
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f1c68e759c0 R15: 00007f1c68e75700

This patch tries to use trylock_page to mitigate such deadlock condition
for fix.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/f2fs/segment.c |   43 +++++++++++++++++++++++++++++++------------
 1 file changed, 31 insertions(+), 12 deletions(-)

--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -216,7 +216,8 @@ void f2fs_register_inmem_page(struct ino
 }
 
 static int __revoke_inmem_pages(struct inode *inode,
-				struct list_head *head, bool drop, bool recover)
+				struct list_head *head, bool drop, bool recover,
+				bool trylock)
 {
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
 	struct inmem_pages *cur, *tmp;
@@ -228,7 +229,16 @@ static int __revoke_inmem_pages(struct i
 		if (drop)
 			trace_f2fs_commit_inmem_page(page, INMEM_DROP);
 
-		lock_page(page);
+		if (trylock) {
+			/*
+			 * to avoid deadlock in between page lock and
+			 * inmem_lock.
+			 */
+			if (!trylock_page(page))
+				continue;
+		} else {
+			lock_page(page);
+		}
 
 		f2fs_wait_on_page_writeback(page, DATA, true);
 
@@ -317,13 +327,19 @@ void f2fs_drop_inmem_pages(struct inode
 	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
 	struct f2fs_inode_info *fi = F2FS_I(inode);
 
-	mutex_lock(&fi->inmem_lock);
-	__revoke_inmem_pages(inode, &fi->inmem_pages, true, false);
-	spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
-	if (!list_empty(&fi->inmem_ilist))
-		list_del_init(&fi->inmem_ilist);
-	spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
-	mutex_unlock(&fi->inmem_lock);
+	while (!list_empty(&fi->inmem_pages)) {
+		mutex_lock(&fi->inmem_lock);
+		__revoke_inmem_pages(inode, &fi->inmem_pages,
+						true, false, true);
+
+		if (list_empty(&fi->inmem_pages)) {
+			spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
+			if (!list_empty(&fi->inmem_ilist))
+				list_del_init(&fi->inmem_ilist);
+			spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
+		}
+		mutex_unlock(&fi->inmem_lock);
+	}
 
 	clear_inode_flag(inode, FI_ATOMIC_FILE);
 	fi->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
@@ -427,12 +443,15 @@ retry:
 		 * recovery or rewrite & commit last transaction. For other
 		 * error number, revoking was done by filesystem itself.
 		 */
-		err = __revoke_inmem_pages(inode, &revoke_list, false, true);
+		err = __revoke_inmem_pages(inode, &revoke_list,
+						false, true, false);
 
 		/* drop all uncommitted pages */
-		__revoke_inmem_pages(inode, &fi->inmem_pages, true, false);
+		__revoke_inmem_pages(inode, &fi->inmem_pages,
+						true, false, false);
 	} else {
-		__revoke_inmem_pages(inode, &revoke_list, false, false);
+		__revoke_inmem_pages(inode, &revoke_list,
+						false, false, false);
 	}
 
 	return err;



  parent reply	other threads:[~2019-03-26  6:43 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-26  6:29 [PATCH 4.19 00/45] 4.19.32-stable review Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 01/45] ALSA: hda - add Lenovo IdeaCentre B550 to the power_save_blacklist Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 02/45] ALSA: firewire-motu: use version field of unit directory to identify model Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 03/45] mmc: pxamci: fix enum type confusion Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 04/45] mmc: mxcmmc: "Revert mmc: mxcmmc: handle highmem pages" Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 05/45] mmc: renesas_sdhi: limit block count to 16 bit for old revisions Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 06/45] drm/vmwgfx: Dont double-free the mode stored in par->set_mode Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 07/45] drm/vmwgfx: Return 0 when gmrid::get_node runs out of IDs Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 08/45] iommu/amd: fix sg->dma_address for sg->offset bigger than PAGE_SIZE Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 09/45] libceph: wait for latest osdmap in ceph_monc_blacklist_add() Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 10/45] udf: Fix crash on IO error during truncate Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 11/45] mips: loongson64: lemote-2f: Add IRQF_NO_SUSPEND to "cascade" irqaction Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 12/45] MIPS: Ensure ELF appended dtb is relocated Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 13/45] MIPS: Fix kernel crash for R6 in jump label branch function Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 14/45] powerpc/vdso64: Fix CLOCK_MONOTONIC inconsistencies across Y2038 Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 15/45] scsi: ibmvscsi: Protect ibmvscsi_head from concurrent modificaiton Greg Kroah-Hartman
2019-03-26  6:29 ` [PATCH 4.19 16/45] scsi: ibmvscsi: Fix empty event pool access during host removal Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 17/45] futex: Ensure that futex address is aligned in handle_futex_death() Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 18/45] cifs: allow guest mounts to work for smb3.11 Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 19/45] perf probe: Fix getting the kernel map Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 20/45] objtool: Move objtool_file struct off the stack Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 21/45] irqchip/gic-v3-its: Fix comparison logic in lpi_range_cmp Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 22/45] SMB3: Fix SMB3.1.1 guest mounts to Samba Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 23/45] ALSA: x86: Fix runtime PM for hdmi-lpe-audio Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 24/45] ALSA: hda/ca0132 - make pci_iounmap() call conditional Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 25/45] ALSA: ac97: Fix of-node refcount unbalance Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 26/45] ext4: fix NULL pointer dereference while journal is aborted Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 27/45] ext4: fix data corruption caused by unaligned direct AIO Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 28/45] ext4: brelse all indirect buffer in ext4_ind_remove_space() Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 29/45] media: v4l2-ctrls.c/uvc: zero v4l2_event Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 30/45] Bluetooth: hci_uart: Check if socket buffer is ERR_PTR in h4_recv_buf() Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 31/45] Bluetooth: Fix decrementing reference count twice in releasing socket Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 32/45] Bluetooth: hci_ldisc: Initialize hci_dev before open() Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 33/45] Bluetooth: hci_ldisc: Postpone HCI_UART_PROTO_READY bit set in hci_uart_set_proto() Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 34/45] drm: Reorder set_property_atomic to avoid returning with an active ww_ctx Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 35/45] RDMA/cma: Rollback source IP address if failing to acquire device Greg Kroah-Hartman
2019-03-26  6:30 ` Greg Kroah-Hartman [this message]
2019-03-26  6:30 ` [PATCH 4.19 37/45] netfilter: ebtables: remove BUGPRINT messages Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 38/45] loop: access lo_backing_file only when the loop device is Lo_bound Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 39/45] x86/unwind: Handle NULL pointer calls better in frame unwinder Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 40/45] x86/unwind: Add hardcoded ORC entry for NULL Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 41/45] locking/lockdep: Add debug_locks check in __lock_downgrade() Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 42/45] mm, mempolicy: fix uninit memory access Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 43/45] ALSA: hda - Record the current power state before suspend/resume calls Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 44/45] ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec Greg Kroah-Hartman
2019-03-26  6:30 ` [PATCH 4.19 45/45] power: supply: charger-manager: Fix incorrect return value Greg Kroah-Hartman
2019-03-26 10:23 ` [PATCH 4.19 00/45] 4.19.32-stable review kernelci.org bot
2019-03-26 15:19 ` Jon Hunter
2019-03-26 17:49 ` Guenter Roeck
2019-03-26 23:17 ` shuah
2019-03-27  4:04 ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190326042704.566582951@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=jaegeuk@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=yuchao0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).