All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Shuning Zhang <sunny.s.zhang@oracle.com>,
	Joseph Qi <jiangqi903@gmail.com>, Mark Fasheh <mark@fasheh.com>,
	Joel Becker <jlbec@evilplan.org>,
	Junxiao Bi <junxiao.bi@oracle.com>,
	Changwei Ge <gechangwei@live.cn>, piaojun <piaojun@huawei.com>,
	"Gang He" <ghe@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.9 23/44] ocfs2: fix ocfs2 read inode data panic in ocfs2_iget
Date: Mon, 20 May 2019 14:14:12 +0200	[thread overview]
Message-ID: <20190520115233.553924605@linuxfoundation.org> (raw)
In-Reply-To: <20190520115230.720347034@linuxfoundation.org>

From: Shuning Zhang <sunny.s.zhang@oracle.com>

commit e091eab028f9253eac5c04f9141bbc9d170acab3 upstream.

In some cases, ocfs2_iget() reads the data of inode, which has been
deleted for some reason.  That will make the system panic.  So We should
judge whether this inode has been deleted, and tell the caller that the
inode is a bad inode.

For example, the ocfs2 is used as the backed of nfs, and the client is
nfsv3.  This issue can be reproduced by the following steps.

on the nfs server side,
..../patha/pathb

Step 1: The process A was scheduled before calling the function fh_verify.

Step 2: The process B is removing the 'pathb', and just completed the call
to function dput.  Then the dentry of 'pathb' has been deleted from the
dcache, and all ancestors have been deleted also.  The relationship of
dentry and inode was deleted through the function hlist_del_init.  The
following is the call stack.
dentry_iput->hlist_del_init(&dentry->d_u.d_alias)

At this time, the inode is still in the dcache.

Step 3: The process A call the function ocfs2_get_dentry, which get the
inode from dcache.  Then the refcount of inode is 1.  The following is the
call stack.
nfsd3_proc_getacl->fh_verify->exportfs_decode_fh->fh_to_dentry(ocfs2_get_dentry)

Step 4: Dirty pages are flushed by bdi threads.  So the inode of 'patha'
is evicted, and this directory was deleted.  But the inode of 'pathb'
can't be evicted, because the refcount of the inode was 1.

Step 5: The process A keep running, and call the function
reconnect_path(in exportfs_decode_fh), which call function
ocfs2_get_parent of ocfs2.  Get the block number of parent
directory(patha) by the name of ...  Then read the data from disk by the
block number.  But this inode has been deleted, so the system panic.

Process A                                             Process B
1. in nfsd3_proc_getacl                   |
2.                                        |        dput
3. fh_to_dentry(ocfs2_get_dentry)         |
4. bdi flush dirty cache                  |
5. ocfs2_iget                             |

[283465.542049] OCFS2: ERROR (device sdp): ocfs2_validate_inode_block:
Invalid dinode #580640: OCFS2_VALID_FL not set

[283465.545490] Kernel panic - not syncing: OCFS2: (device sdp): panic forced
after error

[283465.546889] CPU: 5 PID: 12416 Comm: nfsd Tainted: G        W
4.1.12-124.18.6.el6uek.bug28762940v3.x86_64 #2
[283465.548382] Hardware name: VMware, Inc. VMware Virtual Platform/440BX
Desktop Reference Platform, BIOS 6.00 09/21/2015
[283465.549657]  0000000000000000 ffff8800a56fb7b8 ffffffff816e839c
ffffffffa0514758
[283465.550392]  000000000008dc20 ffff8800a56fb838 ffffffff816e62d3
0000000000000008
[283465.551056]  ffff880000000010 ffff8800a56fb848 ffff8800a56fb7e8
ffff88005df9f000
[283465.551710] Call Trace:
[283465.552516]  [<ffffffff816e839c>] dump_stack+0x63/0x81
[283465.553291]  [<ffffffff816e62d3>] panic+0xcb/0x21b
[283465.554037]  [<ffffffffa04e66b0>] ocfs2_handle_error+0xf0/0xf0 [ocfs2]
[283465.554882]  [<ffffffffa04e7737>] __ocfs2_error+0x67/0x70 [ocfs2]
[283465.555768]  [<ffffffffa049c0f9>] ocfs2_validate_inode_block+0x229/0x230
[ocfs2]
[283465.556683]  [<ffffffffa047bcbc>] ocfs2_read_blocks+0x46c/0x7b0 [ocfs2]
[283465.557408]  [<ffffffffa049bed0>] ? ocfs2_inode_cache_io_unlock+0x20/0x20
[ocfs2]
[283465.557973]  [<ffffffffa049f0eb>] ocfs2_read_inode_block_full+0x3b/0x60
[ocfs2]
[283465.558525]  [<ffffffffa049f5ba>] ocfs2_iget+0x4aa/0x880 [ocfs2]
[283465.559082]  [<ffffffffa049146e>] ocfs2_get_parent+0x9e/0x220 [ocfs2]
[283465.559622]  [<ffffffff81297c05>] reconnect_path+0xb5/0x300
[283465.560156]  [<ffffffff81297f46>] exportfs_decode_fh+0xf6/0x2b0
[283465.560708]  [<ffffffffa062faf0>] ? nfsd_proc_getattr+0xa0/0xa0 [nfsd]
[283465.561262]  [<ffffffff810a8196>] ? prepare_creds+0x26/0x110
[283465.561932]  [<ffffffffa0630860>] fh_verify+0x350/0x660 [nfsd]
[283465.562862]  [<ffffffffa0637804>] ? nfsd_cache_lookup+0x44/0x630 [nfsd]
[283465.563697]  [<ffffffffa063a8b9>] nfsd3_proc_getattr+0x69/0xf0 [nfsd]
[283465.564510]  [<ffffffffa062cf60>] nfsd_dispatch+0xe0/0x290 [nfsd]
[283465.565358]  [<ffffffffa05eb892>] ? svc_tcp_adjust_wspace+0x12/0x30
[sunrpc]
[283465.566272]  [<ffffffffa05ea652>] svc_process_common+0x412/0x6a0 [sunrpc]
[283465.567155]  [<ffffffffa05eaa03>] svc_process+0x123/0x210 [sunrpc]
[283465.568020]  [<ffffffffa062c90f>] nfsd+0xff/0x170 [nfsd]
[283465.568962]  [<ffffffffa062c810>] ? nfsd_destroy+0x80/0x80 [nfsd]
[283465.570112]  [<ffffffff810a622b>] kthread+0xcb/0xf0
[283465.571099]  [<ffffffff810a6160>] ? kthread_create_on_node+0x180/0x180
[283465.572114]  [<ffffffff816f11b8>] ret_from_fork+0x58/0x90
[283465.573156]  [<ffffffff810a6160>] ? kthread_create_on_node+0x180/0x180

Link: http://lkml.kernel.org/r/1554185919-3010-1-git-send-email-sunny.s.zhang@oracle.com
Signed-off-by: Shuning Zhang <sunny.s.zhang@oracle.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: piaojun <piaojun@huawei.com>
Cc: "Gang He" <ghe@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/ocfs2/export.c |   30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

--- a/fs/ocfs2/export.c
+++ b/fs/ocfs2/export.c
@@ -148,16 +148,24 @@ static struct dentry *ocfs2_get_parent(s
 	u64 blkno;
 	struct dentry *parent;
 	struct inode *dir = d_inode(child);
+	int set;
 
 	trace_ocfs2_get_parent(child, child->d_name.len, child->d_name.name,
 			       (unsigned long long)OCFS2_I(dir)->ip_blkno);
 
+	status = ocfs2_nfs_sync_lock(OCFS2_SB(dir->i_sb), 1);
+	if (status < 0) {
+		mlog(ML_ERROR, "getting nfs sync lock(EX) failed %d\n", status);
+		parent = ERR_PTR(status);
+		goto bail;
+	}
+
 	status = ocfs2_inode_lock(dir, NULL, 0);
 	if (status < 0) {
 		if (status != -ENOENT)
 			mlog_errno(status);
 		parent = ERR_PTR(status);
-		goto bail;
+		goto unlock_nfs_sync;
 	}
 
 	status = ocfs2_lookup_ino_from_name(dir, "..", 2, &blkno);
@@ -166,11 +174,31 @@ static struct dentry *ocfs2_get_parent(s
 		goto bail_unlock;
 	}
 
+	status = ocfs2_test_inode_bit(OCFS2_SB(dir->i_sb), blkno, &set);
+	if (status < 0) {
+		if (status == -EINVAL) {
+			status = -ESTALE;
+		} else
+			mlog(ML_ERROR, "test inode bit failed %d\n", status);
+		parent = ERR_PTR(status);
+		goto bail_unlock;
+	}
+
+	trace_ocfs2_get_dentry_test_bit(status, set);
+	if (!set) {
+		status = -ESTALE;
+		parent = ERR_PTR(status);
+		goto bail_unlock;
+	}
+
 	parent = d_obtain_alias(ocfs2_iget(OCFS2_SB(dir->i_sb), blkno, 0, 0));
 
 bail_unlock:
 	ocfs2_inode_unlock(dir, 0);
 
+unlock_nfs_sync:
+	ocfs2_nfs_sync_unlock(OCFS2_SB(dir->i_sb), 1);
+
 bail:
 	trace_ocfs2_get_parent_end(parent);
 



  parent reply	other threads:[~2019-05-20 12:53 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-20 12:13 [PATCH 4.9 00/44] 4.9.178-stable review Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 01/44] net: core: another layer of lists, around PF_MEMALLOC skb handling Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 02/44] locking/rwsem: Prevent decrement of reader count before increment Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 03/44] PCI: hv: Fix a memory leak in hv_eject_device_work() Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 04/44] x86/speculation/mds: Revert CPU buffer clear on double fault exit Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 05/44] x86/speculation/mds: Improve CPU buffer clear documentation Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 06/44] objtool: Fix function fallthrough detection Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 07/44] ARM: exynos: Fix a leaked reference by adding missing of_node_put Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 08/44] power: supply: axp288_charger: Fix unchecked return value Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 09/44] arm64: compat: Reduce address limit Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 10/44] arm64: Clear OSDLR_EL1 on CPU boot Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 11/44] sched/x86: Save [ER]FLAGS on context switch Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 12/44] crypto: chacha20poly1305 - set cra_name correctly Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 13/44] crypto: vmx - fix copy-paste error in CTR mode Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 14/44] crypto: crct10dif-generic - fix use via crypto_shash_digest() Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 15/44] crypto: x86/crct10dif-pcl " Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 16/44] ALSA: usb-audio: Fix a memory leak bug Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 17/44] ALSA: hda/hdmi - Read the pin sense from register when repolling Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 18/44] ALSA: hda/hdmi - Consider eld_valid when reporting jack event Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 19/44] ALSA: hda/realtek - EAPD turn on later Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 20/44] ASoC: max98090: Fix restore of DAPM Muxes Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 21/44] ASoC: RT5677-SPI: Disable 16Bit SPI Transfers Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 22/44] mm/mincore.c: make mincore() more conservative Greg Kroah-Hartman
2019-05-20 12:14 ` Greg Kroah-Hartman [this message]
2019-05-20 12:14 ` [PATCH 4.9 24/44] mfd: da9063: Fix OTP control register names to match datasheets for DA9063/63L Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 25/44] mfd: max77620: Fix swapped FPS_PERIOD_MAX_US values Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 26/44] tty/vt: fix write/write race in ioctl(KDSKBSENT) handler Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 27/44] jbd2: check superblock mapped prior to committing Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 28/44] ext4: actually request zeroing of inode table after grow Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 29/44] ext4: fix ext4_show_options for file systems w/o journal Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 30/44] Btrfs: do not start a transaction at iterate_extent_inodes() Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 31/44] bcache: fix a race between cache register and cacheset unregister Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 32/44] bcache: never set KEY_PTRS of journal key to 0 in journal_reclaim() Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 33/44] ipmi:ssif: compare block number correctly for multi-part return messages Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 34/44] crypto: gcm - Fix error return code in crypto_gcm_create_common() Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 35/44] crypto: gcm - fix incompatibility between "gcm" and "gcm_base" Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 36/44] crypto: salsa20 - dont access already-freed walk.iv Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 37/44] crypto: arm/aes-neonbs " Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 38/44] fib_rules: fix error in backport of e9919a24d302 ("fib_rules: return 0...") Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 39/44] writeback: synchronize sync(2) against cgroup writeback membership switches Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 40/44] fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 41/44] ext4: zero out the unused memory region in the extent tree block Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 42/44] ext4: fix data corruption caused by overlapping unaligned and aligned IO Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 43/44] ALSA: hda/realtek - Fix for Lenovo B50-70 inverted internal microphone bug Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 44/44] KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes Greg Kroah-Hartman
2019-05-20 19:28 ` [PATCH 4.9 00/44] 4.9.178-stable review kernelci.org bot
2019-05-21  8:51 ` Jon Hunter
2019-05-21  8:51   ` Jon Hunter
2019-05-21 10:34 ` Naresh Kamboju
2019-05-21 16:46   ` Greg Kroah-Hartman
2019-05-21 21:39 ` shuah

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190520115233.553924605@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=gechangwei@live.cn \
    --cc=ghe@suse.com \
    --cc=jiangqi903@gmail.com \
    --cc=jlbec@evilplan.org \
    --cc=junxiao.bi@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark@fasheh.com \
    --cc=piaojun@huawei.com \
    --cc=stable@vger.kernel.org \
    --cc=sunny.s.zhang@oracle.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.