linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "zhengbin (A)" <zhengbin13@huawei.com>
To: <jack@suse.cz>, Al Viro <viro@ZenIV.linux.org.uk>,
	<akpm@linux-foundation.org>, <linux-fsdevel@vger.kernel.org>
Cc: "zhangyi (F)" <yi.zhang@huawei.com>, <zhengbin13@huawei.com>
Subject: Possible FS race condition between iterate_dir and d_alloc_parallel
Date: Tue, 3 Sep 2019 22:44:32 +0800	[thread overview]
Message-ID: <fd00be2c-257a-8e1f-eb1e-943a40c71c9a@huawei.com> (raw)

We recently encountered an oops(the filesystem is tmpfs)
crash> bt
PID: 108367  TASK: ffff8020d28eda00  CPU: 123  COMMAND: "du"
 #0 [ffff0000ae77b7e0] machine_kexec at ffff00006709d674
 #1 [ffff0000ae77b830] __crash_kexec at ffff000067150354
 #2 [ffff0000ae77b9c0] panic at ffff0000670a9358
 #3 [ffff0000ae77baa0] die at ffff00006708ec98
 #4 [ffff0000ae77bae0] die_kernel_fault at ffff0000670a1c6c
 #5 [ffff0000ae77bb10] __do_kernel_fault at ffff0000670a1924
 #6 [ffff0000ae77bb40] do_translation_fault at ffff0000676bb754
 #7 [ffff0000ae77bb50] do_mem_abort at ffff0000670812e0
 #8 [ffff0000ae77bd50] el1_ia at ffff000067083214
     PC: ffff0000672954c0  [dcache_readdir+216]
     LR: ffff0000672954f8  [dcache_readdir+272]
     SP: ffff0000ae77bd60  PSTATE: 60400009
    X29: ffff0000ae77bd60  X28: ffff8020d28eda00  X27: 0000000000000000
    X26: 0000000000000000  X25: 0000000056000000  X24: ffff80215c854000
    X23: 0000000000000001  X22: ffff8021f2f03290  X21: ffff803f74359698
    X20: ffff803f74359960  X19: ffff0000ae77be30  X18: 0000000000000000
    X17: 0000000000000000  X16: 0000000000000000  X15: 0000000000000000
    X14: 0000000000000000  X13: 0000000000000000  X12: 0000000000000000
    X11: 0000000000000000  X10: ffff8020fee99b18   X9: ffff8020fee99878
     X8: 0000000000a1f3aa   X7: 0000000000000000   X6: ffff00006727d760
     X5: ffffffffffff0073   X4: 0000000315d1d1c6   X3: 000000000000001b
     X2: 00000000ffff803f   X1: 656d616e00676f6c   X0: ffff0000ae77be30
 #9 [ffff0000ae77bd60] dcache_readdir at ffff0000672954bc

The reason is as follows:
Process 1 cat test which is not exist in directory A, process 2 cat test in directory A too.
process 3 create new file in directory B, process 4 ls directory A.

process 1(dirA)                  |process 2(dirA)                            |process 3(dirB)                       |process 4(dirA)
do_last                          |do_last                                    |do_last                               |iterate_dir
  inode_lock_shared              |  inode_lock_shared                        |  inode_lock(dirB)                    |  inode_lock_shared
  lookup_open                    |  lookup_open                              |  lookup_open                         |
    d_alloc_parallel             |    d_alloc_parallel                       |    d_alloc_parallel                  |
      d_alloc(add dtry1 to dirA) |                                           |                                      |
      hlist_bl_lock              |      d_alloc(add dtry2 to dirA)           |                                      |
      hlist_bl_add_head_rcu      |                                           |                                      |  dcache_readdir
      hlist_bl_unlock            |                                           |                                      |    p = &dentry->d_subdirs
                                 |      hlist_bl_lock                        |                                      |    next_positive(dentry, p, 1)
                                 |		hlist_bl_for_each_entry      |                                      |      p = from->next(p is dtry2)
                                 |		hlist_bl_unlock              |                                      |
                                 |		dput                         |                                      |
                                 |		  retain_dentry(dentry) false|                                      |
                                 |		  dentry_kill                |                                      |
                                 |		    spin_trylock(&parent)    |                                      |
                                 |			__dentry_kill        |                                      |
                                 |			  dentry_unlist      |                                      |
                                 |			  dentry_free(dtry2) |                                      |
                                 |                                           |      d_alloc(add dtry2 to dirB)      |
                                 |                                           |      hlist_bl_add_head_rcu           |
                                 |                                           |    dir_inode->i_op->create(new inode)|
                                 |                                           |                                      |      d = list_entry(p, struct dentry, d_child)
                                 |                                           |                                      |      if (!simple_positive(d))-->d belongs to dirB now

lookup_open-->d_in_lookup-->simple_lookup(shmem_dir_inode_operations)-->dentry->d_op = simple_dentry_operations
const struct dentry_operations simple_dentry_operations = {
	.d_delete = always_delete_dentry,
};
retain_dentry will return false


We should use spin_lock(&parent->d_lock) in next_positive. commit ebaaa80e8f20 ("lockless next_positive()") removes spin_lock, is it just for performance optimization?

Or if dput dentry, use inode_lock instead of inode_lock_shared?



             reply	other threads:[~2019-09-03 14:44 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-03 14:44 zhengbin (A) [this message]
2019-09-03 15:40 ` Possible FS race condition between iterate_dir and d_alloc_parallel Al Viro
2019-09-03 15:41   ` Al Viro
2019-09-04  6:15     ` zhengbin (A)
2019-09-05 17:47       ` Al Viro
2019-09-06  0:55         ` Jun Li
2019-09-06  2:00           ` Al Viro
2019-09-06  2:32         ` zhengbin (A)
2019-09-09 14:10       ` zhengbin (A)
2019-09-09 14:59         ` Al Viro
2019-09-09 15:10           ` zhengbin (A)
     [not found]             ` <7e32cda5-dc89-719d-9651-cf2bd06ae728@huawei.com>
2019-09-10 21:53               ` Al Viro
2019-09-10 22:17                 ` Al Viro
2019-09-14 16:16                 ` [PATCH] " Al Viro
2019-09-14 16:49                   ` Linus Torvalds
2019-09-14 17:01                     ` Al Viro
2019-09-14 17:15                       ` Linus Torvalds
2019-09-14 20:04                         ` Al Viro
2019-09-14 22:57                           ` Linus Torvalds
2019-09-15  0:50                             ` Al Viro
2019-09-15  1:41                               ` Linus Torvalds
2019-09-15 16:02                                 ` Al Viro
2019-09-15 17:58                                   ` Linus Torvalds
2019-09-21 14:07                                     ` Al Viro
2019-09-21 16:21                                       ` Linus Torvalds
2019-09-21 17:18                                         ` Al Viro
2019-09-21 17:38                                           ` Linus Torvalds
2019-09-24  2:52                                       ` Al Viro
2019-09-24 13:30                                         ` Josef Bacik
2019-09-24 14:51                                           ` Al Viro
2019-09-24 15:01                                             ` Josef Bacik
2019-09-24 15:11                                               ` Al Viro
2019-09-24 15:26                                                 ` Josef Bacik
2019-09-24 16:33                                                   ` Al Viro
     [not found]                                         ` <CAHk-=wiJ1eY7y6r_cFNRPCqD+BJZS7eJeQFO6OrXxRFjDAipsQ@mail.gmail.com>
2019-09-29  5:29                                           ` Al Viro
2019-09-25 11:59                                       ` Amir Goldstein
2019-09-25 12:22                                         ` Al Viro
2019-09-25 12:34                                           ` Amir Goldstein
2019-09-22 21:29                     ` Al Viro
2019-09-23  3:32                       ` zhengbin (A)
2019-09-23  5:08                         ` Al Viro
     [not found]                   ` <20190916020434.tutzwipgs4f6o3di@inn2.lkp.intel.com>
2019-09-16  2:58                     ` 266a9a8b41: WARNING:possible_recursive_locking_detected Al Viro
2019-09-16  3:03                       ` Al Viro
2019-09-16  3:44                         ` Linus Torvalds
2019-09-16 17:16                           ` Al Viro
2019-09-16 17:29                             ` Al Viro
     [not found]                             ` <bd707e64-9650-e9ed-a820-e2cabd02eaf8@huawei.com>
2019-09-17 12:01                               ` Al Viro
2019-09-19  3:36                                 ` zhengbin (A)
2019-09-19  3:55                                   ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fd00be2c-257a-8e1f-eb1e-943a40c71c9a@huawei.com \
    --to=zhengbin13@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=viro@ZenIV.linux.org.uk \
    --cc=yi.zhang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).