linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: zhengbin <zhengbin13@huawei.com>, Jan Kara <jack@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	"zhangyi (F)" <yi.zhang@huawei.com>,
	renxudong1@huawei.com, Hou Tao <houtao1@huawei.com>,
	linux-btrfs@vger.kernel.org, "Yan, Zheng" <zyan@redhat.com>,
	linux-cifs@vger.kernel.org, Steve French <sfrench@samba.org>
Subject: Re: [PATCH] Re: Possible FS race condition between iterate_dir and d_alloc_parallel
Date: Sun, 29 Sep 2019 06:29:34 +0100	[thread overview]
Message-ID: <20190929052934.GY26530@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CAHk-=wiJ1eY7y6r_cFNRPCqD+BJZS7eJeQFO6OrXxRFjDAipsQ@mail.gmail.com>

On Tue, Sep 24, 2019 at 09:55:06AM -0700, Linus Torvalds wrote:
> [ Sorry for html, I'm driving around ]
> 
> On Mon, Sep 23, 2019, 19:52 Al Viro <viro@zeniv.linux.org.uk> wrote:
> 
> >
> > Argh...  The things turned interesting.  The tricky part is
> > where do we handle switching cursors away from something
> > that gets moved.
> >
> 
> I forget why we do that. Remind me?
> 
> Anyway, I think to a first approximation, we should probably first just see
> if the "remove cursors from the end" change already effectively makes most
> of the regression go away.
> 
> Because with that change, if you just readdir() things with a sufficiently
> big buffer, you'll never have the cursor hang around over several system
> calls.
> 
> Before, you'd do a readdir, add the cursor, return to user space, then do
> another readdir just to see the EOF, and then do the close().
> 
> So it used to leave the cursor in place over those two final system calls,
> and now it's all gone.
> 
> I'm sure that on the big systems you can still trigger the whole d_lock
> contention on the parent, but I wonder if it's all that much of a problem
> any more in practice with your other change..

If nothing else, it's DoSable right now.  And getting rid of that would
reduce the memory footprint, while we are at it.

In any case, it looks like btrfs really wants an empty directory there,
i.e. the right thing to do would be simple_lookup() for ->lookup.

CIFS is potentially trickier.  AFAICS, what's going on is
	* Windows has a strange export, called IPC$.  Looks like it
was (still is?) the place where they export their named pipes.  From
what I'd been able to figure out, it's always there and allows for
some other uses - it can be used to get the list of exports.  Whether
the actual named pipes are seen there these days... no idea.
	* there seems to be nothing to prevent a server (modified
samba, for example) from exporting whatever it wants under that
name.
	* IF it can be non-empty, mounting it will end up with
root directory where we can do lookups for whatever is there.
getdents() on that root will show what's currently in dcache
(== had been looked up and still has not been evicted by
memory pressure).  Mainline can get seriously buggered if
dcache_readdir() steps into something that is going away.  With the
patches in this series that's no longer a problem.  HOWEVER, if
lookup in one subdirectory picks an alias for another subdirectory
of root, we will be really screwed - shared lock on parent
won't stop d_splice_alias() from moving the alias, and that can
bloody well lead to freeing the original name.  From under
copy_to_user()...  And grabbing a reference to dentry obviously
doesn't prevent that - dentry itself won't get freed, but
external name very well might be.

Again, I don't know CIFS well enough to tell how IPC$ is really
used.  If it doesn't normally contain anything but named pipes,
we can simply have cifs_lookup() refuse to return subdirectories
on such mounts, with solves any problems with d_splice_alias().
If it's always empty - better yet.  If the impressions above
(and that's all those are) are correct, we might have a problem
in mainline with bogus/malicious servers.

That's independent from what we do with the cursors.  I asked
around a bit; no luck so far.  It would be really nice if
some CIFS folks could give a braindump on that thing...

  parent reply	other threads:[~2019-09-29  5:30 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-03 14:44 Possible FS race condition between iterate_dir and d_alloc_parallel zhengbin (A)
2019-09-03 15:40 ` Al Viro
2019-09-03 15:41   ` Al Viro
2019-09-04  6:15     ` zhengbin (A)
2019-09-05 17:47       ` Al Viro
2019-09-06  0:55         ` Jun Li
2019-09-06  2:00           ` Al Viro
2019-09-06  2:32         ` zhengbin (A)
2019-09-09 14:10       ` zhengbin (A)
2019-09-09 14:59         ` Al Viro
2019-09-09 15:10           ` zhengbin (A)
     [not found]             ` <7e32cda5-dc89-719d-9651-cf2bd06ae728@huawei.com>
2019-09-10 21:53               ` Al Viro
2019-09-10 22:17                 ` Al Viro
2019-09-14 16:16                 ` [PATCH] " Al Viro
2019-09-14 16:49                   ` Linus Torvalds
2019-09-14 17:01                     ` Al Viro
2019-09-14 17:15                       ` Linus Torvalds
2019-09-14 20:04                         ` Al Viro
2019-09-14 22:57                           ` Linus Torvalds
2019-09-15  0:50                             ` Al Viro
2019-09-15  1:41                               ` Linus Torvalds
2019-09-15 16:02                                 ` Al Viro
2019-09-15 17:58                                   ` Linus Torvalds
2019-09-21 14:07                                     ` Al Viro
2019-09-21 16:21                                       ` Linus Torvalds
2019-09-21 17:18                                         ` Al Viro
2019-09-21 17:38                                           ` Linus Torvalds
2019-09-24  2:52                                       ` Al Viro
2019-09-24 13:30                                         ` Josef Bacik
2019-09-24 14:51                                           ` Al Viro
2019-09-24 15:01                                             ` Josef Bacik
2019-09-24 15:11                                               ` Al Viro
2019-09-24 15:26                                                 ` Josef Bacik
2019-09-24 16:33                                                   ` Al Viro
     [not found]                                         ` <CAHk-=wiJ1eY7y6r_cFNRPCqD+BJZS7eJeQFO6OrXxRFjDAipsQ@mail.gmail.com>
2019-09-29  5:29                                           ` Al Viro [this message]
2019-09-25 11:59                                       ` Amir Goldstein
2019-09-25 12:22                                         ` Al Viro
2019-09-25 12:34                                           ` Amir Goldstein
2019-09-22 21:29                     ` Al Viro
2019-09-23  3:32                       ` zhengbin (A)
2019-09-23  5:08                         ` Al Viro
     [not found]                   ` <20190916020434.tutzwipgs4f6o3di@inn2.lkp.intel.com>
2019-09-16  2:58                     ` 266a9a8b41: WARNING:possible_recursive_locking_detected Al Viro
2019-09-16  3:03                       ` Al Viro
2019-09-16  3:44                         ` Linus Torvalds
2019-09-16 17:16                           ` Al Viro
2019-09-16 17:29                             ` Al Viro
     [not found]                             ` <bd707e64-9650-e9ed-a820-e2cabd02eaf8@huawei.com>
2019-09-17 12:01                               ` Al Viro
2019-09-19  3:36                                 ` zhengbin (A)
2019-09-19  3:55                                   ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190929052934.GY26530@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=akpm@linux-foundation.org \
    --cc=houtao1@huawei.com \
    --cc=jack@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=renxudong1@huawei.com \
    --cc=sfrench@samba.org \
    --cc=torvalds@linux-foundation.org \
    --cc=yi.zhang@huawei.com \
    --cc=zhengbin13@huawei.com \
    --cc=zyan@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).