All of lore.kernel.org
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jan Kara <jack@suse.cz>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	syzkaller-bugs@googlegroups.com,
	Al Viro <viro@zeniv.linux.org.uk>,
	syzbot <syzbot+31d8b84465a7cbfd8515@syzkaller.appspotmail.com>,
	overlayfs <linux-unionfs@vger.kernel.org>
Subject: Re: possible deadlock in pipe_lock (2)
Date: Mon, 11 Feb 2019 19:04:09 +0200	[thread overview]
Message-ID: <CAOQ4uxiby9B8X5zOTRqTtg_oNX2ageiY5ee_GhQovwnqpHUyEA@mail.gmail.com> (raw)
In-Reply-To: <CAJfpegstnt2tyhZuQ+DnR3rfn7B6xo9JhcBnb=GJV11i8013gA@mail.gmail.com>

On Mon, Feb 11, 2019 at 5:40 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Mon, Feb 11, 2019 at 4:07 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Mon, Feb 11, 2019 at 4:33 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > >
> > > On Mon, Feb 11, 2019 at 2:08 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > >
> > > > On Mon, Feb 11, 2019 at 2:37 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > > > >
> > > > > On Mon, Feb 11, 2019 at 1:06 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > > > > >
> > > > > > On Mon, Feb 11, 2019 at 8:38 AM Amir Goldstein <amir73il@gmail.com> wrote:
> > > > > > >
> > > > > > > On Sun, Feb 10, 2019 at 8:23 PM syzbot
> > > > > > > <syzbot+31d8b84465a7cbfd8515@syzkaller.appspotmail.com> wrote:
> > > > > >
> > > > > > > > -> #1 (&ovl_i_mutex_key[depth]){+.+.}:
> > > > > > > >         down_write+0x38/0x90 kernel/locking/rwsem.c:70
> > > > > > > >         inode_lock include/linux/fs.h:757 [inline]
> > > > > > > >         ovl_write_iter+0x148/0xc20 fs/overlayfs/file.c:231
> > > > > > > >         call_write_iter include/linux/fs.h:1863 [inline]
> > > > > > > >         new_sync_write fs/read_write.c:474 [inline]
> > > > > > > >         __vfs_write+0x613/0x8e0 fs/read_write.c:487
> > > > > > > > kobject: 'loop4' (000000009e2b886d): kobject_uevent_env
> > > > > > > >         __kernel_write+0x110/0x3b0 fs/read_write.c:506
> > > > > > > >         write_pipe_buf+0x15d/0x1f0 fs/splice.c:797
> > > > > > > >         splice_from_pipe_feed fs/splice.c:503 [inline]
> > > > > > > >         __splice_from_pipe+0x39a/0x7e0 fs/splice.c:627
> > > > > > > >         splice_from_pipe+0x108/0x170 fs/splice.c:662
> > > > > > > >         default_file_splice_write+0x3c/0x90 fs/splice.c:809
> > > > > >
> > > > > > Irrelevant to the lockdep splat, but why isn't there an
> > > > > > ovl_splice_write() that just recurses into realfile->splice_write()?
> > > > > > Sounds like a much more efficient way to handle splice read and
> > > > > > write...
> > > > > >
> > > > > > [...]
> > > > > >
> > > > > > > Miklos,
> > > > > > >
> > > > > > > Its good that this report popped up again, because I went to
> > > > > > > look back at my notes from previous report [1].
> > > > > > > If I was right in my previous analysis then we must have a real
> > > > > > > deadlock in current "lazy copy up" WIP patches. Right?
> > > > > >
> > > > > > Hmm, AFAICS this circular dependency translated into layman's terms:
> > > > > >
> > > > > > pipe lock -> ovl inode lock  (splice to ovl file)
> > > > > >
> > > > > > ovl inode lock -> upper freeze lock (truncate of ovl file)
> > > > > >
> > > > > > upper freeze lock -> pipe lock (splice to upper file)
> > > > >
> > > > > So what can we do with this?
> > > > >
> > > > > The "freeze lock -> inode lock" dependency is fixed.   This is
> > > > > reversed in overlay to "ovl inode lock -> upper freeze lock", which is
> > > > > okay, because this is a nesting that cannot be reversed.   But in
> > > > > splice the pipe locks comes in between: "freeze lock -> pipe lock ->
> > > > > inode lock" which breaks this nesting direction and creates a true
> > > > > reverse dependency between ovl inode lock and upper freeze lock.
> > > > >
> > > > > The only way I see this could be fixed is to move the freeze lock
> > > > > inside the pipe lock.  But that would mean splice/sendfile/etc could
> > > > > be frozen with the pipe lock held.  It doesn't look nice.
> > > > >
> > > > > Any other ideas?
> > > > >
> > > >
> > > > [CC Jan]
> > > >
> > > > I think we are allowed to file_start_write_trylock(upper)
> > > > before ovl_inode_lock(). This in ONLY needed to cover the corner
> > > > case of upper being frozen in between "upper freeze lock -> pipe lock"
> > > > and thread B being in between "ovl inode lock -> upper freeze lock".
> > > > Is it OK to return a transient error in this corner copy up case?
> > >
> > > This case shouldn't happen assuming adherence to the "upper shall not
> > > be modified while part of the overlay" rule.
> > >
> >
> > Right. And unfreezing upper should untangle this deadlock,
> > because both thread A and B are taking a shared sb_writers lock.
>
> I don't think that'll work.   The deadlock would involve freezing for
> sure, otherwise sb_start_write() won't block.  But there's no way to
> cancel sb_wait_write() once it's called, so the deadlock is permanent.
>

Of course.

> > > Side note: I don't see that it has anything to do with copy-up, but I
> > > may be missing something.
> > >
> >
> > You are right. I was confusing your "ovl inode lock" with ovl_inode_lock(),
> > but the latter is taken after upper freeze lock, so irrelevant.
> >
> > > My other thought is that perhaps sb_start_write() should invoke
> > > s_ops->start_write() so that overlay can do the freeze protection on
> > > the upper early.
> > >
> >
> > Sorry, I don't see how that solves anything expect for the lockdep
> > warning. In both cases threads A and B would block until upper
> > in unfrozen, only without a lockdep warning.
> > Also, I am quite sure that taking upper freeze lock early will generate
> > many new lockdep warnings.
>
> My thinking was to make the lock order:
>
>   ovl freeze lock -> upper freeze lock -> ovl inode lock -> upper inode lock
>

That sounds reasonable for the common cases,
but limits us in doing things like update origin/impure xattr on lookup
and remove impure xattr on readdir.
I guess these best effort cases could use upper freeze trylock?
I sense there may be other issues lurking with page cache implementation..

>
> > Anyway, what about the recursive do_splice_direct() issue
> > with lazy copy up, do you have an easy solution for that?
>
> Not sure.  I think splice_direct_to_actor() should be able to deal
> with it.  E.g.
>
>     pipe = current->splice_pipe;
>     if (unlikely(!pipe)) {
>         pipe = alloc_pipe_info();
>         ...
>     } else {
>         current->splice_pipe = NULL;
>     }
>
>     ... do the actual splicing ...
>
>     if (!current->splice_pipe) {
>         current->splice_pipe = pipe
>     } else {
>         free_pipe_info(pipe);
>     }
>

OK, I wonder how to make lockdep aware of that nesting.
limit nesting level to FILESYSTEM_MAX_STACK_DEPTH?...

Thanks,
Amir.

  reply	other threads:[~2019-02-11 17:04 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-10 18:23 possible deadlock in pipe_lock (2) syzbot
2019-02-11  7:38 ` Amir Goldstein
2019-02-11 12:06   ` Miklos Szeredi
2019-02-11 12:37     ` Miklos Szeredi
2019-02-11 13:08       ` Amir Goldstein
2019-02-11 14:33         ` Miklos Szeredi
2019-02-11 15:06           ` Amir Goldstein
2019-02-11 15:40             ` Miklos Szeredi
2019-02-11 17:04               ` Amir Goldstein [this message]
2019-02-12 11:14           ` Jan Kara
2019-02-12 13:39             ` Amir Goldstein
2019-02-12 14:19               ` Miklos Szeredi
2019-02-12 16:11               ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxiby9B8X5zOTRqTtg_oNX2ageiY5ee_GhQovwnqpHUyEA@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=syzbot+31d8b84465a7cbfd8515@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.