From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-fsdevel-owner@vger.kernel.org>
Received: from mail-oi0-f65.google.com ([209.85.218.65]:46780 "EHLO
        mail-oi0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S2388076AbeGWOGs (ORCPT
        <rfc822;linux-fsdevel@vger.kernel.org>);
        Mon, 23 Jul 2018 10:06:48 -0400
Received: by mail-oi0-f65.google.com with SMTP id y207-v6so917424oie.13
        for <linux-fsdevel@vger.kernel.org>; Mon, 23 Jul 2018 06:05:39 -0700 (PDT)
MIME-Version: 1.0
In-Reply-To: <CACT4Y+bSnJjtgeLdusj6czbH8080XfRs2b8L0V4R0TAixqxX6Q@mail.gmail.com>
References: <000000000000bc17b60571a60434@google.com> <CACT4Y+bKU8f4jVENYHX=fzNVd95A4vce2F=UCV12paVNFv-LNg@mail.gmail.com>
 <CAJfpegsKWGZ4LVeQXrrCr47+Bch4yfOWcWMFSniQsRzjRof=RQ@mail.gmail.com>
 <CACT4Y+ZbRi=0kRiR-j-SkngsB_QuALfnOX5nGF4agQD-weFsew@mail.gmail.com>
 <CAJfpegs0by5OJ7iqtg6L3T1w2RrFRiU6yRufVNbt=tNpJCbf2A@mail.gmail.com> <CACT4Y+bSnJjtgeLdusj6czbH8080XfRs2b8L0V4R0TAixqxX6Q@mail.gmail.com>
From: Miklos Szeredi <miklos@szeredi.hu>
Date: Mon, 23 Jul 2018 15:05:38 +0200
Message-ID: <CAJfpegvYHYUMdo14J_of10rmg0krjd_eiATJ3D4+XqaNys9bqQ@mail.gmail.com>
Subject: Re: INFO: task hung in fuse_reverse_inval_entry
To: Dmitry Vyukov <dvyukov@google.com>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
        LKML <linux-kernel@vger.kernel.org>,
        syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
        syzbot <syzbot+bb6d800770577a083f8c@syzkaller.appspotmail.com>
Content-Type: text/plain; charset="UTF-8"
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

On Mon, Jul 23, 2018 at 2:46 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Mon, Jul 23, 2018 at 2:33 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>> On Mon, Jul 23, 2018 at 9:59 AM, syzbot
>>>>> <syzbot+bb6d800770577a083f8c@syzkaller.appspotmail.com> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> syzbot found the following crash on:
>>>>>>
>>>>>> HEAD commit:    d72e90f33aa4 Linux 4.18-rc6
>>>>>> git tree:       upstream
>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1324f794400000
>>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=68af3495408deac5
>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=bb6d800770577a083f8c
>>>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>>>>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=11564d1c400000
>>>>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16fc570c400000
>>>>>
>>>>>
>>>>> Hi fuse maintainers,
>>>>>
>>>>> We are seeing a bunch of such deadlocks in fuse on syzbot. As far as I
>>>>> understand this is mostly working-as-intended (parts about deadlocks
>>>>> in Documentation/filesystems/fuse.txt). The intended way to resolve
>>>>> this is aborting connections via fusectl, right?
>>>>
>>>> Yes.  Alternative is with "umount -f".
>>>>
>>>>> The doc says "Under
>>>>> the fuse control filesystem each connection has a directory named by a
>>>>> unique number". The question is: if I start a process and this process
>>>>> can mount fuse, how do I kill it? I mean: totally and certainly get
>>>>> rid of it right away? How do I find these unique numbers for the
>>>>> mounts it created?
>>>>
>>>> It is the device number found in st_dev for the mount.  Other than
>>>> doing stat(2) it is possible to find out the device number by reading
>>>> /proc/$PID/mountinfo  (third field).
>>>
>>> Thanks. I will try to figure out fusectl connection numbers and see if
>>> it's possible to integrate aborting into syzkaller.
>>>
>>>>> Taking into account that there is usually no
>>>>> operator attached to each server, I wonder if kernel could somehow
>>>>> auto-abort fuse on kill?
>>>>
>>>> Depends on what the fuse server is sleeping on.   If it's trying to
>>>> acquire an inode lock (e.g. unlink(2)), which is classical way to
>>>> deadlock a fuse filesystem, then it will go into an uninterruptible
>>>> sleep.  There's no way in which that process can be killed except to
>>>> force a release of the offending lock, which can only be done by
>>>> aborting the request that is being performed while holding that lock.
>>>
>>> I understand that it is not killed today, but I am asking if we can
>>> make it killable. It's all code that we can change, and if a human
>>> operator can do it, it can be done pure programmatically on kill too,
>>> right?
>>
>> Hmm, you mean if a process is in an uninterruptible sleep trying to
>> acquire a lock on a fuse filesystem and is killed, then the fuse
>> filesystem should be aborted?
>>
>> Even if we'd manage to implement that, it's a large backward
>> incompatibility risk.
>>
>> I don't argue that it can be done, but I would definitely argue *if*
>> it should be done.
>
>
> I understand that we should abort only if we are sure that it's
> actually deadlocked and there is no other way.
> So if fuse-user process is blocked on fuse lock, then we probably
> should do nothing. However, if the fuse-server is killed, then perhaps
> we could abort the connection at that point. Namely, if a process that
> has a fuse fd open is killed and it is the only process that shared
> this fd, then we could abort the connection on arrival of the kill
> signal (rather than wait untill all it's threads finish and then start
> closing all fd's, this is where we get the deadlock -- some of its
> threads won't finish). I don't know if such synchronous kill hook is
> available, though. If several processes shared the same fuse fd, then
> we could close the fd in each process on SIGKILL arrival, then when
> all of these processes are killed, fuse fd will be closed and we can
> abort the connection, which will un-deadlock all of these processes.
> Does this look any reasonable?

Biggest conceptual problem: your definition of fuse-server is weak.
Take the following example: process A is holding the fuse device fd
and is forwarding requests and replies to/from process B via a pipe.
So basically A is just a proxy that does nothing interesting, the
"real" server is B.  But according to your definition B is not a
server, only A is.

And this is just a simple example, parts of the server might be on
different machines, etc...  It's impossible to automatically detect if
a process is acting as a fuse server or not.

We could let the fuse server itself notify the kernel that it's a fuse
server.  That might help in the cases where the deadlock is
accidental, but obviously not in the case when done by a malicious
agent.  I'm not sure it's worth the effort.   Also I have no idea how
the respective maintainers would take the idea of "kill hooks"...   It
would probably be a lot of work for little gain.

Thanks,
Miklos