Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Michael Stapelberg <michael+lkml@stapelberg.ch>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: fuse-devel <fuse-devel@lists.sourceforge.net>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Kyle Sanderson <kyle.leet@gmail.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Still a pretty bad time on 5.4.6 with fuse_request_end.
Date: Wed, 12 Feb 2020 08:58:47 +0100
Message-ID: <CANnVG6kYh6M30mwBHcGeFf=fhqKmWKPeUj2GYbvNgtq0hm=gXQ@mail.gmail.com> (raw)
In-Reply-To: <CAJfpegv4iL=bW3TXP3F9w1z6-LUox8KiBmw7UBcWE-0jiK0YsA@mail.gmail.com>

Sure: here’s a backtrace and req->args printed:

(gdb) bt full
#0  0xffffffff82000ff3 in __x86_indirect_thunk_r8 () at
arch/x86/lib/retpoline.S:40
No locals.
#1  0xffffffff8137ec68 in fuse_request_end (fc=0xffff88813a57be00,
req=0xffff88813a5a3770) at fs/fuse/dev.c:329
        fiq = 0xffff88813a57be48
        async = true
#2  0xffffffff81382d88 in fuse_dev_do_write (fud=0xffff88813a57be00,
cs=0xffffc90002fafa00, nbytes=4294967294) at fs/fuse/dev.c:1912
        err = 0
        fc = 0xffff88813a57be00
        fpq = 0xffff888132815f48
        req = 0xffff88813a5a3770
        oh = {
          len = 16,
          error = -2,
          unique = 66942
        }
#3  0xffffffff81382e69 in fuse_dev_write (iocb=0xffffc900008cbe48,
from=0xffffc900008cbe20) at fs/fuse/dev.c:1934
        cs = {
          write = 0,
          req = 0xffff88813a5a3770,
          iter = 0xffffc900008cbe20,
          pipebufs = 0x0 <fixed_percpu_data>,
          currbuf = 0x0 <fixed_percpu_data>,
          pipe = 0x0 <fixed_percpu_data>,
          nr_segs = 0,
          pg = 0x0 <fixed_percpu_data>,
          len = 0,
          offset = 24,
          move_pages = 0
        }
        fud = 0xffff888132815f40
#4  0xffffffff8120122e in call_write_iter (file=<optimized out>,
iter=<optimized out>, kio=<optimized out>) at
./include/linux/fs.h:1901
No locals.
#5  new_sync_write (filp=0xffff888119886b00, buf=<optimized out>,
len=<optimized out>, ppos=0xffffc900008cbee8) at fs/read_write.c:483
        iov = {
          iov_base = 0xc0008ec008,
          iov_len = 16
        }
        kiocb = {
          ki_filp = 0xffff888119886b00,
          ki_pos = 0,
          ki_complete = 0x0 <fixed_percpu_data>,
          private = 0x0 <fixed_percpu_data>,
          ki_flags = 0,
          ki_hint = 0,
          ki_ioprio = 0,
          ki_cookie = 0
        }
        iter = {
          type = 5,
          iov_offset = 0,
          count = 0,
          {
            iov = 0xffffc900008cbe20,
            kvec = 0xffffc900008cbe20,
            bvec = 0xffffc900008cbe20,
            pipe = 0xffffc900008cbe20
          },
          {
            nr_segs = 0,
            {
              head = 0,
              start_head = 0
            }
          }
        }
        ret = <optimized out>
#6  0xffffffff812012e4 in __vfs_write (file=<optimized out>,
p=<optimized out>, count=<optimized out>, pos=<optimized out>) at
fs/read_write.c:496
No locals.
#7  0xffffffff81203f04 in vfs_write (pos=<optimized out>, count=16,
buf=<optimized out>, file=<optimized out>) at fs/read_write.c:558
        ret = 16
        ret = <optimized out>
#8  vfs_write (file=0xffff888119886b00, buf=0xc0008ec008 "\020",
count=16, pos=0xffffc900008cbee8) at fs/read_write.c:542
        ret = 16
#9  0xffffffff812041b2 in ksys_write (fd=<optimized out>,
buf=0xc0008ec008 "\020", count=16) at fs/read_write.c:611
        pos = 0
        ppos = <optimized out>
        f = <optimized out>
        ret = 824643076104
#10 0xffffffff81204245 in __do_sys_write (count=<optimized out>,
buf=<optimized out>, fd=<optimized out>) at fs/read_write.c:623
No locals.
#11 __se_sys_write (count=<optimized out>, buf=<optimized out>,
fd=<optimized out>) at fs/read_write.c:620
        ret = <optimized out>
        ret = <optimized out>
#12 __x64_sys_write (regs=<optimized out>) at fs/read_write.c:620
No locals.
#13 0xffffffff810028a8 in do_syscall_64 (nr=<optimized out>,
regs=0xffffc900008cbf58) at arch/x86/entry/common.c:294
        ti = <optimized out>
#14 0xffffffff81e0007c in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:175
No locals.
#15 0x0000000000000000 in ?? ()
No symbol table info available.

(gdb) p *req->args
$5 = {
  nodeid = 18446683600620026424,
  opcode = 2167928246,
  in_numargs = 65535,
  out_numargs = 65535,
  force = false,
  noreply = false,
  nocreds = false,
  in_pages = false,
  out_pages = false,
  out_argvar = true,
  page_zeroing = true,
  page_replace = false,
  in_args = {{
      size = 978828800,
      value = 0x2fafce0
    }, {
      size = 978992728,
      value = 0xffffffff8138efaa <fuse_alloc_forget+26>
    }, {
      size = 50002688,
      value = 0xffffffff8138635f <fuse_lookup_name+255>
    }},
  out_args = {{
      size = 570,
      value = 0xffffc90002fafb10
    }, {
      size = 6876,
      value = 0x3000000001adc
    }},
  end = 0x1000100000001
}

Independently, as a separate test, I have also modified the source like this:

bool async;
bool async_early = req->args->end;

if (test_and_set_bit(FR_FINISHED, &req->flags))
goto put_request;

async = req->args->end;

…and printed the value of async and async_early. async is true,
async_early is false.

Perhaps some other routine is modifying the request, and checking
req->args->end early enough masks that bug?

On Tue, Feb 11, 2020 at 11:55 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Sun, Feb 9, 2020 at 9:09 AM <michael+lkml@stapelberg.ch> wrote:
> >
> > From: Michael Stapelberg <michael+lkml@stapelberg.ch>
> >
> > Hey,
> >
> > I recently ran into this, too. The symptom for me is that processes using the
> > affected FUSE file system hang indefinitely, sync(2) system calls hang
> > indefinitely, and even triggering an abort via echo 1 >
> > /sys/fs/fuse/connections/*/abort does not get the file system unstuck (there is
> > always 1 request still pending). Only removing power will get the machine
> > unstuck.
> >
> > I’m triggering this when building packages for https://distr1.org/, which uses a
> > FUSE daemon (written in Go using the jacobsa/fuse package) to provide package
> > contents.
> >
> > I bisected the issue to commit
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2b319d1f6f92a4ced9897678113d176ee16ae85d
> >
> > With that commit, I run into a kernel oops within ≈1 minute after starting my
> > batch build. With the commit before, I can batch build for many minutes without
> > issues.
>
> Pretty weird.   I'm not seeing how this could change behavior, as the
> args->end value is not changed after being initialized, and so moving
> the test later should not make a difference.
>
> Could you print out the complete contents of req->args?
>
> Thanks,
> Miklos

  reply index

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-05  1:15 Kyle Sanderson
2020-01-05  8:08 ` Greg KH
2020-01-16 11:06 ` Miklos Szeredi
2020-02-09  8:09   ` michael+lkml
2020-02-11 10:55     ` Miklos Szeredi
2020-02-12  7:58       ` Michael Stapelberg [this message]
2020-02-12  8:34         ` Miklos Szeredi
2020-02-12  9:38           ` Michael Stapelberg
2020-02-12 19:36             ` Miklos Szeredi
2020-02-13  7:23               ` Michael Stapelberg
2020-03-10 18:07               ` Vivek Goyal
2020-02-22 19:29 Kyle Sanderson
2020-02-23  9:58 ` Greg KH
2020-03-06  9:31 Bruno Thomsen
2020-03-09 19:03 ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANnVG6kYh6M30mwBHcGeFf=fhqKmWKPeUj2GYbvNgtq0hm=gXQ@mail.gmail.com' \
    --to=michael+lkml@stapelberg.ch \
    --cc=fuse-devel@lists.sourceforge.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=kyle.leet@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git