linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Reading from fuse pipe fails with EBADFD
@ 2018-11-27 20:26 Nikolaus Rath
  2018-12-04  9:03 ` Nikolaus Rath
  0 siblings, 1 reply; 7+ messages in thread
From: Nikolaus Rath @ 2018-11-27 20:26 UTC (permalink / raw)
  To: fuse-devel, Miklos Szeredi, linux-fsdevel

Hi,

When testing FUSE under heavy load, I am occasionally getting EBADFD
errors when reading from the fuse pipe.

Does anyone have an idea what might cause this, and how to debug it
further?

Unfortunately I can't tell if this happens upon read() or upon
splice(). I've extended the code now, so the next time it happens I will
be able to tell.

I am testing with kernel 4.15 from Ubuntu Bionic.

So far this has happened twice in about 3 hours of testing. For the
testing, I am repeatedly running the same 3 workloads, without
remounting. Each workload takes ~ 15-30 seconds.

Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Reading from fuse pipe fails with EBADFD
  2018-11-27 20:26 Reading from fuse pipe fails with EBADFD Nikolaus Rath
@ 2018-12-04  9:03 ` Nikolaus Rath
  2018-12-04  9:10   ` Miklos Szeredi
  0 siblings, 1 reply; 7+ messages in thread
From: Nikolaus Rath @ 2018-12-04  9:03 UTC (permalink / raw)
  To: fuse-devel, Miklos Szeredi, linux-fsdevel, Sahitya Tummala,
	David Sheets, Tahsin Erdogan, Al Viro

Hi,

Really no one any suggestion for debugging this?

(Adding some more people who recently worked on fs/fuse)

Best,
-Nikolaus

On Nov 27 2018, Nikolaus Rath <Nikolaus@rath.org> wrote:
> Hi,
>
> When testing FUSE under heavy load, I am occasionally getting EBADFD
> errors when reading from the fuse pipe.
>
> Does anyone have an idea what might cause this, and how to debug it
> further?
>
> Unfortunately I can't tell if this happens upon read() or upon
> splice(). I've extended the code now, so the next time it happens I will
> be able to tell.
>
> I am testing with kernel 4.15 from Ubuntu Bionic.
>
> So far this has happened twice in about 3 hours of testing. For the
> testing, I am repeatedly running the same 3 workloads, without
> remounting. Each workload takes ~ 15-30 seconds.
>
> Best,
> -Nikolaus
>
> -- 
> GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
>
>              »Time flies like an arrow, fruit flies like a Banana.«


-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Reading from fuse pipe fails with EBADFD
  2018-12-04  9:03 ` Nikolaus Rath
@ 2018-12-04  9:10   ` Miklos Szeredi
  2018-12-04 19:02     ` [fuse-devel] " Nikolaus Rath
  0 siblings, 1 reply; 7+ messages in thread
From: Miklos Szeredi @ 2018-12-04  9:10 UTC (permalink / raw)
  To: fuse-devel, linux-fsdevel, stummala, david.sheets, tahsin, viro

On Tue, Dec 4, 2018 at 10:03 AM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> Hi,
>
> Really no one any suggestion for debugging this?
>
> (Adding some more people who recently worked on fs/fuse)
>
> Best,
> -Nikolaus
>
> On Nov 27 2018, Nikolaus Rath <Nikolaus@rath.org> wrote:
> > Hi,
> >
> > When testing FUSE under heavy load, I am occasionally getting EBADFD
> > errors when reading from the fuse pipe.

EBADFD or EBADF?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [fuse-devel] Reading from fuse pipe fails with EBADFD
  2018-12-04  9:10   ` Miklos Szeredi
@ 2018-12-04 19:02     ` Nikolaus Rath
  2018-12-10  9:19       ` Miklos Szeredi
  0 siblings, 1 reply; 7+ messages in thread
From: Nikolaus Rath @ 2018-12-04 19:02 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: fuse-devel, linux-fsdevel, stummala, david.sheets, tahsin, viro

On Dec 04 2018, Miklos Szeredi <mszeredi@redhat.com> wrote:
> On Tue, Dec 4, 2018 at 10:03 AM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> Hi,
>>
>> Really no one any suggestion for debugging this?
>>
>> (Adding some more people who recently worked on fs/fuse)
>>
>> Best,
>> -Nikolaus
>>
>> On Nov 27 2018, Nikolaus Rath <Nikolaus@rath.org> wrote:
>> > Hi,
>> >
>> > When testing FUSE under heavy load, I am occasionally getting EBADFD
>> > errors when reading from the fuse pipe.
>
> EBADFD or EBADF?

EBADF ("Bad file descriptor"), sorry.


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [fuse-devel] Reading from fuse pipe fails with EBADFD
  2018-12-04 19:02     ` [fuse-devel] " Nikolaus Rath
@ 2018-12-10  9:19       ` Miklos Szeredi
  2018-12-26 22:30         ` Nikolaus Rath
  0 siblings, 1 reply; 7+ messages in thread
From: Miklos Szeredi @ 2018-12-10  9:19 UTC (permalink / raw)
  To: Miklos Szeredi, fuse-devel, linux-fsdevel, Sahitya Tummala,
	David Sheets, Tahsin Erdogan, Al Viro

On Tue, Dec 4, 2018 at 8:02 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> On Dec 04 2018, Miklos Szeredi <mszeredi@redhat.com> wrote:
> > On Tue, Dec 4, 2018 at 10:03 AM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >>
> >> Hi,
> >>
> >> Really no one any suggestion for debugging this?
> >>
> >> (Adding some more people who recently worked on fs/fuse)
> >>
> >> Best,
> >> -Nikolaus
> >>
> >> On Nov 27 2018, Nikolaus Rath <Nikolaus@rath.org> wrote:
> >> > Hi,
> >> >
> >> > When testing FUSE under heavy load, I am occasionally getting EBADFD
> >> > errors when reading from the fuse pipe.
> >
> > EBADFD or EBADF?
>
> EBADF ("Bad file descriptor"), sorry.

Can you run the thing with "strace -f  ..."?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [fuse-devel] Reading from fuse pipe fails with EBADFD
  2018-12-10  9:19       ` Miklos Szeredi
@ 2018-12-26 22:30         ` Nikolaus Rath
  2018-12-31 10:56           ` Nikolaus Rath
  0 siblings, 1 reply; 7+ messages in thread
From: Nikolaus Rath @ 2018-12-26 22:30 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Miklos Szeredi, fuse-devel, linux-fsdevel, Sahitya Tummala,
	David Sheets, Tahsin Erdogan, Al Viro

On Dec 10 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Tue, Dec 4, 2018 at 8:02 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> On Dec 04 2018, Miklos Szeredi <mszeredi@redhat.com> wrote:
>> > On Tue, Dec 4, 2018 at 10:03 AM Nikolaus Rath <Nikolaus@rath.org> wrote:
>> >>
>> >> Hi,
>> >>
>> >> Really no one any suggestion for debugging this?
>> >>
>> >> (Adding some more people who recently worked on fs/fuse)
>> >>
>> >> Best,
>> >> -Nikolaus
>> >>
>> >> On Nov 27 2018, Nikolaus Rath <Nikolaus@rath.org> wrote:
>> >> > Hi,
>> >> >
>> >> > When testing FUSE under heavy load, I am occasionally getting EBADFD
>> >> > errors when reading from the fuse pipe.
>> >
>> > EBADFD or EBADF?
>>
>> EBADF ("Bad file descriptor"), sorry.
>
> Can you run the thing with "strace -f  ..."?

Apologies for the delayed response. I have been trying to reproduce this
but have instead run into another problem: the *client* getting spurious
EBADF warnings. I am not sure if this is related or unrelated, and it
has been hard to debug because it does not happen under strace:

$ find mnt > /dev/null
find: ‘mnt/modules/4.18.0-0.bpo.3-amd64/kernel/net/8021q’: Bad file descriptor
find: ‘mnt/modules/4.18.0-0.bpo.3-amd64/kernel/drivers/net/ethernet/natsemi’: Bad file descriptor

This happens in roughly 1 in 2 runs, and the affected directory entries
are always different. Disabling readdirplus does not make a difference.

$ strace -o log find mnt > /dev/null

never gave in error in roughly 20 attempts.

Similarly, enabling fuse debug logging also makes the problem go away.

I also found some odd warnings in dmesg:

[24472.435256] fuse: trying to steal weird page
[24472.435261]   page=00000000b5b89670 index=0 flags=17fffc0000000ad, count=1, mapcount=0, map

..happens a lot (and I just wrote another email about it), and 

[24473.170110] VFS: Lookup of '4.18.0-0.bpo.1-amd64' in fuse fuse would have caused loop

happened only a few times (and the `fuse` string is indeed
duplicated). When this happens, `find` complains that:

$ find mnt > /dev/null
find: File system loop detected; ‘mnt/modules/4.18.0-0.bpo.1-amd64/kernel/drivers/w1’ is part of the same file system loop as ‘mnt/modules’


Does this ring any bells with anyone?


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [fuse-devel] Reading from fuse pipe fails with EBADFD
  2018-12-26 22:30         ` Nikolaus Rath
@ 2018-12-31 10:56           ` Nikolaus Rath
  0 siblings, 0 replies; 7+ messages in thread
From: Nikolaus Rath @ 2018-12-31 10:56 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Miklos Szeredi, fuse-devel, linux-fsdevel, Sahitya Tummala, Al Viro

On Dec 26 2018, Nikolaus Rath <Nikolaus@rath.org> wrote:
>>> >> > When testing FUSE under heavy load, I am occasionally getting EBADFD
>>> >> > errors when reading from the fuse pipe.
>>> >
>>> > EBADFD or EBADF?
>>>
>>> EBADF ("Bad file descriptor"), sorry.
>>
>> Can you run the thing with "strace -f  ..."?
>
> Apologies for the delayed response. I have been trying to reproduce this
> but have instead run into another problem: the *client* getting spurious
> EBADF warnings. I am not sure if this is related or unrelated, and it
> has been hard to debug because it does not happen under strace:
[..]

I believe I have figured it out. I was mistakenly assuming that the bad
file descriptor was the fuse pipe - but it was actually the target file
descriptor (I managed to completely forget that splice works on two file
descriptors).

The root cause was mistakenly closing a file descriptor twice and not
checking the return value. Because of the missing return value check,
the error went unnoticed almost all the time - except when a different
thread managed to re-use the fd for different purposes between the first
and second close.

Apologies for taking everyone's time for what was actually a filesystem
bug.


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-12-31 10:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-27 20:26 Reading from fuse pipe fails with EBADFD Nikolaus Rath
2018-12-04  9:03 ` Nikolaus Rath
2018-12-04  9:10   ` Miklos Szeredi
2018-12-04 19:02     ` [fuse-devel] " Nikolaus Rath
2018-12-10  9:19       ` Miklos Szeredi
2018-12-26 22:30         ` Nikolaus Rath
2018-12-31 10:56           ` Nikolaus Rath

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).