linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* fuse: incorrect attribute caching with writeback cache disabled
@ 2022-08-11 21:05 Frank Dinoff
  2022-08-12  9:33 ` Miklos Szeredi
  0 siblings, 1 reply; 4+ messages in thread
From: Frank Dinoff @ 2022-08-11 21:05 UTC (permalink / raw)
  To: Miklos Szeredi, linux-fsdevel

I have a binary running on a fuse filesystem which is generating a zip file. I
don't know what syscalls are involved since the binary segfaults when run with
strace.

After doing a binary search,
https://github.com/torvalds/linux/commit/fa5eee57e33e79b71b40e6950c29cc46f5cc5cb7
is the commit that seems to have introduced the error. It still seems to
failing with a much newer kernel.

Reverting the fuse_invalidate_attr_mask in fuse_perform_write to
fuse_invalidate_attr makes every other run of the binary produce the correct
output.

I found that enabling the writeback cache makes the binary always produce the
right output. Running the fuse daemon in single threaded mode also works.

Is there anything that sticks out to you that is wrong with the above commit?

Thanks,
Frank Dinoff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: fuse: incorrect attribute caching with writeback cache disabled
  2022-08-11 21:05 fuse: incorrect attribute caching with writeback cache disabled Frank Dinoff
@ 2022-08-12  9:33 ` Miklos Szeredi
  2022-08-12 22:58   ` Frank Dinoff
  0 siblings, 1 reply; 4+ messages in thread
From: Miklos Szeredi @ 2022-08-12  9:33 UTC (permalink / raw)
  To: Frank Dinoff; +Cc: linux-fsdevel

On Thu, 11 Aug 2022 at 23:05, Frank Dinoff <fdinoff@google.com> wrote:
>
> I have a binary running on a fuse filesystem which is generating a zip file. I
> don't know what syscalls are involved since the binary segfaults when run with
> strace.

You could strace the fuse filesystem.

> After doing a binary search,
> https://github.com/torvalds/linux/commit/fa5eee57e33e79b71b40e6950c29cc46f5cc5cb7
> is the commit that seems to have introduced the error. It still seems to
> failing with a much newer kernel.

How is it failing?

> Reverting the fuse_invalidate_attr_mask in fuse_perform_write to
> fuse_invalidate_attr makes every other run of the binary produce the correct
> output.

What do you mean?  Is it succeeding half the time?

>
> I found that enabling the writeback cache makes the binary always produce the
> right output. Running the fuse daemon in single threaded mode also works.
>
> Is there anything that sticks out to you that is wrong with the above commit?

Could you try adding STATX_MODE to the invalidated mask?   Can't
imagine any other attribute being relevant.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: fuse: incorrect attribute caching with writeback cache disabled
  2022-08-12  9:33 ` Miklos Szeredi
@ 2022-08-12 22:58   ` Frank Dinoff
  2022-08-22 22:54     ` Frank Dinoff
  0 siblings, 1 reply; 4+ messages in thread
From: Frank Dinoff @ 2022-08-12 22:58 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel

On Fri, Aug 12, 2022 at 5:33 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Thu, 11 Aug 2022 at 23:05, Frank Dinoff <fdinoff@google.com> wrote:
> >
> > I have a binary running on a fuse filesystem which is generating a zip file. I
> > don't know what syscalls are involved since the binary segfaults when run with
> > strace.
>
> You could strace the fuse filesystem.

I'll try doing this later, I was unsuccessful in finding anything
useful printing large amounts
of debug logs.

>
> > After doing a binary search,
> > https://github.com/torvalds/linux/commit/fa5eee57e33e79b71b40e6950c29cc46f5cc5cb7
> > is the commit that seems to have introduced the error. It still seems to
> > failing with a much newer kernel.
>
> How is it failing?

Oops sorry I thought I included that.  You can't unzip the file.
unzip -t has "error:  invalid compressed data to inflate"

> > Reverting the fuse_invalidate_attr_mask in fuse_perform_write to
> > fuse_invalidate_attr makes every other run of the binary produce the correct
> > output.
>
> What do you mean?  Is it succeeding half the time?

Running the binary multiple times in a row about 50% produce the
correct file and 50%
produce a corrupt file.

Running the test multiple times before fa5eee57 I'm seeing about 10%
of runs producing
a corrupt file. (I did not realize this had a chance of failure on the
old kernel.)
After fa5eee57 I have 100% of runs producing the corrupt file.

>
> >
> > I found that enabling the writeback cache makes the binary always produce the
> > right output. Running the fuse daemon in single threaded mode also works.
> >
> > Is there anything that sticks out to you that is wrong with the above commit?
>
> Could you try adding STATX_MODE to the invalidated mask?   Can't
> imagine any other attribute being relevant.

Adding STATX_MODE to FUSE_STATX_MODIFY does make the binary produce the
correct file about 75% of the time. The last bit of flakiness may be
some concurrency
issue in the binary?

>
> Thanks,
> Miklos

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: fuse: incorrect attribute caching with writeback cache disabled
  2022-08-12 22:58   ` Frank Dinoff
@ 2022-08-22 22:54     ` Frank Dinoff
  0 siblings, 0 replies; 4+ messages in thread
From: Frank Dinoff @ 2022-08-22 22:54 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel

On Fri, Aug 12, 2022 at 6:58 PM Frank Dinoff <fdinoff@google.com> wrote:
>
> On Fri, Aug 12, 2022 at 5:33 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Thu, 11 Aug 2022 at 23:05, Frank Dinoff <fdinoff@google.com> wrote:
> > >
> > > I have a binary running on a fuse filesystem which is generating a zip file. I
> > > don't know what syscalls are involved since the binary segfaults when run with
> > > strace.
> >
> > You could strace the fuse filesystem.
>
> I'll try doing this later, I was unsuccessful in finding anything
> useful printing large amounts
> of debug logs.

I got strace working on the program. It looks like it doing something like

open(O_RDWR) = 9
multiple write(...) calls such that the lseek below is before end of file.
lseek(9, 2514944, SEEK_SET)             = 2514944
read(9, "", 8192)                       = 0 // Should have read 5770 bytes
lseek(9, 5770, SEEK_CUR)                = 2520714 // should be end.
write(...)
close(9)
open(O_RDWR) = 9
lseek(9, 2514944, SEEK_SET)             = 2514944
read(9, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
6042) = 6042
...

The first read doesn't return data and I'm not sure why. It is kinda
like the kernel page cache has gotten out of sync and thinks the whole
file should be zeros.

>
> >
> > > After doing a binary search,
> > > https://github.com/torvalds/linux/commit/fa5eee57e33e79b71b40e6950c29cc46f5cc5cb7
> > > is the commit that seems to have introduced the error. It still seems to
> > > failing with a much newer kernel.
> >
> > How is it failing?
>
> Oops sorry I thought I included that.  You can't unzip the file.
> unzip -t has "error:  invalid compressed data to inflate"
>
> > > Reverting the fuse_invalidate_attr_mask in fuse_perform_write to
> > > fuse_invalidate_attr makes every other run of the binary produce the correct
> > > output.
> >
> > What do you mean?  Is it succeeding half the time?
>
> Running the binary multiple times in a row about 50% produce the
> correct file and 50%
> produce a corrupt file.
>
> Running the test multiple times before fa5eee57 I'm seeing about 10%
> of runs producing
> a corrupt file. (I did not realize this had a chance of failure on the
> old kernel.)
> After fa5eee57 I have 100% of runs producing the corrupt file.
>
> >
> > >
> > > I found that enabling the writeback cache makes the binary always produce the
> > > right output. Running the fuse daemon in single threaded mode also works.
> > >
> > > Is there anything that sticks out to you that is wrong with the above commit?
> >
> > Could you try adding STATX_MODE to the invalidated mask?   Can't
> > imagine any other attribute being relevant.
>
> Adding STATX_MODE to FUSE_STATX_MODIFY does make the binary produce the
> correct file about 75% of the time. The last bit of flakiness may be
> some concurrency
> issue in the binary?
>
> >
> > Thanks,
> > Miklos

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-08-22 22:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-11 21:05 fuse: incorrect attribute caching with writeback cache disabled Frank Dinoff
2022-08-12  9:33 ` Miklos Szeredi
2022-08-12 22:58   ` Frank Dinoff
2022-08-22 22:54     ` Frank Dinoff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).