Linux-Fsdevel Archive on lore.kernel.org
 help / Atom feed
* fuse: trying to steal weird page
@ 2018-12-26 21:43 Nikolaus Rath
  2019-01-07  8:28 ` [fuse-devel] " Miklos Szeredi
  0 siblings, 1 reply; 11+ messages in thread
From: Nikolaus Rath @ 2018-12-26 21:43 UTC (permalink / raw)
  To: fuse-devel, Miklos Szeredi, linux-fsdevel

Hi,

I am seeing relatively regular occurences of

$ sudo dmesg | tail
[21929.138815] fuse: trying to steal weird page
[21929.138821]   page=00000000a7dd2617 index=64 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
[21930.647338] fuse: trying to steal weird page
[21930.647345]   page=00000000a07f32af index=2848 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
[21932.338873] fuse: trying to steal weird page
[21932.338879]   page=0000000067e3a012 index=64 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
[21933.930703] fuse: trying to steal weird page
[21933.930710]   page=00000000046feb25 index=845 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
[21936.163174] fuse: trying to steal weird page
[21936.163180]   page=00000000fb80fe27 index=0 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)

They do not seem to correlate with userspace errors.

(I am debugging a mysterious EBADF error that I get from an nftw() walk
and that disappears when running under strace(), but it its occurence
does not correlate).


What are the implications of the above kernel message? Is there a way to
provide more debugging information?

$ uname -a
Linux vostro.rath.org 4.18.0-0.bpo.1-amd64 #1 SMP Debian 4.18.6-1~bpo9+1 (2018-09-13) x86_64 GNU/Linux

Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] fuse: trying to steal weird page
  2018-12-26 21:43 fuse: trying to steal weird page Nikolaus Rath
@ 2019-01-07  8:28 ` " Miklos Szeredi
  2019-01-07 21:05   ` Nikolaus Rath
  0 siblings, 1 reply; 11+ messages in thread
From: Miklos Szeredi @ 2019-01-07  8:28 UTC (permalink / raw)
  To: fuse-devel, Miklos Szeredi, linux-fsdevel

On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> Hi,
>
> I am seeing relatively regular occurences of
>
> $ sudo dmesg | tail
> [21929.138815] fuse: trying to steal weird page
> [21929.138821]   page=00000000a7dd2617 index=64 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
> [21930.647338] fuse: trying to steal weird page
> [21930.647345]   page=00000000a07f32af index=2848 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
> [21932.338873] fuse: trying to steal weird page
> [21932.338879]   page=0000000067e3a012 index=64 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
> [21933.930703] fuse: trying to steal weird page
> [21933.930710]   page=00000000046feb25 index=845 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)
> [21936.163174] fuse: trying to steal weird page
> [21936.163180]   page=00000000fb80fe27 index=0 flags=17fffc0000000ad, count=1, mapcount=0, mapping=          (null)

The page has the PG_dity and PG_waiters flags set which are
incompatible with stealing.  page_cache_pipe_buf_steal() does
apparently filter out dirty ones, so it's not a regular file that we
are trying to streal the page from.  So the question is: what is the
source of the splice()?

Thanks,
Miklos


>
> They do not seem to correlate with userspace errors.
>
> (I am debugging a mysterious EBADF error that I get from an nftw() walk
> and that disappears when running under strace(), but it its occurence
> does not correlate).
>
>
> What are the implications of the above kernel message? Is there a way to
> provide more debugging information?
>
> $ uname -a
> Linux vostro.rath.org 4.18.0-0.bpo.1-amd64 #1 SMP Debian 4.18.6-1~bpo9+1 (2018-09-13) x86_64 GNU/Linux
>
> Best,
> -Nikolaus
>
> --
> GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
>
>              »Time flies like an arrow, fruit flies like a Banana.«
>
>
> --
> fuse-devel mailing list
> To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] fuse: trying to steal weird page
  2019-01-07  8:28 ` [fuse-devel] " Miklos Szeredi
@ 2019-01-07 21:05   ` Nikolaus Rath
  2019-01-08  8:27     ` Miklos Szeredi
  0 siblings, 1 reply; 11+ messages in thread
From: Nikolaus Rath @ 2019-01-07 21:05 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, Miklos Szeredi, linux-fsdevel

On Jan 07 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> Hi,
>>
>> I am seeing relatively regular occurences of
>>
>> $ sudo dmesg | tail
>> [21929.138815] fuse: trying to steal weird page
>> [21929.138821] page=00000000a7dd2617 index=64 flags=17fffc0000000ad,
>> count=1, mapcount=0, mapping= (null)
>> [21930.647338] fuse: trying to steal weird page
>> [21930.647345] page=00000000a07f32af index=2848
>> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> [21932.338873] fuse: trying to steal weird page
>> [21932.338879] page=0000000067e3a012 index=64 flags=17fffc0000000ad,
>> count=1, mapcount=0, mapping= (null)
>> [21933.930703] fuse: trying to steal weird page
>> [21933.930710] page=00000000046feb25 index=845
>> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> [21936.163174] fuse: trying to steal weird page
>> [21936.163180] page=00000000fb80fe27 index=0 flags=17fffc0000000ad,
>> count=1, mapcount=0, mapping= (null)
>
> The page has the PG_dity and PG_waiters flags set which are
> incompatible with stealing.  page_cache_pipe_buf_steal() does
> apparently filter out dirty ones, so it's not a regular file that we
> are trying to streal the page from.  So the question is: what is the
> source of the splice()?
 
Hmm. I think it has to be a regular file. But as I mentioned in my other
email, I did have a race condition where fd's were closed
incorrectly. Is it possible that this also triggered the above,
i.e. that the fd was closed sometime during splice?


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] fuse: trying to steal weird page
  2019-01-07 21:05   ` Nikolaus Rath
@ 2019-01-08  8:27     ` Miklos Szeredi
  2019-01-08 10:35       ` Nikolaus Rath
  0 siblings, 1 reply; 11+ messages in thread
From: Miklos Szeredi @ 2019-01-08  8:27 UTC (permalink / raw)
  To: fuse-devel; +Cc: Miklos Szeredi, linux-fsdevel

On Mon, Jan 7, 2019 at 10:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> On Jan 07 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >>
> >> Hi,
> >>
> >> I am seeing relatively regular occurences of
> >>
> >> $ sudo dmesg | tail
> >> [21929.138815] fuse: trying to steal weird page
> >> [21929.138821] page=00000000a7dd2617 index=64 flags=17fffc0000000ad,
> >> count=1, mapcount=0, mapping= (null)
> >> [21930.647338] fuse: trying to steal weird page
> >> [21930.647345] page=00000000a07f32af index=2848
> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
> >> [21932.338873] fuse: trying to steal weird page
> >> [21932.338879] page=0000000067e3a012 index=64 flags=17fffc0000000ad,
> >> count=1, mapcount=0, mapping= (null)
> >> [21933.930703] fuse: trying to steal weird page
> >> [21933.930710] page=00000000046feb25 index=845
> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
> >> [21936.163174] fuse: trying to steal weird page
> >> [21936.163180] page=00000000fb80fe27 index=0 flags=17fffc0000000ad,
> >> count=1, mapcount=0, mapping= (null)
> >
> > The page has the PG_dity and PG_waiters flags set which are
> > incompatible with stealing.  page_cache_pipe_buf_steal() does
> > apparently filter out dirty ones, so it's not a regular file that we
> > are trying to streal the page from.  So the question is: what is the
> > source of the splice()?
>
> Hmm. I think it has to be a regular file. But as I mentioned in my other
> email, I did have a race condition where fd's were closed
> incorrectly. Is it possible that this also triggered the above,
> i.e. that the fd was closed sometime during splice?

Close during a syscall that uses the fd is not an issue, because a ref
to the file is acquired.  So the race is between the close() and the
internal fget(); if the close() wins then fget() will fail and the
syscall will return EBADF.  If the fget() wins, then the syscall can
run normally despite the fact that the fd was closed.

Can you tell me what filesystem is the regular file (the one being
spliced into fuse) is on?

It actually has to be a regular file, since AFAIK nothing else has
dirty pages.   It could be using something other than
page_cache_pipe_buf_steal(), or there's some other mechanism that lets
the page be dirtied after being unmapped, though that looks
impossible...

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] fuse: trying to steal weird page
  2019-01-08  8:27     ` Miklos Szeredi
@ 2019-01-08 10:35       ` Nikolaus Rath
  2019-01-09  8:07         ` Miklos Szeredi
  0 siblings, 1 reply; 11+ messages in thread
From: Nikolaus Rath @ 2019-01-08 10:35 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-fsdevel, Miklos Szeredi

On Jan 08 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Mon, Jan 7, 2019 at 10:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> On Jan 07 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> > On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>> >>
>> >> Hi,
>> >>
>> >> I am seeing relatively regular occurences of
>> >>
>> >> $ sudo dmesg | tail
>> >> [21929.138815] fuse: trying to steal weird page
>> >> [21929.138821] page=00000000a7dd2617 index=64 flags=17fffc0000000ad,
>> >> count=1, mapcount=0, mapping= (null)
>> >> [21930.647338] fuse: trying to steal weird page
>> >> [21930.647345] page=00000000a07f32af index=2848
>> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> >> [21932.338873] fuse: trying to steal weird page
>> >> [21932.338879] page=0000000067e3a012 index=64 flags=17fffc0000000ad,
>> >> count=1, mapcount=0, mapping= (null)
>> >> [21933.930703] fuse: trying to steal weird page
>> >> [21933.930710] page=00000000046feb25 index=845
>> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> >> [21936.163174] fuse: trying to steal weird page
>> >> [21936.163180] page=00000000fb80fe27 index=0 flags=17fffc0000000ad,
>> >> count=1, mapcount=0, mapping= (null)
>> >
>> > The page has the PG_dity and PG_waiters flags set which are
>> > incompatible with stealing.  page_cache_pipe_buf_steal() does
>> > apparently filter out dirty ones, so it's not a regular file that we
>> > are trying to streal the page from.  So the question is: what is the
>> > source of the splice()?
>>
>> Hmm. I think it has to be a regular file. But as I mentioned in my other
>> email, I did have a race condition where fd's were closed
>> incorrectly. Is it possible that this also triggered the above,
>> i.e. that the fd was closed sometime during splice?
>
> Close during a syscall that uses the fd is not an issue, because a ref
> to the file is acquired.  So the race is between the close() and the
> internal fget(); if the close() wins then fget() will fail and the
> syscall will return EBADF.  If the fget() wins, then the syscall can
> run normally despite the fact that the fd was closed.
>
> Can you tell me what filesystem is the regular file (the one being
> spliced into fuse) is on?

It's ext4.

> It actually has to be a regular file, since AFAIK nothing else has
> dirty pages.   It could be using something other than
> page_cache_pipe_buf_steal(), or there's some other mechanism that lets
> the page be dirtied after being unmapped, though that looks
> impossible...

I can't meaningfully comment on that, sorry...

Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] fuse: trying to steal weird page
  2019-01-08 10:35       ` Nikolaus Rath
@ 2019-01-09  8:07         ` Miklos Szeredi
       [not found]           ` <CAJfpegtiXDgSBWN8MRubpAdJFxy95X21nO_yycCZhpvKLVePRA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Miklos Szeredi @ 2019-01-09  8:07 UTC (permalink / raw)
  To: fuse-devel; +Cc: linux-fsdevel, Miklos Szeredi

On Tue, Jan 8, 2019 at 11:35 AM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> On Jan 08 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > On Mon, Jan 7, 2019 at 10:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >>
> >> On Jan 07 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> >> > On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> I am seeing relatively regular occurences of
> >> >>
> >> >> $ sudo dmesg | tail
> >> >> [21929.138815] fuse: trying to steal weird page
> >> >> [21929.138821] page=00000000a7dd2617 index=64 flags=17fffc0000000ad,
> >> >> count=1, mapcount=0, mapping= (null)
> >> >> [21930.647338] fuse: trying to steal weird page
> >> >> [21930.647345] page=00000000a07f32af index=2848
> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
> >> >> [21932.338873] fuse: trying to steal weird page
> >> >> [21932.338879] page=0000000067e3a012 index=64 flags=17fffc0000000ad,
> >> >> count=1, mapcount=0, mapping= (null)
> >> >> [21933.930703] fuse: trying to steal weird page
> >> >> [21933.930710] page=00000000046feb25 index=845
> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
> >> >> [21936.163174] fuse: trying to steal weird page
> >> >> [21936.163180] page=00000000fb80fe27 index=0 flags=17fffc0000000ad,
> >> >> count=1, mapcount=0, mapping= (null)
> >> >
> >> > The page has the PG_dity and PG_waiters flags set which are
> >> > incompatible with stealing.  page_cache_pipe_buf_steal() does
> >> > apparently filter out dirty ones, so it's not a regular file that we
> >> > are trying to streal the page from.  So the question is: what is the
> >> > source of the splice()?
> >>
> >> Hmm. I think it has to be a regular file. But as I mentioned in my other
> >> email, I did have a race condition where fd's were closed
> >> incorrectly. Is it possible that this also triggered the above,
> >> i.e. that the fd was closed sometime during splice?
> >
> > Close during a syscall that uses the fd is not an issue, because a ref
> > to the file is acquired.  So the race is between the close() and the
> > internal fget(); if the close() wins then fget() will fail and the
> > syscall will return EBADF.  If the fget() wins, then the syscall can
> > run normally despite the fact that the fd was closed.
> >
> > Can you tell me what filesystem is the regular file (the one being
> > spliced into fuse) is on?
>
> It's ext4.

Next question: is file opened with O_DIRECT or is filesystem mounted
with DAX, or anything fancy?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: fuse: trying to steal weird page
       [not found]           ` <CAJfpegtiXDgSBWN8MRubpAdJFxy95X21nO_yycCZhpvKLVePRA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-01-11 15:39             ` Nikolaus Rath
  2019-01-11 15:39               ` [fuse-devel] " Nikolaus Rath
  2019-02-10 22:05               ` Nikolaus Rath
  0 siblings, 2 replies; 11+ messages in thread
From: Nikolaus Rath @ 2019-01-11 15:39 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-fsdevel, Miklos Szeredi

On Jan 09 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Tue, Jan 8, 2019 at 11:35 AM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> On Jan 08 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> > On Mon, Jan 7, 2019 at 10:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>> >>
>> >> On Jan 07 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> >> > On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> I am seeing relatively regular occurences of
>> >> >>
>> >> >> $ sudo dmesg | tail
>> >> >> [21929.138815] fuse: trying to steal weird page
>> >> >> [21929.138821] page=00000000a7dd2617 index=64 flags=17fffc0000000ad,
>> >> >> count=1, mapcount=0, mapping= (null)
>> >> >> [21930.647338] fuse: trying to steal weird page
>> >> >> [21930.647345] page=00000000a07f32af index=2848
>> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> >> >> [21932.338873] fuse: trying to steal weird page
>> >> >> [21932.338879] page=0000000067e3a012 index=64 flags=17fffc0000000ad,
>> >> >> count=1, mapcount=0, mapping= (null)
>> >> >> [21933.930703] fuse: trying to steal weird page
>> >> >> [21933.930710] page=00000000046feb25 index=845
>> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> >> >> [21936.163174] fuse: trying to steal weird page
>> >> >> [21936.163180] page=00000000fb80fe27 index=0 flags=17fffc0000000ad,
>> >> >> count=1, mapcount=0, mapping= (null)
>> >> >
>> >> > The page has the PG_dity and PG_waiters flags set which are
>> >> > incompatible with stealing.  page_cache_pipe_buf_steal() does
>> >> > apparently filter out dirty ones, so it's not a regular file that we
>> >> > are trying to streal the page from.  So the question is: what is the
>> >> > source of the splice()?
>> >>
>> >> Hmm. I think it has to be a regular file. But as I mentioned in my other
>> >> email, I did have a race condition where fd's were closed
>> >> incorrectly. Is it possible that this also triggered the above,
>> >> i.e. that the fd was closed sometime during splice?
>> >
>> > Close during a syscall that uses the fd is not an issue, because a ref
>> > to the file is acquired.  So the race is between the close() and the
>> > internal fget(); if the close() wins then fget() will fail and the
>> > syscall will return EBADF.  If the fget() wins, then the syscall can
>> > run normally despite the fact that the fd was closed.
>> >
>> > Can you tell me what filesystem is the regular file (the one being
>> > spliced into fuse) is on?
>>
>> It's ext4.
>
> Next question: is file opened with O_DIRECT or is filesystem mounted
> with DAX, or anything fancy?

Neither. But thinking about this, I guess that (because of the race) the
fd could have been closed and re-opened before the ref was acquired. So
it may have turned into a directory fd.

To be honest, I don't think it's worth investigating this unless I see
it happen again now that the race in my code is fixed.


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«


-- 
fuse-devel mailing list
To unsubscribe or subscribe, visit https://lists.sourceforge.net/lists/listinfo/fuse-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] fuse: trying to steal weird page
  2019-01-11 15:39             ` Nikolaus Rath
@ 2019-01-11 15:39               ` " Nikolaus Rath
  2019-02-10 22:05               ` Nikolaus Rath
  1 sibling, 0 replies; 11+ messages in thread
From: Nikolaus Rath @ 2019-01-11 15:39 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-fsdevel, Miklos Szeredi

On Jan 09 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Tue, Jan 8, 2019 at 11:35 AM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> On Jan 08 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> > On Mon, Jan 7, 2019 at 10:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>> >>
>> >> On Jan 07 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> >> > On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> I am seeing relatively regular occurences of
>> >> >>
>> >> >> $ sudo dmesg | tail
>> >> >> [21929.138815] fuse: trying to steal weird page
>> >> >> [21929.138821] page=00000000a7dd2617 index=64 flags=17fffc0000000ad,
>> >> >> count=1, mapcount=0, mapping= (null)
>> >> >> [21930.647338] fuse: trying to steal weird page
>> >> >> [21930.647345] page=00000000a07f32af index=2848
>> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> >> >> [21932.338873] fuse: trying to steal weird page
>> >> >> [21932.338879] page=0000000067e3a012 index=64 flags=17fffc0000000ad,
>> >> >> count=1, mapcount=0, mapping= (null)
>> >> >> [21933.930703] fuse: trying to steal weird page
>> >> >> [21933.930710] page=00000000046feb25 index=845
>> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>> >> >> [21936.163174] fuse: trying to steal weird page
>> >> >> [21936.163180] page=00000000fb80fe27 index=0 flags=17fffc0000000ad,
>> >> >> count=1, mapcount=0, mapping= (null)
>> >> >
>> >> > The page has the PG_dity and PG_waiters flags set which are
>> >> > incompatible with stealing.  page_cache_pipe_buf_steal() does
>> >> > apparently filter out dirty ones, so it's not a regular file that we
>> >> > are trying to streal the page from.  So the question is: what is the
>> >> > source of the splice()?
>> >>
>> >> Hmm. I think it has to be a regular file. But as I mentioned in my other
>> >> email, I did have a race condition where fd's were closed
>> >> incorrectly. Is it possible that this also triggered the above,
>> >> i.e. that the fd was closed sometime during splice?
>> >
>> > Close during a syscall that uses the fd is not an issue, because a ref
>> > to the file is acquired.  So the race is between the close() and the
>> > internal fget(); if the close() wins then fget() will fail and the
>> > syscall will return EBADF.  If the fget() wins, then the syscall can
>> > run normally despite the fact that the fd was closed.
>> >
>> > Can you tell me what filesystem is the regular file (the one being
>> > spliced into fuse) is on?
>>
>> It's ext4.
>
> Next question: is file opened with O_DIRECT or is filesystem mounted
> with DAX, or anything fancy?

Neither. But thinking about this, I guess that (because of the race) the
fd could have been closed and re-opened before the ref was acquired. So
it may have turned into a directory fd.

To be honest, I don't think it's worth investigating this unless I see
it happen again now that the race in my code is fixed.


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] fuse: trying to steal weird page
  2019-01-11 15:39             ` Nikolaus Rath
  2019-01-11 15:39               ` [fuse-devel] " Nikolaus Rath
@ 2019-02-10 22:05               ` Nikolaus Rath
  2019-02-12 14:57                 ` Miklos Szeredi
  1 sibling, 1 reply; 11+ messages in thread
From: Nikolaus Rath @ 2019-02-10 22:05 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-fsdevel

On Jan 11 2019, Nikolaus Rath <Nikolaus@rath.org> wrote:
> On Jan 09 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Tue, Jan 8, 2019 at 11:35 AM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>>
>>> On Jan 08 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> > On Mon, Jan 7, 2019 at 10:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>> >>
>>> >> On Jan 07 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>> >> > On Wed, Dec 26, 2018 at 10:44 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>> >> >>
>>> >> >> Hi,
>>> >> >>
>>> >> >> I am seeing relatively regular occurences of
>>> >> >>
>>> >> >> $ sudo dmesg | tail
>>> >> >> [21929.138815] fuse: trying to steal weird page
>>> >> >> [21929.138821] page=00000000a7dd2617 index=64 flags=17fffc0000000ad,
>>> >> >> count=1, mapcount=0, mapping= (null)
>>> >> >> [21930.647338] fuse: trying to steal weird page
>>> >> >> [21930.647345] page=00000000a07f32af index=2848
>>> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>>> >> >> [21932.338873] fuse: trying to steal weird page
>>> >> >> [21932.338879] page=0000000067e3a012 index=64 flags=17fffc0000000ad,
>>> >> >> count=1, mapcount=0, mapping= (null)
>>> >> >> [21933.930703] fuse: trying to steal weird page
>>> >> >> [21933.930710] page=00000000046feb25 index=845
>>> >> >> flags=17fffc0000000ad, count=1, mapcount=0, mapping= (null)
>>> >> >> [21936.163174] fuse: trying to steal weird page
>>> >> >> [21936.163180] page=00000000fb80fe27 index=0 flags=17fffc0000000ad,
>>> >> >> count=1, mapcount=0, mapping= (null)
>>> >> >
>>> >> > The page has the PG_dity and PG_waiters flags set which are
>>> >> > incompatible with stealing.  page_cache_pipe_buf_steal() does
>>> >> > apparently filter out dirty ones, so it's not a regular file that we
>>> >> > are trying to streal the page from.  So the question is: what is the
>>> >> > source of the splice()?
>>> >>
>>> >> Hmm. I think it has to be a regular file. But as I mentioned in my other
>>> >> email, I did have a race condition where fd's were closed
>>> >> incorrectly. Is it possible that this also triggered the above,
>>> >> i.e. that the fd was closed sometime during splice?
>>> >
>>> > Close during a syscall that uses the fd is not an issue, because a ref
>>> > to the file is acquired.  So the race is between the close() and the
>>> > internal fget(); if the close() wins then fget() will fail and the
>>> > syscall will return EBADF.  If the fget() wins, then the syscall can
>>> > run normally despite the fact that the fd was closed.
>>> >
>>> > Can you tell me what filesystem is the regular file (the one being
>>> > spliced into fuse) is on?
>>>
>>> It's ext4.
>>
>> Next question: is file opened with O_DIRECT or is filesystem mounted
>> with DAX, or anything fancy?
>
> Neither. But thinking about this, I guess that (because of the race) the
> fd could have been closed and re-opened before the ref was acquired. So
> it may have turned into a directory fd.
>
> To be honest, I don't think it's worth investigating this unless I see
> it happen again now that the race in my code is fixed.

Bad news. I can now reliably reproduce the issue again.

I have no clue why it disappeared for a while. This is exactly the same
filesystem code, but I can't rule out that there has been some routine upgrade
in the base system (compiler or kernel).

Any suggestions for debugging this?

As I said before, splice() source is a regular file on ext4, opened
without O_DIRECT or DAX. If I disable FUSE_CAP_SPLICE_WRITE, the error
message no longer occurs.


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] fuse: trying to steal weird page
  2019-02-10 22:05               ` Nikolaus Rath
@ 2019-02-12 14:57                 ` Miklos Szeredi
  2019-02-12 21:28                   ` Nikolaus Rath
  0 siblings, 1 reply; 11+ messages in thread
From: Miklos Szeredi @ 2019-02-12 14:57 UTC (permalink / raw)
  To: Miklos Szeredi, fuse-devel, linux-fsdevel

On Sun, Feb 10, 2019 at 11:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:

> Bad news. I can now reliably reproduce the issue again.

A reliable reproducer is always good news.   Are the messages exactly
the same as last time (value of flags, etc)?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] fuse: trying to steal weird page
  2019-02-12 14:57                 ` Miklos Szeredi
@ 2019-02-12 21:28                   ` Nikolaus Rath
  0 siblings, 0 replies; 11+ messages in thread
From: Nikolaus Rath @ 2019-02-12 21:28 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-fsdevel

On Feb 12 2019, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Sun, Feb 10, 2019 at 11:05 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
>> Bad news. I can now reliably reproduce the issue again.
>
> A reliable reproducer is always good news.   Are the messages exactly
> the same as last time (value of flags, etc)?

The flags, count, mapcount and mapping values are always the same. The
page and index is varying. So the general format is:

[ 2562.773181] fuse: trying to steal weird page
[ 2562.773187]   page=<something> index=<something> flags=17ffffc00000ad, count=1, mapcount=0, mapping=          (null)

Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, back to index

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-26 21:43 fuse: trying to steal weird page Nikolaus Rath
2019-01-07  8:28 ` [fuse-devel] " Miklos Szeredi
2019-01-07 21:05   ` Nikolaus Rath
2019-01-08  8:27     ` Miklos Szeredi
2019-01-08 10:35       ` Nikolaus Rath
2019-01-09  8:07         ` Miklos Szeredi
     [not found]           ` <CAJfpegtiXDgSBWN8MRubpAdJFxy95X21nO_yycCZhpvKLVePRA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-01-11 15:39             ` Nikolaus Rath
2019-01-11 15:39               ` [fuse-devel] " Nikolaus Rath
2019-02-10 22:05               ` Nikolaus Rath
2019-02-12 14:57                 ` Miklos Szeredi
2019-02-12 21:28                   ` Nikolaus Rath

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org linux-fsdevel@archiver.kernel.org
	public-inbox-index linux-fsdevel


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox