linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [fuse] Unexpectedly large number of getattr() and lookup requests
@ 2018-11-29 21:20 Nikolaus Rath
  2018-11-30  7:58 ` Miklos Szeredi
  0 siblings, 1 reply; 8+ messages in thread
From: Nikolaus Rath @ 2018-11-29 21:20 UTC (permalink / raw)
  To: fuse-devel, linux-fsdevel, Miklos Szeredi

Hello,

I am seeing an unexpectedly large number of getattr() and lookup()
requests being sent to userspace fuse. I am setting a very large
attr_timeout and entry_timeout, so I would have expected that the
maximum number of getattr() and lookup() requests is capped by the
number of distinct files in the filesystem plus the number of forget
requests.

However, actual numbers are much higher. For example, when running tests
on a filesystem with 2960 directory entries, I am getting scenarios
with 203447 lookup requests, 12970 getattr requests, and zero forget
requests.

Did I misunderstand something about how dentry and attribute caching
works?

Thanks,
-Nikolaus
-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [fuse] Unexpectedly large number of getattr() and lookup requests
  2018-11-29 21:20 [fuse] Unexpectedly large number of getattr() and lookup requests Nikolaus Rath
@ 2018-11-30  7:58 ` Miklos Szeredi
  2018-12-01 10:00   ` [fuse-devel] " Nikolaus Rath
  0 siblings, 1 reply; 8+ messages in thread
From: Miklos Szeredi @ 2018-11-30  7:58 UTC (permalink / raw)
  To: fuse-devel, linux-fsdevel

On Thu, Nov 29, 2018 at 10:20 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> Hello,
>
> I am seeing an unexpectedly large number of getattr() and lookup()
> requests being sent to userspace fuse. I am setting a very large
> attr_timeout and entry_timeout, so I would have expected that the
> maximum number of getattr() and lookup() requests is capped by the
> number of distinct files in the filesystem plus the number of forget
> requests.
>
> However, actual numbers are much higher. For example, when running tests
> on a filesystem with 2960 directory entries, I am getting scenarios
> with 203447 lookup requests, 12970 getattr requests, and zero forget
> requests.
>
> Did I misunderstand something about how dentry and attribute caching
> works?

Debug log might be useful.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [fuse-devel] [fuse] Unexpectedly large number of getattr() and lookup requests
  2018-11-30  7:58 ` Miklos Szeredi
@ 2018-12-01 10:00   ` Nikolaus Rath
  2018-12-04  9:36     ` Miklos Szeredi
  0 siblings, 1 reply; 8+ messages in thread
From: Nikolaus Rath @ 2018-12-01 10:00 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-fsdevel

On Nov 30 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Nov 29, 2018 at 10:20 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> Hello,
>>
>> I am seeing an unexpectedly large number of getattr() and lookup()
>> requests being sent to userspace fuse. I am setting a very large
>> attr_timeout and entry_timeout, so I would have expected that the
>> maximum number of getattr() and lookup() requests is capped by the
>> number of distinct files in the filesystem plus the number of forget
>> requests.
>>
>> However, actual numbers are much higher. For example, when running tests
>> on a filesystem with 2960 directory entries, I am getting scenarios
>> with 203447 lookup requests, 12970 getattr requests, and zero forget
>> requests.
>>
>> Did I misunderstand something about how dentry and attribute caching
>> works?
>
> Debug log might be useful.

Here you go!

https://www.dropbox.com/s/tg4vyshz4g18sub/fuse-debug.log.xz?dl=1


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [fuse-devel] [fuse] Unexpectedly large number of getattr() and lookup requests
  2018-12-01 10:00   ` [fuse-devel] " Nikolaus Rath
@ 2018-12-04  9:36     ` Miklos Szeredi
  2018-12-04 19:04       ` Nikolaus Rath
  0 siblings, 1 reply; 8+ messages in thread
From: Miklos Szeredi @ 2018-12-04  9:36 UTC (permalink / raw)
  To: fuse-devel, linux-fsdevel

On Sat, Dec 1, 2018 at 11:00 AM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> On Nov 30 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > On Thu, Nov 29, 2018 at 10:20 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >>
> >> Hello,
> >>
> >> I am seeing an unexpectedly large number of getattr() and lookup()
> >> requests being sent to userspace fuse. I am setting a very large
> >> attr_timeout and entry_timeout, so I would have expected that the
> >> maximum number of getattr() and lookup() requests is capped by the
> >> number of distinct files in the filesystem plus the number of forget
> >> requests.
> >>
> >> However, actual numbers are much higher. For example, when running tests
> >> on a filesystem with 2960 directory entries, I am getting scenarios
> >> with 203447 lookup requests, 12970 getattr requests, and zero forget
> >> requests.
> >>
> >> Did I misunderstand something about how dentry and attribute caching
> >> works?
> >
> > Debug log might be useful.
>
> Here you go!
>
> https://www.dropbox.com/s/tg4vyshz4g18sub/fuse-debug.log.xz?dl=1

$ grep LOOKUP fuse-debug.log | wc -l
20786
$ grep -A2 LOOKUP fuse-debug.log |grep "No such file or directory"  | wc -l
20116

Since it's a lowlevel log, we can't see what the negative lookups are
for, but by returning ENOENT it is guaranteed that the negative lookup
can't be cached.  Calling fuse_reply_entry() instead with a zero
nodeid the negative entry can also be cached, which probably helps to
reduce the number of lookups.  See fuse_lib_lookup():

        if (err == -ENOENT && f->conf.negative_timeout != 0.0) {
            e.ino = 0;
            e.entry_timeout = f->conf.negative_timeout;

The GETATTR requests are due atime invalidations.  There's no
"noatime" mode yet, but it might make sense to add one.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [fuse-devel] [fuse] Unexpectedly large number of getattr() and lookup requests
  2018-12-04  9:36     ` Miklos Szeredi
@ 2018-12-04 19:04       ` Nikolaus Rath
  2018-12-05  9:25         ` Miklos Szeredi
  0 siblings, 1 reply; 8+ messages in thread
From: Nikolaus Rath @ 2018-12-04 19:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-fsdevel

On Dec 04 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Sat, Dec 1, 2018 at 11:00 AM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> On Nov 30 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> > On Thu, Nov 29, 2018 at 10:20 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>> >>
>> >> Hello,
>> >>
>> >> I am seeing an unexpectedly large number of getattr() and lookup()
>> >> requests being sent to userspace fuse. I am setting a very large
>> >> attr_timeout and entry_timeout, so I would have expected that the
>> >> maximum number of getattr() and lookup() requests is capped by the
>> >> number of distinct files in the filesystem plus the number of forget
>> >> requests.
>> >>
>> >> However, actual numbers are much higher. For example, when running tests
>> >> on a filesystem with 2960 directory entries, I am getting scenarios
>> >> with 203447 lookup requests, 12970 getattr requests, and zero forget
>> >> requests.
>> >>
>> >> Did I misunderstand something about how dentry and attribute caching
>> >> works?
>> >
>> > Debug log might be useful.
>>
>> Here you go!
>>
>> https://www.dropbox.com/s/tg4vyshz4g18sub/fuse-debug.log.xz?dl=1
>
> $ grep LOOKUP fuse-debug.log | wc -l
> 20786
> $ grep -A2 LOOKUP fuse-debug.log |grep "No such file or directory"  | wc -l
> 20116
>
> Since it's a lowlevel log, we can't see what the negative lookups are
> for, but by returning ENOENT it is guaranteed that the negative lookup
> can't be cached.  Calling fuse_reply_entry() instead with a zero
> nodeid the negative entry can also be cached, which probably helps to
> reduce the number of lookups.

Uh, right. I forgot about that. Thanks!

> The GETATTR requests are due atime invalidations.

Could you elaborate? I'm not sure what that means here. What is an atime
invalidation?


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [fuse-devel] [fuse] Unexpectedly large number of getattr() and lookup requests
  2018-12-04 19:04       ` Nikolaus Rath
@ 2018-12-05  9:25         ` Miklos Szeredi
  2018-12-05 18:06           ` Nikolaus Rath
  0 siblings, 1 reply; 8+ messages in thread
From: Miklos Szeredi @ 2018-12-05  9:25 UTC (permalink / raw)
  To: fuse-devel, linux-fsdevel

On Tue, Dec 4, 2018 at 8:04 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> On Dec 04 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:

> > The GETATTR requests are due atime invalidations.
>
> Could you elaborate? I'm not sure what that means here. What is an atime
> invalidation?

POSIX states that on read(2), readdir(3), readlink(2), etc, the
st_atime of the file/directory/symlink needs to be updated to the
current time.  So when e.g. a READ request is sent, fuse will
invalidate the cache in the belief that the server will update the
atime.  In this casea subsequent stat(2) will find the invalid cache
and will issue a GETATTR to the server, which will reply with an
updated atime value.

The problem with the above is that atime is used by very few
applications, and by default it isn't even updated like POSIX
requires, due to the performance penalty it imposes on normal
workloads.

So fuse could also be more relatime/noatime friendly and not
invalidate the cached attributes when not necessary.  This is on my
todo list, but patches are welcome, as always.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [fuse-devel] [fuse] Unexpectedly large number of getattr() and lookup requests
  2018-12-05  9:25         ` Miklos Szeredi
@ 2018-12-05 18:06           ` Nikolaus Rath
  2018-12-06  9:26             ` Miklos Szeredi
  0 siblings, 1 reply; 8+ messages in thread
From: Nikolaus Rath @ 2018-12-05 18:06 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-fsdevel

On Dec 05 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Tue, Dec 4, 2018 at 8:04 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> On Dec 04 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
>
>> > The GETATTR requests are due atime invalidations.
>>
>> Could you elaborate? I'm not sure what that means here. What is an atime
>> invalidation?
>
> POSIX states that on read(2), readdir(3), readlink(2), etc, the
> st_atime of the file/directory/symlink needs to be updated to the
> current time.  So when e.g. a READ request is sent, fuse will
> invalidate the cache in the belief that the server will update the
> atime.  In this casea subsequent stat(2) will find the invalid cache
> and will issue a GETATTR to the server, which will reply with an
> updated atime value.

This sounds like it should not happen when writeback is active, because
in that case userspace doesn't know the right attributes either.

Or is there special code that only prevents invalidation if there
already is dirty data for the inode? If so, is there a reason for not
updating atime in the kernel whenever writeback is active?

Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [fuse-devel] [fuse] Unexpectedly large number of getattr() and lookup requests
  2018-12-05 18:06           ` Nikolaus Rath
@ 2018-12-06  9:26             ` Miklos Szeredi
  0 siblings, 0 replies; 8+ messages in thread
From: Miklos Szeredi @ 2018-12-06  9:26 UTC (permalink / raw)
  To: fuse-devel, linux-fsdevel

On Wed, Dec 5, 2018 at 7:06 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> On Dec 05 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > On Tue, Dec 4, 2018 at 8:04 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >>
> >> On Dec 04 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> >> > The GETATTR requests are due atime invalidations.
> >>
> >> Could you elaborate? I'm not sure what that means here. What is an atime
> >> invalidation?
> >
> > POSIX states that on read(2), readdir(3), readlink(2), etc, the
> > st_atime of the file/directory/symlink needs to be updated to the
> > current time.  So when e.g. a READ request is sent, fuse will
> > invalidate the cache in the belief that the server will update the
> > atime.  In this casea subsequent stat(2) will find the invalid cache
> > and will issue a GETATTR to the server, which will reply with an
> > updated atime value.
>
> This sounds like it should not happen when writeback is active, because
> in that case userspace doesn't know the right attributes either.
>
> Or is there special code that only prevents invalidation if there
> already is dirty data for the inode? If so, is there a reason for not
> updating atime in the kernel whenever writeback is active?

Atime handling is a mess anyway.  E.g. reads that are cached should
also generate atime updates if strictatime, but they don't (regardless
of writeback mode).

I'd rather fix this properly, i.e. respect relevant mount options:
strictatime, noatime, relatime.  It shouldn't be complicated, but I
haven't looked very deeply.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-12-06  9:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-29 21:20 [fuse] Unexpectedly large number of getattr() and lookup requests Nikolaus Rath
2018-11-30  7:58 ` Miklos Szeredi
2018-12-01 10:00   ` [fuse-devel] " Nikolaus Rath
2018-12-04  9:36     ` Miklos Szeredi
2018-12-04 19:04       ` Nikolaus Rath
2018-12-05  9:25         ` Miklos Szeredi
2018-12-05 18:06           ` Nikolaus Rath
2018-12-06  9:26             ` Miklos Szeredi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).