* [fuse] Speeding up readdir()
@ 2018-12-06 20:00 Nikolaus Rath
2018-12-07 9:21 ` Miklos Szeredi
0 siblings, 1 reply; 9+ messages in thread
From: Nikolaus Rath @ 2018-12-06 20:00 UTC (permalink / raw)
To: Miklos Szeredi, linux-fsdevel, fuse-devel
Hello,
I am trying to improve the performance of readdir() requests. I have a
client application that issues a lot of readdir() requests, and a FUSE
filesystem that makes extensive use of kernel caching for inode entry
attributes because retrieving the attributes from the backend is
relatively expensive.
Unfortunately, it seems to me that currently there is no way to avoid
having to retrieve the attributes from the backend for every entry that
is returned by readdir - on every call:
If I am using readdirplus, I have to include the full attributes even if
the kernel already has them cached.
If I disable readdirplus, I can return just the entry name and its inode
- but I believe because this doesn't result in a lookup count increase
of the inode, the kernel can't match this with the existing cached data
for the inode (is that correct?) and I'm getting a separate lookup()
request for each entry that I've returned.
I could implement a readdir cache in the filesystem, but that means I
have to take care of cache invalidation and I'm basically wasting
memory.
Is there a reason why readdirplus() couldn't return just the name and
inode, together with a special flag that tells the kernel to "just use
the attributes that are already cached"?
Best,
-Nikolaus
--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
»Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [fuse] Speeding up readdir()
2018-12-06 20:00 [fuse] Speeding up readdir() Nikolaus Rath
@ 2018-12-07 9:21 ` Miklos Szeredi
2018-12-07 12:55 ` [fuse-devel] " Nikolaus Rath
2018-12-07 13:38 ` Nikolaus Rath
0 siblings, 2 replies; 9+ messages in thread
From: Miklos Szeredi @ 2018-12-07 9:21 UTC (permalink / raw)
To: linux-fsdevel, fuse-devel
On Thu, Dec 6, 2018 at 9:01 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> Hello,
>
> I am trying to improve the performance of readdir() requests. I have a
> client application that issues a lot of readdir() requests, and a FUSE
> filesystem that makes extensive use of kernel caching for inode entry
> attributes because retrieving the attributes from the backend is
> relatively expensive.
>
> Unfortunately, it seems to me that currently there is no way to avoid
> having to retrieve the attributes from the backend for every entry that
> is returned by readdir - on every call:
>
> If I am using readdirplus, I have to include the full attributes even if
> the kernel already has them cached.
>
> If I disable readdirplus, I can return just the entry name and its inode
> - but I believe because this doesn't result in a lookup count increase
> of the inode, the kernel can't match this with the existing cached data
> for the inode (is that correct?) and I'm getting a separate lookup()
> request for each entry that I've returned.
Was the entry timed out? If not, then there shouldn't've been a lookup.
Thanks,
Miklos
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [fuse-devel] [fuse] Speeding up readdir()
2018-12-07 9:21 ` Miklos Szeredi
@ 2018-12-07 12:55 ` Nikolaus Rath
2018-12-07 13:04 ` Miklos Szeredi
2018-12-07 13:38 ` Nikolaus Rath
1 sibling, 1 reply; 9+ messages in thread
From: Nikolaus Rath @ 2018-12-07 12:55 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: linux-fsdevel, fuse-devel
On Dec 07 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Dec 6, 2018 at 9:01 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> Hello,
>>
>> I am trying to improve the performance of readdir() requests. I have a
>> client application that issues a lot of readdir() requests, and a FUSE
>> filesystem that makes extensive use of kernel caching for inode entry
>> attributes because retrieving the attributes from the backend is
>> relatively expensive.
>>
>> Unfortunately, it seems to me that currently there is no way to avoid
>> having to retrieve the attributes from the backend for every entry that
>> is returned by readdir - on every call:
>>
>> If I am using readdirplus, I have to include the full attributes even if
>> the kernel already has them cached.
>>
>> If I disable readdirplus, I can return just the entry name and its inode
>> - but I believe because this doesn't result in a lookup count increase
>> of the inode, the kernel can't match this with the existing cached data
>> for the inode (is that correct?) and I'm getting a separate lookup()
>> request for each entry that I've returned.
>
> Was the entry timed out? If not, then there shouldn't've been a
> lookup.
I am not 100% sure because of the atime invalidation issue. Apart from
that, it definitely was not timed out.
Are you saying that I should not be seeing lookup() requests after
(non-plus) readdir() if the dentry is already cached?
Best,
-Nikolaus
--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
»Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [fuse-devel] [fuse] Speeding up readdir()
2018-12-07 12:55 ` [fuse-devel] " Nikolaus Rath
@ 2018-12-07 13:04 ` Miklos Szeredi
2018-12-07 13:13 ` Nikolaus Rath
2018-12-07 13:39 ` Nikolaus Rath
0 siblings, 2 replies; 9+ messages in thread
From: Miklos Szeredi @ 2018-12-07 13:04 UTC (permalink / raw)
To: linux-fsdevel, fuse-devel
On Fri, Dec 7, 2018 at 1:55 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> I am not 100% sure because of the atime invalidation issue. Apart from
> that, it definitely was not timed out.
The atime invalidation is different because that never results in
LOOKUP requests being generated.
> Are you saying that I should not be seeing lookup() requests after
> (non-plus) readdir() if the dentry is already cached?
If the dentry is already cached, and the timeout has not expired, then
you shouldn't see LOOKUP requests for that dentry.
Hmm, I see some strange entry invalidation calls in NFS export. Do
you know if there's NFS export or open_by_handle(2) calls?
Thanks,
Miklos
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [fuse-devel] [fuse] Speeding up readdir()
2018-12-07 13:04 ` Miklos Szeredi
@ 2018-12-07 13:13 ` Nikolaus Rath
2018-12-07 13:18 ` Miklos Szeredi
2018-12-07 13:39 ` Nikolaus Rath
1 sibling, 1 reply; 9+ messages in thread
From: Nikolaus Rath @ 2018-12-07 13:13 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: linux-fsdevel, fuse-devel
On Dec 07 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Fri, Dec 7, 2018 at 1:55 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
>> I am not 100% sure because of the atime invalidation issue. Apart from
>> that, it definitely was not timed out.
>
> The atime invalidation is different because that never results in
> LOOKUP requests being generated.
>
>> Are you saying that I should not be seeing lookup() requests after
>> (non-plus) readdir() if the dentry is already cached?
>
> If the dentry is already cached, and the timeout has not expired, then
> you shouldn't see LOOKUP requests for that dentry.
>
> Hmm, I see some strange entry invalidation calls in NFS export. Do
> you know if there's NFS export or open_by_handle(2) calls?
Definitely no NFS export. I can't say for sure about open_by_handle, I
do not have the source code of the client application.
Best,
-Nikolaus
--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
»Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [fuse-devel] [fuse] Speeding up readdir()
2018-12-07 13:13 ` Nikolaus Rath
@ 2018-12-07 13:18 ` Miklos Szeredi
0 siblings, 0 replies; 9+ messages in thread
From: Miklos Szeredi @ 2018-12-07 13:18 UTC (permalink / raw)
To: linux-fsdevel, fuse-devel
On Fri, Dec 7, 2018 at 2:13 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> On Dec 07 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > On Fri, Dec 7, 2018 at 1:55 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >
> >> I am not 100% sure because of the atime invalidation issue. Apart from
> >> that, it definitely was not timed out.
> >
> > The atime invalidation is different because that never results in
> > LOOKUP requests being generated.
> >
> >> Are you saying that I should not be seeing lookup() requests after
> >> (non-plus) readdir() if the dentry is already cached?
> >
> > If the dentry is already cached, and the timeout has not expired, then
> > you shouldn't see LOOKUP requests for that dentry.
> >
> > Hmm, I see some strange entry invalidation calls in NFS export. Do
> > you know if there's NFS export or open_by_handle(2) calls?
>
> Definitely no NFS export. I can't say for sure about open_by_handle, I
> do not have the source code of the client application.
Anyway, a debug log might give some ideas; could you please send one
with the weird behaviror?
Thanks,
Miklos
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [fuse-devel] [fuse] Speeding up readdir()
2018-12-07 9:21 ` Miklos Szeredi
2018-12-07 12:55 ` [fuse-devel] " Nikolaus Rath
@ 2018-12-07 13:38 ` Nikolaus Rath
1 sibling, 0 replies; 9+ messages in thread
From: Nikolaus Rath @ 2018-12-07 13:38 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: linux-fsdevel, fuse-devel
On Dec 07 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Dec 6, 2018 at 9:01 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> Hello,
>>
>> I am trying to improve the performance of readdir() requests. I have a
>> client application that issues a lot of readdir() requests, and a FUSE
>> filesystem that makes extensive use of kernel caching for inode entry
>> attributes because retrieving the attributes from the backend is
>> relatively expensive.
>>
>> Unfortunately, it seems to me that currently there is no way to avoid
>> having to retrieve the attributes from the backend for every entry that
>> is returned by readdir - on every call:
>>
>> If I am using readdirplus, I have to include the full attributes even if
>> the kernel already has them cached.
>>
>> If I disable readdirplus, I can return just the entry name and its inode
>> - but I believe because this doesn't result in a lookup count increase
>> of the inode, the kernel can't match this with the existing cached data
>> for the inode (is that correct?) and I'm getting a separate lookup()
>> request for each entry that I've returned.
>
> Was the entry timed out? If not, then there shouldn't've been a lookup.
And there is no lookup indeed, what I am getting is additional getattr()
requests. Sorry for the confusion.
Here some more concrete data:
When enabling readdirplus(), I have:
Operation counts:
lookup 1196 create 4 flush 1034
lookup_new 0 fsync 0 fsyncdir 0
getattr 12818 link 2 unlink 4
mknod 0 mkdir 0 forget 2
open 1001 opendir 19635 read 3999
readdir 19393 readdirplus 22654 readlink 0
release 1005 releasedir 19168 rename 2
rmdir 0 setattr 6 statfs 142
symlink 0 write 9 max_fd 1438
When disabling readdirplus, I get:
Operation counts:
lookup 1298 create 0 flush 1031
lookup_new 0 fsync 0 fsyncdir 0
getattr 22346 link 0 unlink 0
mknod 0 mkdir 0 forget 0
open 1002 opendir 18955 read 3916
readdir 36162 readdirplus 0 readlink 0
release 1002 releasedir 18378 rename 0
rmdir 0 setattr 6 statfs 142
symlink 0 write 5
i.e. ~10k additional getattr() requests. That is lot less than the
number of extra readdir requests (~22k), especially considering that
each readdir() returns a number of entries. However, it is still more
than expected given that there are only 1298 distinct directory entries
that are all cached.
Nevertheless, the overall performance is unchanged (variation within
measurement noise of 5%). This is odd because my readdirplus()
implementation internally calls the getattr() handler so I would have
expected big savings from using plan readdir instead.
I tried to run the whole filesystem under oprofile, but found that
essentially all the time was spent outside the filesystem code:
$ opreport --symbols -g | head
Using /home/nikratio/readdirplus/oprofile_data/samples/ for samples directory.
CPU: AMD64 generic, speed 3500 MHz (estimated)
Counted CPU_CLK_UNHALTED events (CPU Clocks not Halted) with a unit mask of 0x00 (No unit mask) count 100000
samples % linenr info image name app name symbol name
110313 88.7117 (no location information) no-vmlinux testfs /no-vmlinux
2341 1.8826 vfscanf.c:277 libc-2.28.so testfs _IO_vfscanf
1183 0.9513 memcpy-ssse3.S:67 libc-2.28.so testfs __memcpy_ssse3
803 0.6458 cacheinfo.c:259 libc-2.28.so testfs handle_intel.constprop.1
784 0.6305 vfscanf.c:277 libc-2.28.so testfs _IO_vfwscanf
Any suggestions?
Best,
-Nikolaus
--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
»Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [fuse-devel] [fuse] Speeding up readdir()
2018-12-07 13:04 ` Miklos Szeredi
2018-12-07 13:13 ` Nikolaus Rath
@ 2018-12-07 13:39 ` Nikolaus Rath
2018-12-10 9:31 ` Miklos Szeredi
1 sibling, 1 reply; 9+ messages in thread
From: Nikolaus Rath @ 2018-12-07 13:39 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: linux-fsdevel, fuse-devel
On Dec 07 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Fri, Dec 7, 2018 at 1:55 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
>> I am not 100% sure because of the atime invalidation issue. Apart from
>> that, it definitely was not timed out.
>
> The atime invalidation is different because that never results in
> LOOKUP requests being generated.
I was actually confusing LOOKUP requests and GETATTR requests. I assume
the GETATTR requests would be explained by atime invalidation?
Best,
-Nikolaus
--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
»Time flies like an arrow, fruit flies like a Banana.«
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [fuse-devel] [fuse] Speeding up readdir()
2018-12-07 13:39 ` Nikolaus Rath
@ 2018-12-10 9:31 ` Miklos Szeredi
0 siblings, 0 replies; 9+ messages in thread
From: Miklos Szeredi @ 2018-12-10 9:31 UTC (permalink / raw)
To: linux-fsdevel, fuse-devel
On Fri, Dec 7, 2018 at 2:39 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> On Dec 07 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > On Fri, Dec 7, 2018 at 1:55 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >
> >> I am not 100% sure because of the atime invalidation issue. Apart from
> >> that, it definitely was not timed out.
> >
> > The atime invalidation is different because that never results in
> > LOOKUP requests being generated.
>
> I was actually confusing LOOKUP requests and GETATTR requests. I assume
> the GETATTR requests would be explained by atime invalidation?
Probably yes.
Thanks,
Miklos
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-12-10 9:31 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-06 20:00 [fuse] Speeding up readdir() Nikolaus Rath
2018-12-07 9:21 ` Miklos Szeredi
2018-12-07 12:55 ` [fuse-devel] " Nikolaus Rath
2018-12-07 13:04 ` Miklos Szeredi
2018-12-07 13:13 ` Nikolaus Rath
2018-12-07 13:18 ` Miklos Szeredi
2018-12-07 13:39 ` Nikolaus Rath
2018-12-10 9:31 ` Miklos Szeredi
2018-12-07 13:38 ` Nikolaus Rath
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.