All of lore.kernel.org
 help / color / mirror / Atom feed
* [fuse] Speeding up readdir()
@ 2018-12-06 20:00 Nikolaus Rath
  2018-12-07  9:21 ` Miklos Szeredi
  0 siblings, 1 reply; 9+ messages in thread
From: Nikolaus Rath @ 2018-12-06 20:00 UTC (permalink / raw)
  To: Miklos Szeredi, linux-fsdevel, fuse-devel

Hello,

I am trying to improve the performance of readdir() requests. I have a
client application that issues a lot of readdir() requests, and a FUSE
filesystem that makes extensive use of kernel caching for inode entry
attributes because retrieving the attributes from the backend is
relatively expensive.

Unfortunately, it seems to me that currently there is no way to avoid
having to retrieve the attributes from the backend for every entry that
is returned by readdir - on every call:

If I am using readdirplus, I have to include the full attributes even if
the kernel already has them cached.

If I disable readdirplus, I can return just the entry name and its inode
- but I believe because this doesn't result in a lookup count increase
of the inode, the kernel can't match this with the existing cached data
for the inode (is that correct?) and I'm getting a separate lookup()
request for each entry that I've returned.


I could implement a readdir cache in the filesystem, but that means I
have to take care of cache invalidation and I'm basically wasting
memory.

Is there a reason why readdirplus() couldn't return just the name and
inode, together with a special flag that tells the kernel to "just use
the attributes that are already cached"?


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [fuse] Speeding up readdir()
  2018-12-06 20:00 [fuse] Speeding up readdir() Nikolaus Rath
@ 2018-12-07  9:21 ` Miklos Szeredi
  2018-12-07 12:55   ` [fuse-devel] " Nikolaus Rath
  2018-12-07 13:38   ` Nikolaus Rath
  0 siblings, 2 replies; 9+ messages in thread
From: Miklos Szeredi @ 2018-12-07  9:21 UTC (permalink / raw)
  To: linux-fsdevel, fuse-devel

On Thu, Dec 6, 2018 at 9:01 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> Hello,
>
> I am trying to improve the performance of readdir() requests. I have a
> client application that issues a lot of readdir() requests, and a FUSE
> filesystem that makes extensive use of kernel caching for inode entry
> attributes because retrieving the attributes from the backend is
> relatively expensive.
>
> Unfortunately, it seems to me that currently there is no way to avoid
> having to retrieve the attributes from the backend for every entry that
> is returned by readdir - on every call:
>
> If I am using readdirplus, I have to include the full attributes even if
> the kernel already has them cached.
>
> If I disable readdirplus, I can return just the entry name and its inode
> - but I believe because this doesn't result in a lookup count increase
> of the inode, the kernel can't match this with the existing cached data
> for the inode (is that correct?) and I'm getting a separate lookup()
> request for each entry that I've returned.

Was the entry timed out?  If not, then there shouldn't've been a lookup.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [fuse-devel] [fuse] Speeding up readdir()
  2018-12-07  9:21 ` Miklos Szeredi
@ 2018-12-07 12:55   ` Nikolaus Rath
  2018-12-07 13:04     ` Miklos Szeredi
  2018-12-07 13:38   ` Nikolaus Rath
  1 sibling, 1 reply; 9+ messages in thread
From: Nikolaus Rath @ 2018-12-07 12:55 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, fuse-devel

On Dec 07 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Dec 6, 2018 at 9:01 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> Hello,
>>
>> I am trying to improve the performance of readdir() requests. I have a
>> client application that issues a lot of readdir() requests, and a FUSE
>> filesystem that makes extensive use of kernel caching for inode entry
>> attributes because retrieving the attributes from the backend is
>> relatively expensive.
>>
>> Unfortunately, it seems to me that currently there is no way to avoid
>> having to retrieve the attributes from the backend for every entry that
>> is returned by readdir - on every call:
>>
>> If I am using readdirplus, I have to include the full attributes even if
>> the kernel already has them cached.
>>
>> If I disable readdirplus, I can return just the entry name and its inode
>> - but I believe because this doesn't result in a lookup count increase
>> of the inode, the kernel can't match this with the existing cached data
>> for the inode (is that correct?) and I'm getting a separate lookup()
>> request for each entry that I've returned.
>
> Was the entry timed out?  If not, then there shouldn't've been a
> lookup.

I am not 100% sure because of the atime invalidation issue. Apart from
that, it definitely was not timed out.

Are you saying that I should not be seeing lookup() requests after
(non-plus) readdir() if the dentry is already cached?

Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [fuse-devel] [fuse] Speeding up readdir()
  2018-12-07 12:55   ` [fuse-devel] " Nikolaus Rath
@ 2018-12-07 13:04     ` Miklos Szeredi
  2018-12-07 13:13       ` Nikolaus Rath
  2018-12-07 13:39       ` Nikolaus Rath
  0 siblings, 2 replies; 9+ messages in thread
From: Miklos Szeredi @ 2018-12-07 13:04 UTC (permalink / raw)
  To: linux-fsdevel, fuse-devel

On Fri, Dec 7, 2018 at 1:55 PM Nikolaus Rath <Nikolaus@rath.org> wrote:

> I am not 100% sure because of the atime invalidation issue. Apart from
> that, it definitely was not timed out.

The atime invalidation is different because that never results in
LOOKUP requests being generated.

> Are you saying that I should not be seeing lookup() requests after
> (non-plus) readdir() if the dentry is already cached?

If the dentry is already cached, and the timeout has not expired, then
you shouldn't see LOOKUP requests for that dentry.

Hmm, I see some strange entry invalidation calls in NFS export.  Do
you know if there's NFS export or open_by_handle(2) calls?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [fuse-devel] [fuse] Speeding up readdir()
  2018-12-07 13:04     ` Miklos Szeredi
@ 2018-12-07 13:13       ` Nikolaus Rath
  2018-12-07 13:18         ` Miklos Szeredi
  2018-12-07 13:39       ` Nikolaus Rath
  1 sibling, 1 reply; 9+ messages in thread
From: Nikolaus Rath @ 2018-12-07 13:13 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, fuse-devel

On Dec 07 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Fri, Dec 7, 2018 at 1:55 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
>> I am not 100% sure because of the atime invalidation issue. Apart from
>> that, it definitely was not timed out.
>
> The atime invalidation is different because that never results in
> LOOKUP requests being generated.
>
>> Are you saying that I should not be seeing lookup() requests after
>> (non-plus) readdir() if the dentry is already cached?
>
> If the dentry is already cached, and the timeout has not expired, then
> you shouldn't see LOOKUP requests for that dentry.
>
> Hmm, I see some strange entry invalidation calls in NFS export.  Do
> you know if there's NFS export or open_by_handle(2) calls?

Definitely no NFS export. I can't say for sure about open_by_handle, I
do not have the source code of the client application.


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [fuse-devel] [fuse] Speeding up readdir()
  2018-12-07 13:13       ` Nikolaus Rath
@ 2018-12-07 13:18         ` Miklos Szeredi
  0 siblings, 0 replies; 9+ messages in thread
From: Miklos Szeredi @ 2018-12-07 13:18 UTC (permalink / raw)
  To: linux-fsdevel, fuse-devel

On Fri, Dec 7, 2018 at 2:13 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> On Dec 07 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > On Fri, Dec 7, 2018 at 1:55 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >
> >> I am not 100% sure because of the atime invalidation issue. Apart from
> >> that, it definitely was not timed out.
> >
> > The atime invalidation is different because that never results in
> > LOOKUP requests being generated.
> >
> >> Are you saying that I should not be seeing lookup() requests after
> >> (non-plus) readdir() if the dentry is already cached?
> >
> > If the dentry is already cached, and the timeout has not expired, then
> > you shouldn't see LOOKUP requests for that dentry.
> >
> > Hmm, I see some strange entry invalidation calls in NFS export.  Do
> > you know if there's NFS export or open_by_handle(2) calls?
>
> Definitely no NFS export. I can't say for sure about open_by_handle, I
> do not have the source code of the client application.

Anyway, a debug log might give some ideas; could you please send one
with the weird behaviror?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [fuse-devel] [fuse] Speeding up readdir()
  2018-12-07  9:21 ` Miklos Szeredi
  2018-12-07 12:55   ` [fuse-devel] " Nikolaus Rath
@ 2018-12-07 13:38   ` Nikolaus Rath
  1 sibling, 0 replies; 9+ messages in thread
From: Nikolaus Rath @ 2018-12-07 13:38 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, fuse-devel

On Dec 07 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Thu, Dec 6, 2018 at 9:01 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> Hello,
>>
>> I am trying to improve the performance of readdir() requests. I have a
>> client application that issues a lot of readdir() requests, and a FUSE
>> filesystem that makes extensive use of kernel caching for inode entry
>> attributes because retrieving the attributes from the backend is
>> relatively expensive.
>>
>> Unfortunately, it seems to me that currently there is no way to avoid
>> having to retrieve the attributes from the backend for every entry that
>> is returned by readdir - on every call:
>>
>> If I am using readdirplus, I have to include the full attributes even if
>> the kernel already has them cached.
>>
>> If I disable readdirplus, I can return just the entry name and its inode
>> - but I believe because this doesn't result in a lookup count increase
>> of the inode, the kernel can't match this with the existing cached data
>> for the inode (is that correct?) and I'm getting a separate lookup()
>> request for each entry that I've returned.
>
> Was the entry timed out?  If not, then there shouldn't've been a lookup.

And there is no lookup indeed, what I am getting is additional getattr()
requests. Sorry for the confusion.

Here some more concrete data:

When enabling readdirplus(), I have:

Operation counts: 
  lookup	   1196	 create  	      4	 flush  	   1034
  lookup_new	      0	 fsync  	      0	 fsyncdir	      0
  getattr	  12818	 link   	      2	 unlink  	      4
  mknod		      0	 mkdir  	      0	 forget  	      2
  open		   1001	 opendir	  19635	 read    	   3999
  readdir	  19393	 readdirplus	  22654	 readlink	      0
  release	   1005	 releasedir	  19168	 rename  	      2
  rmdir		      0	 setattr	      6	 statfs  	    142
  symlink	      0	 write  	      9	 max_fd  	   1438

When disabling readdirplus, I get:

Operation counts:
  lookup	   1298	 create  	      0	 flush  	   1031
  lookup_new	      0	 fsync  	      0	 fsyncdir	      0
  getattr	  22346	 link   	      0	 unlink  	      0
  mknod		      0	 mkdir  	      0	 forget  	      0
  open		   1002	 opendir	  18955	 read    	   3916
  readdir	  36162	 readdirplus	      0	 readlink	      0
  release	   1002	 releasedir	  18378	 rename  	      0
  rmdir		      0	 setattr	      6	 statfs  	    142
  symlink	      0	 write  	      5	 

i.e. ~10k additional getattr() requests. That is lot less than the
number of extra readdir requests (~22k), especially considering that
each readdir() returns a number of entries. However, it is still more
than expected given that there are only 1298 distinct directory entries
that are all cached.

Nevertheless, the overall performance is unchanged (variation within
measurement noise of 5%). This is odd because my readdirplus()
implementation internally calls the getattr() handler so I would have
expected big savings from using plan readdir instead.

I tried to run the whole filesystem under oprofile, but found that
essentially all the time was spent outside the filesystem code:

$ opreport --symbols -g | head
Using /home/nikratio/readdirplus/oprofile_data/samples/ for samples directory.
CPU: AMD64 generic, speed 3500 MHz (estimated)
Counted CPU_CLK_UNHALTED events (CPU Clocks not Halted) with a unit mask of 0x00 (No unit mask) count 100000
samples  %        linenr info                 image name               app name                 symbol name
110313   88.7117  (no location information)   no-vmlinux               testfs                  /no-vmlinux
2341      1.8826  vfscanf.c:277               libc-2.28.so             testfs                  _IO_vfscanf
1183      0.9513  memcpy-ssse3.S:67           libc-2.28.so             testfs                  __memcpy_ssse3
803       0.6458  cacheinfo.c:259             libc-2.28.so             testfs                  handle_intel.constprop.1
784       0.6305  vfscanf.c:277               libc-2.28.so             testfs                  _IO_vfwscanf


Any suggestions?

Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [fuse-devel] [fuse] Speeding up readdir()
  2018-12-07 13:04     ` Miklos Szeredi
  2018-12-07 13:13       ` Nikolaus Rath
@ 2018-12-07 13:39       ` Nikolaus Rath
  2018-12-10  9:31         ` Miklos Szeredi
  1 sibling, 1 reply; 9+ messages in thread
From: Nikolaus Rath @ 2018-12-07 13:39 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, fuse-devel

On Dec 07 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Fri, Dec 7, 2018 at 1:55 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
>> I am not 100% sure because of the atime invalidation issue. Apart from
>> that, it definitely was not timed out.
>
> The atime invalidation is different because that never results in
> LOOKUP requests being generated.

I was actually confusing LOOKUP requests and GETATTR requests. I assume
the GETATTR requests would be explained by atime invalidation?


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [fuse-devel] [fuse] Speeding up readdir()
  2018-12-07 13:39       ` Nikolaus Rath
@ 2018-12-10  9:31         ` Miklos Szeredi
  0 siblings, 0 replies; 9+ messages in thread
From: Miklos Szeredi @ 2018-12-10  9:31 UTC (permalink / raw)
  To: linux-fsdevel, fuse-devel

On Fri, Dec 7, 2018 at 2:39 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> On Dec 07 2018, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > On Fri, Dec 7, 2018 at 1:55 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >
> >> I am not 100% sure because of the atime invalidation issue. Apart from
> >> that, it definitely was not timed out.
> >
> > The atime invalidation is different because that never results in
> > LOOKUP requests being generated.
>
> I was actually confusing LOOKUP requests and GETATTR requests. I assume
> the GETATTR requests would be explained by atime invalidation?

Probably yes.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-12-10  9:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-06 20:00 [fuse] Speeding up readdir() Nikolaus Rath
2018-12-07  9:21 ` Miklos Szeredi
2018-12-07 12:55   ` [fuse-devel] " Nikolaus Rath
2018-12-07 13:04     ` Miklos Szeredi
2018-12-07 13:13       ` Nikolaus Rath
2018-12-07 13:18         ` Miklos Szeredi
2018-12-07 13:39       ` Nikolaus Rath
2018-12-10  9:31         ` Miklos Szeredi
2018-12-07 13:38   ` Nikolaus Rath

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.