large directory iteration (getdents) over NFS mount resets due to stat

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* large directory iteration (getdents) over NFS mount resets due to stat
@ 2019-07-15  5:56 Noam Lewis
  2019-07-17  6:44 ` Noam Lewis
  0 siblings, 1 reply; 7+ messages in thread
From: Noam Lewis @ 2019-07-15  5:56 UTC (permalink / raw)
  To: linux-nfs

I've encountered a problem while iterating large directories via an NFS mount.

Scenario:

1. Linux NFS client iterates a directory with many (millions) of
files, e.g. via getdents() until all entries are done. In my case,
READDIRPLUS is being used under the hood. Trivial reproduction is to
run: ls -la
2. At the same time, run the stat tool on a file inside that directory.

The directory on the server is not being modified anywhere (on this
client or any other client).

Result: the next or ongoing getdents will get stuck for a long time
(tens of seconds to minutes). It appears to be re-iterating some of
the work it already did, by going back to a previous NFS READDIRPLUS
cookie.

Things I've tried as workarounds:
- Mounting with nordirplus - the iteration doesn't seem to reset or at
least getdents doesn't get stuck, but now I have tons of LOOKUPs, as
expected.
- Setting actimeo=(large number) doesn't affect the behavior

Questions:
1. Why does the stat command cause this?
2. How can I avoid the reset, i.e. ensure forward progress of the dir iteration?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: large directory iteration (getdents) over NFS mount resets due to stat
  2019-07-15  5:56 large directory iteration (getdents) over NFS mount resets due to stat Noam Lewis
@ 2019-07-17  6:44 ` Noam Lewis
  2019-07-17 12:08   ` Benjamin Coddington
  0 siblings, 1 reply; 7+ messages in thread
From: Noam Lewis @ 2019-07-17  6:44 UTC (permalink / raw)
  To: linux-nfs

I'm starting to think this is a bug. I can't see a good reason why
accessing (stat) a directory entry should cause the READDIRPLUS cookie
to be reset.

It seems that the trigger for iteration reset is accessing a directory
entry that doesn't have a valid entry in the cache. If it does has a
valid cache entry it doesn't trigger the cookie reset. Note, it
doesn't matter if the entry (or traversed dir) has actually changed:
the reset occurs even if both did not change on the server side.

Setting actimeo to a large enough value allows the dir iteration to
complete without any resets, but this is just a workaround that isn't
acceptable if the file system is being modified or if there isn't
enough memory. It's also heuristic and can lead to unexpected hiccups
if something in the environment changes.

So my questions still stand: is this expected behavior? What's the reason?

P.S. I'm using NFSv3

On Mon, Jul 15, 2019, 08:56 Noam Lewis <noam.lewis@elastifile.com> wrote:
>
> I've encountered a problem while iterating large directories via an NFS mount.
>
> Scenario:
>
> 1. Linux NFS client iterates a directory with many (millions) of
> files, e.g. via getdents() until all entries are done. In my case,
> READDIRPLUS is being used under the hood. Trivial reproduction is to
> run: ls -la
> 2. At the same time, run the stat tool on a file inside that directory.
>
> The directory on the server is not being modified anywhere (on this
> client or any other client).
>
> Result: the next or ongoing getdents will get stuck for a long time
> (tens of seconds to minutes). It appears to be re-iterating some of
> the work it already did, by going back to a previous NFS READDIRPLUS
> cookie.
>
>
> Things I've tried as workarounds:
> - Mounting with nordirplus - the iteration doesn't seem to reset or at
> least getdents doesn't get stuck, but now I have tons of LOOKUPs, as
> expected.
> - Setting actimeo=(large number) doesn't affect the behavior
>
> Questions:
> 1. Why does the stat command cause this?
> 2. How can I avoid the reset, i.e. ensure forward progress of the dir iteration?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: large directory iteration (getdents) over NFS mount resets due to stat
  2019-07-17  6:44 ` Noam Lewis
@ 2019-07-17 12:08   ` Benjamin Coddington
  2019-07-18  7:26     ` Noam Lewis
  0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Coddington @ 2019-07-17 12:08 UTC (permalink / raw)
  To: Noam Lewis; +Cc: linux-nfs

Maybe because we always drop the directory's page cache whenever we 
decide
to switch to READDIRPLUS, even if we're already doing it..  I've not 
tested
or checked very thoroughly if this is ok, but I think maybe want 
something
like this:

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 57b6a45576ad..acfe47668238 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -438,8 +438,8 @@ void nfs_force_use_readdirplus(struct inode *dir)
         struct nfs_inode *nfsi = NFS_I(dir);

         if (nfs_server_capable(dir, NFS_CAP_READDIRPLUS) &&
-           !list_empty(&nfsi->open_files)) {
-               set_bit(NFS_INO_ADVISE_RDPLUS, &nfsi->flags);
+           !list_empty(&nfsi->open_files) &&
+               !test_and_set_bit(NFS_INO_ADVISE_RDPLUS, &nfsi->flags)) 
{
                 invalidate_mapping_pages(dir->i_mapping, 0, -1);
         }
  }

Want to give that a spin?

Ben

On 17 Jul 2019, at 2:44, Noam Lewis wrote:

> I'm starting to think this is a bug. I can't see a good reason why
> accessing (stat) a directory entry should cause the READDIRPLUS cookie
> to be reset.
>
> It seems that the trigger for iteration reset is accessing a directory
> entry that doesn't have a valid entry in the cache. If it does has a
> valid cache entry it doesn't trigger the cookie reset. Note, it
> doesn't matter if the entry (or traversed dir) has actually changed:
> the reset occurs even if both did not change on the server side.
>
> Setting actimeo to a large enough value allows the dir iteration to
> complete without any resets, but this is just a workaround that isn't
> acceptable if the file system is being modified or if there isn't
> enough memory. It's also heuristic and can lead to unexpected hiccups
> if something in the environment changes.
>
> So my questions still stand: is this expected behavior? What's the 
> reason?
>
> P.S. I'm using NFSv3
>
> On Mon, Jul 15, 2019, 08:56 Noam Lewis <noam.lewis@elastifile.com> 
> wrote:
>>
>> I've encountered a problem while iterating large directories via an 
>> NFS mount.
>>
>> Scenario:
>>
>> 1. Linux NFS client iterates a directory with many (millions) of
>> files, e.g. via getdents() until all entries are done. In my case,
>> READDIRPLUS is being used under the hood. Trivial reproduction is to
>> run: ls -la
>> 2. At the same time, run the stat tool on a file inside that 
>> directory.
>>
>> The directory on the server is not being modified anywhere (on this
>> client or any other client).
>>
>> Result: the next or ongoing getdents will get stuck for a long time
>> (tens of seconds to minutes). It appears to be re-iterating some of
>> the work it already did, by going back to a previous NFS READDIRPLUS
>> cookie.
>>
>>
>> Things I've tried as workarounds:
>> - Mounting with nordirplus - the iteration doesn't seem to reset or 
>> at
>> least getdents doesn't get stuck, but now I have tons of LOOKUPs, as
>> expected.
>> - Setting actimeo=(large number) doesn't affect the behavior
>>
>> Questions:
>> 1. Why does the stat command cause this?
>> 2. How can I avoid the reset, i.e. ensure forward progress of the dir 
>> iteration?

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: large directory iteration (getdents) over NFS mount resets due to stat
  2019-07-17 12:08   ` Benjamin Coddington
@ 2019-07-18  7:26     ` Noam Lewis
  2019-07-18 13:55       ` Benjamin Coddington
  0 siblings, 1 reply; 7+ messages in thread
From: Noam Lewis @ 2019-07-18  7:26 UTC (permalink / raw)
  To: Benjamin Coddington; +Cc: linux-nfs

The patch seems to work nicely. getdents() doesn't get stuck /
iteration does not restart when accessing a non-cached file.

Thanks!

Can we agree this is the more correct behavior?
What's the next step?

On Wed, Jul 17, 2019 at 3:08 PM Benjamin Coddington <bcodding@redhat.com> wrote:
>
> Maybe because we always drop the directory's page cache whenever we
> decide
> to switch to READDIRPLUS, even if we're already doing it..  I've not
> tested
> or checked very thoroughly if this is ok, but I think maybe want
> something
> like this:
>
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 57b6a45576ad..acfe47668238 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -438,8 +438,8 @@ void nfs_force_use_readdirplus(struct inode *dir)
>          struct nfs_inode *nfsi = NFS_I(dir);
>
>          if (nfs_server_capable(dir, NFS_CAP_READDIRPLUS) &&
> -           !list_empty(&nfsi->open_files)) {
> -               set_bit(NFS_INO_ADVISE_RDPLUS, &nfsi->flags);
> +           !list_empty(&nfsi->open_files) &&
> +               !test_and_set_bit(NFS_INO_ADVISE_RDPLUS, &nfsi->flags))
> {
>                  invalidate_mapping_pages(dir->i_mapping, 0, -1);
>          }
>   }
>
> Want to give that a spin?
>
> Ben
>
> On 17 Jul 2019, at 2:44, Noam Lewis wrote:
>
> > I'm starting to think this is a bug. I can't see a good reason why
> > accessing (stat) a directory entry should cause the READDIRPLUS cookie
> > to be reset.
> >
> > It seems that the trigger for iteration reset is accessing a directory
> > entry that doesn't have a valid entry in the cache. If it does has a
> > valid cache entry it doesn't trigger the cookie reset. Note, it
> > doesn't matter if the entry (or traversed dir) has actually changed:
> > the reset occurs even if both did not change on the server side.
> >
> > Setting actimeo to a large enough value allows the dir iteration to
> > complete without any resets, but this is just a workaround that isn't
> > acceptable if the file system is being modified or if there isn't
> > enough memory. It's also heuristic and can lead to unexpected hiccups
> > if something in the environment changes.
> >
> > So my questions still stand: is this expected behavior? What's the
> > reason?
> >
> > P.S. I'm using NFSv3
> >
> > On Mon, Jul 15, 2019, 08:56 Noam Lewis <noam.lewis@elastifile.com>
> > wrote:
> >>
> >> I've encountered a problem while iterating large directories via an
> >> NFS mount.
> >>
> >> Scenario:
> >>
> >> 1. Linux NFS client iterates a directory with many (millions) of
> >> files, e.g. via getdents() until all entries are done. In my case,
> >> READDIRPLUS is being used under the hood. Trivial reproduction is to
> >> run: ls -la
> >> 2. At the same time, run the stat tool on a file inside that
> >> directory.
> >>
> >> The directory on the server is not being modified anywhere (on this
> >> client or any other client).
> >>
> >> Result: the next or ongoing getdents will get stuck for a long time
> >> (tens of seconds to minutes). It appears to be re-iterating some of
> >> the work it already did, by going back to a previous NFS READDIRPLUS
> >> cookie.
> >>
> >>
> >> Things I've tried as workarounds:
> >> - Mounting with nordirplus - the iteration doesn't seem to reset or
> >> at
> >> least getdents doesn't get stuck, but now I have tons of LOOKUPs, as
> >> expected.
> >> - Setting actimeo=(large number) doesn't affect the behavior
> >>
> >> Questions:
> >> 1. Why does the stat command cause this?
> >> 2. How can I avoid the reset, i.e. ensure forward progress of the dir
> >> iteration?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: large directory iteration (getdents) over NFS mount resets due to stat
  2019-07-18  7:26     ` Noam Lewis
@ 2019-07-18 13:55       ` Benjamin Coddington
  2019-07-18 14:13         ` Noam Lewis
  0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Coddington @ 2019-07-18 13:55 UTC (permalink / raw)
  To: Noam Lewis; +Cc: linux-nfs

The one thing we'd need to be careful about here is if there are 
existing
cached dentries without the attributes we want, then we run through
nfs_advise_use_readdirplus(), setting the flag, then the
nfs_force_readdirplus() fails to drop all the existing dentries..

That looks like a `ls <dir>`, then a stat of one of those
cached dentries, then an `ls -la` would look like a flood of GETATTR for
every entry instead of a series of READDIRPLUS.

Any chance you still have that test system?  What happens if you do:

ls <big dir>
stat <big dir>/file
ls -la <big dir>

I'm worried that will leave you stuck in the GETATTR flood instead of 
using
the more optimal READDIRPLUS.

If I'm going to send this patch, I should also test it, so I suppose I 
might
just build and test this myself..

Ben

On 18 Jul 2019, at 3:26, Noam Lewis wrote:

> The patch seems to work nicely. getdents() doesn't get stuck /
> iteration does not restart when accessing a non-cached file.
>
> Thanks!
>
> Can we agree this is the more correct behavior?
> What's the next step?
>
> On Wed, Jul 17, 2019 at 3:08 PM Benjamin Coddington 
> <bcodding@redhat.com> wrote:
>>
>> Maybe because we always drop the directory's page cache whenever we
>> decide
>> to switch to READDIRPLUS, even if we're already doing it..  I've not
>> tested
>> or checked very thoroughly if this is ok, but I think maybe want
>> something
>> like this:
>>
>> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
>> index 57b6a45576ad..acfe47668238 100644
>> --- a/fs/nfs/dir.c
>> +++ b/fs/nfs/dir.c
>> @@ -438,8 +438,8 @@ void nfs_force_use_readdirplus(struct inode *dir)
>>          struct nfs_inode *nfsi = NFS_I(dir);
>>
>>          if (nfs_server_capable(dir, NFS_CAP_READDIRPLUS) &&
>> -           !list_empty(&nfsi->open_files)) {
>> -               set_bit(NFS_INO_ADVISE_RDPLUS, &nfsi->flags);
>> +           !list_empty(&nfsi->open_files) &&
>> +               !test_and_set_bit(NFS_INO_ADVISE_RDPLUS, 
>> &nfsi->flags))
>> {
>>                  invalidate_mapping_pages(dir->i_mapping, 0, -1);
>>          }
>>   }
>>
>> Want to give that a spin?
>>
>> Ben
>>
>> On 17 Jul 2019, at 2:44, Noam Lewis wrote:
>>
>>> I'm starting to think this is a bug. I can't see a good reason why
>>> accessing (stat) a directory entry should cause the READDIRPLUS 
>>> cookie
>>> to be reset.
>>>
>>> It seems that the trigger for iteration reset is accessing a 
>>> directory
>>> entry that doesn't have a valid entry in the cache. If it does has a
>>> valid cache entry it doesn't trigger the cookie reset. Note, it
>>> doesn't matter if the entry (or traversed dir) has actually changed:
>>> the reset occurs even if both did not change on the server side.
>>>
>>> Setting actimeo to a large enough value allows the dir iteration to
>>> complete without any resets, but this is just a workaround that 
>>> isn't
>>> acceptable if the file system is being modified or if there isn't
>>> enough memory. It's also heuristic and can lead to unexpected 
>>> hiccups
>>> if something in the environment changes.
>>>
>>> So my questions still stand: is this expected behavior? What's the
>>> reason?
>>>
>>> P.S. I'm using NFSv3
>>>
>>> On Mon, Jul 15, 2019, 08:56 Noam Lewis <noam.lewis@elastifile.com>
>>> wrote:
>>>>
>>>> I've encountered a problem while iterating large directories via an
>>>> NFS mount.
>>>>
>>>> Scenario:
>>>>
>>>> 1. Linux NFS client iterates a directory with many (millions) of
>>>> files, e.g. via getdents() until all entries are done. In my case,
>>>> READDIRPLUS is being used under the hood. Trivial reproduction is 
>>>> to
>>>> run: ls -la
>>>> 2. At the same time, run the stat tool on a file inside that
>>>> directory.
>>>>
>>>> The directory on the server is not being modified anywhere (on this
>>>> client or any other client).
>>>>
>>>> Result: the next or ongoing getdents will get stuck for a long time
>>>> (tens of seconds to minutes). It appears to be re-iterating some of
>>>> the work it already did, by going back to a previous NFS 
>>>> READDIRPLUS
>>>> cookie.
>>>>
>>>>
>>>> Things I've tried as workarounds:
>>>> - Mounting with nordirplus - the iteration doesn't seem to reset or
>>>> at
>>>> least getdents doesn't get stuck, but now I have tons of LOOKUPs, 
>>>> as
>>>> expected.
>>>> - Setting actimeo=(large number) doesn't affect the behavior
>>>>
>>>> Questions:
>>>> 1. Why does the stat command cause this?
>>>> 2. How can I avoid the reset, i.e. ensure forward progress of the 
>>>> dir
>>>> iteration?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: large directory iteration (getdents) over NFS mount resets due to stat
  2019-07-18 13:55       ` Benjamin Coddington
@ 2019-07-18 14:13         ` Noam Lewis
  2019-07-18 14:29           ` Benjamin Coddington
  0 siblings, 1 reply; 7+ messages in thread
From: Noam Lewis @ 2019-07-18 14:13 UTC (permalink / raw)
  To: Benjamin Coddington; +Cc: linux-nfs

I tested and you're right, it sends a GETATTR flood on the second `ls
-la`. I should have tested more than a single scenario.


On Thu, Jul 18, 2019 at 4:55 PM Benjamin Coddington <bcodding@redhat.com> wrote:
>
> The one thing we'd need to be careful about here is if there are
> existing
> cached dentries without the attributes we want, then we run through
> nfs_advise_use_readdirplus(), setting the flag, then the
> nfs_force_readdirplus() fails to drop all the existing dentries..
>
> That looks like a `ls <dir>`, then a stat of one of those
> cached dentries, then an `ls -la` would look like a flood of GETATTR for
> every entry instead of a series of READDIRPLUS.
>
> Any chance you still have that test system?  What happens if you do:
>
> ls <big dir>
> stat <big dir>/file
> ls -la <big dir>
>
> I'm worried that will leave you stuck in the GETATTR flood instead of
> using
> the more optimal READDIRPLUS.
>
> If I'm going to send this patch, I should also test it, so I suppose I
> might
> just build and test this myself..
>
> Ben
>
> On 18 Jul 2019, at 3:26, Noam Lewis wrote:
>
> > The patch seems to work nicely. getdents() doesn't get stuck /
> > iteration does not restart when accessing a non-cached file.
> >
> > Thanks!
> >
> > Can we agree this is the more correct behavior?
> > What's the next step?
> >
> > On Wed, Jul 17, 2019 at 3:08 PM Benjamin Coddington
> > <bcodding@redhat.com> wrote:
> >>
> >> Maybe because we always drop the directory's page cache whenever we
> >> decide
> >> to switch to READDIRPLUS, even if we're already doing it..  I've not
> >> tested
> >> or checked very thoroughly if this is ok, but I think maybe want
> >> something
> >> like this:
> >>
> >> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> >> index 57b6a45576ad..acfe47668238 100644
> >> --- a/fs/nfs/dir.c
> >> +++ b/fs/nfs/dir.c
> >> @@ -438,8 +438,8 @@ void nfs_force_use_readdirplus(struct inode *dir)
> >>          struct nfs_inode *nfsi = NFS_I(dir);
> >>
> >>          if (nfs_server_capable(dir, NFS_CAP_READDIRPLUS) &&
> >> -           !list_empty(&nfsi->open_files)) {
> >> -               set_bit(NFS_INO_ADVISE_RDPLUS, &nfsi->flags);
> >> +           !list_empty(&nfsi->open_files) &&
> >> +               !test_and_set_bit(NFS_INO_ADVISE_RDPLUS,
> >> &nfsi->flags))
> >> {
> >>                  invalidate_mapping_pages(dir->i_mapping, 0, -1);
> >>          }
> >>   }
> >>
> >> Want to give that a spin?
> >>
> >> Ben
> >>
> >> On 17 Jul 2019, at 2:44, Noam Lewis wrote:
> >>
> >>> I'm starting to think this is a bug. I can't see a good reason why
> >>> accessing (stat) a directory entry should cause the READDIRPLUS
> >>> cookie
> >>> to be reset.
> >>>
> >>> It seems that the trigger for iteration reset is accessing a
> >>> directory
> >>> entry that doesn't have a valid entry in the cache. If it does has a
> >>> valid cache entry it doesn't trigger the cookie reset. Note, it
> >>> doesn't matter if the entry (or traversed dir) has actually changed:
> >>> the reset occurs even if both did not change on the server side.
> >>>
> >>> Setting actimeo to a large enough value allows the dir iteration to
> >>> complete without any resets, but this is just a workaround that
> >>> isn't
> >>> acceptable if the file system is being modified or if there isn't
> >>> enough memory. It's also heuristic and can lead to unexpected
> >>> hiccups
> >>> if something in the environment changes.
> >>>
> >>> So my questions still stand: is this expected behavior? What's the
> >>> reason?
> >>>
> >>> P.S. I'm using NFSv3
> >>>
> >>> On Mon, Jul 15, 2019, 08:56 Noam Lewis <noam.lewis@elastifile.com>
> >>> wrote:
> >>>>
> >>>> I've encountered a problem while iterating large directories via an
> >>>> NFS mount.
> >>>>
> >>>> Scenario:
> >>>>
> >>>> 1. Linux NFS client iterates a directory with many (millions) of
> >>>> files, e.g. via getdents() until all entries are done. In my case,
> >>>> READDIRPLUS is being used under the hood. Trivial reproduction is
> >>>> to
> >>>> run: ls -la
> >>>> 2. At the same time, run the stat tool on a file inside that
> >>>> directory.
> >>>>
> >>>> The directory on the server is not being modified anywhere (on this
> >>>> client or any other client).
> >>>>
> >>>> Result: the next or ongoing getdents will get stuck for a long time
> >>>> (tens of seconds to minutes). It appears to be re-iterating some of
> >>>> the work it already did, by going back to a previous NFS
> >>>> READDIRPLUS
> >>>> cookie.
> >>>>
> >>>>
> >>>> Things I've tried as workarounds:
> >>>> - Mounting with nordirplus - the iteration doesn't seem to reset or
> >>>> at
> >>>> least getdents doesn't get stuck, but now I have tons of LOOKUPs,
> >>>> as
> >>>> expected.
> >>>> - Setting actimeo=(large number) doesn't affect the behavior
> >>>>
> >>>> Questions:
> >>>> 1. Why does the stat command cause this?
> >>>> 2. How can I avoid the reset, i.e. ensure forward progress of the
> >>>> dir
> >>>> iteration?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: large directory iteration (getdents) over NFS mount resets due to stat
  2019-07-18 14:13         ` Noam Lewis
@ 2019-07-18 14:29           ` Benjamin Coddington
  0 siblings, 0 replies; 7+ messages in thread
From: Benjamin Coddington @ 2019-07-18 14:29 UTC (permalink / raw)
  To: Noam Lewis; +Cc: linux-nfs

Ok, ideas?  I think we could get the behavior you want by adding another
flag that tracks whether or not there are any "non-plus" pages in the
cache..

The heuristics around READDIRPLUS are tricky, and it is pretty hard to 
make
everyone happy.  Sub-directories are cheap and always fast!

Ben

(Resending this because the vger.kernel.org server just NDR-ed my 
message..
for some reason it thinks I sent a VIRUS!!1)

On 18 Jul 2019, at 10:13, Noam Lewis wrote:

> I tested and you're right, it sends a GETATTR flood on the second `ls
> -la`. I should have tested more than a single scenario.
>
>
> On Thu, Jul 18, 2019 at 4:55 PM Benjamin Coddington 
> <bcodding@redhat.com> wrote:
>>
>> The one thing we'd need to be careful about here is if there are
>> existing
>> cached dentries without the attributes we want, then we run through
>> nfs_advise_use_readdirplus(), setting the flag, then the
>> nfs_force_readdirplus() fails to drop all the existing dentries..
>>
>> That looks like a `ls <dir>`, then a stat of one of those
>> cached dentries, then an `ls -la` would look like a flood of GETATTR 
>> for
>> every entry instead of a series of READDIRPLUS.
>>
>> Any chance you still have that test system?  What happens if you do:
>>
>> ls <big dir>
>> stat <big dir>/file
>> ls -la <big dir>
>>
>> I'm worried that will leave you stuck in the GETATTR flood instead of
>> using
>> the more optimal READDIRPLUS.
>>
>> If I'm going to send this patch, I should also test it, so I suppose 
>> I
>> might
>> just build and test this myself..
>>
>> Ben
>>
>> On 18 Jul 2019, at 3:26, Noam Lewis wrote:
>>
>>> The patch seems to work nicely. getdents() doesn't get stuck /
>>> iteration does not restart when accessing a non-cached file.
>>>
>>> Thanks!
>>>
>>> Can we agree this is the more correct behavior?
>>> What's the next step?
>>>
>>> On Wed, Jul 17, 2019 at 3:08 PM Benjamin Coddington
>>> <bcodding@redhat.com> wrote:
>>>>
>>>> Maybe because we always drop the directory's page cache whenever we
>>>> decide
>>>> to switch to READDIRPLUS, even if we're already doing it..  I've 
>>>> not
>>>> tested
>>>> or checked very thoroughly if this is ok, but I think maybe want
>>>> something
>>>> like this:
>>>>
>>>> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
>>>> index 57b6a45576ad..acfe47668238 100644
>>>> --- a/fs/nfs/dir.c
>>>> +++ b/fs/nfs/dir.c
>>>> @@ -438,8 +438,8 @@ void nfs_force_use_readdirplus(struct inode 
>>>> *dir)
>>>>          struct nfs_inode *nfsi = NFS_I(dir);
>>>>
>>>>          if (nfs_server_capable(dir, NFS_CAP_READDIRPLUS) &&
>>>> -           !list_empty(&nfsi->open_files)) {
>>>> -               set_bit(NFS_INO_ADVISE_RDPLUS, &nfsi->flags);
>>>> +           !list_empty(&nfsi->open_files) &&
>>>> +               !test_and_set_bit(NFS_INO_ADVISE_RDPLUS,
>>>> &nfsi->flags))
>>>> {
>>>>                  invalidate_mapping_pages(dir->i_mapping, 0, -1);
>>>>          }
>>>>   }
>>>>
>>>> Want to give that a spin?
>>>>
>>>> Ben
>>>>
>>>> On 17 Jul 2019, at 2:44, Noam Lewis wrote:
>>>>
>>>>> I'm starting to think this is a bug. I can't see a good reason why
>>>>> accessing (stat) a directory entry should cause the READDIRPLUS
>>>>> cookie
>>>>> to be reset.
>>>>>
>>>>> It seems that the trigger for iteration reset is accessing a
>>>>> directory
>>>>> entry that doesn't have a valid entry in the cache. If it does has 
>>>>> a
>>>>> valid cache entry it doesn't trigger the cookie reset. Note, it
>>>>> doesn't matter if the entry (or traversed dir) has actually 
>>>>> changed:
>>>>> the reset occurs even if both did not change on the server side.
>>>>>
>>>>> Setting actimeo to a large enough value allows the dir iteration 
>>>>> to
>>>>> complete without any resets, but this is just a workaround that
>>>>> isn't
>>>>> acceptable if the file system is being modified or if there isn't
>>>>> enough memory. It's also heuristic and can lead to unexpected
>>>>> hiccups
>>>>> if something in the environment changes.
>>>>>
>>>>> So my questions still stand: is this expected behavior? What's the
>>>>> reason?
>>>>>
>>>>> P.S. I'm using NFSv3
>>>>>
>>>>> On Mon, Jul 15, 2019, 08:56 Noam Lewis <noam.lewis@elastifile.com>
>>>>> wrote:
>>>>>>
>>>>>> I've encountered a problem while iterating large directories via 
>>>>>> an
>>>>>> NFS mount.
>>>>>>
>>>>>> Scenario:
>>>>>>
>>>>>> 1. Linux NFS client iterates a directory with many (millions) of
>>>>>> files, e.g. via getdents() until all entries are done. In my 
>>>>>> case,
>>>>>> READDIRPLUS is being used under the hood. Trivial reproduction is
>>>>>> to
>>>>>> run: ls -la
>>>>>> 2. At the same time, run the stat tool on a file inside that
>>>>>> directory.
>>>>>>
>>>>>> The directory on the server is not being modified anywhere (on 
>>>>>> this
>>>>>> client or any other client).
>>>>>>
>>>>>> Result: the next or ongoing getdents will get stuck for a long 
>>>>>> time
>>>>>> (tens of seconds to minutes). It appears to be re-iterating some 
>>>>>> of
>>>>>> the work it already did, by going back to a previous NFS
>>>>>> READDIRPLUS
>>>>>> cookie.
>>>>>>
>>>>>>
>>>>>> Things I've tried as workarounds:
>>>>>> - Mounting with nordirplus - the iteration doesn't seem to reset 
>>>>>> or
>>>>>> at
>>>>>> least getdents doesn't get stuck, but now I have tons of LOOKUPs,
>>>>>> as
>>>>>> expected.
>>>>>> - Setting actimeo=(large number) doesn't affect the behavior
>>>>>>
>>>>>> Questions:
>>>>>> 1. Why does the stat command cause this?
>>>>>> 2. How can I avoid the reset, i.e. ensure forward progress of the
>>>>>> dir
>>>>>> iteration?

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-07-18 14:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-15  5:56 large directory iteration (getdents) over NFS mount resets due to stat Noam Lewis
2019-07-17  6:44 ` Noam Lewis
2019-07-17 12:08   ` Benjamin Coddington
2019-07-18  7:26     ` Noam Lewis
2019-07-18 13:55       ` Benjamin Coddington
2019-07-18 14:13         ` Noam Lewis
2019-07-18 14:29           ` Benjamin Coddington

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).