linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Coddington <bcodding@redhat.com>
To: Trond Myklebust <trondmy@hammerspace.com>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>,
	Hugh Dickins <hughd@google.com>,
	Charles Edward Lever <chuck.lever@oracle.com>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	Anna Schumaker <anna@kernel.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Amir Goldstein <amir73il@gmail.com>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	Linux kernel regressions list <regressions@lists.linux.dev>
Subject: Re: git regression failures with v6.2-rc NFS client
Date: Sat, 04 Feb 2023 15:44:26 -0500	[thread overview]
Message-ID: <05BEEF62-46DF-4FAC-99D4-4589C294F93A@redhat.com> (raw)
In-Reply-To: <031C52C0-144A-4051-9B4C-0E1E3164951E@hammerspace.com>

On 4 Feb 2023, at 11:52, Trond Myklebust wrote:
> On Feb 4, 2023, at 08:15, Benjamin Coddington <bcodding@redhat.com> wrote:
>> Ah, thanks for explaining that.
>>
>> I'd like to summarize and quantify this problem one last time for folks that
>> don't want to read everything.  If an application wants to remove all files
>> and the parent directory, and uses this pattern to do it:
>>
>> opendir
>> while (getdents)
>>    unlink dents
>> closedir
>> rmdir
>>
>> Before this commit, that would work with up to 126 dentries on NFS from
>> tmpfs export.  If the directory had 127 or more, the rmdir would fail with
>> ENOTEMPTY.
>
> For all sizes of filenames, or just the particular set that was chosen
> here? What about the choice of rsize? Both these values affect how many
> entries glibc can cache before it has to issue another getdents() call
> into the kernel. For the record, this is what glibc does in the opendir()
> code in order to choose a buffer size for the getdents syscalls:
>
>   /* The st_blksize value of the directory is used as a hint for the
>      size of the buffer which receives struct dirent values from the
>      kernel.  st_blksize is limited to max_buffer_size, in case the
>      file system provides a bogus value.  */
>   enum { max_buffer_size = 1048576 };
>
>   enum { allocation_size = 32768 };
>   _Static_assert (allocation_size >= sizeof (struct dirent64),
>                   "allocation_size < sizeof (struct dirent64)");
>
>   /* Increase allocation if requested, but not if the value appears to
>      be bogus.  It will be between 32Kb and 1Mb.  */
>   size_t allocation = MIN (MAX ((size_t) statp->st_blksize, (size_t)
>                                 allocation_size), (size_t) max_buffer_size);
>
>   DIR *dirp = (DIR *) malloc (sizeof (DIR) + allocation);

The behavioral complexity is even higher with glibc in the mix, but both the
test that Chuck's using and the reproducer I've been making claims about
use SYS_getdents directly.  I'm using a static 4k buffer size which is big
enough to fit enough entries to prime the heuristic for a single call to
getdents() whether or not we return early at 17 or 126.

>> After this commit, it only works with up to 17 dentries.
>>
>> The argument that this is making things worse takes the position that there
>> are more directories in the universe with >17 dentries that want to be
>> cleaned up by this "saw off the branch you're sitting on" pattern than
>> directories with >127.  And I guess that's true if Chuck runs that testing
>> setup enough.  :)
>>
>> We can change the optimization in the commit from
>> NFS_READDIR_CACHE_MISS_THRESHOLD + 1
>> to
>> nfs_readdir_array_maxentries + 1
>>
>> This would make the regression disappear, and would also keep most of the
>> optimization.
>>
>> Ben
>>
>
> So in other words the suggestion is to optimise the number of readdir
> records that we return from NFS to whatever value that papers over the
> known telldir()/seekdir() tmpfs bug that is re-revealed by this particular
> test when run under these particular conditions?

Yes.  It's a terrible suggestion.  Its only merit may be that it meets the
letter of the no regressions law.  I hate it, and I after I started popping
out patches that do it I've found they've all made the behavior far more
complex due to the way we dynamically optimize dtsize.

> Anyone who tries to use tmpfs with a different number of files, different
> file name lengths, or different mount options is still SOL because that’s
> not a “regression"?

Right. :P

Ben


  reply	other threads:[~2023-02-04 20:45 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-31 21:15 git regression failures with v6.2-rc NFS client Chuck Lever III
2023-01-31 22:02 ` Benjamin Coddington
2023-02-01 14:10   ` Benjamin Coddington
2023-02-01 15:53     ` Benjamin Coddington
2023-02-03 14:38       ` Chuck Lever III
2023-02-03 15:13         ` Benjamin Coddington
2023-02-03 15:35           ` Chuck Lever III
2023-02-03 17:14             ` Benjamin Coddington
2023-02-03 18:03               ` Chuck Lever III
2023-02-03 20:01                 ` Benjamin Coddington
2023-02-03 20:25                   ` Chuck Lever III
2023-02-03 22:26                     ` Trond Myklebust
2023-02-03 23:11                       ` Chuck Lever III
2023-02-03 23:53                         ` Hugh Dickins
2023-02-04  0:07                           ` Trond Myklebust
2023-02-04  0:15                             ` Hugh Dickins
2023-02-04  0:59                               ` Trond Myklebust
2023-02-04 11:07                                 ` Thorsten Leemhuis
2023-02-04 13:15                                   ` Benjamin Coddington
2023-02-04 16:52                                     ` Trond Myklebust
2023-02-04 20:44                                       ` Benjamin Coddington [this message]
2023-02-05 11:24                                         ` Jeff Layton
2023-02-05 16:11                                           ` Benjamin Coddington
2023-02-01 15:11   ` Chuck Lever III
2023-02-03 12:39 ` Linux kernel regression tracking (#adding)
2023-02-21 14:58   ` Linux regression tracking #update (Thorsten Leemhuis)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=05BEEF62-46DF-4FAC-99D4-4589C294F93A@redhat.com \
    --to=bcodding@redhat.com \
    --cc=amir73il@gmail.com \
    --cc=anna@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=hughd@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    --cc=trondmy@hammerspace.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).