linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chuck Lever III <chuck.lever@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Chuck Lever <cel@kernel.org>,
	"hughd@google.com" <hughd@google.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v1] shmem: stable directory cookies
Date: Wed, 3 May 2023 00:43:54 +0000	[thread overview]
Message-ID: <30E5A657-4005-4126-A962-A8E6D90240AB@oracle.com> (raw)
In-Reply-To: <20230502171228.57a906a259172d39542e92fb@linux-foundation.org>



> On May 2, 2023, at 8:12 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> On Mon, 17 Apr 2023 15:23:10 -0400 Chuck Lever <cel@kernel.org> wrote:
> 
>> From: Chuck Lever <chuck.lever@oracle.com>
>> 
>> The current cursor-based directory cookie mechanism doesn't work
>> when a tmpfs filesystem is exported via NFS. This is because NFS
>> clients do not open directories: each READDIR operation has to open
>> the directory on the server, read it, then close it. The cursor
>> state for that directory, being associated strictly with the opened
>> struct file, is then discarded.
>> 
>> Directory cookies are cached not only by NFS clients, but also by
>> user space libraries on those clients. Essentially there is no way
>> to invalidate those caches when directory offsets have changed on
>> an NFS server after the offset-to-dentry mapping changes.
>> 
>> The solution we've come up with is to make the directory cookie for
>> each file in a tmpfs filesystem stable for the life of the directory
>> entry it represents.
>> 
>> Add a per-directory xarray. shmem_readdir() uses this to map each
>> directory offset (an loff_t integer) to the memory address of a
>> struct dentry.
>> 
> 
> How have people survived for this long with this problem?

It's less of a problem without NFS in the picture; local
applications can hold the directory open, and that preserves
the seek cursor. But you can still trigger it.

Also, a plurality of applications are well-behaved in this
regard. It's just the more complex and more useful ones
(like git) that seem to trigger issues.

It became less bearable for NFS because of a recent change
on the Linux NFS client to optimize directory read behavior:

85aa8ddc3818 ("NFS: Trigger the "ls -l" readdir heuristic sooner")

Trond argued that tmpfs directory cookie behavior has always
been problematic (eg broken) therefore this commit does not
count as a regression. However, it does make tmpfs exports
less usable, breaking some tests that have always worked.


> It's a lot of new code -

I don't feel that this is a lot of new code:

include/linux/shmem_fs.h |    2 
mm/shmem.c               |  213 +++++++++++++++++++++++++++++++++++++++++++---
2 files changed, 201 insertions(+), 14 deletions(-)

But I agree it might look a little daunting on first review.
I am happy to try to break this single patch up or consider
other approaches.

We could, for instance, tuck a little more of this into
lib/fs. Copying the readdir and directory seeking
implementation from simplefs to tmpfs is one reason
the insertion count is worrisome.


> can we get away with simply disallowing
> exports of tmpfs?

I think the bottom line is that you /can/ trigger this
behavior without NFS, just not as quickly. The threshold
is high enough that most use cases aren't bothered by
this right now.

We'd rather not disallow exporting tmpfs. It's a very
good testing platform for us, and disallowing it would
be a noticeable regression for some folks.


> How can we maintain this?  Is it possible to come up with a test
> harness for inclusion in kernel selftests?

There is very little directory cookie testing that I know of
in the obvious place: fstests. That would be where this stuff
should be unit tested, IMO.


--
Chuck Lever



  reply	other threads:[~2023-05-03  0:44 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-17 19:23 [PATCH v1] shmem: stable directory cookies Chuck Lever
2023-04-20 18:52 ` Jeff Layton
2023-04-20 20:12   ` Chuck Lever III
2023-05-03  0:12 ` Andrew Morton
2023-05-03  0:43   ` Chuck Lever III [this message]
2023-05-04 17:21     ` Jeff Layton
2023-05-04 20:21     ` Benjamin Coddington
2023-05-05  5:06 ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=30E5A657-4005-4126-A962-A8E6D90240AB@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=cel@kernel.org \
    --cc=hughd@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).