linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dominique Martinet <asmadeus@codewreck.org>
To: Luis Henriques <lhenriques@suse.de>
Cc: David Howells <dhowells@redhat.com>,
	Eric Van Hensbergen <ericvh@gmail.com>,
	Latchesar Ionkov <lucho@ionkov.net>,
	linux-fsdevel@vger.kernel.org,
	v9fs-developer@lists.sourceforge.net
Subject: Re: 9p: fscache duplicate cookie
Date: Wed, 12 May 2021 21:26:02 +0900	[thread overview]
Message-ID: <YJvJWj/CEyEUWeIu@codewreck.org> (raw)
In-Reply-To: <87fsysyxh9.fsf@suse.de>

Luis Henriques wrote on Wed, May 12, 2021 at 12:58:58PM +0100:
> <...>-20591   [000] ...2    67.538644: fscache_cookie: GET prn c=000000003080d900 u=50 p=0000000042542ee5 Nc=48 Na=1 f=22
> <...>-20591   [000] ...1    67.538645: fscache_acquire: c=0000000011fa06b1 p=000000003080d900 pu=50 pc=49 pf=22 n=9p.inod
> <...>-20599   [003] .N.2    67.542180: 9p_fscache_cookie: v9fs_drop_inode cookie: 0000000097476aaa
> [...]
>
> So, this is... annoying, I guess.

Oh, this actually looks different from what I had in mind.

So if I'm reading this right, the dup acquire happens before drop on
another thread, meaning iget5_locked somehow returned an inode with
I_NEW on same i_ino than that of the inode that is dropped later?...

How much trust can we actually put in trace ordering off different cpus?
My theory would really have wanted just that drop before the acquire :D



Anyway, I think there's no room for doubt that it's possible to get a
new inode for the same underlying file before the evict finished; which
leaves room for a few questions:
 - as David brought up on IRC (#linuxfs@OFTC), what about the flushing
of dirty data that happens in evict()? wouldn't it be possible for
operations on the new inode to read stale data while the old inode is
being flushed? I think that warrants asking someone who understands this
better than me as it's probably not 9p specific even if 9p makes it
easier to get a new inode in such a racy way...

 - for 9p in particular, Christian Schoenebeck (helping with 9p in qemu)
brought up that we evict inodes too fast too often, so I think it'd help
to have some sort of inode lifetime management and keep inodes alive for
a bit.
As a network filesystem with no coherency built in the protocol I don't
think we can afford to keep inodes cached too long, and I know some
servers have troubles if we keep too many fids open, but it would be
nice to have a few knobs to just keep inodes around a bit longer... This
won't solve the fundamental problem but if the inode isn't evicted at a
point where it's likely to be used again then this particular problem
should be much harder to hit (like other filesystems, actually :P)

I'm not sure how that works though, and won't have much time to work on
it short term anyway, but it's an idea :/

-- 
Dominique

  reply	other threads:[~2021-05-12 12:26 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-06 10:03 9p: fscache duplicate cookie Luis Henriques
2021-05-06 10:45 ` Dominique Martinet
2021-05-06 12:18   ` Luis Henriques
2021-05-07 16:36     ` Luis Henriques
2021-05-08  0:47       ` Dominique Martinet
2021-05-10 10:54         ` Luis Henriques
2021-05-10 11:47           ` Luis Henriques
2021-05-11 12:44           ` David Howells
2021-05-12 10:10             ` Luis Henriques
2021-05-11 12:53         ` David Howells
2021-05-11 12:38 ` David Howells
2021-05-12 10:07   ` Luis Henriques
2021-05-12 11:04   ` David Howells
2021-05-12 11:58     ` Luis Henriques
2021-05-12 12:26       ` Dominique Martinet [this message]
2021-05-12 12:57       ` What sort of inode state does ->evict_inode() expect to see? [was Re: 9p: fscache duplicate cookie] David Howells
2021-05-12 13:45         ` Al Viro
2021-05-12 14:12         ` David Howells
2021-05-14 16:10           ` Luis Henriques
2021-05-14 21:16             ` Dominique Martinet
2021-05-17 15:56               ` Luis Henriques
2021-05-17 17:39               ` Aneesh Kumar K.V
2021-05-12 11:09   ` 9p: fscache duplicate cookie David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YJvJWj/CEyEUWeIu@codewreck.org \
    --to=asmadeus@codewreck.org \
    --cc=dhowells@redhat.com \
    --cc=ericvh@gmail.com \
    --cc=lhenriques@suse.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=lucho@ionkov.net \
    --cc=v9fs-developer@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).