All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Schoenebeck <qemu_oss@crudebyte.com>
To: qemu-devel@nongnu.org
Cc: Qemu-block <qemu-block@nongnu.org>, Greg Kurz <groug@kaod.org>,
	Linus Heckemann <git@sphalerite.org>
Subject: Re: [PATCH 1/1] 9pfs: avoid iterator invalidation in v9fs_mark_fids_unreclaim
Date: Tue, 27 Sep 2022 19:14:33 +0200	[thread overview]
Message-ID: <3675458.bPJFzOz80O@silver> (raw)
In-Reply-To: <ygav8p9vugm.fsf@localhost>

On Dienstag, 27. September 2022 15:05:13 CEST Linus Heckemann wrote:
> Christian Schoenebeck <qemu_oss@crudebyte.com> writes:
> > Ah, you sent this fix as a separate patch on top. I actually just meant
> > that you would take my already queued patch as the latest version (just
> > because I had made some minor changes on my end) and adjust that patch
> > further as v4.
> > 
> > Anyway, there are still some things to do here, so maybe you can send your
> > patch squashed in the next round ...
> 
> I see, will do!
> 
> >> @Christian: I still haven't been able to reproduce the issue that this
> >> commit is supposed to fix (I tried building KDE too, no problems), so
> >> it's a bit of a shot in the dark. It certainly still runs and I think it
> >> should fix the issue, but it would be great if you could test it.
> > 
> > No worries about reproduction, I will definitely test this thoroughly. ;-)
> > 
> >>  hw/9pfs/9p.c | 46 ++++++++++++++++++++++++++++++----------------
> >>  1 file changed, 30 insertions(+), 16 deletions(-)
> >> 
> >> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> >> index f4c1e37202..825c39e122 100644
> >> --- a/hw/9pfs/9p.c
> >> +++ b/hw/9pfs/9p.c
> >> @@ -522,33 +522,47 @@ static int coroutine_fn
> >> v9fs_mark_fids_unreclaim(V9fsPDU *pdu, V9fsPath *path) V9fsFidState
> >> *fidp;
> >> 
> >>      gpointer fid;
> >>      GHashTableIter iter;
> >> 
> >> +    /*
> >> +     * The most common case is probably that we have exactly one
> >> +     * fid for the given path, so preallocate exactly one.
> >> +     */
> >> +    GArray *to_reopen = g_array_sized_new(FALSE, FALSE,
> >> sizeof(V9fsFidState*), 1); +    gint i;
> > 
> > Please use `g_autoptr(GArray)` instead of `GArray *`, that avoids the need
> > for explicit calls to g_array_free() below.
> 
> Good call. I'm not familiar with glib, so I didn't know about this :)
> 
> >> -            fidp->flags |= FID_NON_RECLAIMABLE;
> > 
> > Why did you remove that? It should still be marked as FID_NON_RECLAIMABLE,
> > no?
> Indeed, that was an accident.
> 
> >> +            /*
> >> +             * Ensure the fid survives a potential clunk request during
> >> +             * v9fs_reopen_fid or put_fid.
> >> +             */
> >> +            fidp->ref++;
> > 
> > Hmm, bumping the refcount here makes sense, as the 2nd loop may be
> > interrupted and the fid otherwise disappear in between, but ...
> > 
> >> +            g_array_append_val(to_reopen, fidp);
> >> 
> >>          }
> >> 
> >> +    }
> >> 
> >> -        /* We're done with this fid */
> >> +    for (i=0; i < to_reopen->len; i++) {
> >> +        fidp = g_array_index(to_reopen, V9fsFidState*, i);
> >> +        /* reopen the file/dir if already closed */
> >> +        err = v9fs_reopen_fid(pdu, fidp);
> >> +        if (err < 0) {
> >> +            put_fid(pdu, fidp);
> >> +            g_array_free(to_reopen, TRUE);
> >> +            return err;
> > 
> > ... this return would then leak all remainder fids that you have bumped
> > the
> > refcount for above already.
> 
> You're right. I think the best way around it, though it feels ugly, is
> to add a third loop in an "out:".

Either that, or continuing the loop to the end. Not that this would become 
much prettier. I must admit I also don't really have a good idea for a clean 
solution in this case.

> > Also: I noticed that your changes in virtfs_reset() would need the same
> > 2-loop hack to avoid hash iterator invalidation, as it would also call
> > put_fid() inside the loop and be prone for hash iterator invalidation
> > otherwise.
> Good point. Will do.
> 
> One more thing has occurred to me. I think the reclaiming/reopening
> logic will misbehave in the following sequence of events:
> 
> 1. QEMU reclaims an open fid, losing the file handle
> 2. The file referred to by the fid is replaced with a different file
>    (e.g. via rename or symlink) outside QEMU
> 3. The file is accessed again by the guest, causing QEMU to reopen a
>    _different file_ from before without the guest having performed any
>    operations that should cause this to happen.
> 
> This is neither introduced nor resolved by my changes. Am I overlooking
> something that avoids this (be it documentation that directories exposed
> via 9p should not be touched by the host), or is this a real issue? I'm
> thinking one could at least detect it by saving inode numbers in
> V9fsFidState and comparing them when reopening, but recovering from such
> a situation seems difficult.

Well, in that specific scenario when rename/move happens outside of QEMU then 
yes, this might happen unfortunately. The point of this "reclaim fid" stuff is 
to deal with the fact that there is an upper limit on systems for the max. 
amount of open file descriptors a process might hold at a time. And on some 
systems like macOS I think that limit is quite low by default (like 100?).

There is also another issue pending that affects pure inner-guest behaviour; 
the infamous use-after-unlink() use pattern:
https://wiki.qemu.org/Documentation/9p#Implementation_Plans
https://gitlab.com/qemu-project/qemu/-/issues/103

It would make sense to look how other file servers deal with the max. amount 
of file descriptors limit before starting to just fight the symptoms. This 
whole reclaim fid stuff in general is PITA.

Best regards,
Christian Schoenebeck




  reply	other threads:[~2022-09-27 17:46 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-26 12:42 [PATCH 1/1] 9pfs: avoid iterator invalidation in v9fs_mark_fids_unreclaim Linus Heckemann
2022-09-26 16:02 ` Christian Schoenebeck
2022-09-27 13:05   ` Linus Heckemann
2022-09-27 17:14     ` Christian Schoenebeck [this message]
2022-09-27 19:47       ` Greg Kurz
2022-09-28 17:24         ` Christian Schoenebeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3675458.bPJFzOz80O@silver \
    --to=qemu_oss@crudebyte.com \
    --cc=git@sphalerite.org \
    --cc=groug@kaod.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.