All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Maydell <peter.maydell@linaro.org>
To: Gerd Hoffmann <kraxel@redhat.com>
Cc: "QEMU Developers" <qemu-devel@nongnu.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Alex Bennée" <alex.bennee@linaro.org>
Subject: Re: race condition in display device caused by run_on_cpu() dropping the iothread lock
Date: Mon, 15 Aug 2022 14:02:47 +0100	[thread overview]
Message-ID: <CAFEAcA_vFUfGA+51YT-Up3GCVzQKPtpscXMncqiGd849oC9NkA@mail.gmail.com> (raw)
In-Reply-To: <20220815112239.37xm3zwbe5gd7trz@sirius.home.kraxel.org>

On Mon, 15 Aug 2022 at 12:22, Gerd Hoffmann <kraxel@redhat.com> wrote:
>
> On Mon, Aug 01, 2022 at 02:23:55PM +0100, Peter Maydell wrote:
> > I've been debugging a segfault in the raspi3b display device, and I've
> > tracked it down to a race condition, but I'm not sure what the right
> > way to fix it is...
> >
> > The race is that a vCPU thread is handling a guest register write that
> > says "resize the framebuffer", which it implements by calling
> > qemu_console_resize().
>
> [ back online after vacation ]
>
> Easiest is probably to not instantly resize the display surface but
> let the update handler do that on the next display refresh.

I feel like this will fix the immediate crash but isn't
addressing the wider underlying problem. (For instance, if the
user does something with the UI at just the wrong moment this
can probably get in during the we-dropped-the-iothread-lock window.)

> Many display devices do that anyway because often multiple register
> updates are needed to perform a resize and you don't want your ui
> window run through all the temporary states ...
>
> Alternative: The DisplaySurface is backed by pixman images which are
> reference counted.  Some qemu code which depends on the backing store
> staying around while not holding the iolock work with the pixman image
> directly because they can just take a reference then to avoid the image
> being freed while they use it.
>
> >  * memory_region_snapshot_and_clear_dirty() ends up calling run_on_cpu(),
> >    which briefly drops the iothread lock.
>
> Oh.  Is that new?

Since commit 9458a9a1df1a4 in 2018.

> > How is this intended to work? I feel like if run_on_cpu() silently
> > drops the iothread lock this probably invalidates a lot of assumptions
> > that QEMU code makes, especially in this kind of setup where
> > the code making the assumptions is several layers in the callstack
> > above whatever it is that ends up calling run_on_cpu()...
>
> Indeed.  The display update code paths using dirty bitmap snapshots
> certainly don't expect that.

Yeah. The problem is that to fix the bug that 9458a9a1df1a4 is
trying to fix we really do have to allow guest code to run,
because we need to make sure that the TCG CPU thread has
finished writing to RAM and got out of the generated code
block, otherwise the dirty flag won't be consistent.

-- PMM


      reply	other threads:[~2022-08-15 13:05 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-01 13:23 race condition in display device caused by run_on_cpu() dropping the iothread lock Peter Maydell
2022-08-15 11:22 ` Gerd Hoffmann
2022-08-15 13:02   ` Peter Maydell [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFEAcA_vFUfGA+51YT-Up3GCVzQKPtpscXMncqiGd849oC9NkA@mail.gmail.com \
    --to=peter.maydell@linaro.org \
    --cc=alex.bennee@linaro.org \
    --cc=kraxel@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.