All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gerd Hoffmann <kraxel@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: "QEMU Developers" <qemu-devel@nongnu.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Alex Bennée" <alex.bennee@linaro.org>
Subject: Re: race condition in display device caused by run_on_cpu() dropping the iothread lock
Date: Mon, 15 Aug 2022 13:22:39 +0200	[thread overview]
Message-ID: <20220815112239.37xm3zwbe5gd7trz@sirius.home.kraxel.org> (raw)
In-Reply-To: <CAFEAcA9odnPo2LPip295Uztri7JfoVnQbkJ=Wn+k8dQneB_ynQ@mail.gmail.com>

On Mon, Aug 01, 2022 at 02:23:55PM +0100, Peter Maydell wrote:
> I've been debugging a segfault in the raspi3b display device, and I've
> tracked it down to a race condition, but I'm not sure what the right
> way to fix it is...
> 
> The race is that a vCPU thread is handling a guest register write that
> says "resize the framebuffer", which it implements by calling
> qemu_console_resize().

[ back online after vacation ]

Easiest is probably to not instantly resize the display surface but
let the update handler do that on the next display refresh.

Many display devices do that anyway because often multiple register
updates are needed to perform a resize and you don't want your ui
window run through all the temporary states ...

Alternative: The DisplaySurface is backed by pixman images which are
reference counted.  Some qemu code which depends on the backing store
staying around while not holding the iolock work with the pixman image
directly because they can just take a reference then to avoid the image
being freed while they use it.

>  * memory_region_snapshot_and_clear_dirty() ends up calling run_on_cpu(),
>    which briefly drops the iothread lock.

Oh.  Is that new?

> How is this intended to work? I feel like if run_on_cpu() silently
> drops the iothread lock this probably invalidates a lot of assumptions
> that QEMU code makes, especially in this kind of setup where
> the code making the assumptions is several layers in the callstack
> above whatever it is that ends up calling run_on_cpu()...

Indeed.  The display update code paths using dirty bitmap snapshots
certainly don't expect that.

take care,
  Gerd



  reply	other threads:[~2022-08-15 11:26 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-01 13:23 race condition in display device caused by run_on_cpu() dropping the iothread lock Peter Maydell
2022-08-15 11:22 ` Gerd Hoffmann [this message]
2022-08-15 13:02   ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220815112239.37xm3zwbe5gd7trz@sirius.home.kraxel.org \
    --to=kraxel@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.