All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tamas K Lengyel <tamas@tklengyel.com>
To: Razvan Cojocaru <rcojocaru@bitdefender.com>
Cc: "Petre Pircalabu" <ppircalabu@bitdefender.com>,
	"Andrei LUTAS" <vlutas@bitdefender.com>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Julien Grall" <julien@xen.org>, "Wei Liu" <wl@xen.org>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Mihai Donțu" <mdontu@bitdefender.com>,
	"Ian Jackson" <ian.jackson@eu.citrix.com>,
	"George Dunlap" <george.dunlap@citrix.com>,
	"Jan Beulich" <jbeulich@suse.com>,
	"Alexandru Isaila" <aisaila@bitdefender.com>,
	Xen-devel <xen-devel@lists.xenproject.org>,
	"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [PATCH v4 for-4.14] x86/monitor: revert default behavior when monitoring register write events
Date: Mon, 8 Jun 2020 13:54:51 -0600	[thread overview]
Message-ID: <CABfawhnNC3yCuG+xNicyjA_Qo89qpvXKL-Cp9wAc4Cq=Xv8BYQ@mail.gmail.com> (raw)
In-Reply-To: <ffa44e09-a9fd-8fff-16af-e0991db3cb9b@bitdefender.com>

On Mon, Jun 8, 2020 at 12:58 PM Razvan Cojocaru
<rcojocaru@bitdefender.com> wrote:
>
> On 6/8/20 6:55 PM, Jan Beulich wrote:
> > On 03.06.2020 17:07, Roger Pau Monné wrote:
> >> On Wed, Jun 03, 2020 at 06:52:37AM -0600, Tamas K Lengyel wrote:
> >>> For the last couple years we have received numerous reports from users of
> >>> monitor vm_events of spurious guest crashes when using events. In particular,
> >>> it has observed that the problem occurs when vm_events are being disabled. The
> >>> nature of the guest crash varied widely and has only occured occasionally. This
> >>> made debugging the issue particularly hard. We had discussions about this issue
> >>> even here on the xen-devel mailinglist with no luck figuring it out.
> >>>
> >>> The bug has now been identified as a race-condition between register event
> >>> handling and disabling the monitor vm_event interface. The default behavior
> >>> regarding emulation of register write events is changed so that they get
> >>> postponed until the corresponding vm_event handler decides whether to allow such
> >>> write to take place. Unfortunately this can only be implemented by performing the
> >>> deny/allow step when the vCPU gets scheduled.
> >>>
> >>> Due to that postponed emulation of the event if the user decides to pause the
> >>> VM in the vm_event handler and then disable events, the entire emulation step
> >>> is skipped the next time the vCPU is resumed. Even if the user doesn't pause
> >>> during the vm_event handling but exits immediately and disables vm_event, the
> >>> situation becomes racey as disabling vm_event may succeed before the guest's
> >>> vCPUs get scheduled with the pending emulation task. This has been particularly
> >>> the case with VMS that have several vCPUs as after the VM is unpaused it may
> >>> actually take a long time before all vCPUs get scheduled.
> >>>
> >>> In this patch we are reverting the default behavior to always perform emulation
> >>> of register write events when the event occurs. To postpone them can be turned
> >>> on as an option. In that case the user of the interface still has to take care
> >>> of only disabling the interface when its safe as it remains buggy.
> >>>
> >>> Fixes: 96760e2fba10 ('vm_event: deny register writes if refused by vm_event
> >>> reply').
> >>>
> >>> Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
> >>
> >> Thanks!
> >>
> >> Reviewed-by: Roger Pau Monné <rogerpau@citrix.com>
> >>
> >> I would like to get some input from Bitdefender really, and whether
> >> they are fine with this approach.
>
> Hello,
>
> Not really my call to make anymore, but I do have a few notes.
>
> First, IIRC the problem stems from the initial choice to have the
> vm_event data allocated on-demand when first subscribing to events. The
> proper solution (since this patch doesn't actually fix the problem),
> IMHO, would be for the vm_event data to _always_ exist, and instead of
> relying on the value of its pointer to check if there are event
> subscribers, we could just check the emulation flags individually and
> never miss a pending emulated something again. I did try to go that way
> in the beginning, but it has reasonably been objected that we should cut
> back on using hypervisor memory unnecessarily, hence we got at this point.
>
> Secondly, I see no reason why we couldn't adapt to the new default
> behaviour provided that the old behaviour continues to work _exactly_ as
> before.
>
> And last but not least, the proper sequence is: 1. unsubscribe from
> register write events, 2. process all events "still in the chamber"
> (keep checking the ring buffer for a while), 3. detach from the guest
> (disable the vm_event subsystem). Not ideal perhaps (in that it's not
> guaranteed that a VCPU won't resume after a longer period than our
> timeout), but if the sequence is followed there should be no guest hangs
> or crashes (at least none that we or our clients have observed so far).

Incorrect. That's not enough. You also have to wait for all the vCPUs
to get scheduled before disabling vm_event or otherwise the emulation
is skipped entirely. Please read the patch message. If the user
decides to disable vm_event after getting a CR3 event delivered, the
CR3 never gets updated and results in the guest crashing in
unpredictable ways. Same happens with all the other registers.

>
> So in short, I think there's a better fix for this by simply not
> allocating the vm_event memory on-demand anymore and never having to
> deal with lost pending emulations again. It should also decrease code
> complexity by a tiny bit. Then again, as stated at the beginning of this
> message, that's just a recommendation.

Since only you guys use this feature I'm going to wait for a fix.
Until then, the default behavior should be restored so this buggy
behavior doesn't affect anyone else. You can still turn it on, its
just not going to be on for vm_event by default. I don't particularly
care what fix is there since only you guys use it. If you don't mind
that there is this buggy behavior because you never disable vm_event
once you activate it then that's that.

Cheers,
Tamas


  reply	other threads:[~2020-06-08 19:55 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-03 12:52 [PATCH v4 for-4.14] x86/monitor: revert default behavior when monitoring register write events Tamas K Lengyel
2020-06-03 15:07 ` Roger Pau Monné
2020-06-08 15:55   ` Jan Beulich
2020-06-08 18:58     ` Razvan Cojocaru
2020-06-08 19:54       ` Tamas K Lengyel [this message]
2020-06-08 20:14         ` Razvan Cojocaru
2020-06-08 20:44           ` Tamas K Lengyel
2020-06-08 21:16             ` Razvan Cojocaru
2020-06-08 22:50               ` Tamas K Lengyel
2020-06-08 23:14                 ` Razvan Cojocaru
2020-06-08 23:41                   ` Tamas K Lengyel
2020-06-09  6:28       ` Jan Beulich
2020-06-09  9:37 ` Jan Beulich
2020-06-09  9:48   ` Paul Durrant
2020-06-10 16:35     ` Tamas K Lengyel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABfawhnNC3yCuG+xNicyjA_Qo89qpvXKL-Cp9wAc4Cq=Xv8BYQ@mail.gmail.com' \
    --to=tamas@tklengyel.com \
    --cc=aisaila@bitdefender.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=mdontu@bitdefender.com \
    --cc=ppircalabu@bitdefender.com \
    --cc=rcojocaru@bitdefender.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=vlutas@bitdefender.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.