linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Maxim Levitsky <mlevitsk@redhat.com>
To: Nicolas Saenz Julienne <nsaenz@amazon.com>,
	Alexander Graf <graf@amazon.com>,
	kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org,
	pbonzini@redhat.com, seanjc@google.com, vkuznets@redhat.com,
	anelkz@amazon.com, dwmw@amazon.co.uk, jgowans@amazon.com,
	corbert@lwn.net, kys@microsoft.com, haiyangz@microsoft.com,
	decui@microsoft.com, x86@kernel.org, linux-doc@vger.kernel.org
Subject: Re: [RFC 30/33] KVM: x86: hyper-v: Introduce KVM_REQ_HV_INJECT_INTERCEPT request
Date: Tue, 28 Nov 2023 10:19:24 +0200	[thread overview]
Message-ID: <a19e0fd07f591dc768403544f92227b1121e068d.camel@redhat.com> (raw)
In-Reply-To: <CWTH00RO3SCI.31S210JQ8XP8J@amazon.com>

On Wed, 2023-11-08 at 13:38 +0000, Nicolas Saenz Julienne wrote:
> On Wed Nov 8, 2023 at 12:45 PM UTC, Alexander Graf wrote:
> > On 08.11.23 12:18, Nicolas Saenz Julienne wrote:
> > > Introduce a new request type, KVM_REQ_HV_INJECT_INTERCEPT which allows
> > > injecting out-of-band Hyper-V secure intercepts. For now only memory
> > > access intercepts are supported. These are triggered when access a GPA
> > > protected by a higher VTL. The memory intercept metadata is filled based
> > > on the GPA provided through struct kvm_vcpu_hv_intercept_info, and
> > > injected into the guest through SynIC message.
> > > 
> > > Signed-off-by: Nicolas Saenz Julienne <nsaenz@amazon.com>
> > 
> > IMHO memory protection violations should result in a user space exit. 
> 
> It already does, it's not very explicit from the patch itself, since the
> functionality was introduced in through the "KVM: guest_memfd() and
> per-page attributes" series [1].
> 
> See this snippet in patch #27:
> 
> +	if (kvm_hv_vsm_enabled(vcpu->kvm)) {
> +		if (kvm_hv_faultin_pfn(vcpu, fault)) {
> +			kvm_mmu_prepare_memory_fault_exit(vcpu, fault);
> +			return -EFAULT;
> +		}
> +	}
> 
> Otherwise the doc in patch #33 also mentions this. :)
> 
> > User space can then validate what to do with the violation and if 
> > necessary inject an intercept.
> 
> I do agree that secure intercept injection should be moved into to
> user-space, and happen as a reaction to a user-space memory fault exit.
> I was unable to do so yet, since the intercepts require a level of
> introspection that is not yet available to QEMU. For example, providing
> the length of the instruction that caused the fault. I'll work on
> exposing the necessary information to user-space and move the whole
> intercept concept there.

All the missing information should be included in the new userspace VM exit payload.


Also I would like to share my knowledge of SYNIC and how to deal with it in userspace,
because you will have to send lots of SYNC messages from userspace if we go with
the suggested approach of doing it in the userspace.

- SYNIC has one message per channel, so there is no way to queue more than one
message in the same channel. Usually only channel 0 is used (but I haven't researched
this much).

- In-kernel STIMER emulation queues Synic messages, but it always does this in
the vCPU thread by processing the request 'KVM_REQ_HV_STIMER', and when userspace
wants to queue something with SYNIC it also does this on vCPU thread, this is how
races are avoided.

kvm_hv_process_stimers -> stimer_expiration -> stimer_send_msg(stimer);

If the delivery fails (that is if SYNIC slot already has a message pending there),
then the timer remains pending and the next KVM_REQ_HV_STIMER request will attempt to
deliver it again.
 
After the guest processes a SYNIC message, it erases it by overwriting its message type with 0,
and then the guest notifies the hypervisor about a free slot by either doing a write
to a special MSR (HV_X64_MSR_EOM) or by EOI'ing the APIC interrupt.

According to my observation windows uses the second approach (EOI),
which thankfully works even on AVIC because the Sync Interrupts happen to be level triggered,
and AVIC does intercept level triggered EOI.

Once intercepted the EOI event triggers a delivery of an another stimer message via the vCPU thread,
by raising another KVM_REQ_HV_STIMER request on it.

kvm_hv_notify_acked_sint -> stimer_mark_pending -> kvm_make_request(KVM_REQ_HV_STIMER, vcpu);


Now if the userspace faces already full SYNIC slot, it has to wait, and I don't know if
it can be notified of an EOI or it busy waits somehow.

Note that Qemu's VMBUS/SYNC implementation was never tested in production IMHO, it was once implemented
and is only used currently by a few unit tests.

It might make sense to add userspace SYNIC message queuing to the kernel,
so that userspace could queue as many messages as it wants and let the kernel
copy the first message in the queue to the actual SYNIC slot, every time
it becomes free.


Final note on SYNIC is that qemu's synic code actually installs an overlay
page over sync slots area and writes to it when it queues a message, 
but the in-kernel stimer code just writes to the GPA regardless
if there is an overlay memslot or not.

Another benefit of a proper way of queuing SYNIC messages from the userspace,
is that it might enable the kernel's STIMER to queue the SYNIC message
directly from the timer interrupt routine, which will remove about 1000 vmexits per second that
are caused by KVM_REQ_HV_STIMER on vCPU0, even when posted timer interrupts are used.

I can implement this if you think that this makes sense.

Those are my 0.2 cents.

Best regards,
	Maxim Levitsky


> 
> Nicolas
> 
> [1] https://lore.kernel.org/lkml/20231105163040.14904-1-pbonzini@redhat.com/.
> 



  reply	other threads:[~2023-11-28  8:19 UTC|newest]

Thread overview: 108+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-08 11:17 [RFC 0/33] KVM: x86: hyperv: Introduce VSM support Nicolas Saenz Julienne
2023-11-08 11:17 ` [RFC 01/33] KVM: x86: Decouple lapic.h from hyperv.h Nicolas Saenz Julienne
2023-11-08 16:11   ` Sean Christopherson
2023-11-08 11:17 ` [RFC 02/33] KVM: x86: Introduce KVM_CAP_APIC_ID_GROUPS Nicolas Saenz Julienne
2023-11-08 12:11   ` Alexander Graf
2023-11-08 17:47   ` Sean Christopherson
2023-11-10 18:46     ` Nicolas Saenz Julienne
2023-11-28  6:56   ` Maxim Levitsky
2023-12-01 15:25     ` Nicolas Saenz Julienne
2023-11-08 11:17 ` [RFC 03/33] KVM: x86: hyper-v: Introduce XMM output support Nicolas Saenz Julienne
2023-11-08 11:44   ` Alexander Graf
2023-11-08 12:11     ` Vitaly Kuznetsov
2023-11-08 12:16       ` Alexander Graf
2023-11-28  6:57         ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 04/33] KVM: x86: hyper-v: Move hypercall page handling into separate function Nicolas Saenz Julienne
2023-11-28  7:01   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 05/33] KVM: x86: hyper-v: Introduce VTL call/return prologues in hypercall page Nicolas Saenz Julienne
2023-11-08 11:53   ` Alexander Graf
2023-11-08 14:10     ` Nicolas Saenz Julienne
2023-11-28  7:08   ` Maxim Levitsky
2023-11-28 16:33     ` Sean Christopherson
2023-12-01 16:19     ` Nicolas Saenz Julienne
2023-12-01 16:32       ` Sean Christopherson
2023-12-01 16:50         ` Nicolas Saenz Julienne
2023-12-01 17:47           ` Sean Christopherson
2023-12-01 18:15             ` Nicolas Saenz Julienne
2023-12-05 19:21               ` Sean Christopherson
2023-12-05 20:04                 ` Maxim Levitsky
2023-12-06  0:07                   ` Sean Christopherson
2023-12-06 16:19                     ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 06/33] KVM: x86: hyper-v: Introduce VTL awareness to Hyper-V's PV-IPIs Nicolas Saenz Julienne
2023-11-28  7:14   ` Maxim Levitsky
2023-12-01 16:31     ` Nicolas Saenz Julienne
2023-12-05 15:02       ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 07/33] KVM: x86: hyper-v: Introduce KVM_CAP_HYPERV_VSM Nicolas Saenz Julienne
2023-11-28  7:16   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 08/33] KVM: x86: Don't use hv_timer if CAP_HYPERV_VSM enabled Nicolas Saenz Julienne
2023-11-28  7:21   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 09/33] KVM: x86: hyper-v: Introduce per-VTL vcpu helpers Nicolas Saenz Julienne
2023-11-08 12:21   ` Alexander Graf
2023-11-08 14:04     ` Nicolas Saenz Julienne
2023-11-28  7:25   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 10/33] KVM: x86: hyper-v: Introduce KVM_HV_GET_VSM_STATE Nicolas Saenz Julienne
2023-11-28  7:26   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 11/33] KVM: x86: hyper-v: Handle GET/SET_VP_REGISTER hcall in user-space Nicolas Saenz Julienne
2023-11-08 12:14   ` Alexander Graf
2023-11-28  7:26     ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 12/33] KVM: x86: hyper-v: Handle VSM hcalls " Nicolas Saenz Julienne
2023-11-28  7:28   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 13/33] KVM: Allow polling vCPUs for events Nicolas Saenz Julienne
2023-11-28  7:30   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 14/33] KVM: x86: Add VTL to the MMU role Nicolas Saenz Julienne
2023-11-08 17:26   ` Sean Christopherson
2023-11-10 18:52     ` Nicolas Saenz Julienne
2023-11-28  7:34       ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 15/33] KVM: x86/mmu: Introduce infrastructure to handle non-executable faults Nicolas Saenz Julienne
2023-11-28  7:34   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 16/33] KVM: x86/mmu: Expose R/W/X flags during memory fault exits Nicolas Saenz Julienne
2023-11-28  7:36   ` Maxim Levitsky
2023-11-28 16:31     ` Sean Christopherson
2023-11-08 11:17 ` [RFC 17/33] KVM: x86/mmu: Allow setting memory attributes if VSM enabled Nicolas Saenz Julienne
2023-11-28  7:39   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 18/33] KVM: x86: Decouple kvm_get_memory_attributes() from struct kvm's mem_attr_array Nicolas Saenz Julienne
2023-11-08 16:59   ` Sean Christopherson
2023-11-28  7:41   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 19/33] KVM: x86: Decouple kvm_range_has_memory_attributes() " Nicolas Saenz Julienne
2023-11-28  7:42   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 20/33] KVM: x86/mmu: Decouple hugepage_has_attrs() " Nicolas Saenz Julienne
2023-11-28  7:43   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 21/33] KVM: Pass memory attribute array as a MMU notifier argument Nicolas Saenz Julienne
2023-11-08 17:08   ` Sean Christopherson
2023-11-08 11:17 ` [RFC 22/33] KVM: Decouple kvm_ioctl_set_mem_attributes() from kvm's mem_attr_array Nicolas Saenz Julienne
2023-11-08 11:17 ` [RFC 23/33] KVM: Expose memory attribute helper functions unanimously Nicolas Saenz Julienne
2023-11-08 11:17 ` [RFC 24/33] KVM: x86: hyper-v: Introduce KVM VTL device Nicolas Saenz Julienne
2023-11-08 11:17 ` [RFC 25/33] KVM: Introduce a set of new memory attributes Nicolas Saenz Julienne
2023-11-08 12:30   ` Alexander Graf
2023-11-08 16:43     ` Sean Christopherson
2023-11-08 11:17 ` [RFC 26/33] KVM: x86: hyper-vsm: Allow setting per-VTL " Nicolas Saenz Julienne
2023-11-28  7:44   ` Maxim Levitsky
2023-11-08 11:18 ` [RFC 27/33] KVM: x86/mmu/hyper-v: Validate memory faults against per-VTL memprots Nicolas Saenz Julienne
2023-11-28  7:46   ` Maxim Levitsky
2023-11-08 11:18 ` [RFC 28/33] x86/hyper-v: Introduce memory intercept message structure Nicolas Saenz Julienne
2023-11-28  7:53   ` Maxim Levitsky
2023-11-08 11:18 ` [RFC 29/33] KVM: VMX: Save instruction length on EPT violation Nicolas Saenz Julienne
2023-11-08 12:40   ` Alexander Graf
2023-11-08 16:15     ` Sean Christopherson
2023-11-08 17:11       ` Alexander Graf
2023-11-08 17:20   ` Sean Christopherson
2023-11-08 17:27     ` Alexander Graf
2023-11-08 18:19       ` Jim Mattson
2023-11-08 11:18 ` [RFC 30/33] KVM: x86: hyper-v: Introduce KVM_REQ_HV_INJECT_INTERCEPT request Nicolas Saenz Julienne
2023-11-08 12:45   ` Alexander Graf
2023-11-08 13:38     ` Nicolas Saenz Julienne
2023-11-28  8:19       ` Maxim Levitsky [this message]
2023-11-08 11:18 ` [RFC 31/33] KVM: x86: hyper-v: Inject intercept on VTL memory protection fault Nicolas Saenz Julienne
2023-11-08 11:18 ` [RFC 32/33] KVM: x86: hyper-v: Implement HVCALL_TRANSLATE_VIRTUAL_ADDRESS Nicolas Saenz Julienne
2023-11-08 12:49   ` Alexander Graf
2023-11-08 13:44     ` Nicolas Saenz Julienne
2023-11-08 11:18 ` [RFC 33/33] Documentation: KVM: Introduce "Emulating Hyper-V VSM with KVM" Nicolas Saenz Julienne
2023-11-28  8:19   ` Maxim Levitsky
2023-11-08 11:40 ` [RFC 0/33] KVM: x86: hyperv: Introduce VSM support Alexander Graf
2023-11-08 14:41   ` Nicolas Saenz Julienne
2023-11-08 16:55 ` Sean Christopherson
2023-11-08 18:33   ` Sean Christopherson
2023-11-10 17:56     ` Nicolas Saenz Julienne
2023-11-10 19:32       ` Sean Christopherson
2023-11-11 11:55         ` Nicolas Saenz Julienne
2023-11-10 19:04   ` Nicolas Saenz Julienne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a19e0fd07f591dc768403544f92227b1121e068d.camel@redhat.com \
    --to=mlevitsk@redhat.com \
    --cc=anelkz@amazon.com \
    --cc=corbert@lwn.net \
    --cc=decui@microsoft.com \
    --cc=dwmw@amazon.co.uk \
    --cc=graf@amazon.com \
    --cc=haiyangz@microsoft.com \
    --cc=jgowans@amazon.com \
    --cc=kvm@vger.kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nsaenz@amazon.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).