linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Saenz Julienne <nsaenz@amazon.com>
To: Maxim Levitsky <mlevitsk@redhat.com>, <kvm@vger.kernel.org>
Cc: <linux-kernel@vger.kernel.org>, <linux-hyperv@vger.kernel.org>,
	<pbonzini@redhat.com>, <seanjc@google.com>, <vkuznets@redhat.com>,
	<anelkz@amazon.com>, <graf@amazon.com>, <dwmw@amazon.co.uk>,
	<jgowans@amazon.com>, <corbert@lwn.net>, <kys@microsoft.com>,
	<haiyangz@microsoft.com>, <decui@microsoft.com>, <x86@kernel.org>,
	<linux-doc@vger.kernel.org>, Anel Orazgaliyeva <anelkz@amazon.de>
Subject: Re: [RFC 02/33] KVM: x86: Introduce KVM_CAP_APIC_ID_GROUPS
Date: Fri, 1 Dec 2023 15:25:06 +0000	[thread overview]
Message-ID: <CXD3O3XBHKZO.22U5VF0HFBTC9@amazon.com> (raw)
In-Reply-To: <98eee37ed7f4b7b9c16bccbe41737e47a116d1f1.camel@redhat.com>

Hi Maxim,

On Tue Nov 28, 2023 at 6:56 AM UTC, Maxim Levitsky wrote:
> On Wed, 2023-11-08 at 11:17 +0000, Nicolas Saenz Julienne wrote:
> > From: Anel Orazgaliyeva <anelkz@amazon.de>
> >
> > Introduce KVM_CAP_APIC_ID_GROUPS, this capability segments the VM's APIC
> > ids into two. The lower bits, the physical APIC id, represent the part
> > that's exposed to the guest. The higher bits, which are private to KVM,
> > groups APICs together. APICs in different groups are isolated from each
> > other, and IPIs can only be directed at APICs that share the same group
> > as its source. Furthermore, groups are only relevant to IPIs, anything
> > incoming from outside the local APIC complex: from the IOAPIC, MSIs, or
> > PV-IPIs is targeted at the default APIC group, group 0.
> >
> > When routing IPIs with physical destinations, KVM will OR the source's
> > vCPU APIC group with the ICR's destination ID and use that to resolve
> > the target lAPIC. The APIC physical map is also made group aware in
> > order to speed up this process. For the sake of simplicity, the logical
> > map is not built while KVM_CAP_APIC_ID_GROUPS is in use and we defer IPI
> > routing to the slower per-vCPU scan method.
> >
> > This capability serves as a building block to implement virtualisation
> > based security features like Hyper-V's Virtual Secure Mode (VSM). VSM
> > introduces a para-virtualised switch that allows for guest CPUs to jump
> > into a different execution context, this switches into a different CPU
> > state, lAPIC state, and memory protections. We model this in KVM by
> > using distinct kvm_vcpus for each context. Moreover, execution contexts
> > are hierarchical and its APICs are meant to remain functional even when
> > the context isn't 'scheduled in'. For example, we have to keep track of
> > timers' expirations, and interrupt execution of lesser priority contexts
> > when relevant. Hence the need to alias physical APIC ids, while keeping
> > the ability to target specific execution contexts.
>
>
> A few general remarks on this patch (assuming that we don't go with
> the approach of a VM per VTL, in which case this patch is not needed)
>
> -> This feature has to be done in the kernel because vCPUs sharing same VTL,
>    will have same APIC ID.
>    (In addition to that, APIC state is private to a VTL so each VTL
>    can even change its apic id).
>
>    Because of this KVM has to have at least some awareness of this.
>
> -> APICv/AVIC should be supported with VTL eventually:
>    This is thankfully possible by having separate physid/pid tables per VTL,
>    and will mostly just work but needs KVM awareness.
>
> -> I am somewhat against reserving bits in apic id, because that will limit
>    the number of apic id bits available to userspace. Currently this is not
>    a problem but it might be in the future if for some reason the userspace
>    will want an apic id with high bits set.
>
>    But still things change, and with this being part of KVM's ABI, it might backfire.
>    A better idea IMHO is just to have 'APIC namespaces', which like say PID namespaces,
>    such as each namespace is isolated IPI wise on its own, and let each vCPU belong to
>    a one namespace.
>
>    In fact Intel's PRM has a brief mention of a 'hierarchical cluster' mode in which
>    roughly describes this situation in which there are multiple not interconnected APIC buses,
>    and communication between them needs a 'cluster manager device'
>
>    However I don't think that we need an explicit pairs of vCPUs and VTL awareness in the kernel
>    all of this I think can be done in userspace.
>
>    TL;DR: Lets have APIC namespace. a vCPU can belong to a single namespace, and all vCPUs
>    in a namespace send IPIs to each other and know nothing about vCPUs from other namespace.
>
>    A vCPU sending IPI to a different VTL thankfully can only do this using a hypercall,
>    and thus can be handled in the userspace.
>
>
> Overall though IMHO the approach of a VM per VTL is better unless some show stoppers show up.
> If we go with a VM per VTL, we gain APIC namespaces for free, together with AVIC support and
> such.


Thanks, for the thorough review! I took note of all your design comments
(here and in subsequent patches).

I agree that the way to go is the VM per VTL approach. I'll prepare a
PoC as soon as I'm back from the holidays and share my results.

Nicolas

  reply	other threads:[~2023-12-01 15:25 UTC|newest]

Thread overview: 108+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-08 11:17 [RFC 0/33] KVM: x86: hyperv: Introduce VSM support Nicolas Saenz Julienne
2023-11-08 11:17 ` [RFC 01/33] KVM: x86: Decouple lapic.h from hyperv.h Nicolas Saenz Julienne
2023-11-08 16:11   ` Sean Christopherson
2023-11-08 11:17 ` [RFC 02/33] KVM: x86: Introduce KVM_CAP_APIC_ID_GROUPS Nicolas Saenz Julienne
2023-11-08 12:11   ` Alexander Graf
2023-11-08 17:47   ` Sean Christopherson
2023-11-10 18:46     ` Nicolas Saenz Julienne
2023-11-28  6:56   ` Maxim Levitsky
2023-12-01 15:25     ` Nicolas Saenz Julienne [this message]
2023-11-08 11:17 ` [RFC 03/33] KVM: x86: hyper-v: Introduce XMM output support Nicolas Saenz Julienne
2023-11-08 11:44   ` Alexander Graf
2023-11-08 12:11     ` Vitaly Kuznetsov
2023-11-08 12:16       ` Alexander Graf
2023-11-28  6:57         ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 04/33] KVM: x86: hyper-v: Move hypercall page handling into separate function Nicolas Saenz Julienne
2023-11-28  7:01   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 05/33] KVM: x86: hyper-v: Introduce VTL call/return prologues in hypercall page Nicolas Saenz Julienne
2023-11-08 11:53   ` Alexander Graf
2023-11-08 14:10     ` Nicolas Saenz Julienne
2023-11-28  7:08   ` Maxim Levitsky
2023-11-28 16:33     ` Sean Christopherson
2023-12-01 16:19     ` Nicolas Saenz Julienne
2023-12-01 16:32       ` Sean Christopherson
2023-12-01 16:50         ` Nicolas Saenz Julienne
2023-12-01 17:47           ` Sean Christopherson
2023-12-01 18:15             ` Nicolas Saenz Julienne
2023-12-05 19:21               ` Sean Christopherson
2023-12-05 20:04                 ` Maxim Levitsky
2023-12-06  0:07                   ` Sean Christopherson
2023-12-06 16:19                     ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 06/33] KVM: x86: hyper-v: Introduce VTL awareness to Hyper-V's PV-IPIs Nicolas Saenz Julienne
2023-11-28  7:14   ` Maxim Levitsky
2023-12-01 16:31     ` Nicolas Saenz Julienne
2023-12-05 15:02       ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 07/33] KVM: x86: hyper-v: Introduce KVM_CAP_HYPERV_VSM Nicolas Saenz Julienne
2023-11-28  7:16   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 08/33] KVM: x86: Don't use hv_timer if CAP_HYPERV_VSM enabled Nicolas Saenz Julienne
2023-11-28  7:21   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 09/33] KVM: x86: hyper-v: Introduce per-VTL vcpu helpers Nicolas Saenz Julienne
2023-11-08 12:21   ` Alexander Graf
2023-11-08 14:04     ` Nicolas Saenz Julienne
2023-11-28  7:25   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 10/33] KVM: x86: hyper-v: Introduce KVM_HV_GET_VSM_STATE Nicolas Saenz Julienne
2023-11-28  7:26   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 11/33] KVM: x86: hyper-v: Handle GET/SET_VP_REGISTER hcall in user-space Nicolas Saenz Julienne
2023-11-08 12:14   ` Alexander Graf
2023-11-28  7:26     ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 12/33] KVM: x86: hyper-v: Handle VSM hcalls " Nicolas Saenz Julienne
2023-11-28  7:28   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 13/33] KVM: Allow polling vCPUs for events Nicolas Saenz Julienne
2023-11-28  7:30   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 14/33] KVM: x86: Add VTL to the MMU role Nicolas Saenz Julienne
2023-11-08 17:26   ` Sean Christopherson
2023-11-10 18:52     ` Nicolas Saenz Julienne
2023-11-28  7:34       ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 15/33] KVM: x86/mmu: Introduce infrastructure to handle non-executable faults Nicolas Saenz Julienne
2023-11-28  7:34   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 16/33] KVM: x86/mmu: Expose R/W/X flags during memory fault exits Nicolas Saenz Julienne
2023-11-28  7:36   ` Maxim Levitsky
2023-11-28 16:31     ` Sean Christopherson
2023-11-08 11:17 ` [RFC 17/33] KVM: x86/mmu: Allow setting memory attributes if VSM enabled Nicolas Saenz Julienne
2023-11-28  7:39   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 18/33] KVM: x86: Decouple kvm_get_memory_attributes() from struct kvm's mem_attr_array Nicolas Saenz Julienne
2023-11-08 16:59   ` Sean Christopherson
2023-11-28  7:41   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 19/33] KVM: x86: Decouple kvm_range_has_memory_attributes() " Nicolas Saenz Julienne
2023-11-28  7:42   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 20/33] KVM: x86/mmu: Decouple hugepage_has_attrs() " Nicolas Saenz Julienne
2023-11-28  7:43   ` Maxim Levitsky
2023-11-08 11:17 ` [RFC 21/33] KVM: Pass memory attribute array as a MMU notifier argument Nicolas Saenz Julienne
2023-11-08 17:08   ` Sean Christopherson
2023-11-08 11:17 ` [RFC 22/33] KVM: Decouple kvm_ioctl_set_mem_attributes() from kvm's mem_attr_array Nicolas Saenz Julienne
2023-11-08 11:17 ` [RFC 23/33] KVM: Expose memory attribute helper functions unanimously Nicolas Saenz Julienne
2023-11-08 11:17 ` [RFC 24/33] KVM: x86: hyper-v: Introduce KVM VTL device Nicolas Saenz Julienne
2023-11-08 11:17 ` [RFC 25/33] KVM: Introduce a set of new memory attributes Nicolas Saenz Julienne
2023-11-08 12:30   ` Alexander Graf
2023-11-08 16:43     ` Sean Christopherson
2023-11-08 11:17 ` [RFC 26/33] KVM: x86: hyper-vsm: Allow setting per-VTL " Nicolas Saenz Julienne
2023-11-28  7:44   ` Maxim Levitsky
2023-11-08 11:18 ` [RFC 27/33] KVM: x86/mmu/hyper-v: Validate memory faults against per-VTL memprots Nicolas Saenz Julienne
2023-11-28  7:46   ` Maxim Levitsky
2023-11-08 11:18 ` [RFC 28/33] x86/hyper-v: Introduce memory intercept message structure Nicolas Saenz Julienne
2023-11-28  7:53   ` Maxim Levitsky
2023-11-08 11:18 ` [RFC 29/33] KVM: VMX: Save instruction length on EPT violation Nicolas Saenz Julienne
2023-11-08 12:40   ` Alexander Graf
2023-11-08 16:15     ` Sean Christopherson
2023-11-08 17:11       ` Alexander Graf
2023-11-08 17:20   ` Sean Christopherson
2023-11-08 17:27     ` Alexander Graf
2023-11-08 18:19       ` Jim Mattson
2023-11-08 11:18 ` [RFC 30/33] KVM: x86: hyper-v: Introduce KVM_REQ_HV_INJECT_INTERCEPT request Nicolas Saenz Julienne
2023-11-08 12:45   ` Alexander Graf
2023-11-08 13:38     ` Nicolas Saenz Julienne
2023-11-28  8:19       ` Maxim Levitsky
2023-11-08 11:18 ` [RFC 31/33] KVM: x86: hyper-v: Inject intercept on VTL memory protection fault Nicolas Saenz Julienne
2023-11-08 11:18 ` [RFC 32/33] KVM: x86: hyper-v: Implement HVCALL_TRANSLATE_VIRTUAL_ADDRESS Nicolas Saenz Julienne
2023-11-08 12:49   ` Alexander Graf
2023-11-08 13:44     ` Nicolas Saenz Julienne
2023-11-08 11:18 ` [RFC 33/33] Documentation: KVM: Introduce "Emulating Hyper-V VSM with KVM" Nicolas Saenz Julienne
2023-11-28  8:19   ` Maxim Levitsky
2023-11-08 11:40 ` [RFC 0/33] KVM: x86: hyperv: Introduce VSM support Alexander Graf
2023-11-08 14:41   ` Nicolas Saenz Julienne
2023-11-08 16:55 ` Sean Christopherson
2023-11-08 18:33   ` Sean Christopherson
2023-11-10 17:56     ` Nicolas Saenz Julienne
2023-11-10 19:32       ` Sean Christopherson
2023-11-11 11:55         ` Nicolas Saenz Julienne
2023-11-10 19:04   ` Nicolas Saenz Julienne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CXD3O3XBHKZO.22U5VF0HFBTC9@amazon.com \
    --to=nsaenz@amazon.com \
    --cc=anelkz@amazon.com \
    --cc=anelkz@amazon.de \
    --cc=corbert@lwn.net \
    --cc=decui@microsoft.com \
    --cc=dwmw@amazon.co.uk \
    --cc=graf@amazon.com \
    --cc=haiyangz@microsoft.com \
    --cc=jgowans@amazon.com \
    --cc=kvm@vger.kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).