All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Matlack <dmatlack@google.com>
To: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	kvm list <kvm@vger.kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Lan, Tianyu" <tianyu.lan@intel.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Jan Kiszka <jan.kiszka@web.de>, Peter Xu <peterx@redhat.com>
Subject: Re: [PATCH v1 10/11] KVM: x86: add KVM_CAP_X2APIC_API
Date: Fri, 1 Jul 2016 11:09:49 -0700	[thread overview]
Message-ID: <CALzav=eP90VZQ-cXSkYcKeG5tMZFvQFBueCPg3hr4w_DW08EqQ@mail.gmail.com> (raw)
In-Reply-To: <20160630205429.16480-11-rkrcmar@redhat.com>

On Thu, Jun 30, 2016 at 1:54 PM, Radim Krčmář <rkrcmar@redhat.com> wrote:
> KVM_CAP_X2APIC_API can be enabled to extend APIC ID in get/set ioctl and MSI
> addresses to 32 bits.  Both are needed to support x2APIC.
>
> The capability has to be toggleable and disabled by default, because get/set
> ioctl shifted and truncated APIC ID to 8 bits by using a non-standard protocol
> inspired by xAPIC and the change is not backward-compatible.
>
> Changes to MSI addresses follow the format used by interrupt remapping unit.
> The upper address word, that used to be 0, contains upper 24 bits of the LAPIC
> address in its upper 24 bits.  Lower 8 bits are reserved as 0.
> Using the upper address word is not backward-compatible either as we didn't
> check that userspace zeroed the word.  Reserved bits are still not explicitly
> checked, but non-zero data will affect LAPIC addresses, which will cause a bug.
>
> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
> ---
>  v1:
>  * rewritten with a toggleable capability [Paolo]
>  * dropped MSI_ADDR_EXT_DEST_ID to enforce reserved bits
>
>  Documentation/virtual/kvm/api.txt | 26 ++++++++++++++++++++++++++
>  arch/x86/include/asm/kvm_host.h   |  4 +++-
>  arch/x86/kvm/irq_comm.c           | 14 ++++++++++----
>  arch/x86/kvm/lapic.c              |  2 +-
>  arch/x86/kvm/vmx.c                |  2 +-
>  arch/x86/kvm/x86.c                | 12 ++++++++++++
>  include/uapi/linux/kvm.h          |  1 +
>  7 files changed, 54 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 09efa9eb3926..0f978089a0f6 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -1482,6 +1482,9 @@ struct kvm_irq_routing_msi {
>         __u32 pad;
>  };
>
> +If KVM_CAP_X2APIC_API is enabled, then address_hi bits 31-8 contain bits 31-8
> +of destination id and address_hi bits 7-0 is must be 0.
> +
>  struct kvm_irq_routing_s390_adapter {
>         __u64 ind_addr;
>         __u64 summary_addr;
> @@ -1583,6 +1586,13 @@ struct kvm_lapic_state {
>  Reads the Local APIC registers and copies them into the input argument.  The
>  data format and layout are the same as documented in the architecture manual.
>
> +If KVM_CAP_X2APIC_API is enabled, then the format of APIC_ID register depends
> +on APIC mode (reported by MSR_IA32_APICBASE) of its VCPU.  The format follows
> +xAPIC otherwise.
> +
> +x2APIC stores APIC ID as little endian in bits 31-0 of APIC_ID register.
> +xAPIC stores bits 7-0 of APIC ID in register bits 31-24.
> +
>
>  4.58 KVM_SET_LAPIC
>
> @@ -1600,6 +1610,8 @@ struct kvm_lapic_state {
>  Copies the input argument into the Local APIC registers.  The data format
>  and layout are the same as documented in the architecture manual.
>
> +See the note about APIC_ID register in KVM_GET_LAPIC.
> +
>
>  4.59 KVM_IOEVENTFD
>
> @@ -2180,6 +2192,9 @@ struct kvm_msi {
>
>  No flags are defined so far. The corresponding field must be 0.
>
> +If KVM_CAP_X2APIC_API is enabled, then address_hi bits 31-8 contain bits 31-8
> +of destination id and address_hi bits 7-0 is must be 0.
> +
>
>  4.71 KVM_CREATE_PIT2
>
> @@ -3811,6 +3826,17 @@ Allows use of runtime-instrumentation introduced with zEC12 processor.
>  Will return -EINVAL if the machine does not support runtime-instrumentation.
>  Will return -EBUSY if a VCPU has already been created.
>
> +7.7 KVM_CAP_X2APIC_API
> +
> +Architectures: x86
> +Parameters: none
> +Returns: 0 on success, -EINVAL if reserved parameters are not 0
> +
> +Enabling this capability changes the behavior of KVM_SET_GSI_ROUTING,
> +KVM_SIGNAL_MSI, KVM_SET_LAPIC, and KVM_GET_LAPIC.  See KVM_CAP_X2APIC_API
> +in their respective sections.
> +
> +
>  8. Other capabilities.
>  ----------------------
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 459a789cb3da..48b0ca18066c 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -782,6 +782,8 @@ struct kvm_arch {
>         u32 ldr_mode;
>         struct page *avic_logical_id_table_page;
>         struct page *avic_physical_id_table_page;
> +
> +       bool x2apic_api;
>  };
>
>  struct kvm_vm_stat {
> @@ -1365,7 +1367,7 @@ bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
>                              struct kvm_vcpu **dest_vcpu);
>
>  void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e,
> -                    struct kvm_lapic_irq *irq);
> +                    struct kvm_lapic_irq *irq, bool x2apic_api);
>
>  static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
>  {
> diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
> index 47ad681a33fd..4594644ab090 100644
> --- a/arch/x86/kvm/irq_comm.c
> +++ b/arch/x86/kvm/irq_comm.c
> @@ -111,12 +111,17 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
>  }
>
>  void kvm_set_msi_irq(struct kvm_kernel_irq_routing_entry *e,
> -                    struct kvm_lapic_irq *irq)
> +                    struct kvm_lapic_irq *irq, bool x2apic_api)
>  {
>         trace_kvm_msi_set_irq(e->msi.address_lo, e->msi.data);

This tracepoint should start reporting e->msi.address_hi as well now.

>
>         irq->dest_id = (e->msi.address_lo &
>                         MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
> +       if (x2apic_api)
> +               /* MSI_ADDR_EXT_DEST_ID() is omitted to introduce bugs on
> +                * userspaces that set reserved bits 0-7.
> +                */
> +               irq->dest_id |= e->msi.address_hi;
>         irq->vector = (e->msi.data &
>                         MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT;
>         irq->dest_mode = (1 << MSI_ADDR_DEST_MODE_SHIFT) & e->msi.address_lo;
> @@ -137,7 +142,7 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
>         if (!level)
>                 return -1;
>
> -       kvm_set_msi_irq(e, &irq);
> +       kvm_set_msi_irq(e, &irq, kvm->arch.x2apic_api);
>
>         return kvm_irq_delivery_to_apic(kvm, NULL, &irq, NULL);
>  }
> @@ -153,7 +158,7 @@ int kvm_arch_set_irq_inatomic(struct kvm_kernel_irq_routing_entry *e,
>         if (unlikely(e->type != KVM_IRQ_ROUTING_MSI))
>                 return -EWOULDBLOCK;
>
> -       kvm_set_msi_irq(e, &irq);
> +       kvm_set_msi_irq(e, &irq, kvm->arch.x2apic_api);
>
>         if (kvm_irq_delivery_to_apic_fast(kvm, NULL, &irq, &r, NULL))
>                 return r;
> @@ -393,7 +398,8 @@ void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu,
>                         if (entry->type != KVM_IRQ_ROUTING_MSI)
>                                 continue;
>
> -                       kvm_set_msi_irq(entry, &irq);
> +                       kvm_set_msi_irq(entry, &irq,
> +                                       vcpu->kvm->arch.x2apic_api);
>
>                         if (irq.level && kvm_apic_match_dest(vcpu, NULL, 0,
>                                                 irq.dest_id, irq.dest_mode))
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 46eb71c425cf..178605635df5 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1983,7 +1983,7 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
>  static void __kvm_apic_state_fixup(struct kvm_vcpu *vcpu,
>                 struct kvm_lapic_state *s, bool set)
>  {
> -       if (apic_x2apic_mode(vcpu->arch.apic)) {
> +       if (apic_x2apic_mode(vcpu->arch.apic) && !vcpu->kvm->arch.x2apic_api) {
>                 u32 *id = (u32 *)(s->regs + APIC_ID);
>                 if (set)
>                         *id >>= 24;
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index a10038258b80..ea1f439b444e 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -11075,7 +11075,7 @@ static int vmx_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
>                  * We will support full lowest-priority interrupt later.
>                  */
>
> -               kvm_set_msi_irq(e, &irq);
> +               kvm_set_msi_irq(e, &irq, kvm->arch.x2apic_api);
>                 if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) {
>                         /*
>                          * Make sure the IRTE is in remapped mode if
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 043f110f2210..16b55f09dd16 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2576,6 +2576,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>         case KVM_CAP_DISABLE_QUIRKS:
>         case KVM_CAP_SET_BOOT_CPU_ID:
>         case KVM_CAP_SPLIT_IRQCHIP:
> +       case KVM_CAP_X2APIC_API:
>  #ifdef CONFIG_KVM_DEVICE_ASSIGNMENT
>         case KVM_CAP_ASSIGN_DEV_IRQ:
>         case KVM_CAP_PCI_2_3:
> @@ -3799,6 +3800,17 @@ split_irqchip_unlock:
>                 mutex_unlock(&kvm->lock);
>                 break;
>         }
> +       case KVM_CAP_X2APIC_API: {
> +               struct kvm_enable_cap valid = {.cap = KVM_CAP_X2APIC_API};
> +
> +               r = -EINVAL;
> +               if (memcmp(cap, &valid, sizeof(valid)))
> +                       break;
> +
> +               kvm->arch.x2apic_api = true;
> +               r = 0;
> +               break;
> +       }
>         default:
>                 r = -EINVAL;
>                 break;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 05ebf475104c..43b355d6db7b 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -866,6 +866,7 @@ struct kvm_ppc_smmu_info {
>  #define KVM_CAP_ARM_PMU_V3 126
>  #define KVM_CAP_VCPU_ATTRIBUTES 127
>  #define KVM_CAP_MAX_VCPU_ID 128
> +#define KVM_CAP_X2APIC_API 129
>
>  #ifdef KVM_CAP_IRQ_ROUTING
>
> --
> 2.9.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2016-07-01 18:10 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-30 20:54 [PATCH v1 00/11] KVM: x86: break the xAPIC barrier Radim Krčmář
2016-06-30 20:54 ` [PATCH v1 01/11] KVM: x86: bump KVM_SOFT_MAX_VCPUS to 240 Radim Krčmář
2016-07-01  8:42   ` Paolo Bonzini
2016-06-30 20:54 ` [PATCH v1 02/11] KVM: x86: add kvm_apic_map_get_dest_lapic Radim Krčmář
2016-07-01  7:57   ` Paolo Bonzini
2016-07-01 12:39     ` Radim Krčmář
2016-06-30 20:54 ` [PATCH v1 03/11] KVM: x86: dynamic kvm_apic_map Radim Krčmář
2016-06-30 22:15   ` Andrew Honig
2016-07-01  8:42     ` Paolo Bonzini
2016-07-01 12:44       ` Radim Krčmář
2016-07-01 14:03         ` Paolo Bonzini
2016-07-01 14:38           ` Radim Krčmář
2016-07-01 15:06             ` Paolo Bonzini
2016-07-01 15:12               ` Paolo Bonzini
2016-07-01 15:43                 ` Radim Krčmář
2016-07-01 16:38                   ` Paolo Bonzini
2016-07-01 15:35               ` Radim Krčmář
2016-07-01  7:33   ` Paolo Bonzini
2016-06-30 20:54 ` [PATCH v1 04/11] KVM: x86: use u16 for logical VCPU mask in lapic Radim Krčmář
2016-07-01  7:56   ` Paolo Bonzini
2016-07-01 12:48     ` Radim Krčmář
2016-07-01 14:04       ` Paolo Bonzini
2016-06-30 20:54 ` [PATCH v1 05/11] KVM: x86: use generic function for MSI parsing Radim Krčmář
2016-07-01  8:42   ` Paolo Bonzini
2016-06-30 20:54 ` [PATCH v1 06/11] KVM: x86: use hardware-compatible format for APIC ID register Radim Krčmář
2016-07-01  8:33   ` Paolo Bonzini
2016-07-01 13:11     ` Radim Krčmář
2016-07-01 14:12       ` Paolo Bonzini
2016-07-01 14:54         ` Radim Krčmář
2016-07-01 15:07           ` Paolo Bonzini
2016-07-01 15:53             ` Radim Krčmář
2016-07-01 16:37               ` Paolo Bonzini
2016-06-30 20:54 ` [PATCH v1 07/11] KVM: VMX: optimize APIC ID read with APICv Radim Krčmář
2016-07-01  8:42   ` Paolo Bonzini
2016-06-30 20:54 ` [PATCH v1 08/11] KVM: x86: directly call recalculate_apic_map on lapic restore Radim Krčmář
2016-07-01  8:43   ` Paolo Bonzini
2016-06-30 20:54 ` [PATCH v1 09/11] KVM: x86: reset lapic base in kvm_lapic_reset Radim Krčmář
2016-07-01  8:43   ` Paolo Bonzini
2016-06-30 20:54 ` [PATCH v1 10/11] KVM: x86: add KVM_CAP_X2APIC_API Radim Krčmář
2016-07-01  8:24   ` Paolo Bonzini
2016-07-01 13:25     ` Radim Krčmář
2016-07-01 18:09   ` David Matlack [this message]
2016-07-01 18:31     ` Radim Krčmář
2016-06-30 20:54 ` [PATCH v1 11/11] KVM: x86: bump MAX_VCPUS to 288 Radim Krčmář
2016-07-01  8:43   ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALzav=eP90VZQ-cXSkYcKeG5tMZFvQFBueCPg3hr4w_DW08EqQ@mail.gmail.com' \
    --to=dmatlack@google.com \
    --cc=imammedo@redhat.com \
    --cc=jan.kiszka@web.de \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=rkrcmar@redhat.com \
    --cc=tianyu.lan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.