All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eduardo Habkost <ehabkost@redhat.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Maxim Levitsky <mlevitsk@redhat.com>,
	kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Nitesh Narayan Lal <nitesh@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 4/4] KVM: x86: Fix stack-out-of-bounds memory access from ioapic_write_indirect()
Date: Thu, 26 Aug 2021 10:52:10 -0400	[thread overview]
Message-ID: <20210826145210.gpfbiagntwoswrzp@habkost.net> (raw)
In-Reply-To: <87tujcidka.fsf@vitty.brq.redhat.com>

On Thu, Aug 26, 2021 at 02:40:53PM +0200, Vitaly Kuznetsov wrote:
> Eduardo Habkost <ehabkost@redhat.com> writes:
> 
> > On Wed, Aug 25, 2021 at 5:43 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> >>
> >> Maxim Levitsky <mlevitsk@redhat.com> writes:
> >>
> >> > On Wed, 2021-08-25 at 10:21 +0200, Vitaly Kuznetsov wrote:
> >> >> Maxim Levitsky <mlevitsk@redhat.com> writes:
> >> >>
> >> >> > On Tue, 2021-08-24 at 16:42 +0200, Vitaly Kuznetsov wrote:
> >> >> ...
> >> >> > Not a classical review but,
> >> >> > I did some digital archaeology with this one, trying to understand what is going on:
> >> >> >
> >> >> >
> >> >> > I think that 16 bit vcpu bitmap is due to the fact that IOAPIC spec states that
> >> >> > it can address up to 16 cpus in physical destination mode.
> >> >> >
> >> >> > In logical destination mode, assuming flat addressing and that logical id = 1 << physical id
> >> >> > which KVM hardcodes, it is also only possible to address 8 CPUs.
> >> >> >
> >> >> > However(!) in flat cluster mode, the logical apic id is split in two.
> >> >> > We have 16 clusters and each have 4 CPUs, so it is possible to address 64 CPUs,
> >> >> > and unlike the logical ID, the KVM does honour cluster ID,
> >> >> > thus one can stick say cluster ID 0 to any vCPU.
> >> >> >
> >> >> >
> >> >> > Let's look at ioapic_write_indirect.
> >> >> > It does:
> >> >> >
> >> >> >     -> bitmap_zero(&vcpu_bitmap, 16);
> >> >> >     -> kvm_bitmap_or_dest_vcpus(ioapic->kvm, &irq, &vcpu_bitmap);
> >> >> >     -> kvm_make_scan_ioapic_request_mask(ioapic->kvm, &vcpu_bitmap); // use of the above bitmap
> >> >> >
> >> >> >
> >> >> > When we call kvm_bitmap_or_dest_vcpus, we can already overflow the bitmap,
> >> >> > since we pass all 8 bit of the destination even when it is physical.
> >> >> >
> >> >> >
> >> >> > Lets examine the kvm_bitmap_or_dest_vcpus:
> >> >> >
> >> >> >   -> It calls the kvm_apic_map_get_dest_lapic which
> >> >> >
> >> >> >        -> for physical destinations, it just sets the bitmap, which can overflow
> >> >> >           if we pass it 8 bit destination (which basically includes reserved bits + 4 bit destination).
> >> >> >
> >> >> >
> >> >> >        -> For logical apic ID, it seems to truncate the result to 16 bit, which isn't correct as I explained
> >> >> >           above, but should not overflow the result.
> >> >> >
> >> >> >
> >> >> >    -> If call to kvm_apic_map_get_dest_lapic fails, it goes over all vcpus and tries to match the destination
> >> >> >        This can overflow as well.
> >> >> >
> >> >> >
> >> >> > I also don't like that ioapic_write_indirect calls the kvm_bitmap_or_dest_vcpus twice,
> >> >> > and second time with 'old_dest_id'
> >> >> >
> >> >> > I am not 100%  sure why old_dest_id/old_dest_mode are needed as I don't see anything in the
> >> >> > function changing them.
> >> >> > I think only the guest can change them, so maybe the code deals with the guest changing them
> >> >> > while the code is running from a different vcpu?
> >> >> >
> >> >> > The commit that introduced this code is 7ee30bc132c683d06a6d9e360e39e483e3990708
> >> >> > Nitesh Narayan Lal, maybe you remember something about it?
> >> >> >
> >> >>
> >> >> Before posting this patch I've contacted Nitesh privately, he's
> >> >> currently on vacation but will take a look when he gets back.
> >> >>
> >> >> > Also I worry a lot about other callers of kvm_apic_map_get_dest_lapic
> >> >> >
> >> >> > It is also called from kvm_irq_delivery_to_apic_fast, and from kvm_intr_is_single_vcpu_fast
> >> >> > and both seem to also use 'unsigned long' for bitmap, and then only use 16 bits of it.
> >> >> >
> >> >> > I haven't dug into them, but these don't seem to be IOAPIC related and I think
> >> >> > can overwrite the stack as well.
> >> >>
> >> >> I'm no expert in this code but when writing the patch I somehow
> >> >> convinced myself that a single unsigned long is always enough. I think
> >> >> that for cluster mode 'bitmap' needs 64-bits (and it is *not* a
> >> >> vcpu_bitmap, we need to convert). I may be completely wrong of course
> >> >> but in any case this is a different issue. In ioapic_write_indirect() we
> >> >> have 'vcpu_bitmap' which should certainly be longer than 64 bits.
> >> >
> >> >
> >> > This code which I mentioned in 'other callers' as far as I see is not IOAPIC related.
> >> > For regular local APIC all bets are off, any vCPU and apic ID are possible
> >> > (xapic I think limits apic id to 255 but x2apic doesn't).
> >> >
> >> > I strongly suspect that this code can overflow as well.
> >>
> >> I've probably missed something but I don't see how
> >> kvm_apic_map_get_dest_lapic() can set bits above 64 in 'bitmap'. If it
> >> can, then we have a problem indeed.
> >
> > It would be nice if the compiler took care of validating bitmap sizes
> > for us. Shouldn't we make the function prototypes explicit about the
> > bitmap sizes they expect?
> >
> > I believe some `typedef DECLARE_BITMAP(...)` or `typedef struct {
> > DECLARE_BITMAP(...) } ...` declarations would be very useful here.
> 
> The fundamental problem here is that bitmap in Linux has 'unsigned long
> *' type, it's supposed to be accompanied with 'int len' parameter but
> it's not always the case.
> 
> In KVM, we usually use 'vcpu_bitmap' (or 'dest_vcpu_bitmap') and these
> are 'KVM_MAX_VCPUS' long. Just 'bitmap' or 'mask' case is a bit more
> complicated. E.g. kvm_apic_map_get_logical_dest() uses 'u16 *mask' and
> this means that only 16 bits in the destination are supposed to be
> set. kvm_apic_map_get_dest_lapic() uses 'unsigned long *bitmap' - go
> figure.
> 
> We could've probably used a declaration like you suggest to e.g. create
> incompatible 'bitmap16','bitmap64',... types and make the compiler do
> the checking but I'm slightly hesitant to introduce such helpers to KVM
> and not the whole kernel. Alternatively, we could've just encoded the
> length in parameters name, e.g. 
> 
> @@ -918,7 +918,7 @@ static bool kvm_apic_is_broadcast_dest(struct kvm *kvm, struct kvm_lapic **src,
>  static inline bool kvm_apic_map_get_dest_lapic(struct kvm *kvm,
>                 struct kvm_lapic **src, struct kvm_lapic_irq *irq,
>                 struct kvm_apic_map *map, struct kvm_lapic ***dst,
> -               unsigned long *bitmap)
> +               unsigned long *bitmap64)

You can communicate the expected bitmap size to the compiler
without typedefs if using DECLARE_BITMAP inside the function
parameter list is acceptable coding style (is it?).

For example, the following would have allowed the compiler to
catch the bug you are fixing:

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index d7c25d0c1354..e8c64747121a 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -236,7 +236,7 @@ bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
 void kvm_wait_lapic_expire(struct kvm_vcpu *vcpu);
 
 void kvm_bitmap_or_dest_vcpus(struct kvm *kvm, struct kvm_lapic_irq *irq,
-			      unsigned long *vcpu_bitmap);
+			      DECLARE_BITMAP(vcpu_bitmap, KVM_MAX_VCPUS));
 
 bool kvm_intr_is_single_vcpu_fast(struct kvm *kvm, struct kvm_lapic_irq *irq,
 			struct kvm_vcpu **dest_vcpu);
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 76fb00921203..1df113894cba 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1166,7 +1166,7 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
  * each available vcpu to identify the same.
  */
 void kvm_bitmap_or_dest_vcpus(struct kvm *kvm, struct kvm_lapic_irq *irq,
-			      unsigned long *vcpu_bitmap)
+			      DECLARE_BITMAP(vcpu_bitmap, KVM_MAX_VCPUS))
 {
 	struct kvm_lapic **dest_vcpu = NULL;
 	struct kvm_lapic *src = NULL;

-- 
Eduardo


  reply	other threads:[~2021-08-26 17:50 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-23 14:30 [PATCH v2 0/4] KVM: Various fixes and improvements around kicking vCPUs Vitaly Kuznetsov
2021-08-23 14:30 ` [PATCH v2 1/4] KVM: Clean up benign vcpu->cpu data races when " Vitaly Kuznetsov
2021-08-23 14:30 ` [PATCH v2 2/4] KVM: Guard cpusmask NULL check with CONFIG_CPUMASK_OFFSTACK Vitaly Kuznetsov
2021-08-23 14:30 ` [PATCH v2 3/4] KVM: Optimize kvm_make_vcpus_request_mask() a bit Vitaly Kuznetsov
2021-08-23 14:30 ` [PATCH v2 4/4] KVM: x86: Fix stack-out-of-bounds memory access from ioapic_write_indirect() Vitaly Kuznetsov
2021-08-23 18:58   ` Eduardo Habkost
2021-08-24  7:13     ` Vitaly Kuznetsov
2021-08-24 14:23       ` Eduardo Habkost
2021-08-24 14:42         ` Vitaly Kuznetsov
2021-08-24 16:07           ` Maxim Levitsky
2021-08-24 17:40             ` Sean Christopherson
2021-08-25  8:26               ` Vitaly Kuznetsov
2021-08-25  8:21             ` Vitaly Kuznetsov
2021-08-25  9:11               ` Maxim Levitsky
2021-08-25  9:43                 ` Vitaly Kuznetsov
2021-08-25 10:41                   ` Maxim Levitsky
2021-08-25 13:19                   ` Eduardo Habkost
2021-08-26 12:40                     ` Vitaly Kuznetsov
2021-08-26 14:52                       ` Eduardo Habkost [this message]
2021-08-26 18:01                         ` Sean Christopherson
2021-08-26 18:13                           ` Eduardo Habkost
2021-08-26 19:27             ` Eduardo Habkost
2021-08-30 19:47             ` Nitesh Lal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210826145210.gpfbiagntwoswrzp@habkost.net \
    --to=ehabkost@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=nitesh@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.