kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
From: Auger Eric <eric.auger@redhat.com>
To: Marc Zyngier <maz@kernel.org>, Zenghui Yu <yuzenghui@huawei.com>,
	kvmarm@lists.cs.columbia.edu, qemu-arm@nongnu.org
Cc: zhang.zhanghailiang@huawei.com, kvm@vger.kernel.org
Subject: Re: Can we boot a 512U kvm guest?
Date: Thu, 22 Aug 2019 11:50:25 +0200	[thread overview]
Message-ID: <681f59e8-a193-6d3e-0bcc-5e52f4203868@redhat.com> (raw)
In-Reply-To: <fbeb47df-7ea2-04ce-5fe3-a6a6a4751b8b@kernel.org>

Hi Marc,

On 8/22/19 11:29 AM, Marc Zyngier wrote:
> Hi Eric,
> 
> On 22/08/2019 10:08, Auger Eric wrote:
>> Hi Zenghui,
>>
>> On 8/13/19 10:50 AM, Zenghui Yu wrote:
>>> Hi folks,
>>>
>>> Since commit e25028c8ded0 ("KVM: arm/arm64: Bump VGIC_V3_MAX_CPUS to
>>> 512"), we seemed to be allowed to boot a 512U guest.  But I failed to
>>> start it up with the latest QEMU.  I guess there are at least *two*
>>> reasons (limitations).
>>>
>>> First I got a QEMU abort:
>>>     "kvm_set_irq: Invalid argument"
>>>
>>> Enable the trace_kvm_irq_line() under debugfs, when it comed with
>>> vcpu-256, I got:
>>>     "Inject UNKNOWN interrupt (3), vcpu->idx: 0, num: 23, level: 0"
>>> and kvm_vm_ioctl_irq_line() returns -EINVAL to user-space...
>>>
>>> So the thing is that we only have 8 bits for vcpu_index field ([23:16])
>>> in KVM_IRQ_LINE ioctl.  irq_type field will be corrupted if we inject a
>>> PPI to vcpu-256, whose vcpu_index will take 9 bits.
>>>
>>> I temporarily patched the KVM and QEMU with the following diff:
>>>
>>> ---8<---
>>> diff --git a/arch/arm64/include/uapi/asm/kvm.h
>>> b/arch/arm64/include/uapi/asm/kvm.h
>>> index 95516a4..39a0fb1 100644
>>> --- a/arch/arm64/include/uapi/asm/kvm.h
>>> +++ b/arch/arm64/include/uapi/asm/kvm.h
>>> @@ -325,10 +325,10 @@ struct kvm_vcpu_events {
>>>  #define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER        1
>>>
>>>  /* KVM_IRQ_LINE irq field index values */
>>> -#define KVM_ARM_IRQ_TYPE_SHIFT        24
>>> -#define KVM_ARM_IRQ_TYPE_MASK        0xff
>>> +#define KVM_ARM_IRQ_TYPE_SHIFT        28
>>> +#define KVM_ARM_IRQ_TYPE_MASK        0xf
>>>  #define KVM_ARM_IRQ_VCPU_SHIFT        16
>>> -#define KVM_ARM_IRQ_VCPU_MASK        0xff
>>> +#define KVM_ARM_IRQ_VCPU_MASK        0xfff
>>>  #define KVM_ARM_IRQ_NUM_SHIFT        0
>>>  #define KVM_ARM_IRQ_NUM_MASK        0xffff
>>>
>>> ---8<---
>>>
>>> It makes things a bit better, it also immediately BREAKs the api with
>>> old versions.
>>>
>>>
>>> Next comes one more QEMU abort (with the "fix" above):
>>>     "Failed to set device address: No space left on device"
>>>
>>> We register two io devices (rd_dev and sgi_dev) on KVM_MMIO_BUS for
>>> each redistributor. 512 vcpus take 1024 io devices, which is beyond the
>>> maximum limitation of the current kernel - NR_IOBUS_DEVS (1000).
>>> So we get a ENOSPC error here.
>>
>> Do you plan to send a patch for increasing the NR_IOBUS_DEVS? Otherwise
>> I can do it.
> 
> I really wonder whether that's a sensible thing to do on its own.
> 
> Looking at the implementation of kvm_io_bus_register_dev (which copies
> the whole array each time we insert a device), we have an obvious issue
> with systems that create a large number of device structures, leading to
> large transient memory usage and slow guest start.
> 
> We could also try and reduce the number of devices we insert by making
> the redistributor a single device (which it is in reality). It probably
> means we need to make the MMIO decoding more flexible.

Yes it makes sense. If no objection, I can work on this as I am the
source of the mess ;-)

Thanks

Eric
> 
> Thanks,
> 
> 	M.
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

  reply	other threads:[~2019-08-22  9:50 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-13  8:50 Can we boot a 512U kvm guest? Zenghui Yu
2019-08-13 14:17 ` Marc Zyngier
2019-08-13 21:44   ` Auger Eric
2019-08-14  6:51   ` Zenghui Yu
2019-08-22  9:08 ` Auger Eric
2019-08-22  9:29   ` Marc Zyngier
2019-08-22  9:50     ` Auger Eric [this message]
2019-08-22 10:23       ` Marc Zyngier
2019-08-23  2:21   ` Zenghui Yu
2019-08-23  7:37     ` Auger Eric

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=681f59e8-a193-6d3e-0bcc-5e52f4203868@redhat.com \
    --to=eric.auger@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=maz@kernel.org \
    --cc=qemu-arm@nongnu.org \
    --cc=yuzenghui@huawei.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).