From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:59530)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <marcel@redhat.com>) id 1bEx8x-0008Uu-27
	for qemu-devel@nongnu.org; Mon, 20 Jun 2016 07:13:28 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <marcel@redhat.com>) id 1bEx8s-0004x3-Qo
	for qemu-devel@nongnu.org; Mon, 20 Jun 2016 07:13:26 -0400
Received: from mx1.redhat.com ([209.132.183.28]:44735)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <marcel@redhat.com>) id 1bEx8s-0004wr-I9
	for qemu-devel@nongnu.org; Mon, 20 Jun 2016 07:13:22 -0400
Received: from int-mx09.intmail.prod.int.phx2.redhat.com
	(int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mx1.redhat.com (Postfix) with ESMTPS id 33B69C05681D
	for <qemu-devel@nongnu.org>; Mon, 20 Jun 2016 11:13:21 +0000 (UTC)
References: <1466097133-5489-1-git-send-email-dgilbert@redhat.com>
	<1466097133-5489-5-git-send-email-dgilbert@redhat.com>
	<20160616202449.GY18662@thinpad.lan.raisama.net>
	<20160617081505.GA2273@work-vm>
	<08f8e4e0-781a-d7f2-9008-3274f8a085eb@redhat.com>
	<1466155074.18921.16.camel@redhat.com>
	<20160617115239.035fb544@nial.brq.redhat.com>
	<2cebe3e1-4d22-ef3b-d1d7-734f1b2371df@redhat.com>
	<5766C49D.2000102@redhat.com>
	<20160620124236.739498e3@igors-macbook-pro.local>
From: Marcel Apfelbaum <marcel@redhat.com>
Message-ID: <5767CFCD.6080206@redhat.com>
Date: Mon, 20 Jun 2016 14:13:17 +0300
MIME-Version: 1.0
In-Reply-To: <20160620124236.739498e3@igors-macbook-pro.local>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH 4/5] x86: Allow physical address bits to be
 set
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Igor Mammedov <imammedo@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>, Gerd Hoffmann <kraxel@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, aarcange@redhat.com, qemu-devel@nongnu.org, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Eduardo Habkost <ehabkost@redhat.com>

On 06/20/2016 01:42 PM, Igor Mammedov wrote:
> On Sun, 19 Jun 2016 19:13:17 +0300
> Marcel Apfelbaum <marcel@redhat.com> wrote:
>
>> On 06/17/2016 07:07 PM, Laszlo Ersek wrote:
>>> On 06/17/16 11:52, Igor Mammedov wrote:
>>>> On Fri, 17 Jun 2016 11:17:54 +0200
>>>> Gerd Hoffmann <kraxel@redhat.com> wrote:
>>>>
>>>>> On Fr, 2016-06-17 at 10:43 +0200, Paolo Bonzini wrote:
>>>>>>
>>>>>> On 17/06/2016 10:15, Dr. David Alan Gilbert wrote:
>>>>>>> Larger is a problem if the guest tries to map something to a
>>>>>>> high address that's not addressable.
>>>>>>
>>>>>> Right.  It's not a problem for most emulated PCI devices (it
>>>>>> would be a problem for those that have large RAM BARs, but even
>>>>>> our emulated video cards do not have 64-bit RAM BARs, I think;
>>>>>
>>>>> qxl can be configured to have one, try "-device
>>>>> qxl-vga,vram64_size_mb=1024"
>>>>>
>>>>>>>      2) While we have maxmem settings to tell us the top of VM
>>>>>>> RAM, do we have anything that tells us the top of IO space?
>>>>>>> What happens when we hotplug a PCI card?
>>>>>
>>>>>> (arch/x86/kernel/setup.c) but I agree that (2) is a blocker.
>>>>>
>>>>> seabios maps stuff right above ram (possibly with a hole due to
>>>>> alignment requirements).
>>>>>
>>>>> ovmf maps stuff into a 32G-aligned 32G hole.  Which lands at 32G
>>>>> and therefore is addressable with 36 bits, unless you have tons
>>>>> of ram (> 30G) assigned to your guest.  A physical host machine
>>>>> where you can plug in enough ram for such a configuration likely
>>>>> has more than 36 physical address lines too ...
>>>>>
>>>>> qemu checks where the firmware mapped 64bit bars, then adds those
>>>>> ranges to the root bus pci resources in the acpi tables
>>>>> (see /proc/iomem).
>>>>>
>>>>>> You don't know how the guest will assign PCI BAR addresses, and
>>>>>> as you said there's hotplug too.
>>>>>
>>>>> Not sure whenever qemu adds some extra space for hotplug to the
>>>>> 64bit hole and if so how it calculates the size then.  But the
>>>>> guest os should stick to those ranges when configuring hotplugged
>>>>> devices.
>>>> currently firmware would assign 64-bit BARs after
>>>> reserved-memory-end (not sure about ovmf though)
>>>
>>> OVMF does the same as well. It makes sure that the 64-bit PCI MMIO
>>> aperture is located above "etc/reserved-memory-end", if the latter
>>> exists.
>>>
>>>> but QEMU on ACPI side will add 64-bit _CRS only
>>>> for firmware mapped devices (i.e. no space reserved for hotplug).
>>>> And is I recall correctly ovmf won't map BARs if it doesn't have
>>>> a driver for it
>>>
>>> Yes, that's correct, generally for all UEFI firmware.
>>>
>>> More precisely, BARs will be allocated and programmed, but the MMIO
>>> space decoding bit will not be set (permanently) in the device's
>>> command register, if there is no matching driver in the firmware
>>> (or in the device's own oprom).
>>>
>>>> so ACPI tables won't even have a space for not mapped
>>>> 64-bit BARs.
>>>
>>> This used to be true, but that's not the case since
>>> <https://github.com/tianocore/edk2/commit/8f35eb92c419>.
>>>
>>> Namely, specifically for conforming to QEMU's ACPI generator, OVMF
>>> *temporarily* enables, as a platform quirk, all PCI devices present
>>> in the system, before triggering QEMU to generate the ACPI payload.
>>>
>>> Thus, nowadays 64-bit BARs work fine with OVMF, both for
>>> virtio-modern devices, and assigned physical devices. (This is very
>>> easy to test, because, unlike SeaBIOS, the edk2 stuff built into
>>> OVMF prefers to allocate 64-bit BARs outside of the 32-bit address
>>> space.)
>>>
>>> Devices behind PXBs are a different story, but Marcel's been looking
>>> into that, see
>>> <https://bugzilla.redhat.com/show_bug.cgi?id=1323976>.
>>>
>>>> There was another attempt to reserve more space in _CRS
>>>>     https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg00090.html
>>>
>>> That's actually Marcel's first own patch set for addressing
>>> RHBZ#1323976 that I mentioned above (see it linked in
>>> <https://bugzilla.redhat.com/show_bug.cgi?id=1323976#c2>).
>>>
>>> It might have wider effects, but it is entirely motivated, to my
>>> knowledge, by PXB. If you don't have extra root bridges, and/or you
>>> plug all your devices with 64-bit MMIO BARs into the
>>> "main" (default) root bridge, then (I believe) that patch set is
>>> not supposed to make any difference. (I could be wrong, it's been a
>>> while since I looked at Marcel's work!)
>>>
>>
>> Patch 3 and 4 indeed are for PXB only. but patch 'pci: reserve 64 bit
>> MMIO range for PCI hotplug' (see
>> https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg00091.html)
>> tries to reserve [above_4g_mem_size, max_addressable_cpu_bits] range
>> for PCI hotplug.
> it should be [reserved-memory-end, max_addressable_cpu_bits]
>

Right, thanks, actually the patch works like you pointed out.

Thanks,
Marcel

>>
>> The implementation is not good enough because the number of
>> addressable bits is hard-coded. However, we have now David's wrapper
>> I can use.
>>
>>
>> Thanks,
>> Marcel
>>
>>
>>
>>
>>
>>
>>
>>> Thanks
>>> Laszlo
>>>
>>
>