All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joao Martins <joao.m.martins@oracle.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Richard Henderson <richard.henderson@linaro.org>,
	qemu-devel@nongnu.org, Daniel Jordan <daniel.m.jordan@oracle.com>,
	David Edmondson <david.edmondson@oracle.com>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH RFC 1/6] i386/pc: Account IOVA reserved ranges above 4G boundary
Date: Wed, 23 Jun 2021 10:51:59 +0100	[thread overview]
Message-ID: <21f86eb7-e9db-b7ac-9014-2baa9fd44741@oracle.com> (raw)
In-Reply-To: <20210623110341.0ceca1b4@redhat.com>

On 6/23/21 10:03 AM, Igor Mammedov wrote:
> On Tue, 22 Jun 2021 16:49:00 +0100
> Joao Martins <joao.m.martins@oracle.com> wrote:
> 
>> It is assumed that the whole GPA space is available to be
>> DMA addressable, within a given address space limit. Since
>> v5.4 based that is not true, and VFIO will validate whether
>> the selected IOVA is indeed valid i.e. not reserved by IOMMU
>> on behalf of some specific devices or platform-defined.
>>
>> AMD systems with an IOMMU are examples of such platforms and
>> particularly may export only these ranges as allowed:
>>
>> 	0000000000000000 - 00000000fedfffff (0      .. 3.982G)
>> 	00000000fef00000 - 000000fcffffffff (3.983G .. 1011.9G)
>> 	0000010000000000 - ffffffffffffffff (1Tb    .. 16Pb)
>>
>> We already know of accounting for the 4G hole, albeit if the
>> guest is big enough we will fail to allocate a >1010G given
>> the ~12G hole at the 1Tb boundary, reserved for HyperTransport.
>>
>> When creating the region above 4G, take into account what
>> IOVAs are allowed by defining the known allowed ranges
>> and search for the next free IOVA ranges. When finding a
>> invalid IOVA we mark them as reserved and proceed to the
>> next allowed IOVA region.
>>
>> After accounting for the 1Tb hole on AMD hosts, mtree should
>> look like:
>>
>> 0000000100000000-000000fcffffffff (prio 0, i/o):
>> 	alias ram-above-4g @pc.ram 0000000080000000-000000fc7fffffff
>> 0000010000000000-000001037fffffff (prio 0, i/o):
>> 	alias ram-above-1t @pc.ram 000000fc80000000-000000ffffffffff
> 
> You are talking here about GPA which is guest specific thing
> and then somehow it becomes tied to host. For bystanders it's
> not clear from above commit message how both are related.
> I'd add here an explicit explanation how AMD host is related GPAs
> and clarify where you are talking about guest/host side.
> 
OK, makes sense.

Perhaps using IOVA makes it easier to understand. I said GPA because
there's an 1:1 mapping between GPA and IOVA (if you're not using vIOMMU).

> also what about usecases:
>  * start QEMU with Intel cpu model on AMD host with intel's iommu

In principle it would be less likely to occur. But you would still need
to mark the same range as reserved. The limitation is on DMA occuring
on those IOVAs (host or guest) coinciding with that range, so you would
want to inform the guest that at least those should be avoided.

>  * start QEMU with AMD cpu model and AMD's iommu on Intel host

Here you would probably only mark the range, solely for honoring how hardware
is usually represented. But really, on Intel, nothing stops you from exposing the
aforementioned range as RAM.

>  * start QEMU in TCG mode on AMD host (mostly form qtest point ot view)
> 
This one is tricky. Because you can hotplug a VFIO device later on,
I opted for always marking the reserved range. If you don't use VFIO you're good, but
otherwise you would still need reserved. But I am not sure how qtest is used
today for testing huge guests.


  reply	other threads:[~2021-06-23  9:53 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-22 15:48 [PATCH RFC 0/6] i386/pc: Fix creation of >= 1Tb guests on AMD systems with IOMMU Joao Martins
2021-06-22 15:49 ` [PATCH RFC 1/6] i386/pc: Account IOVA reserved ranges above 4G boundary Joao Martins
2021-06-23  7:11   ` Igor Mammedov
2021-06-23  9:37     ` Joao Martins
2021-06-23 11:39       ` Igor Mammedov
2021-06-23 13:04         ` Joao Martins
2021-06-28 14:32           ` Igor Mammedov
2021-08-06 10:41             ` Joao Martins
2021-06-23  9:03   ` Igor Mammedov
2021-06-23  9:51     ` Joao Martins [this message]
2021-06-23 12:09       ` Igor Mammedov
2021-06-23 13:07         ` Joao Martins
2021-06-28 13:25           ` Igor Mammedov
2021-06-28 13:43             ` Joao Martins
2021-06-28 15:21               ` Igor Mammedov
2021-06-24  9:32     ` Dr. David Alan Gilbert
2021-06-28 14:42       ` Igor Mammedov
2021-06-22 15:49 ` [PATCH RFC 2/6] i386/pc: Round up the hotpluggable memory within valid IOVA ranges Joao Martins
2021-06-22 15:49 ` [PATCH RFC 3/6] pc/cmos: Adjust CMOS above 4G memory size according to 1Tb boundary Joao Martins
2021-06-22 15:49 ` [PATCH RFC 4/6] i386/pc: Keep PCI 64-bit hole within usable IOVA space Joao Martins
2021-06-23 12:30   ` Igor Mammedov
2021-06-23 13:22     ` Joao Martins
2021-06-28 15:37       ` Igor Mammedov
2021-06-23 16:33     ` Laszlo Ersek
2021-06-25 17:19       ` Joao Martins
2021-06-22 15:49 ` [PATCH RFC 5/6] i386/acpi: Fix SRAT ranges in accordance to usable IOVA Joao Martins
2021-06-22 15:49 ` [PATCH RFC 6/6] i386/pc: Add a machine property for AMD-only enforcing of valid IOVAs Joao Martins
2021-06-23  9:18   ` Igor Mammedov
2021-06-23  9:59     ` Joao Martins
2021-06-22 21:16 ` [PATCH RFC 0/6] i386/pc: Fix creation of >= 1Tb guests on AMD systems with IOMMU Alex Williamson
2021-06-23  7:40   ` David Edmondson
2021-06-23 19:13     ` Alex Williamson
2021-06-23  9:30   ` Joao Martins
2021-06-23 11:58     ` Igor Mammedov
2021-06-23 13:15       ` Joao Martins
2021-06-23 19:27     ` Alex Williamson
2021-06-24  9:22       ` Dr. David Alan Gilbert
2021-06-25 16:54       ` Joao Martins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=21f86eb7-e9db-b7ac-9014-2baa9fd44741@oracle.com \
    --to=joao.m.martins@oracle.com \
    --cc=daniel.m.jordan@oracle.com \
    --cc=david.edmondson@oracle.com \
    --cc=ehabkost@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=suravee.suthikulpanit@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.