From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Alistair Popple <alistair@popple.id.au>
Cc: Oliver O'Halloran <oohall@gmail.com>,
Alex Williamson <alex.williamson@redhat.com>,
linuxppc-dev@lists.ozlabs.org, kvm@vger.kernel.org,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH kernel RFC 0/4] powerpc/powenv/ioda: Allow huge DMA window at 4GB
Date: Mon, 2 Dec 2019 16:58:15 +1100 [thread overview]
Message-ID: <45175dc2-8ed4-6e96-ff69-44980f3d1951@ozlabs.ru> (raw)
In-Reply-To: <22858805.RAHADn2P79@townsend>
On 02/12/2019 16:36, Alistair Popple wrote:
> On Monday, 2 December 2019 12:59:49 PM AEDT Alexey Kardashevskiy wrote:
>> Here is an attempt to support bigger DMA space for devices
>> supporting DMA masks less than 59 bits (GPUs come into mind
>> first). POWER9 PHBs have an option to map 2 windows at 0
>> and select a windows based on DMA address being below or above
>> 4GB.
>>
>> This adds the "iommu=iommu_bypass" kernel parameter and
>
> Would it be possible to just enable this by default if the platform supports
> it? Are there any downsides?
It changes the second DMA window location which is now assumed by QEMU
to be at 0x800.0000.0000.0000 and I do not see an easy way to work
around this.
For example, we start QEMU without VFIO but with emulated XHCI which
will ask for DDW, we (QEMU) have to pick a window location but then we
have to stick to it and if a user later hotplugs an VFIO-PCI, that
physical IOMMU has to support the previously selected DMA window
address; otherwise hotplug is going to fail.
The question is how to tell QEMU about this new offset and what we do
about migration from P8 (which let's say did have a VFIO device which we
unplug before the migration) to P9 with a prospect of hotplugging an
VFIO device but this time with this GTE4GB bit set.
> Adding it as an option seems like it would make
> things harder to support and reduces the amount of testing/use it would get.
Yeah, this why this is an RFC...
>> supports VFIO+pseries machine - current this requires telling
>> upstream+unmodified QEMU about this via
>> -global spapr-pci-host-bridge.dma64_win_addr=0x100000000
>> or per-phb property. 4/4 advertises the new option but
>> there is no automation around it in QEMU (should it be?).
>>
>> For now it is either 1<<59 or 4GB mode; dynamic switching is
>> not supported (could be via sysfs).
>>
>> This is based on sha1
>> a6ed68d6468b Linus Torvalds "Merge tag 'drm-next-2019-11-27' of git://
> anongit.freedesktop.org/drm/drm".
>
> Are you sure?
Almost. It should have been HEAD^^^^^..HEAD instead of HEAD^^^^..HEAD :)
I've posted 00/4 to the thread now, sorry about that. Thanks,
> I am getting the following rejected hunk trying to apply the
> first patch in the series:
>
> --- arch/powerpc/platforms/powernv/pci-ioda.c
> +++ arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -2349,15 +2349,10 @@ static void pnv_pci_ioda2_set_bypass(struct
> pnv_ioda_pe *pe, bool enable)
> pe->tce_bypass_enabled = enable;
> }
>
> -static long pnv_pci_ioda2_create_table(struct iommu_table_group *table_group,
> - int num, __u32 page_shift, __u64 window_size, __u32 levels,
> +static long pnv_pci_ioda2_create_table(int nid, int num, __u64 bus_offset,
> + __u32 page_shift, __u64 window_size, __u32 levels,
> bool alloc_userspace_copy, struct iommu_table **ptbl)
> {
> - struct pnv_ioda_pe *pe = container_of(table_group, struct pnv_ioda_pe,
> - table_group);
> - int nid = pe->phb->hose->node;
> - __u64 bus_offset = num ?
> - pe->table_group.tce64_start : table_group->tce32_start;
> long ret;
> struct iommu_table *tbl;
>
> - Alistair
>
>> Please comment. Thanks.
>>
>>
>>
>> Alexey Kardashevskiy (4):
>> powerpc/powernv/ioda: Rework for huge DMA window at 4GB
>> powerpc/powernv/ioda: Allow smaller TCE table levels
>> powerpc/powernv/phb4: Add 4GB IOMMU bypass mode
>> vfio/spapr_tce: Advertise and allow a huge DMA windows at 4GB
>>
>> arch/powerpc/include/asm/iommu.h | 1 +
>> arch/powerpc/include/asm/opal-api.h | 11 +-
>> arch/powerpc/include/asm/opal.h | 2 +
>> arch/powerpc/platforms/powernv/pci.h | 1 +
>> include/uapi/linux/vfio.h | 2 +
>> arch/powerpc/platforms/powernv/opal-call.c | 2 +
>> arch/powerpc/platforms/powernv/pci-ioda-tce.c | 4 +-
>> arch/powerpc/platforms/powernv/pci-ioda.c | 219 ++++++++++++++----
>> drivers/vfio/vfio_iommu_spapr_tce.c | 10 +-
>> 9 files changed, 202 insertions(+), 50 deletions(-)
>>
>>
>
>
>
>
--
Alexey
next prev parent reply other threads:[~2019-12-02 6:00 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-02 1:59 [PATCH kernel RFC 0/4] powerpc/powenv/ioda: Allow huge DMA window at 4GB Alexey Kardashevskiy
2019-12-02 1:59 ` [PATCH kernel RFC 1/4] powerpc/powernv/ioda: Rework for " Alexey Kardashevskiy
2019-12-02 1:59 ` [PATCH kernel RFC 2/4] powerpc/powernv/ioda: Allow smaller TCE table levels Alexey Kardashevskiy
2019-12-02 1:59 ` [PATCH kernel RFC 3/4] powerpc/powernv/phb4: Add 4GB IOMMU bypass mode Alexey Kardashevskiy
2019-12-02 1:59 ` [PATCH kernel RFC 4/4] vfio/spapr_tce: Advertise and allow a huge DMA windows at 4GB Alexey Kardashevskiy
2019-12-02 5:36 ` [PATCH kernel RFC 0/4] powerpc/powenv/ioda: Allow huge DMA window " Alistair Popple
2019-12-02 5:58 ` Alexey Kardashevskiy [this message]
2019-12-02 5:51 ` [PATCH kernel RFC 00/4] powerpc/powernv/ioda: Move TCE bypass base to PE Alexey Kardashevskiy
2020-01-10 4:18 ` [PATCH kernel RFC 0/4] powerpc/powenv/ioda: Allow huge DMA window at 4GB Alexey Kardashevskiy
2020-01-23 0:53 ` Alexey Kardashevskiy
2020-01-23 1:17 ` David Gibson
2020-01-23 8:42 ` Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45175dc2-8ed4-6e96-ff69-44980f3d1951@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=alistair@popple.id.au \
--cc=david@gibson.dropbear.id.au \
--cc=kvm@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=oohall@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).