From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Michael Roth <mdroth@linux.vnet.ibm.com>, Ram Pai <linuxram@us.ibm.com>
Cc: andmike@us.ibm.com, mst@redhat.com, linux-kernel@vger.kernel.org,
leonardo@linux.ibm.com, ram.n.pai@gmail.com, cai@lca.pw,
tglx@linutronix.de, sukadev@linux.vnet.ibm.com,
linuxppc-dev@lists.ozlabs.org, hch@lst.de,
bauerman@linux.ibm.com, david@gibson.dropbear.id.au
Subject: Re: [PATCH v5 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor.
Date: Thu, 12 Dec 2019 13:39:42 +1100 [thread overview]
Message-ID: <d2285445-b86d-b2f2-5418-b1af46523311@ozlabs.ru> (raw)
In-Reply-To: <ad63a352-bdec-08a8-2fd0-f64b9579da6c@ozlabs.ru>
On 12/12/2019 09:47, Alexey Kardashevskiy wrote:
>
>
> On 12/12/2019 07:31, Michael Roth wrote:
>> Quoting Alexey Kardashevskiy (2019-12-11 02:15:44)
>>>
>>>
>>> On 11/12/2019 02:35, Ram Pai wrote:
>>>> On Tue, Dec 10, 2019 at 04:32:10PM +1100, Alexey Kardashevskiy wrote:
>>>>>
>>>>>
>>>>> On 10/12/2019 16:12, Ram Pai wrote:
>>>>>> On Tue, Dec 10, 2019 at 02:07:36PM +1100, Alexey Kardashevskiy wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 07/12/2019 12:12, Ram Pai wrote:
>>>>>>>> H_PUT_TCE_INDIRECT hcall uses a page filled with TCE entries, as one of
>>>>>>>> its parameters. On secure VMs, hypervisor cannot access the contents of
>>>>>>>> this page since it gets encrypted. Hence share the page with the
>>>>>>>> hypervisor, and unshare when done.
>>>>>>>
>>>>>>>
>>>>>>> I thought the idea was to use H_PUT_TCE and avoid sharing any extra
>>>>>>> pages. There is small problem that when DDW is enabled,
>>>>>>> FW_FEATURE_MULTITCE is ignored (easy to fix); I also noticed complains
>>>>>>> about the performance on slack but this is caused by initial cleanup of
>>>>>>> the default TCE window (which we do not use anyway) and to battle this
>>>>>>> we can simply reduce its size by adding
>>>>>>
>>>>>> something that takes hardly any time with H_PUT_TCE_INDIRECT, takes
>>>>>> 13secs per device for H_PUT_TCE approach, during boot. This is with a
>>>>>> 30GB guest. With larger guest, the time will further detoriate.
>>>>>
>>>>>
>>>>> No it will not, I checked. The time is the same for 2GB and 32GB guests-
>>>>> the delay is caused by clearing the small DMA window which is small by
>>>>> the space mapped (1GB) but quite huge in TCEs as it uses 4K pages; and
>>>>> for DDW window + emulated devices the IOMMU page size will be 2M/16M/1G
>>>>> (depends on the system) so the number of TCEs is much smaller.
>>>>
>>>> I cant get your results. What changes did you make to get it?
>>>
>>>
>>> Get what? I passed "-m 2G" and "-m 32G", got the same time - 13s spent
>>> in clearing the default window and the huge window took a fraction of a
>>> second to create and map.
>>
>> Is this if we disable FW_FEATURE_MULTITCE in the guest and force the use
>> of H_PUT_TCE everywhere?
>
>
> Yes. Well, for the DDW case FW_FEATURE_MULTITCE is ignored but even when
> fixed (I have it in my local branch), this does not make a difference.
>
>
>>
>> In theory couldn't we leave FW_FEATURE_MULTITCE in place so that
>> iommu_table_clear() can still use H_STUFF_TCE (which I guess is basically
>> instant),
>
> PAPR/LoPAPR "conveniently" do not describe what hcall-multi-tce does
> exactly. But I am pretty sure the idea is that either both H_STUFF_TCE
> and H_PUT_TCE_INDIRECT are present or neither.
>
>
>> and then force H_PUT_TCE for new mappings via something like:
>>
>> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
>> index 6ba081dd61c9..85d092baf17d 100644
>> --- a/arch/powerpc/platforms/pseries/iommu.c
>> +++ b/arch/powerpc/platforms/pseries/iommu.c
>> @@ -194,6 +194,7 @@ static int tce_buildmulti_pSeriesLP(struct iommu_table *tbl, long tcenum,
>> unsigned long flags;
>>
>> if ((npages == 1) || !firmware_has_feature(FW_FEATURE_MULTITCE)) {
>> + if ((npages == 1) || !firmware_has_feature(FW_FEATURE_MULTITCE) || is_secure_guest()) {
>
>
> Nobody (including myself) seems to like the idea of having
> is_secure_guest() all over the place.
>
> And with KVM acceleration enabled, it is pretty fast anyway. Just now we
> do not have H_PUT_TCE in KVM/UV for secure guests but we will have to
> fix this for secure PCI passhtrough anyway.
>
>
>> return tce_build_pSeriesLP(tbl, tcenum, npages, uaddr,
>> direction, attrs);
>> }
>>
>> That seems like it would avoid the extra 13s.
>
> Or move around iommu_table_clear() which imho is just the right thing to do.
Huh. It is not the right thing as the firmware could have left mappings
there so we need cleanup. Even if I fixed SLOF, there is POWERVM which I
do not know what it does about TCEs. Thanks,
--
Alexey
next prev parent reply other threads:[~2019-12-12 2:42 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-07 1:12 [PATCH v5 0/2] Enable IOMMU support for pseries Secure VMs Ram Pai
2019-12-07 1:12 ` [PATCH v5 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor Ram Pai
2019-12-07 1:12 ` [PATCH v5 2/2] powerpc/pseries/iommu: Use dma_iommu_ops for Secure VM Ram Pai
2019-12-10 3:08 ` Alexey Kardashevskiy
2019-12-10 22:09 ` Thiago Jung Bauermann
2019-12-11 1:43 ` Michael Roth
2019-12-11 8:36 ` Alexey Kardashevskiy
2019-12-11 18:07 ` Michael Roth
2019-12-11 18:20 ` Christoph Hellwig
2019-12-12 6:45 ` Ram Pai
2019-12-13 0:19 ` Michael Roth
2019-12-10 3:07 ` [PATCH v5 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor Alexey Kardashevskiy
2019-12-10 5:12 ` Ram Pai
2019-12-10 5:32 ` Alexey Kardashevskiy
2019-12-10 15:35 ` Ram Pai
2019-12-11 8:15 ` Alexey Kardashevskiy
2019-12-11 20:31 ` Michael Roth
2019-12-11 22:47 ` Alexey Kardashevskiy
2019-12-12 2:39 ` Alexey Kardashevskiy [this message]
2019-12-13 0:22 ` Michael Roth
2019-12-12 4:11 ` Ram Pai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d2285445-b86d-b2f2-5418-b1af46523311@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=andmike@us.ibm.com \
--cc=bauerman@linux.ibm.com \
--cc=cai@lca.pw \
--cc=david@gibson.dropbear.id.au \
--cc=hch@lst.de \
--cc=leonardo@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=linuxram@us.ibm.com \
--cc=mdroth@linux.vnet.ibm.com \
--cc=mst@redhat.com \
--cc=ram.n.pai@gmail.com \
--cc=sukadev@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).