All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Michael Roth <mdroth@linux.vnet.ibm.com>, Ram Pai <linuxram@us.ibm.com>
Cc: mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org,
	benh@kernel.crashing.org, david@gibson.dropbear.id.au,
	paulus@ozlabs.org, hch@lst.de, andmike@us.ibm.com,
	sukadev@linux.vnet.ibm.com, mst@redhat.com, ram.n.pai@gmail.com,
	cai@lca.pw, tglx@linutronix.de, bauerman@linux.ibm.com,
	linux-kernel@vger.kernel.org, leonardo@linux.ibm.com
Subject: Re: [PATCH v5 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor.
Date: Thu, 12 Dec 2019 13:39:42 +1100	[thread overview]
Message-ID: <d2285445-b86d-b2f2-5418-b1af46523311@ozlabs.ru> (raw)
In-Reply-To: <ad63a352-bdec-08a8-2fd0-f64b9579da6c@ozlabs.ru>



On 12/12/2019 09:47, Alexey Kardashevskiy wrote:
> 
> 
> On 12/12/2019 07:31, Michael Roth wrote:
>> Quoting Alexey Kardashevskiy (2019-12-11 02:15:44)
>>>
>>>
>>> On 11/12/2019 02:35, Ram Pai wrote:
>>>> On Tue, Dec 10, 2019 at 04:32:10PM +1100, Alexey Kardashevskiy wrote:
>>>>>
>>>>>
>>>>> On 10/12/2019 16:12, Ram Pai wrote:
>>>>>> On Tue, Dec 10, 2019 at 02:07:36PM +1100, Alexey Kardashevskiy wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 07/12/2019 12:12, Ram Pai wrote:
>>>>>>>> H_PUT_TCE_INDIRECT hcall uses a page filled with TCE entries, as one of
>>>>>>>> its parameters.  On secure VMs, hypervisor cannot access the contents of
>>>>>>>> this page since it gets encrypted.  Hence share the page with the
>>>>>>>> hypervisor, and unshare when done.
>>>>>>>
>>>>>>>
>>>>>>> I thought the idea was to use H_PUT_TCE and avoid sharing any extra
>>>>>>> pages. There is small problem that when DDW is enabled,
>>>>>>> FW_FEATURE_MULTITCE is ignored (easy to fix); I also noticed complains
>>>>>>> about the performance on slack but this is caused by initial cleanup of
>>>>>>> the default TCE window (which we do not use anyway) and to battle this
>>>>>>> we can simply reduce its size by adding
>>>>>>
>>>>>> something that takes hardly any time with H_PUT_TCE_INDIRECT,  takes
>>>>>> 13secs per device for H_PUT_TCE approach, during boot. This is with a
>>>>>> 30GB guest. With larger guest, the time will further detoriate.
>>>>>
>>>>>
>>>>> No it will not, I checked. The time is the same for 2GB and 32GB guests-
>>>>> the delay is caused by clearing the small DMA window which is small by
>>>>> the space mapped (1GB) but quite huge in TCEs as it uses 4K pages; and
>>>>> for DDW window + emulated devices the IOMMU page size will be 2M/16M/1G
>>>>> (depends on the system) so the number of TCEs is much smaller.
>>>>
>>>> I cant get your results.  What changes did you make to get it?
>>>
>>>
>>> Get what? I passed "-m 2G" and "-m 32G", got the same time - 13s spent
>>> in clearing the default window and the huge window took a fraction of a
>>> second to create and map.
>>
>> Is this if we disable FW_FEATURE_MULTITCE in the guest and force the use
>> of H_PUT_TCE everywhere?
> 
> 
> Yes. Well, for the DDW case FW_FEATURE_MULTITCE is ignored but even when
> fixed (I have it in my local branch), this does not make a difference.
> 
> 
>>
>> In theory couldn't we leave FW_FEATURE_MULTITCE in place so that
>> iommu_table_clear() can still use H_STUFF_TCE (which I guess is basically
>> instant),
> 
> PAPR/LoPAPR "conveniently" do not describe what hcall-multi-tce does
> exactly. But I am pretty sure the idea is that either both H_STUFF_TCE
> and H_PUT_TCE_INDIRECT are present or neither.
> 
> 
>> and then force H_PUT_TCE for new mappings via something like:
>>
>> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
>> index 6ba081dd61c9..85d092baf17d 100644
>> --- a/arch/powerpc/platforms/pseries/iommu.c
>> +++ b/arch/powerpc/platforms/pseries/iommu.c
>> @@ -194,6 +194,7 @@ static int tce_buildmulti_pSeriesLP(struct iommu_table *tbl, long tcenum,
>>         unsigned long flags;
>>  
>>         if ((npages == 1) || !firmware_has_feature(FW_FEATURE_MULTITCE)) {
>> +       if ((npages == 1) || !firmware_has_feature(FW_FEATURE_MULTITCE) || is_secure_guest()) {
> 
> 
> Nobody (including myself) seems to like the idea of having
> is_secure_guest() all over the place.
> 
> And with KVM acceleration enabled, it is pretty fast anyway. Just now we
> do not have H_PUT_TCE in KVM/UV for secure guests but we will have to
> fix this for secure PCI passhtrough anyway.
> 
> 
>>                 return tce_build_pSeriesLP(tbl, tcenum, npages, uaddr,
>>                                            direction, attrs);
>>         }
>>
>> That seems like it would avoid the extra 13s.
> 
> Or move around iommu_table_clear() which imho is just the right thing to do.


Huh. It is not the right thing as the firmware could have left mappings
there so we need cleanup. Even if I fixed SLOF, there is POWERVM which I
do not know what it does about TCEs. Thanks,



-- 
Alexey

WARNING: multiple messages have this Message-ID (diff)
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Michael Roth <mdroth@linux.vnet.ibm.com>, Ram Pai <linuxram@us.ibm.com>
Cc: andmike@us.ibm.com, mst@redhat.com, linux-kernel@vger.kernel.org,
	leonardo@linux.ibm.com, ram.n.pai@gmail.com, cai@lca.pw,
	tglx@linutronix.de, sukadev@linux.vnet.ibm.com,
	linuxppc-dev@lists.ozlabs.org, hch@lst.de,
	bauerman@linux.ibm.com, david@gibson.dropbear.id.au
Subject: Re: [PATCH v5 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor.
Date: Thu, 12 Dec 2019 13:39:42 +1100	[thread overview]
Message-ID: <d2285445-b86d-b2f2-5418-b1af46523311@ozlabs.ru> (raw)
In-Reply-To: <ad63a352-bdec-08a8-2fd0-f64b9579da6c@ozlabs.ru>



On 12/12/2019 09:47, Alexey Kardashevskiy wrote:
> 
> 
> On 12/12/2019 07:31, Michael Roth wrote:
>> Quoting Alexey Kardashevskiy (2019-12-11 02:15:44)
>>>
>>>
>>> On 11/12/2019 02:35, Ram Pai wrote:
>>>> On Tue, Dec 10, 2019 at 04:32:10PM +1100, Alexey Kardashevskiy wrote:
>>>>>
>>>>>
>>>>> On 10/12/2019 16:12, Ram Pai wrote:
>>>>>> On Tue, Dec 10, 2019 at 02:07:36PM +1100, Alexey Kardashevskiy wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 07/12/2019 12:12, Ram Pai wrote:
>>>>>>>> H_PUT_TCE_INDIRECT hcall uses a page filled with TCE entries, as one of
>>>>>>>> its parameters.  On secure VMs, hypervisor cannot access the contents of
>>>>>>>> this page since it gets encrypted.  Hence share the page with the
>>>>>>>> hypervisor, and unshare when done.
>>>>>>>
>>>>>>>
>>>>>>> I thought the idea was to use H_PUT_TCE and avoid sharing any extra
>>>>>>> pages. There is small problem that when DDW is enabled,
>>>>>>> FW_FEATURE_MULTITCE is ignored (easy to fix); I also noticed complains
>>>>>>> about the performance on slack but this is caused by initial cleanup of
>>>>>>> the default TCE window (which we do not use anyway) and to battle this
>>>>>>> we can simply reduce its size by adding
>>>>>>
>>>>>> something that takes hardly any time with H_PUT_TCE_INDIRECT,  takes
>>>>>> 13secs per device for H_PUT_TCE approach, during boot. This is with a
>>>>>> 30GB guest. With larger guest, the time will further detoriate.
>>>>>
>>>>>
>>>>> No it will not, I checked. The time is the same for 2GB and 32GB guests-
>>>>> the delay is caused by clearing the small DMA window which is small by
>>>>> the space mapped (1GB) but quite huge in TCEs as it uses 4K pages; and
>>>>> for DDW window + emulated devices the IOMMU page size will be 2M/16M/1G
>>>>> (depends on the system) so the number of TCEs is much smaller.
>>>>
>>>> I cant get your results.  What changes did you make to get it?
>>>
>>>
>>> Get what? I passed "-m 2G" and "-m 32G", got the same time - 13s spent
>>> in clearing the default window and the huge window took a fraction of a
>>> second to create and map.
>>
>> Is this if we disable FW_FEATURE_MULTITCE in the guest and force the use
>> of H_PUT_TCE everywhere?
> 
> 
> Yes. Well, for the DDW case FW_FEATURE_MULTITCE is ignored but even when
> fixed (I have it in my local branch), this does not make a difference.
> 
> 
>>
>> In theory couldn't we leave FW_FEATURE_MULTITCE in place so that
>> iommu_table_clear() can still use H_STUFF_TCE (which I guess is basically
>> instant),
> 
> PAPR/LoPAPR "conveniently" do not describe what hcall-multi-tce does
> exactly. But I am pretty sure the idea is that either both H_STUFF_TCE
> and H_PUT_TCE_INDIRECT are present or neither.
> 
> 
>> and then force H_PUT_TCE for new mappings via something like:
>>
>> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
>> index 6ba081dd61c9..85d092baf17d 100644
>> --- a/arch/powerpc/platforms/pseries/iommu.c
>> +++ b/arch/powerpc/platforms/pseries/iommu.c
>> @@ -194,6 +194,7 @@ static int tce_buildmulti_pSeriesLP(struct iommu_table *tbl, long tcenum,
>>         unsigned long flags;
>>  
>>         if ((npages == 1) || !firmware_has_feature(FW_FEATURE_MULTITCE)) {
>> +       if ((npages == 1) || !firmware_has_feature(FW_FEATURE_MULTITCE) || is_secure_guest()) {
> 
> 
> Nobody (including myself) seems to like the idea of having
> is_secure_guest() all over the place.
> 
> And with KVM acceleration enabled, it is pretty fast anyway. Just now we
> do not have H_PUT_TCE in KVM/UV for secure guests but we will have to
> fix this for secure PCI passhtrough anyway.
> 
> 
>>                 return tce_build_pSeriesLP(tbl, tcenum, npages, uaddr,
>>                                            direction, attrs);
>>         }
>>
>> That seems like it would avoid the extra 13s.
> 
> Or move around iommu_table_clear() which imho is just the right thing to do.


Huh. It is not the right thing as the firmware could have left mappings
there so we need cleanup. Even if I fixed SLOF, there is POWERVM which I
do not know what it does about TCEs. Thanks,



-- 
Alexey

  reply	other threads:[~2019-12-12  2:39 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-07  1:12 [PATCH v5 0/2] Enable IOMMU support for pseries Secure VMs Ram Pai
2019-12-07  1:12 ` Ram Pai
2019-12-07  1:12 ` [PATCH v5 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor Ram Pai
2019-12-07  1:12   ` Ram Pai
2019-12-07  1:12   ` [PATCH v5 2/2] powerpc/pseries/iommu: Use dma_iommu_ops for Secure VM Ram Pai
2019-12-07  1:12     ` Ram Pai
2019-12-10  3:08     ` Alexey Kardashevskiy
2019-12-10  3:08       ` Alexey Kardashevskiy
2019-12-10 22:09     ` Thiago Jung Bauermann
2019-12-10 22:09       ` Thiago Jung Bauermann
2019-12-11  1:43     ` Michael Roth
2019-12-11  1:43       ` Michael Roth
2019-12-11  8:36       ` Alexey Kardashevskiy
2019-12-11  8:36         ` Alexey Kardashevskiy
2019-12-11 18:07         ` Michael Roth
2019-12-11 18:07           ` Michael Roth
2019-12-11 18:20           ` Christoph Hellwig
2019-12-11 18:20             ` Christoph Hellwig
2019-12-12  6:45       ` Ram Pai
2019-12-12  6:45         ` Ram Pai
2019-12-13  0:19         ` Michael Roth
2019-12-13  0:19           ` Michael Roth
2019-12-10  3:07   ` [PATCH v5 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor Alexey Kardashevskiy
2019-12-10  3:07     ` Alexey Kardashevskiy
2019-12-10  5:12     ` Ram Pai
2019-12-10  5:12       ` Ram Pai
2019-12-10  5:32       ` Alexey Kardashevskiy
2019-12-10  5:32         ` Alexey Kardashevskiy
2019-12-10 15:35         ` Ram Pai
2019-12-10 15:35           ` Ram Pai
2019-12-11  8:15           ` Alexey Kardashevskiy
2019-12-11  8:15             ` Alexey Kardashevskiy
2019-12-11 20:31             ` Michael Roth
2019-12-11 20:31               ` Michael Roth
2019-12-11 22:47               ` Alexey Kardashevskiy
2019-12-11 22:47                 ` Alexey Kardashevskiy
2019-12-12  2:39                 ` Alexey Kardashevskiy [this message]
2019-12-12  2:39                   ` Alexey Kardashevskiy
2019-12-13  0:22                 ` Michael Roth
2019-12-13  0:22                   ` Michael Roth
2019-12-12  4:11             ` Ram Pai
2019-12-12  4:11               ` Ram Pai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d2285445-b86d-b2f2-5418-b1af46523311@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=andmike@us.ibm.com \
    --cc=bauerman@linux.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=cai@lca.pw \
    --cc=david@gibson.dropbear.id.au \
    --cc=hch@lst.de \
    --cc=leonardo@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=linuxram@us.ibm.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=mst@redhat.com \
    --cc=paulus@ozlabs.org \
    --cc=ram.n.pai@gmail.com \
    --cc=sukadev@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.