From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Leonardo Bras <leobras.c@gmail.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Paul Mackerras <paulus@samba.org>,
Christophe Leroy <christophe.leroy@c-s.fr>,
Joel Stanley <joel@jms.id.au>,
Thiago Jung Bauermann <bauerman@linux.ibm.com>,
Ram Pai <linuxram@us.ibm.com>,
Brian King <brking@linux.vnet.ibm.com>,
Murilo Fossa Vicentini <muvic@linux.ibm.com>,
David Dai <zdai@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 02/10] powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE on iommu_*_coherent()
Date: Mon, 31 Aug 2020 10:47:22 +1000 [thread overview]
Message-ID: <81f106bd-8962-22f2-f14a-378d3486f57e@ozlabs.ru> (raw)
In-Reply-To: <2aacd45f047489642da1731c92d3555ad101e3c7.camel@gmail.com>
On 29/08/2020 06:41, Leonardo Bras wrote:
> On Fri, 2020-08-28 at 11:40 +1000, Alexey Kardashevskiy wrote:
>>> I think it would be better to keep the code as much generic as possible
>>> regarding page sizes.
>>
>> Then you need to test it. Does 4K guest even boot (it should but I would
>> not bet much on it)?
>
> Maybe testing with host 64k pagesize and IOMMU 16MB pagesize in qemu
> should be enough, is there any chance to get indirect mapping in qemu
> like this? (DDW but with smaller DMA window available)
You will have to hack the guest kernel to always do indirect mapping or
hack QEMU's rtas_ibm_query_pe_dma_window() to return a small number of
available TCEs. But you will be testing QEMU/KVM which behave quite
differently to pHyp in this particular case.
>>>> Because if we want the former (==support), then we'll have to align the
>>>> size up to the bigger page size when allocating/zeroing system pages,
>>>> etc.
>>>
>>> This part I don't understand. Why do we need to align everything to the
>>> bigger pagesize?
>>>
>>> I mean, is not that enough that the range [ret, ret + size[ is both
>>> allocated by mm and mapped on a iommu range?
>>>
>>> Suppose a iommu_alloc_coherent() of 16kB on PAGESIZE = 4k and
>>> IOMMU_PAGE_SIZE() == 64k.
>>> Why 4 * cpu_pages mapped by a 64k IOMMU page is not enough?
>>> All the space the user asked for is allocated and mapped for DMA.
>>
>> The user asked to map 16K, the rest - 48K - is used for something else
>> (may be even mapped to another device) but you are making all 64K
>> accessible by the device which only should be able to access 16K.
>>
>> In practice, if this happens, H_PUT_TCE will simply fail.
>
> I have noticed mlx5 driver getting a few bytes in a buffer, and using
> iommu_map_page(). It does map a whole page for as few bytes as the user
Whole 4K system page or whole 64K iommu page?
> wants mapped, and the other bytes get used for something else, or just
> mapped on another DMA page.
> It seems to work fine.
With 4K system page and 64K IOMMU page? In practice it would take an
effort or/and bad luck to see it crashing. Thanks,
>
>>
>>
>>>> Bigger pages are not the case here as I understand it.
>>>
>>> I did not get this part, what do you mean?
>>
>> Possible IOMMU page sizes are 4K, 64K, 2M, 16M, 256M, 1GB, and the
>> supported set of sizes is different for P8/P9 and type of IO (PHB,
>> NVLink/CAPI).
>>
>>
>>>>> Update those functions to guarantee alignment with requested size
>>>>> using IOMMU_PAGE_ALIGN() before doing iommu_alloc() / iommu_free().
>>>>>
>>>>> Also, on iommu_range_alloc(), replace ALIGN(n, 1 << tbl->it_page_shift)
>>>>> with IOMMU_PAGE_ALIGN(n, tbl), which seems easier to read.
>>>>>
>>>>> Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
>>>>> ---
>>>>> arch/powerpc/kernel/iommu.c | 17 +++++++++--------
>>>>> 1 file changed, 9 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
>>>>> index 9704f3f76e63..d7086087830f 100644
>>>>> --- a/arch/powerpc/kernel/iommu.c
>>>>> +++ b/arch/powerpc/kernel/iommu.c
>>>>> @@ -237,10 +237,9 @@ static unsigned long iommu_range_alloc(struct device *dev,
>>>>> }
>>>>>
>>>>> if (dev)
>>>>> - boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
>>>>> - 1 << tbl->it_page_shift);
>>>>> + boundary_size = IOMMU_PAGE_ALIGN(dma_get_seg_boundary(dev) + 1, tbl);
>>>>
>>>> Run checkpatch.pl, should complain about a long line.
>>>
>>> It's 86 columns long, which is less than the new limit of 100 columns
>>> Linus announced a few weeks ago. checkpatch.pl was updated too:
>>> https://www.phoronix.com/scan.php?page=news_item&px=Linux-Kernel-Deprecates-80-Col
>>
>> Yay finally :) Thanks,
>
> :)
>
>>
>>
>>>>
>>>>> else
>>>>> - boundary_size = ALIGN(1UL << 32, 1 << tbl->it_page_shift);
>>>>> + boundary_size = IOMMU_PAGE_ALIGN(1UL << 32, tbl);
>>>>> /* 4GB boundary for iseries_hv_alloc and iseries_hv_map */
>>>>>
>>>>> n = iommu_area_alloc(tbl->it_map, limit, start, npages, tbl->it_offset,
>>>>> @@ -858,6 +857,7 @@ void *iommu_alloc_coherent(struct device *dev, struct iommu_table *tbl,
>>>>> unsigned int order;
>>>>> unsigned int nio_pages, io_order;
>>>>> struct page *page;
>>>>> + size_t size_io = size;
>>>>>
>>>>> size = PAGE_ALIGN(size);
>>>>> order = get_order(size);
>>>>> @@ -884,8 +884,9 @@ void *iommu_alloc_coherent(struct device *dev, struct iommu_table *tbl,
>>>>> memset(ret, 0, size);
>>>>>
>>>>> /* Set up tces to cover the allocated range */
>>>>> - nio_pages = size >> tbl->it_page_shift;
>>>>> - io_order = get_iommu_order(size, tbl);
>>>>> + size_io = IOMMU_PAGE_ALIGN(size_io, tbl);
>>>>> + nio_pages = size_io >> tbl->it_page_shift;
>>>>> + io_order = get_iommu_order(size_io, tbl);
>>>>> mapping = iommu_alloc(dev, tbl, ret, nio_pages, DMA_BIDIRECTIONAL,
>>>>> mask >> tbl->it_page_shift, io_order, 0);
>>>>> if (mapping == DMA_MAPPING_ERROR) {
>>>>> @@ -900,11 +901,11 @@ void iommu_free_coherent(struct iommu_table *tbl, size_t size,
>>>>> void *vaddr, dma_addr_t dma_handle)
>>>>> {
>>>>> if (tbl) {
>>>>> - unsigned int nio_pages;
>>>>> + size_t size_io = IOMMU_PAGE_ALIGN(size, tbl);
>>>>> + unsigned int nio_pages = size_io >> tbl->it_page_shift;
>>>>>
>>>>> - size = PAGE_ALIGN(size);
>>>>> - nio_pages = size >> tbl->it_page_shift;
>>>>> iommu_free(tbl, dma_handle, nio_pages);
>>>>> +
>>>>
>>>> Unrelated new line.
>>>
>>> Will be removed. Thanks!
>>>
>>>>
>>>>> size = PAGE_ALIGN(size);
>>>>> free_pages((unsigned long)vaddr, get_order(size));
>>>>> }
>>>>>
>
--
Alexey
next prev parent reply other threads:[~2020-08-31 0:49 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-17 23:40 [PATCH v1 00/10] DDW indirect mapping Leonardo Bras
2020-08-17 23:40 ` [PATCH v1 01/10] powerpc/pseries/iommu: Replace hard-coded page shift Leonardo Bras
2020-08-22 9:33 ` Alexey Kardashevskiy
2020-08-27 15:32 ` Leonardo Bras
2020-08-28 2:27 ` Alexey Kardashevskiy
2020-08-28 19:55 ` Leonardo Bras
2020-08-31 0:06 ` Alexey Kardashevskiy
2020-08-31 1:41 ` Oliver O'Halloran
2020-08-31 3:48 ` Alexey Kardashevskiy
2020-09-01 21:38 ` Leonardo Bras
2020-09-03 4:26 ` Alexey Kardashevskiy
2020-08-17 23:40 ` [PATCH v1 02/10] powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE on iommu_*_coherent() Leonardo Bras
2020-08-22 10:07 ` Alexey Kardashevskiy
2020-08-27 16:51 ` Leonardo Bras
2020-08-28 1:40 ` Alexey Kardashevskiy
2020-08-28 20:41 ` Leonardo Bras
2020-08-31 0:47 ` Alexey Kardashevskiy [this message]
2020-09-01 22:34 ` Leonardo Bras
2020-09-03 4:41 ` Alexey Kardashevskiy
2020-09-04 6:04 ` Leonardo Bras
2020-09-08 3:18 ` Alexey Kardashevskiy
2020-08-17 23:40 ` [PATCH v1 03/10] powerpc/kernel/iommu: Use largepool as a last resort when !largealloc Leonardo Bras
2020-08-22 10:09 ` Alexey Kardashevskiy
2020-08-27 16:58 ` Leonardo Bras
2020-08-17 23:40 ` [PATCH v1 04/10] powerpc/kernel/iommu: Add new iommu_table_in_use() helper Leonardo Bras
2020-08-22 10:34 ` Alexey Kardashevskiy
2020-08-27 18:34 ` Leonardo Bras
2020-08-28 1:51 ` Alexey Kardashevskiy
2020-08-17 23:40 ` [PATCH v1 05/10] powerpc/pseries/iommu: Add iommu_pseries_alloc_table() helper Leonardo Bras
2020-08-24 0:38 ` Alexey Kardashevskiy
2020-08-27 21:23 ` Leonardo Bras
2020-08-17 23:40 ` [PATCH v1 06/10] powerpc/pseries/iommu: Add ddw_list_add() helper Leonardo Bras
2020-08-24 3:46 ` Alexey Kardashevskiy
2020-08-27 22:11 ` Leonardo Bras
2020-08-28 1:58 ` Alexey Kardashevskiy
2020-08-28 21:28 ` Leonardo Bras
2020-08-17 23:40 ` [PATCH v1 07/10] powerpc/pseries/iommu: Allow DDW windows starting at 0x00 Leonardo Bras
2020-08-24 3:44 ` Alexey Kardashevskiy
2020-08-28 14:04 ` Leonardo Bras
2020-08-31 0:50 ` Alexey Kardashevskiy
2020-08-17 23:40 ` [PATCH v1 08/10] powerpc/pseries/iommu: Add ddw_property_create() and refactor enable_ddw() Leonardo Bras
2020-08-24 5:07 ` Alexey Kardashevskiy
2020-08-28 15:25 ` Leonardo Bras
2020-08-31 4:34 ` Alexey Kardashevskiy
2020-09-02 5:27 ` Leonardo Bras
2020-08-17 23:40 ` [PATCH v1 09/10] powerpc/pseries/iommu: Make use of DDW even if it does not map the partition Leonardo Bras
2020-08-24 5:17 ` Alexey Kardashevskiy
2020-08-28 18:36 ` Leonardo Bras
2020-08-31 4:35 ` Alexey Kardashevskiy
2020-09-02 6:11 ` Leonardo Bras
2020-09-04 1:00 ` Alexey Kardashevskiy
2020-08-17 23:40 ` [PATCH v1 10/10] powerpc/pseries/iommu: Rename "direct window" to "dma window" Leonardo Bras
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=81f106bd-8962-22f2-f14a-378d3486f57e@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=bauerman@linux.ibm.com \
--cc=benh@kernel.crashing.org \
--cc=brking@linux.vnet.ibm.com \
--cc=christophe.leroy@c-s.fr \
--cc=joel@jms.id.au \
--cc=leobras.c@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=linuxram@us.ibm.com \
--cc=mpe@ellerman.id.au \
--cc=muvic@linux.ibm.com \
--cc=paulus@samba.org \
--cc=zdai@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).