* [PATCH v2 0/2] swiotlb: allocate padding slots if necessary
@ 2024-03-18 13:04 Petr Tesarik
2024-03-18 13:04 ` [PATCH v2 1/2] swiotlb: extend buffer pre-padding to alloc_align_mask " Petr Tesarik
2024-03-18 13:04 ` [PATCH v2 2/2] bug: introduce ASSERT_VAR_CAN_HOLD() Petr Tesarik
0 siblings, 2 replies; 6+ messages in thread
From: Petr Tesarik @ 2024-03-18 13:04 UTC (permalink / raw)
To: Christoph Hellwig, Marek Szyprowski, Robin Murphy, Petr Tesarik,
Michael Kelley, Will Deacon, open list,
open list:DMA MAPPING HELPERS
Cc: Roberto Sassu, Petr Tesarik
From: Petr Tesarik <petr.tesarik1@huawei-partners.com>
If the allocation alignment is bigger than IO_TLB_SIZE and min_align_mask
covers some bits in the original address between IO_TLB_SIZE and
alloc_align_mask, preserve these bits by allocating additional padding
slots before the actual swiotlb buffer.
Changes from v1
---------------
* Rename padding to pad_slots.
* Set pad_slots only for the first allocated non-padding slot.
* Do not bother initializing orig_addr to INVALID_PHYS_ADDR.
* Change list and pad_slots to unsigned short to avoid growing
struct io_tlb_slot on 32-bit targets.
* Add build-time check that list and pad_slots can hold the maximum
allowed value of IO_TLB_SEGSIZE.
Petr Tesarik (2):
swiotlb: extend buffer pre-padding to alloc_align_mask if necessary
bug: introduce ASSERT_VAR_CAN_HOLD()
include/linux/build_bug.h | 10 ++++++++++
kernel/dma/swiotlb.c | 37 +++++++++++++++++++++++++++++++------
2 files changed, 41 insertions(+), 6 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 1/2] swiotlb: extend buffer pre-padding to alloc_align_mask if necessary
2024-03-18 13:04 [PATCH v2 0/2] swiotlb: allocate padding slots if necessary Petr Tesarik
@ 2024-03-18 13:04 ` Petr Tesarik
2024-03-18 15:37 ` Michael Kelley
2024-03-19 4:45 ` kernel test robot
2024-03-18 13:04 ` [PATCH v2 2/2] bug: introduce ASSERT_VAR_CAN_HOLD() Petr Tesarik
1 sibling, 2 replies; 6+ messages in thread
From: Petr Tesarik @ 2024-03-18 13:04 UTC (permalink / raw)
To: Christoph Hellwig, Marek Szyprowski, Robin Murphy, Petr Tesarik,
Michael Kelley, Will Deacon, open list,
open list:DMA MAPPING HELPERS
Cc: Roberto Sassu, Petr Tesarik
From: Petr Tesarik <petr.tesarik1@huawei-partners.com>
Allow a buffer pre-padding of up to alloc_align_mask. If the allocation
alignment is bigger than IO_TLB_SIZE and min_align_mask covers any non-zero
bits in the original address between IO_TLB_SIZE and alloc_align_mask,
these bits are not preserved in the swiotlb buffer address.
To fix this case, increase the allocation size and use a larger offset
within the allocated buffer. As a result, extra padding slots may be
allocated before the mapping start address.
Set the orig_addr in these padding slots to INVALID_PHYS_ADDR, because they
do not correspond to any CPU buffer and the data must never be synced.
The padding slots should be automatically released when the buffer is
unmapped. However, swiotlb_tbl_unmap_single() takes only the address of the
DMA buffer slot, not the first padding slot. Save the number of padding
slots in struct io_tlb_slot and use it to adjust the slot index in
swiotlb_release_slots(), so all allocated slots are properly freed.
Fixes: 2fd4fa5d3fb5 ("swiotlb: Fix alignment checks when both allocation and DMA masks are present")
Link: https://lore.kernel.org/linux-iommu/20240311210507.217daf8b@meshulam.tesarici.cz/
Signed-off-by: Petr Tesarik <petr.tesarik1@huawei-partners.com>
---
kernel/dma/swiotlb.c | 35 +++++++++++++++++++++++++++++------
1 file changed, 29 insertions(+), 6 deletions(-)
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 86fe172b5958..aefb05ff55e7 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -69,11 +69,14 @@
* @alloc_size: Size of the allocated buffer.
* @list: The free list describing the number of free entries available
* from each index.
+ * @pad_slots: Number of preceding padding slots. Valid only in the first
+ * allocated non-padding slot.
*/
struct io_tlb_slot {
phys_addr_t orig_addr;
size_t alloc_size;
- unsigned int list;
+ unsigned short list;
+ unsigned short pad_slots;
};
static bool swiotlb_force_bounce;
@@ -287,6 +290,7 @@ static void swiotlb_init_io_tlb_pool(struct io_tlb_pool *mem, phys_addr_t start,
mem->nslabs - i);
mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
mem->slots[i].alloc_size = 0;
+ mem->slots[i].pad_slots = 0;
}
memset(vaddr, 0, bytes);
@@ -1328,11 +1332,12 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
unsigned long attrs)
{
struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
- unsigned int offset = swiotlb_align_offset(dev, orig_addr);
+ unsigned int offset;
struct io_tlb_pool *pool;
unsigned int i;
int index;
phys_addr_t tlb_addr;
+ unsigned short pad_slots;
if (!mem || !mem->nslabs) {
dev_warn_ratelimited(dev,
@@ -1349,6 +1354,15 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
return (phys_addr_t)DMA_MAPPING_ERROR;
}
+ /*
+ * Calculate buffer pre-padding within the allocated space. Use it to
+ * preserve the low bits of the original address according to device's
+ * min_align_mask. Limit the padding to alloc_align_mask or slot size
+ * (whichever is bigger); higher bits of the original address are
+ * preserved by selecting a suitable IO TLB slot.
+ */
+ offset = orig_addr & dma_get_min_align_mask(dev) &
+ (alloc_align_mask | (IO_TLB_SIZE - 1));
index = swiotlb_find_slots(dev, orig_addr,
alloc_size + offset, alloc_align_mask, &pool);
if (index == -1) {
@@ -1364,6 +1378,10 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
* This is needed when we sync the memory. Then we sync the buffer if
* needed.
*/
+ pad_slots = offset / IO_TLB_SIZE;
+ offset %= IO_TLB_SIZE;
+ index += pad_slots;
+ pool->slots[index].pad_slots = i;
for (i = 0; i < nr_slots(alloc_size + offset); i++)
pool->slots[index + i].orig_addr = slot_addr(orig_addr, i);
tlb_addr = slot_addr(pool->start, index) + offset;
@@ -1385,12 +1403,16 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
struct io_tlb_pool *mem = swiotlb_find_pool(dev, tlb_addr);
unsigned long flags;
unsigned int offset = swiotlb_align_offset(dev, tlb_addr);
- int index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT;
- int nslots = nr_slots(mem->slots[index].alloc_size + offset);
- int aindex = index / mem->area_nslabs;
- struct io_tlb_area *area = &mem->areas[aindex];
+ int index, nslots, aindex;
+ struct io_tlb_area *area;
int count, i;
+ index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT;
+ index -= mem->slots[index].pad_slots;
+ nslots = nr_slots(mem->slots[index].alloc_size + offset);
+ aindex = index / mem->area_nslabs;
+ area = &mem->areas[aindex];
+
/*
* Return the buffer to the free list by setting the corresponding
* entries to indicate the number of contiguous entries available.
@@ -1413,6 +1435,7 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
mem->slots[i].list = ++count;
mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
mem->slots[i].alloc_size = 0;
+ mem->slots[i].pad_slots = 0;
}
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 2/2] bug: introduce ASSERT_VAR_CAN_HOLD()
2024-03-18 13:04 [PATCH v2 0/2] swiotlb: allocate padding slots if necessary Petr Tesarik
2024-03-18 13:04 ` [PATCH v2 1/2] swiotlb: extend buffer pre-padding to alloc_align_mask " Petr Tesarik
@ 2024-03-18 13:04 ` Petr Tesarik
1 sibling, 0 replies; 6+ messages in thread
From: Petr Tesarik @ 2024-03-18 13:04 UTC (permalink / raw)
To: Christoph Hellwig, Marek Szyprowski, Robin Murphy, Petr Tesarik,
Michael Kelley, Will Deacon, open list,
open list:DMA MAPPING HELPERS
Cc: Roberto Sassu, Petr Tesarik
From: Petr Tesarik <petr.tesarik1@huawei-partners.com>
Introduce an ASSERT_VAR_CAN_HOLD() macro to check at build time that a
variable can hold the given value.
Use this macro in swiotlb to make sure that the list and pad_slots fields
of struct io_tlb_slot are big enough to hold the maximum possible value of
IO_TLB_SEGSIZE.
Signed-off-by: Petr Tesarik <petr.tesarik1@huawei-partners.com>
---
include/linux/build_bug.h | 10 ++++++++++
kernel/dma/swiotlb.c | 2 ++
2 files changed, 12 insertions(+)
diff --git a/include/linux/build_bug.h b/include/linux/build_bug.h
index 3aa3640f8c18..6e2486508af0 100644
--- a/include/linux/build_bug.h
+++ b/include/linux/build_bug.h
@@ -86,4 +86,14 @@
"Offset of " #field " in " #type " has changed.")
+/*
+ * Compile time check that a variable can hold the given value
+ */
+#define ASSERT_VAR_CAN_HOLD(var, value) ({ \
+ typeof(value) __val = (value); \
+ typeof(var) __tmp = __val; \
+ BUILD_BUG_ON_MSG(__tmp != __val, \
+ #var " cannot hold " #value "."); \
+})
+
#endif /* _LINUX_BUILD_BUG_H */
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index aefb05ff55e7..0737c1283f86 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -285,6 +285,8 @@ static void swiotlb_init_io_tlb_pool(struct io_tlb_pool *mem, phys_addr_t start,
mem->areas[i].used = 0;
}
+ ASSERT_VAR_CAN_HOLD(mem->slots[0].list, IO_TLB_SEGSIZE);
+ ASSERT_VAR_CAN_HOLD(mem->slots[0].pad_slots, IO_TLB_SEGSIZE);
for (i = 0; i < mem->nslabs; i++) {
mem->slots[i].list = min(IO_TLB_SEGSIZE - io_tlb_offset(i),
mem->nslabs - i);
--
2.34.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* RE: [PATCH v2 1/2] swiotlb: extend buffer pre-padding to alloc_align_mask if necessary
2024-03-18 13:04 ` [PATCH v2 1/2] swiotlb: extend buffer pre-padding to alloc_align_mask " Petr Tesarik
@ 2024-03-18 15:37 ` Michael Kelley
2024-03-18 18:50 ` Petr Tesařík
2024-03-19 4:45 ` kernel test robot
1 sibling, 1 reply; 6+ messages in thread
From: Michael Kelley @ 2024-03-18 15:37 UTC (permalink / raw)
To: Petr Tesarik, Christoph Hellwig, Marek Szyprowski, Robin Murphy,
Petr Tesarik, Will Deacon, open list,
open list:DMA MAPPING HELPERS
Cc: Roberto Sassu, Petr Tesarik
From: Petr Tesarik <petrtesarik@huaweicloud.com> Sent: Monday, March 18, 2024 6:05 AM
>
> Allow a buffer pre-padding of up to alloc_align_mask. If the allocation
> alignment is bigger than IO_TLB_SIZE and min_align_mask covers any non-zero
> bits in the original address between IO_TLB_SIZE and alloc_align_mask,
> these bits are not preserved in the swiotlb buffer address.
>
> To fix this case, increase the allocation size and use a larger offset
> within the allocated buffer. As a result, extra padding slots may be
> allocated before the mapping start address.
>
> Set the orig_addr in these padding slots to INVALID_PHYS_ADDR, because they
> do not correspond to any CPU buffer and the data must never be synced.
This paragraph is now obsolete.
>
> The padding slots should be automatically released when the buffer is
> unmapped. However, swiotlb_tbl_unmap_single() takes only the address of the
> DMA buffer slot, not the first padding slot. Save the number of padding
> slots in struct io_tlb_slot and use it to adjust the slot index in
> swiotlb_release_slots(), so all allocated slots are properly freed.
>
> Fixes: 2fd4fa5d3fb5 ("swiotlb: Fix alignment checks when both allocation and DMA masks are present")
> Link: https://lore.kernel.org/linux-iommu/20240311210507.217daf8b@meshulam.tesarici.cz/
> Signed-off-by: Petr Tesarik <petr.tesarik1@huawei-partners.com>
> ---
> kernel/dma/swiotlb.c | 35 +++++++++++++++++++++++++++++------
> 1 file changed, 29 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 86fe172b5958..aefb05ff55e7 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -69,11 +69,14 @@
> * @alloc_size: Size of the allocated buffer.
> * @list: The free list describing the number of free entries available
> * from each index.
> + * @pad_slots: Number of preceding padding slots. Valid only in the first
> + * allocated non-padding slot.
> */
> struct io_tlb_slot {
> phys_addr_t orig_addr;
> size_t alloc_size;
> - unsigned int list;
> + unsigned short list;
> + unsigned short pad_slots;
> };
>
> static bool swiotlb_force_bounce;
> @@ -287,6 +290,7 @@ static void swiotlb_init_io_tlb_pool(struct io_tlb_pool *mem, phys_addr_t start,
> mem->nslabs - i);
> mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
> mem->slots[i].alloc_size = 0;
> + mem->slots[i].pad_slots = 0;
> }
>
> memset(vaddr, 0, bytes);
> @@ -1328,11 +1332,12 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
> unsigned long attrs)
> {
> struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
> - unsigned int offset = swiotlb_align_offset(dev, orig_addr);
> + unsigned int offset;
> struct io_tlb_pool *pool;
> unsigned int i;
> int index;
> phys_addr_t tlb_addr;
> + unsigned short pad_slots;
>
> if (!mem || !mem->nslabs) {
> dev_warn_ratelimited(dev,
> @@ -1349,6 +1354,15 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
> return (phys_addr_t)DMA_MAPPING_ERROR;
> }
>
> + /*
> + * Calculate buffer pre-padding within the allocated space. Use it to
> + * preserve the low bits of the original address according to device's
> + * min_align_mask. Limit the padding to alloc_align_mask or slot size
> + * (whichever is bigger); higher bits of the original address are
> + * preserved by selecting a suitable IO TLB slot.
> + */
> + offset = orig_addr & dma_get_min_align_mask(dev) &
> + (alloc_align_mask | (IO_TLB_SIZE - 1));
> index = swiotlb_find_slots(dev, orig_addr,
> alloc_size + offset, alloc_align_mask, &pool);
> if (index == -1) {
> @@ -1364,6 +1378,10 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
> * This is needed when we sync the memory. Then we sync the buffer if
> * needed.
> */
> + pad_slots = offset / IO_TLB_SIZE;
> + offset %= IO_TLB_SIZE;
> + index += pad_slots;
> + pool->slots[index].pad_slots = i;
The above line should be:
pool->slots[index].pad_slots = pad_slots;
> for (i = 0; i < nr_slots(alloc_size + offset); i++)
> pool->slots[index + i].orig_addr = slot_addr(orig_addr, i);
> tlb_addr = slot_addr(pool->start, index) + offset;
> @@ -1385,12 +1403,16 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
> struct io_tlb_pool *mem = swiotlb_find_pool(dev, tlb_addr);
> unsigned long flags;
> unsigned int offset = swiotlb_align_offset(dev, tlb_addr);
> - int index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT;
> - int nslots = nr_slots(mem->slots[index].alloc_size + offset);
> - int aindex = index / mem->area_nslabs;
> - struct io_tlb_area *area = &mem->areas[aindex];
> + int index, nslots, aindex;
> + struct io_tlb_area *area;
> int count, i;
>
> + index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT;
> + index -= mem->slots[index].pad_slots;
> + nslots = nr_slots(mem->slots[index].alloc_size + offset);
> + aindex = index / mem->area_nslabs;
> + area = &mem->areas[aindex];
> +
> /*
> * Return the buffer to the free list by setting the corresponding
> * entries to indicate the number of contiguous entries available.
> @@ -1413,6 +1435,7 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
> mem->slots[i].list = ++count;
> mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
> mem->slots[i].alloc_size = 0;
> + mem->slots[i].pad_slots = 0;
> }
>
> /*
> --
> 2.34.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/2] swiotlb: extend buffer pre-padding to alloc_align_mask if necessary
2024-03-18 15:37 ` Michael Kelley
@ 2024-03-18 18:50 ` Petr Tesařík
0 siblings, 0 replies; 6+ messages in thread
From: Petr Tesařík @ 2024-03-18 18:50 UTC (permalink / raw)
To: Michael Kelley
Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, Petr Tesarik,
Will Deacon, open list, open list:DMA MAPPING HELPERS,
Roberto Sassu, Petr Tesarik
On Mon, 18 Mar 2024 15:37:16 +0000
Michael Kelley <mhklinux@outlook.com> wrote:
> From: Petr Tesarik <petrtesarik@huaweicloud.com> Sent: Monday, March 18, 2024 6:05 AM
> >
> > Allow a buffer pre-padding of up to alloc_align_mask. If the allocation
> > alignment is bigger than IO_TLB_SIZE and min_align_mask covers any non-zero
> > bits in the original address between IO_TLB_SIZE and alloc_align_mask,
> > these bits are not preserved in the swiotlb buffer address.
> >
> > To fix this case, increase the allocation size and use a larger offset
> > within the allocated buffer. As a result, extra padding slots may be
> > allocated before the mapping start address.
> >
> > Set the orig_addr in these padding slots to INVALID_PHYS_ADDR, because they
> > do not correspond to any CPU buffer and the data must never be synced.
>
> This paragraph is now obsolete.
Right. It should now say that orig_addr already _is_ set to
INVALID_PHYS_ADDR, so attempts to sync data will be ignored.
> >
> > The padding slots should be automatically released when the buffer is
> > unmapped. However, swiotlb_tbl_unmap_single() takes only the address of the
> > DMA buffer slot, not the first padding slot. Save the number of padding
> > slots in struct io_tlb_slot and use it to adjust the slot index in
> > swiotlb_release_slots(), so all allocated slots are properly freed.
> >
> > Fixes: 2fd4fa5d3fb5 ("swiotlb: Fix alignment checks when both allocation and DMA masks are present")
> > Link: https://lore.kernel.org/linux-iommu/20240311210507.217daf8b@meshulam.tesarici.cz/
> > Signed-off-by: Petr Tesarik <petr.tesarik1@huawei-partners.com>
> > ---
> > kernel/dma/swiotlb.c | 35 +++++++++++++++++++++++++++++------
> > 1 file changed, 29 insertions(+), 6 deletions(-)
> >
> > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> > index 86fe172b5958..aefb05ff55e7 100644
> > --- a/kernel/dma/swiotlb.c
> > +++ b/kernel/dma/swiotlb.c
> > @@ -69,11 +69,14 @@
> > * @alloc_size: Size of the allocated buffer.
> > * @list: The free list describing the number of free entries available
> > * from each index.
> > + * @pad_slots: Number of preceding padding slots. Valid only in the first
> > + * allocated non-padding slot.
> > */
> > struct io_tlb_slot {
> > phys_addr_t orig_addr;
> > size_t alloc_size;
> > - unsigned int list;
> > + unsigned short list;
> > + unsigned short pad_slots;
> > };
> >
> > static bool swiotlb_force_bounce;
> > @@ -287,6 +290,7 @@ static void swiotlb_init_io_tlb_pool(struct io_tlb_pool *mem, phys_addr_t start,
> > mem->nslabs - i);
> > mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
> > mem->slots[i].alloc_size = 0;
> > + mem->slots[i].pad_slots = 0;
> > }
> >
> > memset(vaddr, 0, bytes);
> > @@ -1328,11 +1332,12 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
> > unsigned long attrs)
> > {
> > struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
> > - unsigned int offset = swiotlb_align_offset(dev, orig_addr);
> > + unsigned int offset;
> > struct io_tlb_pool *pool;
> > unsigned int i;
> > int index;
> > phys_addr_t tlb_addr;
> > + unsigned short pad_slots;
> >
> > if (!mem || !mem->nslabs) {
> > dev_warn_ratelimited(dev,
> > @@ -1349,6 +1354,15 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
> > return (phys_addr_t)DMA_MAPPING_ERROR;
> > }
> >
> > + /*
> > + * Calculate buffer pre-padding within the allocated space. Use it to
> > + * preserve the low bits of the original address according to device's
> > + * min_align_mask. Limit the padding to alloc_align_mask or slot size
> > + * (whichever is bigger); higher bits of the original address are
> > + * preserved by selecting a suitable IO TLB slot.
> > + */
> > + offset = orig_addr & dma_get_min_align_mask(dev) &
> > + (alloc_align_mask | (IO_TLB_SIZE - 1));
> > index = swiotlb_find_slots(dev, orig_addr,
> > alloc_size + offset, alloc_align_mask, &pool);
> > if (index == -1) {
> > @@ -1364,6 +1378,10 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
> > * This is needed when we sync the memory. Then we sync the buffer if
> > * needed.
> > */
> > + pad_slots = offset / IO_TLB_SIZE;
> > + offset %= IO_TLB_SIZE;
> > + index += pad_slots;
> > + pool->slots[index].pad_slots = i;
>
> The above line should be:
> pool->slots[index].pad_slots = pad_slots;
Doh. Yes, I rewrote it a few times and then forgot to change this.
How did it even pass the test suite?
Thank you for the review.
Petr T
> > for (i = 0; i < nr_slots(alloc_size + offset); i++)
> > pool->slots[index + i].orig_addr = slot_addr(orig_addr, i);
> > tlb_addr = slot_addr(pool->start, index) + offset;
> > @@ -1385,12 +1403,16 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
> > struct io_tlb_pool *mem = swiotlb_find_pool(dev, tlb_addr);
> > unsigned long flags;
> > unsigned int offset = swiotlb_align_offset(dev, tlb_addr);
> > - int index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT;
> > - int nslots = nr_slots(mem->slots[index].alloc_size + offset);
> > - int aindex = index / mem->area_nslabs;
> > - struct io_tlb_area *area = &mem->areas[aindex];
> > + int index, nslots, aindex;
> > + struct io_tlb_area *area;
> > int count, i;
> >
> > + index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT;
> > + index -= mem->slots[index].pad_slots;
> > + nslots = nr_slots(mem->slots[index].alloc_size + offset);
> > + aindex = index / mem->area_nslabs;
> > + area = &mem->areas[aindex];
> > +
> > /*
> > * Return the buffer to the free list by setting the corresponding
> > * entries to indicate the number of contiguous entries available.
> > @@ -1413,6 +1435,7 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
> > mem->slots[i].list = ++count;
> > mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
> > mem->slots[i].alloc_size = 0;
> > + mem->slots[i].pad_slots = 0;
> > }
> >
> > /*
> > --
> > 2.34.1
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/2] swiotlb: extend buffer pre-padding to alloc_align_mask if necessary
2024-03-18 13:04 ` [PATCH v2 1/2] swiotlb: extend buffer pre-padding to alloc_align_mask " Petr Tesarik
2024-03-18 15:37 ` Michael Kelley
@ 2024-03-19 4:45 ` kernel test robot
1 sibling, 0 replies; 6+ messages in thread
From: kernel test robot @ 2024-03-19 4:45 UTC (permalink / raw)
To: Petr Tesarik, Christoph Hellwig, Marek Szyprowski, Robin Murphy,
Petr Tesarik, Michael Kelley, Will Deacon, open list,
open list:DMA MAPPING HELPERS
Cc: llvm, oe-kbuild-all, Roberto Sassu
Hi Petr,
kernel test robot noticed the following build warnings:
[auto build test WARNING on v6.8]
[also build test WARNING on linus/master next-20240318]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Petr-Tesarik/swiotlb-extend-buffer-pre-padding-to-alloc_align_mask-if-necessary/20240318-212500
base: v6.8
patch link: https://lore.kernel.org/r/20240318130447.594-2-petrtesarik%40huaweicloud.com
patch subject: [PATCH v2 1/2] swiotlb: extend buffer pre-padding to alloc_align_mask if necessary
config: s390-allnoconfig (https://download.01.org/0day-ci/archive/20240319/202403191203.AtV7fvue-lkp@intel.com/config)
compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project 8f68022f8e6e54d1aeae4ed301f5a015963089b7)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240319/202403191203.AtV7fvue-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202403191203.AtV7fvue-lkp@intel.com/
All warnings (new ones prefixed by >>):
In file included from kernel/dma/swiotlb.c:27:
In file included from include/linux/dma-direct.h:9:
In file included from include/linux/dma-mapping.h:8:
In file included from include/linux/device.h:32:
In file included from include/linux/device/driver.h:21:
In file included from include/linux/module.h:19:
In file included from include/linux/elf.h:6:
In file included from arch/s390/include/asm/elf.h:173:
In file included from arch/s390/include/asm/mmu_context.h:11:
In file included from arch/s390/include/asm/pgalloc.h:18:
In file included from include/linux/mm.h:2188:
include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
522 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
| ~~~~~~~~~~~ ^ ~~~
In file included from kernel/dma/swiotlb.c:27:
In file included from include/linux/dma-direct.h:9:
In file included from include/linux/dma-mapping.h:11:
In file included from include/linux/scatterlist.h:9:
In file included from arch/s390/include/asm/io.h:78:
include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
547 | val = __raw_readb(PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
560 | val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
| ~~~~~~~~~~ ^
include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
37 | #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
| ^
include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
| ^
In file included from kernel/dma/swiotlb.c:27:
In file included from include/linux/dma-direct.h:9:
In file included from include/linux/dma-mapping.h:11:
In file included from include/linux/scatterlist.h:9:
In file included from arch/s390/include/asm/io.h:78:
include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
573 | val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
| ~~~~~~~~~~ ^
include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
35 | #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
| ^
include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
| ^
In file included from kernel/dma/swiotlb.c:27:
In file included from include/linux/dma-direct.h:9:
In file included from include/linux/dma-mapping.h:11:
In file included from include/linux/scatterlist.h:9:
In file included from arch/s390/include/asm/io.h:78:
include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
584 | __raw_writeb(value, PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
594 | __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
604 | __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:692:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
692 | readsb(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:700:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
700 | readsw(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:708:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
708 | readsl(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:717:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
717 | writesb(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:726:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
726 | writesw(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
include/asm-generic/io.h:735:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
735 | writesl(PCI_IOBASE + addr, buffer, count);
| ~~~~~~~~~~ ^
>> kernel/dma/swiotlb.c:1348:33: warning: variable 'i' is uninitialized when used here [-Wuninitialized]
1348 | pool->slots[index].pad_slots = i;
| ^
kernel/dma/swiotlb.c:1301:16: note: initialize the variable 'i' to silence this warning
1301 | unsigned int i;
| ^
| = 0
kernel/dma/swiotlb.c:1643:20: warning: unused function 'swiotlb_create_debugfs_files' [-Wunused-function]
1643 | static inline void swiotlb_create_debugfs_files(struct io_tlb_mem *mem,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
15 warnings generated.
vim +/i +1348 kernel/dma/swiotlb.c
1292
1293 phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
1294 size_t mapping_size, size_t alloc_size,
1295 unsigned int alloc_align_mask, enum dma_data_direction dir,
1296 unsigned long attrs)
1297 {
1298 struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
1299 unsigned int offset;
1300 struct io_tlb_pool *pool;
1301 unsigned int i;
1302 int index;
1303 phys_addr_t tlb_addr;
1304 unsigned short pad_slots;
1305
1306 if (!mem || !mem->nslabs) {
1307 dev_warn_ratelimited(dev,
1308 "Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer");
1309 return (phys_addr_t)DMA_MAPPING_ERROR;
1310 }
1311
1312 if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
1313 pr_warn_once("Memory encryption is active and system is using DMA bounce buffers\n");
1314
1315 if (mapping_size > alloc_size) {
1316 dev_warn_once(dev, "Invalid sizes (mapping: %zd bytes, alloc: %zd bytes)",
1317 mapping_size, alloc_size);
1318 return (phys_addr_t)DMA_MAPPING_ERROR;
1319 }
1320
1321 /*
1322 * Calculate buffer pre-padding within the allocated space. Use it to
1323 * preserve the low bits of the original address according to device's
1324 * min_align_mask. Limit the padding to alloc_align_mask or slot size
1325 * (whichever is bigger); higher bits of the original address are
1326 * preserved by selecting a suitable IO TLB slot.
1327 */
1328 offset = orig_addr & dma_get_min_align_mask(dev) &
1329 (alloc_align_mask | (IO_TLB_SIZE - 1));
1330 index = swiotlb_find_slots(dev, orig_addr,
1331 alloc_size + offset, alloc_align_mask, &pool);
1332 if (index == -1) {
1333 if (!(attrs & DMA_ATTR_NO_WARN))
1334 dev_warn_ratelimited(dev,
1335 "swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
1336 alloc_size, mem->nslabs, mem_used(mem));
1337 return (phys_addr_t)DMA_MAPPING_ERROR;
1338 }
1339
1340 /*
1341 * Save away the mapping from the original address to the DMA address.
1342 * This is needed when we sync the memory. Then we sync the buffer if
1343 * needed.
1344 */
1345 pad_slots = offset / IO_TLB_SIZE;
1346 offset %= IO_TLB_SIZE;
1347 index += pad_slots;
> 1348 pool->slots[index].pad_slots = i;
1349 for (i = 0; i < nr_slots(alloc_size + offset); i++)
1350 pool->slots[index + i].orig_addr = slot_addr(orig_addr, i);
1351 tlb_addr = slot_addr(pool->start, index) + offset;
1352 /*
1353 * When the device is writing memory, i.e. dir == DMA_FROM_DEVICE, copy
1354 * the original buffer to the TLB buffer before initiating DMA in order
1355 * to preserve the original's data if the device does a partial write,
1356 * i.e. if the device doesn't overwrite the entire buffer. Preserving
1357 * the original data, even if it's garbage, is necessary to match
1358 * hardware behavior. Use of swiotlb is supposed to be transparent,
1359 * i.e. swiotlb must not corrupt memory by clobbering unwritten bytes.
1360 */
1361 swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_TO_DEVICE);
1362 return tlb_addr;
1363 }
1364
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-03-19 4:46 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-18 13:04 [PATCH v2 0/2] swiotlb: allocate padding slots if necessary Petr Tesarik
2024-03-18 13:04 ` [PATCH v2 1/2] swiotlb: extend buffer pre-padding to alloc_align_mask " Petr Tesarik
2024-03-18 15:37 ` Michael Kelley
2024-03-18 18:50 ` Petr Tesařík
2024-03-19 4:45 ` kernel test robot
2024-03-18 13:04 ` [PATCH v2 2/2] bug: introduce ASSERT_VAR_CAN_HOLD() Petr Tesarik
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).