linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* iommu dma mapping alignment requirements
@ 2007-12-20 17:14 Steve Wise
  2007-12-20 17:29 ` [ofa-general] " Tom Tucker
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Steve Wise @ 2007-12-20 17:14 UTC (permalink / raw)
  To: Roland Dreier
  Cc: linux-kernel, OpenFabrics General, Benjamin Herrenschmidt, Wen Xiong

Hey Roland (and any iommu/ppc/dma experts out there):

I'm debugging a data corruption issue that happens on PPC64 systems 
running rdma on kernels where the iommu page size is 4KB yet the host 
page size is 64KB.  This "feature" was added to the PPC64 code recently, 
and is in kernel.org from 2.6.23.  So if the kernel is built with a 4KB 
page size, no problems.  If the kernel is prior to 2.6.23 then 64KB page 
  configs work too. Its just a problem when the iommu page size != host 
page size.

It appears that my problem boils down to a single host page of memory 
that is mapped for dma, and the dma address returned by dma_map_sg() is 
_not_ 64KB aligned.  Here is an example:

app registers va 0x000000002d9a3000 len 12288
ib_umem_get() creates and maps a umem and chunk that looks like (dumping 
state from a registered user memory region):

>     umem len 12288 off 12288 pgsz 65536 shift 16
>     chunk 0: nmap 1 nents 1
>         sglist[0] page 0xc000000000930b08 off 0 len 65536 dma_addr 000000005bff4000 dma_len 65536
> 

So the kernel maps 1 full page for this MR.  But note that the dma 
address is 000000005bff4000 which is 4KB aligned, not 64KB aligned.  I 
think this is causing grief to the RDMA HW.

My first question is: Is there an assumption or requirement in linux 
that dma_addressess should have the same alignment as the host address 
they are mapped to?  IE the rdma core is mapping the entire 64KB page, 
but the mapping doesn't begin on a 64KB page boundary.

If this mapping is considered valid, then perhaps the rdma hw is at 
fault here.  But I'm wondering if this is an PPC/iommu bug.

BTW:  Here is what the Memory Region looks like to the HW:

> TPT entry:  stag idx 0x2e800 key 0xff state VAL type NSMR pdid 0x2
>             perms RW rem_inv_dis 0 addr_type VATO
>             bind_enable 1 pg_size 65536 qpid 0x0 pbl_addr 0x003c67c0
>             len 12288 va 000000002d9a3000 bind_cnt 0
> PBL: 000000005bff4000



Any thoughts?

Steve.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ofa-general] iommu dma mapping alignment requirements
  2007-12-20 17:14 iommu dma mapping alignment requirements Steve Wise
@ 2007-12-20 17:29 ` Tom Tucker
  2007-12-20 18:07 ` Roland Dreier
  2007-12-20 20:17 ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 14+ messages in thread
From: Tom Tucker @ 2007-12-20 17:29 UTC (permalink / raw)
  To: Steve Wise
  Cc: Roland Dreier, Benjamin Herrenschmidt, linux-kernel,
	OpenFabrics General, Wen Xiong


On Thu, 2007-12-20 at 11:14 -0600, Steve Wise wrote:
> Hey Roland (and any iommu/ppc/dma experts out there):
> 
> I'm debugging a data corruption issue that happens on PPC64 systems 
> running rdma on kernels where the iommu page size is 4KB yet the host 
> page size is 64KB.  This "feature" was added to the PPC64 code recently, 
> and is in kernel.org from 2.6.23.  So if the kernel is built with a 4KB 
> page size, no problems.  If the kernel is prior to 2.6.23 then 64KB page 
>   configs work too. Its just a problem when the iommu page size != host 
> page size.
> 
> It appears that my problem boils down to a single host page of memory 
> that is mapped for dma, and the dma address returned by dma_map_sg() is 
> _not_ 64KB aligned.  Here is an example:
> 
> app registers va 0x000000002d9a3000 len 12288
> ib_umem_get() creates and maps a umem and chunk that looks like (dumping 
> state from a registered user memory region):
> 
> >     umem len 12288 off 12288 pgsz 65536 shift 16
> >     chunk 0: nmap 1 nents 1
> >         sglist[0] page 0xc000000000930b08 off 0 len 65536 dma_addr 000000005bff4000 dma_len 65536
> > 
> 
> So the kernel maps 1 full page for this MR.  But note that the dma 
> address is 000000005bff4000 which is 4KB aligned, not 64KB aligned.  I 
> think this is causing grief to the RDMA HW.
> 
> My first question is: Is there an assumption or requirement in linux 
> that dma_addressess should have the same alignment as the host address 
> they are mapped to?  IE the rdma core is mapping the entire 64KB page, 
> but the mapping doesn't begin on a 64KB page boundary.
> 
> If this mapping is considered valid, then perhaps the rdma hw is at 
> fault here.  But I'm wondering if this is an PPC/iommu bug.
> 
> BTW:  Here is what the Memory Region looks like to the HW:
> 
> > TPT entry:  stag idx 0x2e800 key 0xff state VAL type NSMR pdid 0x2
> >             perms RW rem_inv_dis 0 addr_type VATO
> >             bind_enable 1 pg_size 65536 qpid 0x0 pbl_addr 0x003c67c0
> >             len 12288 va 000000002d9a3000 bind_cnt 0
> > PBL: 000000005bff4000
> 
> 
> 
> Any thoughts?

The Ammasso certainly works this way. If you tell it the page size is
64KB, it will ignore bits in the page address that encode 0-65535.

> 
> Steve.
> 
> 
> _______________________________________________
> general mailing list
> general@lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ofa-general] iommu dma mapping alignment requirements
  2007-12-20 17:14 iommu dma mapping alignment requirements Steve Wise
  2007-12-20 17:29 ` [ofa-general] " Tom Tucker
@ 2007-12-20 18:07 ` Roland Dreier
  2007-12-20 19:11   ` Steve Wise
  2007-12-20 20:17 ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 14+ messages in thread
From: Roland Dreier @ 2007-12-20 18:07 UTC (permalink / raw)
  To: Steve Wise
  Cc: Benjamin Herrenschmidt, linux-kernel, OpenFabrics General, Wen Xiong

 > It appears that my problem boils down to a single host page of memory
 > that is mapped for dma, and the dma address returned by dma_map_sg()
 > is _not_ 64KB aligned.  Here is an example:

 > My first question is: Is there an assumption or requirement in linux
 > that dma_addressess should have the same alignment as the host address
 > they are mapped to?  IE the rdma core is mapping the entire 64KB page,
 > but the mapping doesn't begin on a 64KB page boundary.

I don't think this is explicitly documented anywhere, but it certainly
seems that we want the bus address to be page-aligned in this case.
For mthca/mlx4 at least, we tell the adapter what the host page size
is (so that it knows how to align doorbell pages etc) and I think this
sort of thing would confuse the HW.

 - R.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ofa-general] iommu dma mapping alignment requirements
  2007-12-20 18:07 ` Roland Dreier
@ 2007-12-20 19:11   ` Steve Wise
  2007-12-20 19:29     ` Steve Wise
  0 siblings, 1 reply; 14+ messages in thread
From: Steve Wise @ 2007-12-20 19:11 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Benjamin Herrenschmidt, linux-kernel, OpenFabrics General, Wen Xiong

Roland Dreier wrote:
>  > It appears that my problem boils down to a single host page of memory
>  > that is mapped for dma, and the dma address returned by dma_map_sg()
>  > is _not_ 64KB aligned.  Here is an example:
> 
>  > My first question is: Is there an assumption or requirement in linux
>  > that dma_addressess should have the same alignment as the host address
>  > they are mapped to?  IE the rdma core is mapping the entire 64KB page,
>  > but the mapping doesn't begin on a 64KB page boundary.
> 
> I don't think this is explicitly documented anywhere, but it certainly
> seems that we want the bus address to be page-aligned in this case.
> For mthca/mlx4 at least, we tell the adapter what the host page size
> is (so that it knows how to align doorbell pages etc) and I think this
> sort of thing would confuse the HW.
> 
>  - R.


In arch/powerpc/kernel/iommu.c:iommu_map_sg() I see that it calls 
iommu_range_alloc() with a alignment_order of 0:

>                 vaddr = (unsigned long)page_address(s->page) + s->offset;
>                 npages = iommu_num_pages(vaddr, slen);
>                 entry = iommu_range_alloc(tbl, npages, &handle, mask >> IOMMU_PAGE_SHIFT, 0);

But perhaps the alignment order needs to be based on the host page size?


Steve.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ofa-general] iommu dma mapping alignment requirements
  2007-12-20 19:11   ` Steve Wise
@ 2007-12-20 19:29     ` Steve Wise
  2007-12-20 20:21       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 14+ messages in thread
From: Steve Wise @ 2007-12-20 19:29 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Benjamin Herrenschmidt, linux-kernel, OpenFabrics General, Wen Xiong

Steve Wise wrote:
> Roland Dreier wrote:
>>  > It appears that my problem boils down to a single host page of memory
>>  > that is mapped for dma, and the dma address returned by dma_map_sg()
>>  > is _not_ 64KB aligned.  Here is an example:
>>
>>  > My first question is: Is there an assumption or requirement in linux
>>  > that dma_addressess should have the same alignment as the host address
>>  > they are mapped to?  IE the rdma core is mapping the entire 64KB page,
>>  > but the mapping doesn't begin on a 64KB page boundary.
>>
>> I don't think this is explicitly documented anywhere, but it certainly
>> seems that we want the bus address to be page-aligned in this case.
>> For mthca/mlx4 at least, we tell the adapter what the host page size
>> is (so that it knows how to align doorbell pages etc) and I think this
>> sort of thing would confuse the HW.
>>
>>  - R.
> 
> 
> In arch/powerpc/kernel/iommu.c:iommu_map_sg() I see that it calls 
> iommu_range_alloc() with a alignment_order of 0:
> 
>>                 vaddr = (unsigned long)page_address(s->page) + s->offset;
>>                 npages = iommu_num_pages(vaddr, slen);
>>                 entry = iommu_range_alloc(tbl, npages, &handle, mask 
>> >> IOMMU_PAGE_SHIFT, 0);
> 
> But perhaps the alignment order needs to be based on the host page size?
> 

Or based on the alignment of vaddr actually...


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: iommu dma mapping alignment requirements
  2007-12-20 17:14 iommu dma mapping alignment requirements Steve Wise
  2007-12-20 17:29 ` [ofa-general] " Tom Tucker
  2007-12-20 18:07 ` Roland Dreier
@ 2007-12-20 20:17 ` Benjamin Herrenschmidt
  2007-12-20 21:02   ` Steve Wise
  2 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2007-12-20 20:17 UTC (permalink / raw)
  To: Steve Wise
  Cc: Roland Dreier, linux-kernel, OpenFabrics General,
	Benjamin Herrenschmidt, Wen Xiong, Olof Johansson,
	Paul Mackerras

Adding A few more people to the discussion. You may well be right and we
would have to provide the same alignment, though that sucks a bit as one
of the reason we switched to 4K for the IOMMU is that the iommu space
available on pSeries is very small and we were running out of it with
64K pages and lots of networking activity.

On Thu, 2007-12-20 at 11:14 -0600, Steve Wise wrote:
> Hey Roland (and any iommu/ppc/dma experts out there):
> 
> I'm debugging a data corruption issue that happens on PPC64 systems 
> running rdma on kernels where the iommu page size is 4KB yet the host 
> page size is 64KB.  This "feature" was added to the PPC64 code recently, 
> and is in kernel.org from 2.6.23.  So if the kernel is built with a 4KB 
> page size, no problems.  If the kernel is prior to 2.6.23 then 64KB page 
>   configs work too. Its just a problem when the iommu page size != host 
> page size.
> 
> It appears that my problem boils down to a single host page of memory 
> that is mapped for dma, and the dma address returned by dma_map_sg() is 
> _not_ 64KB aligned.  Here is an example:
> 
> app registers va 0x000000002d9a3000 len 12288
> ib_umem_get() creates and maps a umem and chunk that looks like (dumping 
> state from a registered user memory region):
> 
> >     umem len 12288 off 12288 pgsz 65536 shift 16
> >     chunk 0: nmap 1 nents 1
> >         sglist[0] page 0xc000000000930b08 off 0 len 65536 dma_addr 000000005bff4000 dma_len 65536
> > 
> 
> So the kernel maps 1 full page for this MR.  But note that the dma 
> address is 000000005bff4000 which is 4KB aligned, not 64KB aligned.  I 
> think this is causing grief to the RDMA HW.
> 
> My first question is: Is there an assumption or requirement in linux 
> that dma_addressess should have the same alignment as the host address 
> they are mapped to?  IE the rdma core is mapping the entire 64KB page, 
> but the mapping doesn't begin on a 64KB page boundary.
> 
> If this mapping is considered valid, then perhaps the rdma hw is at 
> fault here.  But I'm wondering if this is an PPC/iommu bug.
> 
> BTW:  Here is what the Memory Region looks like to the HW:
> 
> > TPT entry:  stag idx 0x2e800 key 0xff state VAL type NSMR pdid 0x2
> >             perms RW rem_inv_dis 0 addr_type VATO
> >             bind_enable 1 pg_size 65536 qpid 0x0 pbl_addr 0x003c67c0
> >             len 12288 va 000000002d9a3000 bind_cnt 0
> > PBL: 000000005bff4000
> 
> 
> 
> Any thoughts?
> 
> Steve.
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ofa-general] iommu dma mapping alignment requirements
  2007-12-20 19:29     ` Steve Wise
@ 2007-12-20 20:21       ` Benjamin Herrenschmidt
  2007-12-20 21:22         ` Steve Wise
  0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2007-12-20 20:21 UTC (permalink / raw)
  To: Steve Wise
  Cc: Roland Dreier, Benjamin Herrenschmidt, linux-kernel,
	OpenFabrics General, Wen Xiong, Olof Johansson, Paul Mackerras


On Thu, 2007-12-20 at 13:29 -0600, Steve Wise wrote:

> Or based on the alignment of vaddr actually...

The later wouldn't be realistic. What I think might be necessay, though
it would definitely cause us problems with running out of iommu space
(which is the reason we did the switch down to 4K), is to provide
alignment to the real page size, and alignement to the allocation order
for dma_map_consistent.

It might be possible to -tweak- and only provide alignment to the page
size for allocations that are larger than IOMMU_PAGE_SIZE. That would
solve the problem with small network packets eating up too much iommu
space though.

What do you think ?

Ben.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: iommu dma mapping alignment requirements
  2007-12-20 20:17 ` Benjamin Herrenschmidt
@ 2007-12-20 21:02   ` Steve Wise
  2007-12-20 21:26     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 14+ messages in thread
From: Steve Wise @ 2007-12-20 21:02 UTC (permalink / raw)
  To: benh
  Cc: Roland Dreier, linux-kernel, OpenFabrics General,
	Benjamin Herrenschmidt, Wen Xiong, Olof Johansson,
	Paul Mackerras

Benjamin Herrenschmidt wrote:
> Adding A few more people to the discussion. You may well be right and we
> would have to provide the same alignment, though that sucks a bit as one
> of the reason we switched to 4K for the IOMMU is that the iommu space
> available on pSeries is very small and we were running out of it with
> 64K pages and lots of networking activity.
> 

But smarter NIC drivers can resolve this too, I think, but perhaps 
carving up full pages of mapped buffers instead of just assuming mapping 
is free...

perhaps...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ofa-general] iommu dma mapping alignment requirements
  2007-12-20 20:21       ` Benjamin Herrenschmidt
@ 2007-12-20 21:22         ` Steve Wise
  0 siblings, 0 replies; 14+ messages in thread
From: Steve Wise @ 2007-12-20 21:22 UTC (permalink / raw)
  To: benh
  Cc: Roland Dreier, Benjamin Herrenschmidt, linux-kernel,
	OpenFabrics General, Wen Xiong, Olof Johansson, Paul Mackerras

Benjamin Herrenschmidt wrote:
> On Thu, 2007-12-20 at 13:29 -0600, Steve Wise wrote:
> 
>> Or based on the alignment of vaddr actually...
> 
> The later wouldn't be realistic. What I think might be necessay, though
> it would definitely cause us problems with running out of iommu space
> (which is the reason we did the switch down to 4K), is to provide
> alignment to the real page size, and alignement to the allocation order
> for dma_map_consistent.
> 
> It might be possible to -tweak- and only provide alignment to the page
> size for allocations that are larger than IOMMU_PAGE_SIZE. That would
> solve the problem with small network packets eating up too much iommu
> space though.
> 
> What do you think ?

That might work.

If you gimme a patch, i'll try it out!

Steve.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: iommu dma mapping alignment requirements
  2007-12-20 21:02   ` Steve Wise
@ 2007-12-20 21:26     ` Benjamin Herrenschmidt
  2007-12-20 22:12       ` Steve Wise
  0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2007-12-20 21:26 UTC (permalink / raw)
  To: Steve Wise
  Cc: benh, Roland Dreier, linux-kernel, OpenFabrics General,
	Benjamin Herrenschmidt, Wen Xiong, Olof Johansson,
	Paul Mackerras


On Thu, 2007-12-20 at 15:02 -0600, Steve Wise wrote:
> Benjamin Herrenschmidt wrote:
> > Adding A few more people to the discussion. You may well be right and we
> > would have to provide the same alignment, though that sucks a bit as one
> > of the reason we switched to 4K for the IOMMU is that the iommu space
> > available on pSeries is very small and we were running out of it with
> > 64K pages and lots of networking activity.
> > 
> 
> But smarter NIC drivers can resolve this too, I think, but perhaps 
> carving up full pages of mapped buffers instead of just assuming mapping 
> is free...

True, but the problem still happenens today, if we switch back to 64K
iommu page size (which should be possible, I need to fix that), we
-will- run out of iommu space on typical workloads and that is not
acceptable.

So we need to find a compromise.

What I might do is something around the lines of: If size >= PAGE_SIZE,
and vaddr (page_address + offset) is PAGE_SIZE aligned, then I enforce
alignment of the resulting mapping.

That should fix your case. Anything requesting smaller than PAGE_SIZE
mappings would lose that alignment but I -think- it should be safe, and
you still always get 4K alignment anyway (+/- your offset) so at least
small alignment restrictions are still enforced (such as cache line
alignment etc...).

I'll send you a test patch later today.

Ben.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: iommu dma mapping alignment requirements
  2007-12-20 21:26     ` Benjamin Herrenschmidt
@ 2007-12-20 22:12       ` Steve Wise
  2007-12-20 23:49         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 14+ messages in thread
From: Steve Wise @ 2007-12-20 22:12 UTC (permalink / raw)
  To: benh
  Cc: benh, Roland Dreier, linux-kernel, OpenFabrics General,
	Benjamin Herrenschmidt, Wen Xiong, Olof Johansson,
	Paul Mackerras

Benjamin Herrenschmidt wrote:
> On Thu, 2007-12-20 at 15:02 -0600, Steve Wise wrote:
>> Benjamin Herrenschmidt wrote:
>>> Adding A few more people to the discussion. You may well be right and we
>>> would have to provide the same alignment, though that sucks a bit as one
>>> of the reason we switched to 4K for the IOMMU is that the iommu space
>>> available on pSeries is very small and we were running out of it with
>>> 64K pages and lots of networking activity.
>>>
>> But smarter NIC drivers can resolve this too, I think, but perhaps 
>> carving up full pages of mapped buffers instead of just assuming mapping 
>> is free...
> 
> True, but the problem still happenens today, if we switch back to 64K
> iommu page size (which should be possible, I need to fix that), we
> -will- run out of iommu space on typical workloads and that is not
> acceptable.
> 
> So we need to find a compromise.
> 
> What I might do is something around the lines of: If size >= PAGE_SIZE,
> and vaddr (page_address + offset) is PAGE_SIZE aligned, then I enforce
> alignment of the resulting mapping.
> 
> That should fix your case. Anything requesting smaller than PAGE_SIZE
> mappings would lose that alignment but I -think- it should be safe, and
> you still always get 4K alignment anyway (+/- your offset) so at least
> small alignment restrictions are still enforced (such as cache line
> alignment etc...).
> 
> I'll send you a test patch later today.
> 
> Ben.
> 

Sounds good.  Thanks!

Note, that these smaller sub-host-page-sized mappings might pollute the 
address space causing full aligned host-page-size maps to become 
scarce...  Maybe there's a clever way to keep those in their own segment 
of the address space?



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: iommu dma mapping alignment requirements
  2007-12-20 22:12       ` Steve Wise
@ 2007-12-20 23:49         ` Benjamin Herrenschmidt
  2007-12-21  4:49           ` Steve Wise
  0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2007-12-20 23:49 UTC (permalink / raw)
  To: Steve Wise
  Cc: benh, Roland Dreier, linux-kernel, OpenFabrics General,
	Benjamin Herrenschmidt, Wen Xiong, Olof Johansson,
	Paul Mackerras


> Sounds good.  Thanks!
> 
> Note, that these smaller sub-host-page-sized mappings might pollute the 
> address space causing full aligned host-page-size maps to become 
> scarce...  Maybe there's a clever way to keep those in their own segment 
> of the address space?

We already have a large vs. small split in the iommu virtual space to
alleviate this (though it's not a hard constraint, we can still get
into the "other" side if the default one is full).

Try that patch and let me know:

Index: linux-work/arch/powerpc/kernel/iommu.c
===================================================================
--- linux-work.orig/arch/powerpc/kernel/iommu.c	2007-12-21 10:39:39.000000000 +1100
+++ linux-work/arch/powerpc/kernel/iommu.c	2007-12-21 10:46:18.000000000 +1100
@@ -278,6 +278,7 @@ int iommu_map_sg(struct iommu_table *tbl
 	unsigned long flags;
 	struct scatterlist *s, *outs, *segstart;
 	int outcount, incount, i;
+	unsigned int align;
 	unsigned long handle;
 
 	BUG_ON(direction == DMA_NONE);
@@ -309,7 +310,11 @@ int iommu_map_sg(struct iommu_table *tbl
 		/* Allocate iommu entries for that segment */
 		vaddr = (unsigned long) sg_virt(s);
 		npages = iommu_num_pages(vaddr, slen);
-		entry = iommu_range_alloc(tbl, npages, &handle, mask >> IOMMU_PAGE_SHIFT, 0);
+		align = 0;
+		if (IOMMU_PAGE_SHIFT < PAGE_SHIFT && (vaddr & ~PAGE_MASK) == 0)
+			align = PAGE_SHIFT - IOMMU_PAGE_SHIFT;
+		entry = iommu_range_alloc(tbl, npages, &handle,
+					  mask >> IOMMU_PAGE_SHIFT, align);
 
 		DBG("  - vaddr: %lx, size: %lx\n", vaddr, slen);
 
@@ -572,7 +577,7 @@ dma_addr_t iommu_map_single(struct iommu
 {
 	dma_addr_t dma_handle = DMA_ERROR_CODE;
 	unsigned long uaddr;
-	unsigned int npages;
+	unsigned int npages, align;
 
 	BUG_ON(direction == DMA_NONE);
 
@@ -580,8 +585,13 @@ dma_addr_t iommu_map_single(struct iommu
 	npages = iommu_num_pages(uaddr, size);
 
 	if (tbl) {
+		align = 0;
+		if (IOMMU_PAGE_SHIFT < PAGE_SHIFT &&
+		    ((unsigned long)vaddr & ~PAGE_MASK) == 0)
+			align = PAGE_SHIFT - IOMMU_PAGE_SHIFT;
+
 		dma_handle = iommu_alloc(tbl, vaddr, npages, direction,
-					 mask >> IOMMU_PAGE_SHIFT, 0);
+					 mask >> IOMMU_PAGE_SHIFT, align);
 		if (dma_handle == DMA_ERROR_CODE) {
 			if (printk_ratelimit())  {
 				printk(KERN_INFO "iommu_alloc failed, "



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: iommu dma mapping alignment requirements
  2007-12-20 23:49         ` Benjamin Herrenschmidt
@ 2007-12-21  4:49           ` Steve Wise
  2007-12-21  5:38             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 14+ messages in thread
From: Steve Wise @ 2007-12-21  4:49 UTC (permalink / raw)
  To: benh
  Cc: benh, Roland Dreier, linux-kernel, OpenFabrics General,
	Benjamin Herrenschmidt, Wen Xiong, Olof Johansson,
	Paul Mackerras



Benjamin Herrenschmidt wrote:
>> Sounds good.  Thanks!
>>
>> Note, that these smaller sub-host-page-sized mappings might pollute the 
>> address space causing full aligned host-page-size maps to become 
>> scarce...  Maybe there's a clever way to keep those in their own segment 
>> of the address space?
> 
> We already have a large vs. small split in the iommu virtual space to
> alleviate this (though it's not a hard constraint, we can still get
> into the "other" side if the default one is full).
> 
> Try that patch and let me know:

Seems to be working!

:)


> 
> Index: linux-work/arch/powerpc/kernel/iommu.c
> ===================================================================
> --- linux-work.orig/arch/powerpc/kernel/iommu.c	2007-12-21 10:39:39.000000000 +1100
> +++ linux-work/arch/powerpc/kernel/iommu.c	2007-12-21 10:46:18.000000000 +1100
> @@ -278,6 +278,7 @@ int iommu_map_sg(struct iommu_table *tbl
>  	unsigned long flags;
>  	struct scatterlist *s, *outs, *segstart;
>  	int outcount, incount, i;
> +	unsigned int align;
>  	unsigned long handle;
>  
>  	BUG_ON(direction == DMA_NONE);
> @@ -309,7 +310,11 @@ int iommu_map_sg(struct iommu_table *tbl
>  		/* Allocate iommu entries for that segment */
>  		vaddr = (unsigned long) sg_virt(s);
>  		npages = iommu_num_pages(vaddr, slen);
> -		entry = iommu_range_alloc(tbl, npages, &handle, mask >> IOMMU_PAGE_SHIFT, 0);
> +		align = 0;
> +		if (IOMMU_PAGE_SHIFT < PAGE_SHIFT && (vaddr & ~PAGE_MASK) == 0)
> +			align = PAGE_SHIFT - IOMMU_PAGE_SHIFT;
> +		entry = iommu_range_alloc(tbl, npages, &handle,
> +					  mask >> IOMMU_PAGE_SHIFT, align);
>  
>  		DBG("  - vaddr: %lx, size: %lx\n", vaddr, slen);
>  
> @@ -572,7 +577,7 @@ dma_addr_t iommu_map_single(struct iommu
>  {
>  	dma_addr_t dma_handle = DMA_ERROR_CODE;
>  	unsigned long uaddr;
> -	unsigned int npages;
> +	unsigned int npages, align;
>  
>  	BUG_ON(direction == DMA_NONE);
>  
> @@ -580,8 +585,13 @@ dma_addr_t iommu_map_single(struct iommu
>  	npages = iommu_num_pages(uaddr, size);
>  
>  	if (tbl) {
> +		align = 0;
> +		if (IOMMU_PAGE_SHIFT < PAGE_SHIFT &&
> +		    ((unsigned long)vaddr & ~PAGE_MASK) == 0)
> +			align = PAGE_SHIFT - IOMMU_PAGE_SHIFT;
> +
>  		dma_handle = iommu_alloc(tbl, vaddr, npages, direction,
> -					 mask >> IOMMU_PAGE_SHIFT, 0);
> +					 mask >> IOMMU_PAGE_SHIFT, align);
>  		if (dma_handle == DMA_ERROR_CODE) {
>  			if (printk_ratelimit())  {
>  				printk(KERN_INFO "iommu_alloc failed, "
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: iommu dma mapping alignment requirements
  2007-12-21  4:49           ` Steve Wise
@ 2007-12-21  5:38             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2007-12-21  5:38 UTC (permalink / raw)
  To: Steve Wise
  Cc: benh, Roland Dreier, linux-kernel, OpenFabrics General,
	Benjamin Herrenschmidt, Wen Xiong, Olof Johansson,
	Paul Mackerras

BTW. I need to know urgently what HW is broken by this 

Ben.



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2007-12-21  5:39 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-12-20 17:14 iommu dma mapping alignment requirements Steve Wise
2007-12-20 17:29 ` [ofa-general] " Tom Tucker
2007-12-20 18:07 ` Roland Dreier
2007-12-20 19:11   ` Steve Wise
2007-12-20 19:29     ` Steve Wise
2007-12-20 20:21       ` Benjamin Herrenschmidt
2007-12-20 21:22         ` Steve Wise
2007-12-20 20:17 ` Benjamin Herrenschmidt
2007-12-20 21:02   ` Steve Wise
2007-12-20 21:26     ` Benjamin Herrenschmidt
2007-12-20 22:12       ` Steve Wise
2007-12-20 23:49         ` Benjamin Herrenschmidt
2007-12-21  4:49           ` Steve Wise
2007-12-21  5:38             ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).