linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Problems with alpha/pci + radeon/ttm
@ 2010-06-21 21:19 Matt Turner
  2010-06-22  5:59 ` FUJITA Tomonori
  0 siblings, 1 reply; 14+ messages in thread
From: Matt Turner @ 2010-06-21 21:19 UTC (permalink / raw)
  To: LKML, linux-alpha
  Cc: FUJITA Tomonori, Richard Henderson, Ivan Kokshaysky,
	Michael Cree, Jesse Barnes, linux-pci,
	Maling list - DRI developers, Dave Airlie, Alex Deucher,
	Jerome Glisse

Michael Cree and I have been debugging FDO bug 26403 [1]. I tried
booting with `radeon.test=1` and found this, which I think is related:

> [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0x202000
> [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0x302000
[snip]
> [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0xfd02000
> [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0xfe02000
> pci_map_single failed: could not allocate dma page tables
> [drm:radeon_ttm_backend_bind] *ERROR* failed to bind 128 pages at 0x0FF02000
> [TTM] Couldn't bind backend.
> radeon 0000:00:07.0: object_init failed for (1048576, 0x00000002)
> [drm:radeon_test_moves] *ERROR* Failed to create GTT object 253
> Error while testing BO move.

>From what I can see, the call chain is
radeon_test_moves
 (radeon_ttm_backend_bind called through callback function)
 - radeon_ttm.c:radeon_ttm_backend_bind calls radeon_gart_bind
  - radeon_gart.c:radeon_gart_bind calls pci_map_page
   - pci_map_page is alpha_pci_map_page, which calls...
    - alpha_pci_map_page calls pci_iommu.c:pci_map_single_1
     - pci_map_single_1 calls iommu_arena_alloc
      - iommu_arena_alloc calls iommu_arena_find_pages
       - iommu_arena_find_pages returns non-0
      - iommu_arena_alloc returns non-0
     - pci_map_single_1 returns 0 after printing
       "could not allocate dma page tables" error
    - alpha_pci_map_page returns 0 from pci_map_single_1
  - radeon_gart_bind returns non-0, error path prints
    "*ERROR* failed to bind 128 pages at 0x0FF02000"

Is this the cause of the bug we're seeing in the report [1]?

Anyone know what's going wrong here?

Thanks!
Matt Turner

[1] https://bugs.freedesktop.org/show_bug.cgi?id=26403

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Problems with alpha/pci + radeon/ttm
  2010-06-21 21:19 Problems with alpha/pci + radeon/ttm Matt Turner
@ 2010-06-22  5:59 ` FUJITA Tomonori
  2010-06-22  8:32   ` Dave Airlie
  2010-06-24 14:53   ` Matt Turner
  0 siblings, 2 replies; 14+ messages in thread
From: FUJITA Tomonori @ 2010-06-22  5:59 UTC (permalink / raw)
  To: mattst88
  Cc: linux-kernel, linux-alpha, fujita.tomonori, rth, ink, mcree,
	jbarnes, linux-pci, dri-devel, airlied, alexdeucher, jglisse

On Mon, 21 Jun 2010 17:19:43 -0400
Matt Turner <mattst88@gmail.com> wrote:

> Michael Cree and I have been debugging FDO bug 26403 [1]. I tried
> booting with `radeon.test=1` and found this, which I think is related:
> 
> > [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0x202000
> > [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0x302000
> [snip]
> > [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0xfd02000
> > [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0xfe02000
> > pci_map_single failed: could not allocate dma page tables
> > [drm:radeon_ttm_backend_bind] *ERROR* failed to bind 128 pages at 0x0FF02000
> > [TTM] Couldn't bind backend.
> > radeon 0000:00:07.0: object_init failed for (1048576, 0x00000002)
> > [drm:radeon_test_moves] *ERROR* Failed to create GTT object 253
> > Error while testing BO move.
> 
> From what I can see, the call chain is
> radeon_test_moves
>  (radeon_ttm_backend_bind called through callback function)
>  - radeon_ttm.c:radeon_ttm_backend_bind calls radeon_gart_bind
>   - radeon_gart.c:radeon_gart_bind calls pci_map_page
>    - pci_map_page is alpha_pci_map_page, which calls...
>     - alpha_pci_map_page calls pci_iommu.c:pci_map_single_1
>      - pci_map_single_1 calls iommu_arena_alloc
>       - iommu_arena_alloc calls iommu_arena_find_pages
>        - iommu_arena_find_pages returns non-0
>       - iommu_arena_alloc returns non-0
>      - pci_map_single_1 returns 0 after printing
>        "could not allocate dma page tables" error
>     - alpha_pci_map_page returns 0 from pci_map_single_1
>   - radeon_gart_bind returns non-0, error path prints
>     "*ERROR* failed to bind 128 pages at 0x0FF02000"

This happens in the latest git, right?

Is this a regression (what kernel version worked)?


Seems that the IOMMU can't find 128 pages. It's likely due to:

- out of the IOMMU space (possibly someone doesn't free the IOMMU
  space).

or

- the mapping parameters (such as align) aren't appropriate so the
  IOMMU can't find space.


> Is this the cause of the bug we're seeing in the report [1]?
>
> Anyone know what's going wrong here?


I've attached a patch to print the debug info about the mapping
parameters.


diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index d1dbd9a..17cf0d8 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -187,6 +187,10 @@ iommu_arena_alloc(struct device *dev, struct pci_iommu_arena *arena, long n,
 	/* Search for N empty ptes */
 	ptes = arena->ptes;
 	mask = max(align, arena->align_entry) - 1;
+
+	printk("%s: %p, %p, %d, %ld, %lx, %u\n", __func__, dev, arena, arena->size,
+	       n, mask, align);
+
 	p = iommu_arena_find_pages(dev, arena, n, mask);
 	if (p < 0) {
 		spin_unlock_irqrestore(&arena->lock, flags);


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: Problems with alpha/pci + radeon/ttm
  2010-06-22  5:59 ` FUJITA Tomonori
@ 2010-06-22  8:32   ` Dave Airlie
  2010-06-24  9:51     ` Michael Cree
  2010-06-24 14:53   ` Matt Turner
  1 sibling, 1 reply; 14+ messages in thread
From: Dave Airlie @ 2010-06-22  8:32 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: mattst88, linux-kernel, linux-alpha, rth, ink, mcree, jbarnes,
	linux-pci, dri-devel, alexdeucher, jglisse

On Tue, Jun 22, 2010 at 3:59 PM, FUJITA Tomonori
<fujita.tomonori@lab.ntt.co.jp> wrote:
> On Mon, 21 Jun 2010 17:19:43 -0400
> Matt Turner <mattst88@gmail.com> wrote:
>
>> Michael Cree and I have been debugging FDO bug 26403 [1]. I tried
>> booting with `radeon.test=1` and found this, which I think is related:
>>
>> > [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0x202000
>> > [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0x302000
>> [snip]
>> > [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0xfd02000
>> > [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0xfe02000
>> > pci_map_single failed: could not allocate dma page tables
>> > [drm:radeon_ttm_backend_bind] *ERROR* failed to bind 128 pages at 0x0FF02000
>> > [TTM] Couldn't bind backend.
>> > radeon 0000:00:07.0: object_init failed for (1048576, 0x00000002)
>> > [drm:radeon_test_moves] *ERROR* Failed to create GTT object 253
>> > Error while testing BO move.
>>
>> From what I can see, the call chain is
>> radeon_test_moves
>>  (radeon_ttm_backend_bind called through callback function)
>>  - radeon_ttm.c:radeon_ttm_backend_bind calls radeon_gart_bind
>>   - radeon_gart.c:radeon_gart_bind calls pci_map_page
>>    - pci_map_page is alpha_pci_map_page, which calls...
>>     - alpha_pci_map_page calls pci_iommu.c:pci_map_single_1
>>      - pci_map_single_1 calls iommu_arena_alloc
>>       - iommu_arena_alloc calls iommu_arena_find_pages
>>        - iommu_arena_find_pages returns non-0
>>       - iommu_arena_alloc returns non-0
>>      - pci_map_single_1 returns 0 after printing
>>        "could not allocate dma page tables" error
>>     - alpha_pci_map_page returns 0 from pci_map_single_1
>>   - radeon_gart_bind returns non-0, error path prints
>>     "*ERROR* failed to bind 128 pages at 0x0FF02000"
>
> This happens in the latest git, right?
>
> Is this a regression (what kernel version worked)?
>
>
> Seems that the IOMMU can't find 128 pages. It's likely due to:
>
> - out of the IOMMU space (possibly someone doesn't free the IOMMU
>  space).
>
> or
>
> - the mapping parameters (such as align) aren't appropriate so the
>  IOMMU can't find space.

I don't think KMS drivers have ever worked on alpha so its not a
regression, they are working fine on x86 + powerpc and sparc has been
run at least once.

I suspect we are simply hitting the limits of the iommu, how big an
address space does it handle? since generally graphics drivers try to
bind a lot of things to the GART.

It might be worth limiting the PCIGART in radeon to 32MB to see if the
lower limit helps.

Dave.

>
>
>> Is this the cause of the bug we're seeing in the report [1]?
>>
>> Anyone know what's going wrong here?
>
>
> I've attached a patch to print the debug info about the mapping
> parameters.
>
>
> diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
> index d1dbd9a..17cf0d8 100644
> --- a/arch/alpha/kernel/pci_iommu.c
> +++ b/arch/alpha/kernel/pci_iommu.c
> @@ -187,6 +187,10 @@ iommu_arena_alloc(struct device *dev, struct pci_iommu_arena *arena, long n,
>        /* Search for N empty ptes */
>        ptes = arena->ptes;
>        mask = max(align, arena->align_entry) - 1;
> +
> +       printk("%s: %p, %p, %d, %ld, %lx, %u\n", __func__, dev, arena, arena->size,
> +              n, mask, align);
> +
>        p = iommu_arena_find_pages(dev, arena, n, mask);
>        if (p < 0) {
>                spin_unlock_irqrestore(&arena->lock, flags);
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Problems with alpha/pci + radeon/ttm
  2010-06-22  8:32   ` Dave Airlie
@ 2010-06-24  9:51     ` Michael Cree
  2010-06-24 15:02       ` Matt Turner
  2010-06-27  4:20       ` FUJITA Tomonori
  0 siblings, 2 replies; 14+ messages in thread
From: Michael Cree @ 2010-06-24  9:51 UTC (permalink / raw)
  To: Dave Airlie
  Cc: FUJITA Tomonori, mattst88, linux-kernel, linux-alpha, rth, ink,
	jbarnes, linux-pci, dri-devel, alexdeucher, jglisse

On 22/06/10 20:32, Dave Airlie wrote:
> On Tue, Jun 22, 2010 at 3:59 PM, FUJITA Tomonori
> <fujita.tomonori@lab.ntt.co.jp>  wrote:
>> On Mon, 21 Jun 2010 17:19:43 -0400
>> Matt Turner<mattst88@gmail.com>  wrote:
>>
>>> Michael Cree and I have been debugging FDO bug 26403 [1]. I tried
>>> booting with `radeon.test=1` and found this, which I think is related:

Note that my radeon card is PCI whereas I think Matt may be using an AGP 
card.

My logs are very similar to Matt's except I don't see the following line:

>>>> pci_map_single failed: could not allocate dma page tables


>> This happens in the latest git, right?

Indeed, testing 2.6.35-rc3 (plus a couple or so extra patches to fix 
unrelated compile errors).

>> Is this a regression (what kernel version worked)?
>>
>> Seems that the IOMMU can't find 128 pages. It's likely due to:
>>
>> - out of the IOMMU space (possibly someone doesn't free the IOMMU
>>   space).
>>
>> or
>>
>> - the mapping parameters (such as align) aren't appropriate so the
>>   IOMMU can't find space.
>
> I don't think KMS drivers have ever worked on alpha so its not a
> regression, they are working fine on x86 + powerpc and sparc has been
> run at least once.

KMS on the console boot up has worked since about 2.6.32, but starting 
up the X server has always failed and, in my case, the system becomes 
unstable and eventually OOPs.

> I suspect we are simply hitting the limits of the iommu, how big an
> address space does it handle? since generally graphics drivers try to
> bind a lot of things to the GART.

No idea on the address space limit.  I applied the patch of Fujita that 
logs all IOMMU allocations, and also inserted some extra printks in the 
ttm kernel code so that I could see which routines failed and the error 
code returned.  Running the radeon test on boot exhibits the following:

[  238.712768] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 
0x1a312000
[  239.281127] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 
0x1a412000
[  239.281127] ttm_tt_bind belched -12
[  239.282104] ttm_bo_handle_move_mem belched -12
[  239.282104] ttm_bo_move_buffer belched -12
[  239.282104] ttm_bo_validate belched -12
[  239.282104] radeon 0000:01:00.0: object_init failed for (1048576, 
0x00000002) err=-12
[  239.282104] [drm:radeon_test_moves] *ERROR* Failed to create GTT 
object 419
[  239.399291] Error while testing BO move.

Note that no IOMMU allocations are printed while radeon_test_moves is 
running so iommu_arena_alloc doesn't appear to be called.  Also the 
error code returned up to radeon_test_moves is -12 which is ENOMEM.  So 
does appear to be some memory limit.

> It might be worth limiting the PCIGART in radeon to 32MB to see if the
> lower limit helps.

So, how does one do that?

Cheers
Michael.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Problems with alpha/pci + radeon/ttm
  2010-06-22  5:59 ` FUJITA Tomonori
  2010-06-22  8:32   ` Dave Airlie
@ 2010-06-24 14:53   ` Matt Turner
  2010-06-27  4:20     ` FUJITA Tomonori
  1 sibling, 1 reply; 14+ messages in thread
From: Matt Turner @ 2010-06-24 14:53 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: linux-kernel, linux-alpha, rth, ink, mcree, jbarnes, linux-pci,
	dri-devel, airlied, alexdeucher, jglisse

[-- Attachment #1: Type: text/plain, Size: 3479 bytes --]

On Tue, Jun 22, 2010 at 1:59 AM, FUJITA Tomonori
<fujita.tomonori@lab.ntt.co.jp> wrote:
> On Mon, 21 Jun 2010 17:19:43 -0400
> Matt Turner <mattst88@gmail.com> wrote:
>
>> Michael Cree and I have been debugging FDO bug 26403 [1]. I tried
>> booting with `radeon.test=1` and found this, which I think is related:
>>
>> > [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0x202000
>> > [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0x302000
>> [snip]
>> > [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0xfd02000
>> > [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0xfe02000
>> > pci_map_single failed: could not allocate dma page tables
>> > [drm:radeon_ttm_backend_bind] *ERROR* failed to bind 128 pages at 0x0FF02000
>> > [TTM] Couldn't bind backend.
>> > radeon 0000:00:07.0: object_init failed for (1048576, 0x00000002)
>> > [drm:radeon_test_moves] *ERROR* Failed to create GTT object 253
>> > Error while testing BO move.
>>
>> From what I can see, the call chain is
>> radeon_test_moves
>>  (radeon_ttm_backend_bind called through callback function)
>>  - radeon_ttm.c:radeon_ttm_backend_bind calls radeon_gart_bind
>>   - radeon_gart.c:radeon_gart_bind calls pci_map_page
>>    - pci_map_page is alpha_pci_map_page, which calls...
>>     - alpha_pci_map_page calls pci_iommu.c:pci_map_single_1
>>      - pci_map_single_1 calls iommu_arena_alloc
>>       - iommu_arena_alloc calls iommu_arena_find_pages
>>        - iommu_arena_find_pages returns non-0
>>       - iommu_arena_alloc returns non-0
>>      - pci_map_single_1 returns 0 after printing
>>        "could not allocate dma page tables" error
>>     - alpha_pci_map_page returns 0 from pci_map_single_1
>>   - radeon_gart_bind returns non-0, error path prints
>>     "*ERROR* failed to bind 128 pages at 0x0FF02000"
>
> This happens in the latest git, right?

I'm using 2.6.35-rc2, but I could try rc3 if you think it would make a
difference.

> Is this a regression (what kernel version worked)?

The framebuffer console has always worked, but I've never known X on
KMS to work. The radeon.test parameter hasn't existed the entire time,
but I could try still previous kernels.

> Seems that the IOMMU can't find 128 pages. It's likely due to:
>
> - out of the IOMMU space (possibly someone doesn't free the IOMMU
>  space).
>
> or
>
> - the mapping parameters (such as align) aren't appropriate so the
>  IOMMU can't find space.
>
>
>> Is this the cause of the bug we're seeing in the report [1]?
>>
>> Anyone know what's going wrong here?
>
>
> I've attached a patch to print the debug info about the mapping
> parameters.
>
>
> diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
> index d1dbd9a..17cf0d8 100644
> --- a/arch/alpha/kernel/pci_iommu.c
> +++ b/arch/alpha/kernel/pci_iommu.c
> @@ -187,6 +187,10 @@ iommu_arena_alloc(struct device *dev, struct pci_iommu_arena *arena, long n,
>        /* Search for N empty ptes */
>        ptes = arena->ptes;
>        mask = max(align, arena->align_entry) - 1;
> +
> +       printk("%s: %p, %p, %d, %ld, %lx, %u\n", __func__, dev, arena, arena->size,
> +              n, mask, align);
> +
>        p = iommu_arena_find_pages(dev, arena, n, mask);
>        if (p < 0) {
>                spin_unlock_irqrestore(&arena->lock, flags);

Using this patch, I log the attached output.

Thanks for your help so far. :)

Matt

[-- Attachment #2: screenlog.0.gz --]
[-- Type: application/x-gzip, Size: 13412 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Problems with alpha/pci + radeon/ttm
  2010-06-24  9:51     ` Michael Cree
@ 2010-06-24 15:02       ` Matt Turner
  2010-06-27  4:20       ` FUJITA Tomonori
  1 sibling, 0 replies; 14+ messages in thread
From: Matt Turner @ 2010-06-24 15:02 UTC (permalink / raw)
  To: Michael Cree
  Cc: Dave Airlie, FUJITA Tomonori, linux-kernel, linux-alpha, rth,
	ink, jbarnes, linux-pci, dri-devel, alexdeucher, jglisse

On Thu, Jun 24, 2010 at 5:51 AM, Michael Cree <mcree@orcon.net.nz> wrote:
> On 22/06/10 20:32, Dave Airlie wrote:
>>
>> On Tue, Jun 22, 2010 at 3:59 PM, FUJITA Tomonori
>> <fujita.tomonori@lab.ntt.co.jp>  wrote:
>>>
>>> On Mon, 21 Jun 2010 17:19:43 -0400
>>> Matt Turner<mattst88@gmail.com>  wrote:
>>>
>>>> Michael Cree and I have been debugging FDO bug 26403 [1]. I tried
>>>> booting with `radeon.test=1` and found this, which I think is related:
>
> Note that my radeon card is PCI whereas I think Matt may be using an AGP
> card.

Actually, I'm using a plain Radeon 9100 PCI.

> My logs are very similar to Matt's except I don't see the following line:
>
>>>>> pci_map_single failed: could not allocate dma page tables
>
>
>>> This happens in the latest git, right?
>
> Indeed, testing 2.6.35-rc3 (plus a couple or so extra patches to fix
> unrelated compile errors).
>
>>> Is this a regression (what kernel version worked)?
>>>
>>> Seems that the IOMMU can't find 128 pages. It's likely due to:
>>>
>>> - out of the IOMMU space (possibly someone doesn't free the IOMMU
>>>  space).
>>>
>>> or
>>>
>>> - the mapping parameters (such as align) aren't appropriate so the
>>>  IOMMU can't find space.
>>
>> I don't think KMS drivers have ever worked on alpha so its not a
>> regression, they are working fine on x86 + powerpc and sparc has been
>> run at least once.
>
> KMS on the console boot up has worked since about 2.6.32, but starting up
> the X server has always failed and, in my case, the system becomes unstable
> and eventually OOPs.
>
>> I suspect we are simply hitting the limits of the iommu, how big an
>> address space does it handle? since generally graphics drivers try to
>> bind a lot of things to the GART.
>
> No idea on the address space limit.  I applied the patch of Fujita that logs
> all IOMMU allocations, and also inserted some extra printks in the ttm
> kernel code so that I could see which routines failed and the error code
> returned.  Running the radeon test on boot exhibits the following:
>
> [  238.712768] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset
> 0x1a312000
> [  239.281127] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset
> 0x1a412000
> [  239.281127] ttm_tt_bind belched -12
> [  239.282104] ttm_bo_handle_move_mem belched -12
> [  239.282104] ttm_bo_move_buffer belched -12
> [  239.282104] ttm_bo_validate belched -12
> [  239.282104] radeon 0000:01:00.0: object_init failed for (1048576,
> 0x00000002) err=-12
> [  239.282104] [drm:radeon_test_moves] *ERROR* Failed to create GTT object
> 419
> [  239.399291] Error while testing BO move.
>
> Note that no IOMMU allocations are printed while radeon_test_moves is
> running so iommu_arena_alloc doesn't appear to be called.  Also the error
> code returned up to radeon_test_moves is -12 which is ENOMEM.  So does
> appear to be some memory limit.

I confirm that we're getting -ENOMEM. I don't know if it's coming from
radeon_gart_bind(), but if it is there's an interesting comment
immediately after the call to pci_map_page:

if (pci_dma_mapping_error(rdev->pdev, rdev->gart.pages_addr[p])) {
            /* FIXME: failed to map page (return -ENOMEM?) */
            radeon_gart_unbind(rdev, offset, pages);
            return -ENOMEM;
}

>> It might be worth limiting the PCIGART in radeon to 32MB to see if the
>> lower limit helps.
>
> So, how does one do that?

Boot with `radeon.test=1 radeon.gartsize=<size in MB>`.
> Cheers
> Michael.

Thanks,
Matt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Problems with alpha/pci + radeon/ttm
  2010-06-24 14:53   ` Matt Turner
@ 2010-06-27  4:20     ` FUJITA Tomonori
  2010-06-27  4:58       ` Matt Turner
  0 siblings, 1 reply; 14+ messages in thread
From: FUJITA Tomonori @ 2010-06-27  4:20 UTC (permalink / raw)
  To: mattst88
  Cc: fujita.tomonori, linux-kernel, linux-alpha, rth, ink, mcree,
	jbarnes, linux-pci, dri-devel, airlied, alexdeucher, jglisse

On Thu, 24 Jun 2010 10:53:52 -0400
Matt Turner <mattst88@gmail.com> wrote:

> > Seems that the IOMMU can't find 128 pages. It's likely due to:
> >
> > - out of the IOMMU space (possibly someone doesn't free the IOMMU
> >  space).
> >
> > or
> >
> > - the mapping parameters (such as align) aren't appropriate so the
> >  IOMMU can't find space.
> >
> >
> >> Is this the cause of the bug we're seeing in the report [1]?
> >>
> >> Anyone know what's going wrong here?
> >
> >
> > I've attached a patch to print the debug info about the mapping
> > parameters.
> >
> >
> > diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
> > index d1dbd9a..17cf0d8 100644
> > --- a/arch/alpha/kernel/pci_iommu.c
> > +++ b/arch/alpha/kernel/pci_iommu.c
> > @@ -187,6 +187,10 @@ iommu_arena_alloc(struct device *dev, struct pci_iommu_arena *arena, long n,
> >        /* Search for N empty ptes */
> >        ptes = arena->ptes;
> >        mask = max(align, arena->align_entry) - 1;
> > +
> > +       printk("%s: %p, %p, %d, %ld, %lx, %u\n", __func__, dev, arena, arena->size,
> > +              n, mask, align);
> > +
> >        p = iommu_arena_find_pages(dev, arena, n, mask);
> >        if (p < 0) {
> >                spin_unlock_irqrestore(&arena->lock, flags);
> 
> Using this patch, I log the attached output.

Your system has 1GB iommu address space. I guess that it's enough for
KSM?

The parameters in the log looks good. But you got this log before you
started X?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Problems with alpha/pci + radeon/ttm
  2010-06-24  9:51     ` Michael Cree
  2010-06-24 15:02       ` Matt Turner
@ 2010-06-27  4:20       ` FUJITA Tomonori
  2010-06-27 10:46         ` Michael Cree
  1 sibling, 1 reply; 14+ messages in thread
From: FUJITA Tomonori @ 2010-06-27  4:20 UTC (permalink / raw)
  To: mcree
  Cc: airlied, fujita.tomonori, mattst88, linux-kernel, linux-alpha,
	rth, ink, jbarnes, linux-pci, dri-devel, alexdeucher, jglisse

On Thu, 24 Jun 2010 21:51:40 +1200
Michael Cree <mcree@orcon.net.nz> wrote:

> >> Is this a regression (what kernel version worked)?
> >>
> >> Seems that the IOMMU can't find 128 pages. It's likely due to:
> >>
> >> - out of the IOMMU space (possibly someone doesn't free the IOMMU
> >>   space).
> >>
> >> or
> >>
> >> - the mapping parameters (such as align) aren't appropriate so the
> >>   IOMMU can't find space.
> >
> > I don't think KMS drivers have ever worked on alpha so its not a
> > regression, they are working fine on x86 + powerpc and sparc has been
> > run at least once.
> 
> KMS on the console boot up has worked since about 2.6.32, but starting 
> up the X server has always failed and, in my case, the system becomes 
> unstable and eventually OOPs.
> 
> > I suspect we are simply hitting the limits of the iommu, how big an
> > address space does it handle? since generally graphics drivers try to
> > bind a lot of things to the GART.
> 
> No idea on the address space limit.  I applied the patch of Fujita that 
> logs all IOMMU allocations, and also inserted some extra printks in the 
> ttm kernel code so that I could see which routines failed and the error 
> code returned.  Running the radeon test on boot exhibits the following:
> 
> [  238.712768] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 
> 0x1a312000
> [  239.281127] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 
> 0x1a412000
> [  239.281127] ttm_tt_bind belched -12
> [  239.282104] ttm_bo_handle_move_mem belched -12
> [  239.282104] ttm_bo_move_buffer belched -12
> [  239.282104] ttm_bo_validate belched -12
> [  239.282104] radeon 0000:01:00.0: object_init failed for (1048576, 
> 0x00000002) err=-12
> [  239.282104] [drm:radeon_test_moves] *ERROR* Failed to create GTT 
> object 419
> [  239.399291] Error while testing BO move.
> 
> Note that no IOMMU allocations are printed while radeon_test_moves is 
> running so iommu_arena_alloc doesn't appear to be called.  Also the 
> error code returned up to radeon_test_moves is -12 which is ENOMEM.  So 
> does appear to be some memory limit.

Hmm, not related with IOMMU? looks like ttm_tt_populate could return
ENOMEM too. Can we locate where we hit ENOMEM first?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Problems with alpha/pci + radeon/ttm
  2010-06-27  4:20     ` FUJITA Tomonori
@ 2010-06-27  4:58       ` Matt Turner
  2010-06-30 18:43         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 14+ messages in thread
From: Matt Turner @ 2010-06-27  4:58 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: linux-kernel, linux-alpha, rth, ink, mcree, jbarnes, linux-pci,
	dri-devel, airlied, alexdeucher, jglisse

On Sun, Jun 27, 2010 at 12:20 AM, FUJITA Tomonori
<fujita.tomonori@lab.ntt.co.jp> wrote:
> On Thu, 24 Jun 2010 10:53:52 -0400
> Matt Turner <mattst88@gmail.com> wrote:
>
>> > Seems that the IOMMU can't find 128 pages. It's likely due to:
>> >
>> > - out of the IOMMU space (possibly someone doesn't free the IOMMU
>> >  space).
>> >
>> > or
>> >
>> > - the mapping parameters (such as align) aren't appropriate so the
>> >  IOMMU can't find space.
>> >
>> >
>> >> Is this the cause of the bug we're seeing in the report [1]?
>> >>
>> >> Anyone know what's going wrong here?
>> >
>> >
>> > I've attached a patch to print the debug info about the mapping
>> > parameters.
>> >
>> >
>> > diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
>> > index d1dbd9a..17cf0d8 100644
>> > --- a/arch/alpha/kernel/pci_iommu.c
>> > +++ b/arch/alpha/kernel/pci_iommu.c
>> > @@ -187,6 +187,10 @@ iommu_arena_alloc(struct device *dev, struct pci_iommu_arena *arena, long n,
>> >        /* Search for N empty ptes */
>> >        ptes = arena->ptes;
>> >        mask = max(align, arena->align_entry) - 1;
>> > +
>> > +       printk("%s: %p, %p, %d, %ld, %lx, %u\n", __func__, dev, arena, arena->size,
>> > +              n, mask, align);
>> > +
>> >        p = iommu_arena_find_pages(dev, arena, n, mask);
>> >        if (p < 0) {
>> >                spin_unlock_irqrestore(&arena->lock, flags);
>>
>> Using this patch, I log the attached output.
>
> Your system has 1GB iommu address space. I guess that it's enough for
> KSM?

I would definitely think so. The video card I'm using here is a 64MB
Radeon 9100 PCI, with a 128MB BAR.

> The parameters in the log looks good. But you got this log before you
> started X?

Yes, that's right.

I'll see if I can isolate where the first -ENOMEM is coming from.

Thanks Fujita for helping with this!

Matt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Problems with alpha/pci + radeon/ttm
  2010-06-27  4:20       ` FUJITA Tomonori
@ 2010-06-27 10:46         ` Michael Cree
  2010-06-27 23:14           ` Dave Airlie
  0 siblings, 1 reply; 14+ messages in thread
From: Michael Cree @ 2010-06-27 10:46 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: airlied, mattst88, linux-kernel, linux-alpha, rth, ink, jbarnes,
	linux-pci, dri-devel, alexdeucher, jglisse

On 27/06/10 16:20, FUJITA Tomonori wrote:
> On Thu, 24 Jun 2010 21:51:40 +1200
> Michael Cree<mcree@orcon.net.nz>  wrote:
>
>>>> Is this a regression (what kernel version worked)?
>>>>
>>>> Seems that the IOMMU can't find 128 pages. It's likely due to:
>>>>
>>>> - out of the IOMMU space (possibly someone doesn't free the IOMMU
>>>>    space).
>>>>
>>>> or
>>>>
>>>> - the mapping parameters (such as align) aren't appropriate so the
>>>>    IOMMU can't find space.
>>>
>>> I don't think KMS drivers have ever worked on alpha so its not a
>>> regression, they are working fine on x86 + powerpc and sparc has been
>>> run at least once.
>>
>> KMS on the console boot up has worked since about 2.6.32, but starting
>> up the X server has always failed and, in my case, the system becomes
>> unstable and eventually OOPs.
>>
>>> I suspect we are simply hitting the limits of the iommu, how big an
>>> address space does it handle? since generally graphics drivers try to
>>> bind a lot of things to the GART.
>>
>> No idea on the address space limit.  I applied the patch of Fujita that
>> logs all IOMMU allocations, and also inserted some extra printks in the
>> ttm kernel code so that I could see which routines failed and the error
>> code returned.  Running the radeon test on boot exhibits the following:
>>
>> [  238.712768] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset
>> 0x1a312000
>> [  239.281127] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset
>> 0x1a412000
>> [  239.281127] ttm_tt_bind belched -12
>> [  239.282104] ttm_bo_handle_move_mem belched -12
>> [  239.282104] ttm_bo_move_buffer belched -12
>> [  239.282104] ttm_bo_validate belched -12
>> [  239.282104] radeon 0000:01:00.0: object_init failed for (1048576,
>> 0x00000002) err=-12
>> [  239.282104] [drm:radeon_test_moves] *ERROR* Failed to create GTT
>> object 419
>> [  239.399291] Error while testing BO move.
>>
>> Note that no IOMMU allocations are printed while radeon_test_moves is
>> running so iommu_arena_alloc doesn't appear to be called.  Also the
>> error code returned up to radeon_test_moves is -12 which is ENOMEM.  So
>> does appear to be some memory limit.
>
> Hmm, not related with IOMMU? looks like ttm_tt_populate could return
> ENOMEM too. Can we locate where we hit ENOMEM first?

Yeah, in ttm_mem_global_reserve while it is walking glob->zones:

[  239.303588] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 
0x1a412000
[  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds 
limit (0x1a5ef000)
[  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds 
limit (0x1a5ef000)
[  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds 
limit (0x1a5ef000)
[  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds 
limit (0x1a5ef000)
[  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds 
limit (0x1a5ef000)
[  239.304564] ttm_mem_global_reserve return non-zero count decs to zero
[  239.304564] ttm_mem_global_alloc_page belched -12
[  239.304564] __ttm_tt_get_page coughed NULL
[  239.304564] ttm_tt_populate belched -12
[  239.304564] ttm_tt_bind belched -12
[  239.304564] ttm_bo_handle_move_mem belched -12
[  239.304564] ttm_bo_move_buffer belched -12
[  239.304564] ttm_bo_validate belched -12

On a hunch that we are chasing a red herring I installed another 256MB 
of memory into the machine (was 576MB for the test reported above) for a 
total of 832MB.

Now radeon_test_moves runs to completion without error.

OK, now a test of starting up the X server - ah, a bus error again but 
now it looks like it's in the radeon driver:

[  1435.014] (II) EXA(0): Driver allocated offscreen pixmaps
[  1435.014] (II) EXA(0): Driver registered support for the following 
operations:
[  1435.014] (II)         Solid
[  1435.014] (II)         Copy
[  1435.014] (II)         Composite (RENDER acceleration)
[  1435.014] (II)         UploadToScreen
[  1435.014] (II)         DownloadFromScreen
[  1435.030]
Backtrace:
[  1435.032] 0: /opt/xorg-ev56/bin/X (xorg_backtrace+0x54) [0x120070884]
[  1435.032] 1: /opt/xorg-ev56/bin/X (0x120000000+0x65608) [0x120065608]
[  1435.033] 2: /lib/libc.so.6.1 (0x20000310000+0x3d610) [0x2000034d610]
[  1435.034] 3: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so 
(0x20000758000+0x15b890) [0x200008b3890]
[  1435.034] 4: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so 
(0x20000758000+0x1392a0) [0x200008912a0]
[  1435.034] 5: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so 
(0x20000758000+0x139bec) [0x20000891bec]
[  1435.034] 6: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so 
(0x20000758000+0x4f088) [0x200007a7088]
[  1435.035] 7: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so 
(0x20000758000+0x16f0f8) [0x200008c70f8]
[  1435.035] 8: /opt/xorg-ev56/bin/X (AddScreen+0x1c0) [0x1200532b0]
[  1435.036] 9: /opt/xorg-ev56/bin/X (InitOutput+0x29c) [0x12008c6ec]
[  1435.036] 10: /opt/xorg-ev56/bin/X (0x120000000+0x24b48) [0x120024b48]
[  1435.037] 11: /lib/libc.so.6.1 (__libc_start_main+0xec) [0x2000033267c]
[  1435.037] 12: /opt/xorg-ev56/bin/X (__start+0x38) [0x120024788]
[  1435.038] Bus error at address 0x20000030000

And nothing in dmesg.  Now I'm not triggering the nasty page alloc errors.

Cheers
Michael.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Problems with alpha/pci + radeon/ttm
  2010-06-27 10:46         ` Michael Cree
@ 2010-06-27 23:14           ` Dave Airlie
  2010-06-28  9:03             ` Michael Cree
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Airlie @ 2010-06-27 23:14 UTC (permalink / raw)
  To: Michael Cree
  Cc: FUJITA Tomonori, mattst88, linux-kernel, linux-alpha, rth, ink,
	jbarnes, linux-pci, dri-devel, alexdeucher, jglisse

On Sun, Jun 27, 2010 at 8:46 PM, Michael Cree <mcree@orcon.net.nz> wrote:
> On 27/06/10 16:20, FUJITA Tomonori wrote:
>>
>> On Thu, 24 Jun 2010 21:51:40 +1200
>> Michael Cree<mcree@orcon.net.nz>  wrote:
>>
>>>>> Is this a regression (what kernel version worked)?
>>>>>
>>>>> Seems that the IOMMU can't find 128 pages. It's likely due to:
>>>>>
>>>>> - out of the IOMMU space (possibly someone doesn't free the IOMMU
>>>>>   space).
>>>>>
>>>>> or
>>>>>
>>>>> - the mapping parameters (such as align) aren't appropriate so the
>>>>>   IOMMU can't find space.
>>>>
>>>> I don't think KMS drivers have ever worked on alpha so its not a
>>>> regression, they are working fine on x86 + powerpc and sparc has been
>>>> run at least once.
>>>
>>> KMS on the console boot up has worked since about 2.6.32, but starting
>>> up the X server has always failed and, in my case, the system becomes
>>> unstable and eventually OOPs.
>>>
>>>> I suspect we are simply hitting the limits of the iommu, how big an
>>>> address space does it handle? since generally graphics drivers try to
>>>> bind a lot of things to the GART.
>>>
>>> No idea on the address space limit.  I applied the patch of Fujita that
>>> logs all IOMMU allocations, and also inserted some extra printks in the
>>> ttm kernel code so that I could see which routines failed and the error
>>> code returned.  Running the radeon test on boot exhibits the following:
>>>
>>> [  238.712768] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset
>>> 0x1a312000
>>> [  239.281127] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset
>>> 0x1a412000
>>> [  239.281127] ttm_tt_bind belched -12
>>> [  239.282104] ttm_bo_handle_move_mem belched -12
>>> [  239.282104] ttm_bo_move_buffer belched -12
>>> [  239.282104] ttm_bo_validate belched -12
>>> [  239.282104] radeon 0000:01:00.0: object_init failed for (1048576,
>>> 0x00000002) err=-12
>>> [  239.282104] [drm:radeon_test_moves] *ERROR* Failed to create GTT
>>> object 419
>>> [  239.399291] Error while testing BO move.
>>>
>>> Note that no IOMMU allocations are printed while radeon_test_moves is
>>> running so iommu_arena_alloc doesn't appear to be called.  Also the
>>> error code returned up to radeon_test_moves is -12 which is ENOMEM.  So
>>> does appear to be some memory limit.
>>
>> Hmm, not related with IOMMU? looks like ttm_tt_populate could return
>> ENOMEM too. Can we locate where we hit ENOMEM first?
>
> Yeah, in ttm_mem_global_reserve while it is walking glob->zones:
>
> [  239.303588] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset
> 0x1a412000
> [  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds
> limit (0x1a5ef000)
> [  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds
> limit (0x1a5ef000)
> [  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds
> limit (0x1a5ef000)
> [  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds
> limit (0x1a5ef000)
> [  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds
> limit (0x1a5ef000)
> [  239.304564] ttm_mem_global_reserve return non-zero count decs to zero
> [  239.304564] ttm_mem_global_alloc_page belched -12
> [  239.304564] __ttm_tt_get_page coughed NULL
> [  239.304564] ttm_tt_populate belched -12
> [  239.304564] ttm_tt_bind belched -12
> [  239.304564] ttm_bo_handle_move_mem belched -12
> [  239.304564] ttm_bo_move_buffer belched -12
> [  239.304564] ttm_bo_validate belched -12
>
> On a hunch that we are chasing a red herring I installed another 256MB of
> memory into the machine (was 576MB for the test reported above) for a total
> of 832MB.
>
> Now radeon_test_moves runs to completion without error.
>
> OK, now a test of starting up the X server - ah, a bus error again but now
> it looks like it's in the radeon driver:
>
> [  1435.014] (II) EXA(0): Driver allocated offscreen pixmaps
> [  1435.014] (II) EXA(0): Driver registered support for the following
> operations:
> [  1435.014] (II)         Solid
> [  1435.014] (II)         Copy
> [  1435.014] (II)         Composite (RENDER acceleration)
> [  1435.014] (II)         UploadToScreen
> [  1435.014] (II)         DownloadFromScreen
> [  1435.030]
> Backtrace:
> [  1435.032] 0: /opt/xorg-ev56/bin/X (xorg_backtrace+0x54) [0x120070884]
> [  1435.032] 1: /opt/xorg-ev56/bin/X (0x120000000+0x65608) [0x120065608]
> [  1435.033] 2: /lib/libc.so.6.1 (0x20000310000+0x3d610) [0x2000034d610]
> [  1435.034] 3: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so
> (0x20000758000+0x15b890) [0x200008b3890]
> [  1435.034] 4: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so
> (0x20000758000+0x1392a0) [0x200008912a0]
> [  1435.034] 5: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so
> (0x20000758000+0x139bec) [0x20000891bec]
> [  1435.034] 6: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so
> (0x20000758000+0x4f088) [0x200007a7088]
> [  1435.035] 7: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so
> (0x20000758000+0x16f0f8) [0x200008c70f8]
> [  1435.035] 8: /opt/xorg-ev56/bin/X (AddScreen+0x1c0) [0x1200532b0]
> [  1435.036] 9: /opt/xorg-ev56/bin/X (InitOutput+0x29c) [0x12008c6ec]
> [  1435.036] 10: /opt/xorg-ev56/bin/X (0x120000000+0x24b48) [0x120024b48]
> [  1435.037] 11: /lib/libc.so.6.1 (__libc_start_main+0xec) [0x2000033267c]
> [  1435.037] 12: /opt/xorg-ev56/bin/X (__start+0x38) [0x120024788]
> [  1435.038] Bus error at address 0x20000030000
>
> And nothing in dmesg.  Now I'm not triggering the nasty page alloc errors.

The bus error is caused by the kernel, its something alpha specific
with how mmap works,
I'm not sure if alpha needs some special mmap flags or something,

Dave.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Problems with alpha/pci + radeon/ttm
  2010-06-27 23:14           ` Dave Airlie
@ 2010-06-28  9:03             ` Michael Cree
  2010-06-28 16:08               ` Richard Henderson
  0 siblings, 1 reply; 14+ messages in thread
From: Michael Cree @ 2010-06-28  9:03 UTC (permalink / raw)
  To: Dave Airlie
  Cc: FUJITA Tomonori, mattst88, linux-kernel, linux-alpha, rth, ink,
	jbarnes, linux-pci, dri-devel, alexdeucher, jglisse

On 28/06/10 11:14, Dave Airlie wrote:
> The bus error is caused by the kernel, its something alpha specific
> with how mmap works,
> I'm not sure if alpha needs some special mmap flags or something,

Neither am I.  All I know is that Alpha reorders CPU instructions more 
aggressively than most other architectures, the page map size is 8kB, 
and memory accesses must be aligned to the datum size.

Maybe Ivan or Richard can comment on any relevant Alpha mmap specific 
issues.

BTW, I discovered a couple of weeks ago that DRI is broken under UMS. 
It was working a year or so ago so something has happened to it.  Am I 
correct in thinking that the DRM code has pretty much been shifted into 
the kernel even for UMS?

On the Alpha I have been testing on (PWS600au, EV56 cpu and a radeon 
RV710 graphics card) running glxgears under UMS displays artefacts in 
rendering the gears, that is, some facets are not clipped to the 
rotating gear but extend to the edge of the window.  On another Alpha 
(XP1000, EV67 cpu and a radeon RV610 card) it locked up completely 
(couldn't even ping it) when I ran glxgears.  They are both running 
Debian unstable.

Cheers
Michael.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Problems with alpha/pci + radeon/ttm
  2010-06-28  9:03             ` Michael Cree
@ 2010-06-28 16:08               ` Richard Henderson
  0 siblings, 0 replies; 14+ messages in thread
From: Richard Henderson @ 2010-06-28 16:08 UTC (permalink / raw)
  To: Michael Cree
  Cc: Dave Airlie, FUJITA Tomonori, mattst88, linux-kernel,
	linux-alpha, ink, jbarnes, linux-pci, dri-devel, alexdeucher,
	jglisse

On 06/28/2010 02:03 AM, Michael Cree wrote:
> On 28/06/10 11:14, Dave Airlie wrote:
>> The bus error is caused by the kernel, its something alpha specific
>> with how mmap works,
>> I'm not sure if alpha needs some special mmap flags or something,
> 
> Neither am I.  All I know is that Alpha reorders CPU instructions more
> aggressively than most other architectures, the page map size is 8kB,
> and memory accesses must be aligned to the datum size.

There are no special mmap flags on alpha.  The non-cacheable property
is a function of the physical address (e.g. bit 40 set for ev5), and
this has already been taken care of by the kernel.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Problems with alpha/pci + radeon/ttm
  2010-06-27  4:58       ` Matt Turner
@ 2010-06-30 18:43         ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 14+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-06-30 18:43 UTC (permalink / raw)
  To: Matt Turner
  Cc: FUJITA Tomonori, linux-kernel, linux-alpha, rth, ink, mcree,
	jbarnes, linux-pci, dri-devel, airlied, alexdeucher, jglisse

On Sun, Jun 27, 2010 at 12:58:07AM -0400, Matt Turner wrote:
> On Sun, Jun 27, 2010 at 12:20 AM, FUJITA Tomonori
> <fujita.tomonori@lab.ntt.co.jp> wrote:
> > On Thu, 24 Jun 2010 10:53:52 -0400
> > Matt Turner <mattst88@gmail.com> wrote:
> >
> >> > Seems that the IOMMU can't find 128 pages. It's likely due to:
> >> >
> >> > - out of the IOMMU space (possibly someone doesn't free the IOMMU
> >> >  space).
> >> >
> >> > or
> >> >
> >> > - the mapping parameters (such as align) aren't appropriate so the
> >> >  IOMMU can't find space.
> >> >
> >> >
> >> >> Is this the cause of the bug we're seeing in the report [1]?

For the fun I did:
 cat /tmp/screenlog.0 | grep iommu_arena | wc
  28509  199588 2167164

Which is to say 28509 * 4096 = 116736000. That looks to be about 111MB.

That does not look right when you are trying to allocate 128 pages. You
are sure it is 128 pages? Can you make the ttm_ code print out the total
of pages it is trying to PCI map ? It might also be interesting to see
how many pages the pci_iommu.c has already set aside for other devices.

The BAR is 128MB, but that would have been an ioremap call - and done
much earlier I think.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-06-30 18:45 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-21 21:19 Problems with alpha/pci + radeon/ttm Matt Turner
2010-06-22  5:59 ` FUJITA Tomonori
2010-06-22  8:32   ` Dave Airlie
2010-06-24  9:51     ` Michael Cree
2010-06-24 15:02       ` Matt Turner
2010-06-27  4:20       ` FUJITA Tomonori
2010-06-27 10:46         ` Michael Cree
2010-06-27 23:14           ` Dave Airlie
2010-06-28  9:03             ` Michael Cree
2010-06-28 16:08               ` Richard Henderson
2010-06-24 14:53   ` Matt Turner
2010-06-27  4:20     ` FUJITA Tomonori
2010-06-27  4:58       ` Matt Turner
2010-06-30 18:43         ` Konrad Rzeszutek Wilk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).