linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Cree <mcree@orcon.net.nz>
To: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: airlied@gmail.com, mattst88@gmail.com,
	linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org,
	rth@twiddle.net, ink@jurassic.park.msu.ru,
	jbarnes@virtuousgeek.org, linux-pci@vger.kernel.org,
	dri-devel@lists.freedesktop.org, alexdeucher@gmail.com,
	jglisse@redhat.com
Subject: Re: Problems with alpha/pci + radeon/ttm
Date: Sun, 27 Jun 2010 22:46:52 +1200	[thread overview]
Message-ID: <4C272C1C.9000802@orcon.net.nz> (raw)
In-Reply-To: <20100627131836T.fujita.tomonori@lab.ntt.co.jp>

On 27/06/10 16:20, FUJITA Tomonori wrote:
> On Thu, 24 Jun 2010 21:51:40 +1200
> Michael Cree<mcree@orcon.net.nz>  wrote:
>
>>>> Is this a regression (what kernel version worked)?
>>>>
>>>> Seems that the IOMMU can't find 128 pages. It's likely due to:
>>>>
>>>> - out of the IOMMU space (possibly someone doesn't free the IOMMU
>>>>    space).
>>>>
>>>> or
>>>>
>>>> - the mapping parameters (such as align) aren't appropriate so the
>>>>    IOMMU can't find space.
>>>
>>> I don't think KMS drivers have ever worked on alpha so its not a
>>> regression, they are working fine on x86 + powerpc and sparc has been
>>> run at least once.
>>
>> KMS on the console boot up has worked since about 2.6.32, but starting
>> up the X server has always failed and, in my case, the system becomes
>> unstable and eventually OOPs.
>>
>>> I suspect we are simply hitting the limits of the iommu, how big an
>>> address space does it handle? since generally graphics drivers try to
>>> bind a lot of things to the GART.
>>
>> No idea on the address space limit.  I applied the patch of Fujita that
>> logs all IOMMU allocations, and also inserted some extra printks in the
>> ttm kernel code so that I could see which routines failed and the error
>> code returned.  Running the radeon test on boot exhibits the following:
>>
>> [  238.712768] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset
>> 0x1a312000
>> [  239.281127] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset
>> 0x1a412000
>> [  239.281127] ttm_tt_bind belched -12
>> [  239.282104] ttm_bo_handle_move_mem belched -12
>> [  239.282104] ttm_bo_move_buffer belched -12
>> [  239.282104] ttm_bo_validate belched -12
>> [  239.282104] radeon 0000:01:00.0: object_init failed for (1048576,
>> 0x00000002) err=-12
>> [  239.282104] [drm:radeon_test_moves] *ERROR* Failed to create GTT
>> object 419
>> [  239.399291] Error while testing BO move.
>>
>> Note that no IOMMU allocations are printed while radeon_test_moves is
>> running so iommu_arena_alloc doesn't appear to be called.  Also the
>> error code returned up to radeon_test_moves is -12 which is ENOMEM.  So
>> does appear to be some memory limit.
>
> Hmm, not related with IOMMU? looks like ttm_tt_populate could return
> ENOMEM too. Can we locate where we hit ENOMEM first?

Yeah, in ttm_mem_global_reserve while it is walking glob->zones:

[  239.303588] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 
0x1a412000
[  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds 
limit (0x1a5ef000)
[  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds 
limit (0x1a5ef000)
[  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds 
limit (0x1a5ef000)
[  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds 
limit (0x1a5ef000)
[  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds 
limit (0x1a5ef000)
[  239.304564] ttm_mem_global_reserve return non-zero count decs to zero
[  239.304564] ttm_mem_global_alloc_page belched -12
[  239.304564] __ttm_tt_get_page coughed NULL
[  239.304564] ttm_tt_populate belched -12
[  239.304564] ttm_tt_bind belched -12
[  239.304564] ttm_bo_handle_move_mem belched -12
[  239.304564] ttm_bo_move_buffer belched -12
[  239.304564] ttm_bo_validate belched -12

On a hunch that we are chasing a red herring I installed another 256MB 
of memory into the machine (was 576MB for the test reported above) for a 
total of 832MB.

Now radeon_test_moves runs to completion without error.

OK, now a test of starting up the X server - ah, a bus error again but 
now it looks like it's in the radeon driver:

[  1435.014] (II) EXA(0): Driver allocated offscreen pixmaps
[  1435.014] (II) EXA(0): Driver registered support for the following 
operations:
[  1435.014] (II)         Solid
[  1435.014] (II)         Copy
[  1435.014] (II)         Composite (RENDER acceleration)
[  1435.014] (II)         UploadToScreen
[  1435.014] (II)         DownloadFromScreen
[  1435.030]
Backtrace:
[  1435.032] 0: /opt/xorg-ev56/bin/X (xorg_backtrace+0x54) [0x120070884]
[  1435.032] 1: /opt/xorg-ev56/bin/X (0x120000000+0x65608) [0x120065608]
[  1435.033] 2: /lib/libc.so.6.1 (0x20000310000+0x3d610) [0x2000034d610]
[  1435.034] 3: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so 
(0x20000758000+0x15b890) [0x200008b3890]
[  1435.034] 4: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so 
(0x20000758000+0x1392a0) [0x200008912a0]
[  1435.034] 5: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so 
(0x20000758000+0x139bec) [0x20000891bec]
[  1435.034] 6: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so 
(0x20000758000+0x4f088) [0x200007a7088]
[  1435.035] 7: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so 
(0x20000758000+0x16f0f8) [0x200008c70f8]
[  1435.035] 8: /opt/xorg-ev56/bin/X (AddScreen+0x1c0) [0x1200532b0]
[  1435.036] 9: /opt/xorg-ev56/bin/X (InitOutput+0x29c) [0x12008c6ec]
[  1435.036] 10: /opt/xorg-ev56/bin/X (0x120000000+0x24b48) [0x120024b48]
[  1435.037] 11: /lib/libc.so.6.1 (__libc_start_main+0xec) [0x2000033267c]
[  1435.037] 12: /opt/xorg-ev56/bin/X (__start+0x38) [0x120024788]
[  1435.038] Bus error at address 0x20000030000

And nothing in dmesg.  Now I'm not triggering the nasty page alloc errors.

Cheers
Michael.

  reply	other threads:[~2010-06-27 10:47 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-21 21:19 Problems with alpha/pci + radeon/ttm Matt Turner
2010-06-22  5:59 ` FUJITA Tomonori
2010-06-22  8:32   ` Dave Airlie
2010-06-24  9:51     ` Michael Cree
2010-06-24 15:02       ` Matt Turner
2010-06-27  4:20       ` FUJITA Tomonori
2010-06-27 10:46         ` Michael Cree [this message]
2010-06-27 23:14           ` Dave Airlie
2010-06-28  9:03             ` Michael Cree
2010-06-28 16:08               ` Richard Henderson
2010-06-24 14:53   ` Matt Turner
2010-06-27  4:20     ` FUJITA Tomonori
2010-06-27  4:58       ` Matt Turner
2010-06-30 18:43         ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C272C1C.9000802@orcon.net.nz \
    --to=mcree@orcon.net.nz \
    --cc=airlied@gmail.com \
    --cc=alexdeucher@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=fujita.tomonori@lab.ntt.co.jp \
    --cc=ink@jurassic.park.msu.ru \
    --cc=jbarnes@virtuousgeek.org \
    --cc=jglisse@redhat.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mattst88@gmail.com \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).