linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Airlie <airlied@gmail.com>
To: Michael Cree <mcree@orcon.net.nz>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>,
	mattst88@gmail.com, linux-kernel@vger.kernel.org,
	linux-alpha@vger.kernel.org, rth@twiddle.net,
	ink@jurassic.park.msu.ru, jbarnes@virtuousgeek.org,
	linux-pci@vger.kernel.org, dri-devel@lists.freedesktop.org,
	alexdeucher@gmail.com, jglisse@redhat.com
Subject: Re: Problems with alpha/pci + radeon/ttm
Date: Mon, 28 Jun 2010 09:14:29 +1000	[thread overview]
Message-ID: <AANLkTinrscp_Vqh69dbH5t5yqNiR4Mgl7T2JRrmWwI4a@mail.gmail.com> (raw)
In-Reply-To: <4C272C1C.9000802@orcon.net.nz>

On Sun, Jun 27, 2010 at 8:46 PM, Michael Cree <mcree@orcon.net.nz> wrote:
> On 27/06/10 16:20, FUJITA Tomonori wrote:
>>
>> On Thu, 24 Jun 2010 21:51:40 +1200
>> Michael Cree<mcree@orcon.net.nz>  wrote:
>>
>>>>> Is this a regression (what kernel version worked)?
>>>>>
>>>>> Seems that the IOMMU can't find 128 pages. It's likely due to:
>>>>>
>>>>> - out of the IOMMU space (possibly someone doesn't free the IOMMU
>>>>>   space).
>>>>>
>>>>> or
>>>>>
>>>>> - the mapping parameters (such as align) aren't appropriate so the
>>>>>   IOMMU can't find space.
>>>>
>>>> I don't think KMS drivers have ever worked on alpha so its not a
>>>> regression, they are working fine on x86 + powerpc and sparc has been
>>>> run at least once.
>>>
>>> KMS on the console boot up has worked since about 2.6.32, but starting
>>> up the X server has always failed and, in my case, the system becomes
>>> unstable and eventually OOPs.
>>>
>>>> I suspect we are simply hitting the limits of the iommu, how big an
>>>> address space does it handle? since generally graphics drivers try to
>>>> bind a lot of things to the GART.
>>>
>>> No idea on the address space limit.  I applied the patch of Fujita that
>>> logs all IOMMU allocations, and also inserted some extra printks in the
>>> ttm kernel code so that I could see which routines failed and the error
>>> code returned.  Running the radeon test on boot exhibits the following:
>>>
>>> [  238.712768] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset
>>> 0x1a312000
>>> [  239.281127] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset
>>> 0x1a412000
>>> [  239.281127] ttm_tt_bind belched -12
>>> [  239.282104] ttm_bo_handle_move_mem belched -12
>>> [  239.282104] ttm_bo_move_buffer belched -12
>>> [  239.282104] ttm_bo_validate belched -12
>>> [  239.282104] radeon 0000:01:00.0: object_init failed for (1048576,
>>> 0x00000002) err=-12
>>> [  239.282104] [drm:radeon_test_moves] *ERROR* Failed to create GTT
>>> object 419
>>> [  239.399291] Error while testing BO move.
>>>
>>> Note that no IOMMU allocations are printed while radeon_test_moves is
>>> running so iommu_arena_alloc doesn't appear to be called.  Also the
>>> error code returned up to radeon_test_moves is -12 which is ENOMEM.  So
>>> does appear to be some memory limit.
>>
>> Hmm, not related with IOMMU? looks like ttm_tt_populate could return
>> ENOMEM too. Can we locate where we hit ENOMEM first?
>
> Yeah, in ttm_mem_global_reserve while it is walking glob->zones:
>
> [  239.303588] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset
> 0x1a412000
> [  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds
> limit (0x1a5ef000)
> [  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds
> limit (0x1a5ef000)
> [  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds
> limit (0x1a5ef000)
> [  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds
> limit (0x1a5ef000)
> [  239.304564] ttm_mem_global_reserve zone used_mem (0x1a5f0000) exceeds
> limit (0x1a5ef000)
> [  239.304564] ttm_mem_global_reserve return non-zero count decs to zero
> [  239.304564] ttm_mem_global_alloc_page belched -12
> [  239.304564] __ttm_tt_get_page coughed NULL
> [  239.304564] ttm_tt_populate belched -12
> [  239.304564] ttm_tt_bind belched -12
> [  239.304564] ttm_bo_handle_move_mem belched -12
> [  239.304564] ttm_bo_move_buffer belched -12
> [  239.304564] ttm_bo_validate belched -12
>
> On a hunch that we are chasing a red herring I installed another 256MB of
> memory into the machine (was 576MB for the test reported above) for a total
> of 832MB.
>
> Now radeon_test_moves runs to completion without error.
>
> OK, now a test of starting up the X server - ah, a bus error again but now
> it looks like it's in the radeon driver:
>
> [  1435.014] (II) EXA(0): Driver allocated offscreen pixmaps
> [  1435.014] (II) EXA(0): Driver registered support for the following
> operations:
> [  1435.014] (II)         Solid
> [  1435.014] (II)         Copy
> [  1435.014] (II)         Composite (RENDER acceleration)
> [  1435.014] (II)         UploadToScreen
> [  1435.014] (II)         DownloadFromScreen
> [  1435.030]
> Backtrace:
> [  1435.032] 0: /opt/xorg-ev56/bin/X (xorg_backtrace+0x54) [0x120070884]
> [  1435.032] 1: /opt/xorg-ev56/bin/X (0x120000000+0x65608) [0x120065608]
> [  1435.033] 2: /lib/libc.so.6.1 (0x20000310000+0x3d610) [0x2000034d610]
> [  1435.034] 3: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so
> (0x20000758000+0x15b890) [0x200008b3890]
> [  1435.034] 4: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so
> (0x20000758000+0x1392a0) [0x200008912a0]
> [  1435.034] 5: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so
> (0x20000758000+0x139bec) [0x20000891bec]
> [  1435.034] 6: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so
> (0x20000758000+0x4f088) [0x200007a7088]
> [  1435.035] 7: /opt/xorg-ev56/lib/xorg/modules/drivers/radeon_drv.so
> (0x20000758000+0x16f0f8) [0x200008c70f8]
> [  1435.035] 8: /opt/xorg-ev56/bin/X (AddScreen+0x1c0) [0x1200532b0]
> [  1435.036] 9: /opt/xorg-ev56/bin/X (InitOutput+0x29c) [0x12008c6ec]
> [  1435.036] 10: /opt/xorg-ev56/bin/X (0x120000000+0x24b48) [0x120024b48]
> [  1435.037] 11: /lib/libc.so.6.1 (__libc_start_main+0xec) [0x2000033267c]
> [  1435.037] 12: /opt/xorg-ev56/bin/X (__start+0x38) [0x120024788]
> [  1435.038] Bus error at address 0x20000030000
>
> And nothing in dmesg.  Now I'm not triggering the nasty page alloc errors.

The bus error is caused by the kernel, its something alpha specific
with how mmap works,
I'm not sure if alpha needs some special mmap flags or something,

Dave.

  reply	other threads:[~2010-06-27 23:14 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-21 21:19 Problems with alpha/pci + radeon/ttm Matt Turner
2010-06-22  5:59 ` FUJITA Tomonori
2010-06-22  8:32   ` Dave Airlie
2010-06-24  9:51     ` Michael Cree
2010-06-24 15:02       ` Matt Turner
2010-06-27  4:20       ` FUJITA Tomonori
2010-06-27 10:46         ` Michael Cree
2010-06-27 23:14           ` Dave Airlie [this message]
2010-06-28  9:03             ` Michael Cree
2010-06-28 16:08               ` Richard Henderson
2010-06-24 14:53   ` Matt Turner
2010-06-27  4:20     ` FUJITA Tomonori
2010-06-27  4:58       ` Matt Turner
2010-06-30 18:43         ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AANLkTinrscp_Vqh69dbH5t5yqNiR4Mgl7T2JRrmWwI4a@mail.gmail.com \
    --to=airlied@gmail.com \
    --cc=alexdeucher@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=fujita.tomonori@lab.ntt.co.jp \
    --cc=ink@jurassic.park.msu.ru \
    --cc=jbarnes@virtuousgeek.org \
    --cc=jglisse@redhat.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mattst88@gmail.com \
    --cc=mcree@orcon.net.nz \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).