dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: "Deucher, Alexander" <alexander.deucher@amd.com>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	amd-gfx list <amd-gfx@lists.freedesktop.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>
Subject: Re: [drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to pin framebuffer with error -12
Date: Mon, 11 Jan 2021 21:45:46 +0100	[thread overview]
Message-ID: <77b696b9-3248-d329-4f7d-5e27a21eabff@amd.com> (raw)
In-Reply-To: <CABXGCsM8yYNz7gQW26a4hHwBR+MunXoopHEiyDJdC-muNrRxkQ@mail.gmail.com>

Hi Mike,

Am 11.01.21 um 20:23 schrieb Mikhail Gavrilov:
> On Mon, 11 Jan 2021 at 19:01, Christian König <christian.koenig@amd.com> wrote:
>
>> Changing the page table attributes while releasing memory might sleep.
>> So we can't use a spinlock here.
>>
>> Thanks for the report, a patch to fix this is on the mailing list now.
> Can you look also the first trace?

Unfortunately not, that's DC stuff. Easiest is to assign this as a bug 
tracker to our DC team.

> Here a same error message "sleeping function called from invalid
> context" and a lot of [amdgpu] code.

[SNIP]

>>> -12 is just -ENOMEM. Looks like a memory leak to me, maybe caused by
>>> the problem above, maybe something completely unrelated.
>>>
>>> I will take a look.
>> The looks like a completely unrelated memory leak to me.
>>
>> Probably best if you open up a bug report for this.
> Yes, the monitor still turns off after applying patch "make the pool
> shrinker lock a mutex".
> Anyway patch fixed the issue with flood of message "BUG: sleeping
> function called from invalid context at mm/vmalloc.c:1756" so kernel
> log became cleaner.

At least some progress. Any objections that I add your e-mail address as 
tested-by tag?

> Now the issue with turns off monitor looks in logs so:
>
> DMA-API: cacheline tracking ENOMEM, dma-debug disabled
> amdgpu 0000:0b:00.0: amdgpu: 000000006b791523 pin failed
> [drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to pin
> framebuffer with error -12
> BUG: kernel NULL pointer dereference, address: 0000000000000060
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP NOPTI
> CPU: 20 PID: 3780 Comm: brave:cs0 Tainted: G        W        ---------
> ---  5.11.0-0.rc2.20210108gitf5e6c330254a.120.fc34.x86_64 #1
> Hardware name: System manufacturer System Product Name/ROG STRIX
> X570-I GAMING, BIOS 2802 10/21/2020
> RIP: 0010:ttm_tt_swapin+0x34/0x1b0 [ttm]
> Code: 55 41 54 55 53 48 83 ec 10 48 8b 47 20 48 89 44 24 08 48 85 c0
> 0f 84 86 01 00 00 48 8b 44 24 08 49 89 fc 4c 8b a8 e0 01 00 00 <41> 8b
> 45 60 89 44 24 04 8b 47 0c 85 c0 0f 84 df 00 00 00 31 db 65
> RSP: 0018:ffffa7400532b9c0 EFLAGS: 00010286
> RAX: ffff978e2ae25800 RBX: ffff97910ec12058 RCX: ffff978e12caac70
> RDX: 0000000080000010 RSI: 0000000000000000 RDI: ffff97912c3d99c0
> RBP: ffff97912c3d99c0 R08: 0000000000000000 R09: 0000000070b3a000
> R10: 0000000000000002 R11: 0000000000000000 R12: ffff97912c3d99c0
> R13: 0000000000000000 R14: ffffa7400532ba90 R15: ffff978e182c6350
> FS:  00007f070bb1b640(0000) GS:ffff979509200000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000060 CR3: 00000001f0cd2000 CR4: 0000000000350ee0
> Call Trace:
>   ttm_tt_populate+0xa9/0xe0 [ttm]
>   ttm_bo_handle_move_mem+0x142/0x180 [ttm]
>   ttm_bo_validate+0x12e/0x1c0 [ttm]

I can take a look at this one here. Looks like some missing error 
handling when allocating memory.

Can you decode to which line number ttm_tt_swapin+0x34 points to?

[SNIP]

> You said that I need open up a bug report you means site
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C75040f5053404b0f302b08d8b666769b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637459898491581880%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=IbkSfHK%2BD13OCcYMg%2BlNsZixi9gDEQEfS7Mxyf7vGdM%3D&amp;reserved=0 ?
> I thought mailing lists is better because bug report on
> bugzilla.kernel.org usually leave opened for several years without
> attention.

Please use this one here: 
https://gitlab.freedesktop.org/drm/amd/-/issues/new

If you can't find the DC guys of hand in the assignee list just assign 
to me and I will forward.

But what you have in your logs so far are only unrelated symptoms, the 
root of the problem is that somebody is leaking memory.

What you could do as well is to try to enable kmemleak and maybe try 
some bleeding edge branch like drm-misc-fixes or Alex 
amd-staging-drm-next branch.

Thanks for the help,
Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2021-01-11 20:45 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-10 22:26 [drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to pin framebuffer with error -12 Mikhail Gavrilov
2021-01-11  9:03 ` Christian König
2021-01-11 14:01   ` Christian König
2021-01-11 19:23     ` Mikhail Gavrilov
2021-01-11 20:45       ` Christian König [this message]
2021-01-11 21:51         ` Mikhail Gavrilov
2021-01-14  0:22         ` Mikhail Gavrilov
2021-01-14 13:56           ` Christian König
2021-01-14 14:06             ` Daniel Vetter
2021-01-14 22:43             ` Mikhail Gavrilov
2021-01-20  0:59               ` Mikhail Gavrilov
2021-01-21 13:27                 ` Christian König
2021-01-25  5:28                   ` Mikhail Gavrilov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=77b696b9-3248-d329-4f7d-5e27a21eabff@amd.com \
    --to=christian.koenig@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikhail.v.gavrilov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).