dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: "Thomas Hellström (Intel)" <thomas_os@shipmail.org>,
	"Chris Wilson" <chris@chris-wilson.co.uk>,
	"Daniel Vetter" <daniel@ffwll.ch>,
	"Lepton Wu" <ytht.net@gmail.com>
Cc: dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: [RFC] drm/vgem: Don't use get_page in fault handler.
Date: Tue, 7 Jul 2020 15:55:41 +0200	[thread overview]
Message-ID: <0feb1cee-ca75-938e-9752-c03eec18346d@amd.com> (raw)
In-Reply-To: <a0f8e299-da40-ddc9-3c97-a348a56d3408@shipmail.org>

Am 07.07.20 um 14:05 schrieb Thomas Hellström (Intel):
>
> On 7/7/20 1:07 PM, Chris Wilson wrote:
>> Quoting Christian König (2020-07-07 11:58:26)
>>> Am 07.07.20 um 10:59 schrieb Daniel Vetter:
>>>> On Tue, Jul 7, 2020 at 9:27 AM Lepton Wu <ytht.net@gmail.com> wrote:
>>>>> For pages which are allocated in ttm with transparent huge pages,
>>>>> tail pages have zero as reference count. The current vgem code use
>>>>> get_page on them and it will trigger BUG when release_pages get 
>>>>> called
>>>>> on those pages later.
>>>>>
>>>>> Here I attach the minimal code to trigger this bug on a qemu VM which
>>>>> enables virtio gpu (card1) and vgem (card0). BTW, since the upstream
>>>>> virtio gpu has changed to use drm gem and moved away from ttm. So we
>>>>> have to use an old kernel like 5.4 to reproduce it. But I guess
>>>>> same thing could happen for a real GPU if the real GPU use similar 
>>>>> code
>>>>> path to allocate pages from ttm. I am not sure about two things: 
>>>>> first, do we
>>>>> need to fix this? will a real GPU hit this code path? Second, 
>>>>> suppose this
>>>>> need to be fixed, should this be fixed in ttm side or vgem side?  
>>>>> If we remove
>>>>> "huge_flags &= ~__GFP_COMP" from ttm_get_pages, then get_page in 
>>>>> vgem works
>>>>> fine. But it's there for fix another bug:
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.freedesktop.org%2Fshow_bug.cgi%3Fid%3D103138&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7C6c4170559c0a4d0d5f5708d8226e0f97%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637297203404761846&amp;sdata=qXBQBKvBM7zvi6PIkcAPfBjmOC8UrKmjnO03hWKzNHU%3D&amp;reserved=0 
>>>>>
>>>>> It could also be "fixed" with this patch. But I am really not sure 
>>>>> if this is
>>>>> the way to go.
>>>> For imported dma-buf objects, vgem should not handle this itself, but
>>>> forward to the exporter through the dma_buf_mmap stuff. We have
>>>> helpers for this all now, probably just not wired up correctly. Trying
>>>> to ensure that all combinations of mmap code across all drivers work
>>>> the same doesn't work.
>>> Yes, Daniel is right what vgem does here is fundamentally broken in 
>>> many
>>> ways.
>>>
>>> First of all nobody should ever call get_page() on the pages TTM has
>>> allocated. Those pages come out of a block box and should not be 
>>> touched
>>> by anybody.
>> It doesn't. It's only used on the pages vgem allocates (from 
>> shmemfs). So
>> I'm really confused as to how ttm gets involved here.
>> -Chris
>
> Sounds like vgem is allowing mmap of imported dma-bufs?

Yeah agree, this is most likely the underlying problem and should be 
fixed instead.

Christian.

>
> /Thomas
>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7C6c4170559c0a4d0d5f5708d8226e0f97%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637297203404761846&amp;sdata=Ny4sdibo9o8%2F8UqxdP74HaOXQjoKkmd%2FrRAPHszr3rk%3D&amp;reserved=0 
>>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

      parent reply	other threads:[~2020-07-07 13:55 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-07  4:21 [RFC] drm/vgem: Don't use get_page in fault handler Lepton Wu
2020-07-07  8:22 ` Chris Wilson
2020-07-07  8:59 ` Daniel Vetter
2020-07-07 10:58   ` Christian König
2020-07-07 11:07     ` Chris Wilson
2020-07-07 12:05       ` Thomas Hellström (Intel)
2020-07-07 12:15         ` Chris Wilson
2020-07-07 13:55         ` Christian König [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0feb1cee-ca75-938e-9752-c03eec18346d@amd.com \
    --to=christian.koenig@amd.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=thomas_os@shipmail.org \
    --cc=ytht.net@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).