All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
To: zhoucm1 <david1.zhou-5C7GfCeVMHo@public.gmane.org>,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Subject: Re: [PATCH 00/10] GART table recovery
Date: Thu, 18 Aug 2016 11:03:06 +0200	[thread overview]
Message-ID: <09a51be0-3f54-8250-fd63-4ca9ae11094f@vodafone.de> (raw)
In-Reply-To: <57B576BB.4030400-5C7GfCeVMHo@public.gmane.org>

Am 18.08.2016 um 10:50 schrieb zhoucm1:
>
>
> On 2016年08月04日 17:58, Christian König wrote:
>> Am 04.08.2016 um 05:35 schrieb zhoucm1:
>>>
>>>
>>> On 2016年08月03日 22:01, Christian König wrote:
>>>> Well patch #10 is incorrect. The SA BO will be set to NULL by 
>>>> amdgpu_sa_bo_free(), so it can't be freed twice and so you can't 
>>>> reference the fence twice.
>>> I see.
>>> But amdgpu_job_free_resources still shouldn't be called twice, 
>>> right? That's an obvious duplication although it seems no effect 
>>> now. Is there any other reason?
>>
>> It's actually called from a couple of different locations:
>> 1. From the CS path in amdgpu_cs.c as soon as we have a scheduler fence.
>> 2. From the amdgpu_job_submit() path as soon as we have a scheduler 
>> fence.
>> 3. From amdgpu_job_run() after submitting the job to the hardware ring.
>> 4. From amdgpu_job_free(), this is for direct submissions or for 
>> freeing the job when something went wrong.
>>
>> Thinking about it you could be right and we could probably drop the 
>> one in amdgpu_job_run(), because amdgpu_job_submit() should have 
>> already taken care of that. But I'm not 100% sure of that.
>>
>>>
>>>>
>>>> Additional to that the whole approach here of restoring the GART 
>>>> from the backup using the SDMA won't work either. For the SDMA to 
>>>> work you need the GART to access the ring buffer.
>>>>
>>>> So you run into a chicken and egg problem here, for the ring buffer 
>>>> to work you need the GART and for the GART backup to work you need 
>>>> the ring buffer.
>>> Good catch, ring buffer is a GTT buffer as well.
>>>
>>> Then Can we use memcpy to copy GTT to VRAM? Fortunately, the GART bo 
>>> is only one bo.
>>
>> Yeah that is what we did with radeon as well. Unfortunately the 
>> double housekeeping costs quite a bunch of memory.
>>
>> And actually we have the exactly same information in the TTM MM as 
>> well, we would just need to bind all BOs again.
>>
>> Give me a day or two to double check that. Might be that the solution 
>> is rather simple.
> How about this? Do you have any better idea for it?

Sorry for not answering earlier, but you know humans have only one head 
and two hands to type :)

The informations needed to restore the GART is already stored in the 
amdgpu_ttm_tt structures.

We just need to link them together in amdgpu_ttm_backend_bind() and 
unlink them in amdgpu_ttm_backend_unbind() to be able to restore the 
GART table after a reset.

> or just change ring buffer to VRAM bo?

Interesting idea, but considering how much overhead it has to write to 
VRAM from the CPU it clearly wouldn't be such a good idea to do in general.

Regards,
Christian.

>
> Regards,
> David Zhou
>>
>> Regards,
>> Christian.
>>
>>>
>>> Regards,
>>> David Zhou
>>>
>>>>
>>>> We should just restore the GART content from the housekeeping 
>>>> structure instead. Going to evaluate if and how that might be 
>>>> possible.
>>>
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 02.08.2016 um 10:00 schrieb Chunming Zhou:
>>>>> gart table is stored in one bo which must be ready before gart 
>>>>> init, but the shadow bo must be created after gart is ready, so 
>>>>> they cannot be created at a same time. shado bo itself aslo is 
>>>>> included in gart table, So shadow bo needs a synchronization after 
>>>>> device init. After sync, the contents of bo and shadwo bo will be 
>>>>> same, and be updated at a same time. Then we will be able to 
>>>>> recover gart table from shadow bo when gpu full reset.
>>>>>
>>>>> patch10 is a fix for memory leak.
>>>>>
>>>>> Chunming Zhou (10):
>>>>>    drm/amdgpu: make need_backup generic
>>>>>    drm/amdgpu: implement gart late_init/fini
>>>>>    drm/amdgpu: add gart_late_init/fini to gmc V7/8
>>>>>    drm/amdgpu: abstract amdgpu_bo_create_shadow
>>>>>    drm/amdgpu: shadow gart table support
>>>>>    drm/amdgpu: make recover_bo_from_shadow be generic
>>>>>    drm/amdgpu: implement gart recovery
>>>>>    drm/amdgpu: recover gart table first when full reset
>>>>>    drm/amdgpu: sync gart table before initialization completed
>>>>>    drm/amdgpu: fix memory leak of sched fence
>>>>>
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |   9 ++
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 139 
>>>>> +++++++++++++++++++++++++++++
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c    |   2 +-
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  80 ++++++++++++++---
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |   9 ++
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c     |  50 ++---------
>>>>>   drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c      |  39 +++++++-
>>>>>   drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c      |  40 ++++++++-
>>>>>   9 files changed, 304 insertions(+), 66 deletions(-)
>>>>>
>>>>
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

      parent reply	other threads:[~2016-08-18  9:03 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-02  8:00 [PATCH 00/10] GART table recovery Chunming Zhou
     [not found] ` <1470124840-26170-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
2016-08-02  8:00   ` [PATCH 01/10] drm/amdgpu: make need_backup generic Chunming Zhou
2016-08-02  8:00   ` [PATCH 02/10] drm/amdgpu: implement gart late_init/fini Chunming Zhou
2016-08-02  8:00   ` [PATCH 03/10] drm/amdgpu: add gart_late_init/fini to gmc V7/8 Chunming Zhou
2016-08-02  8:00   ` [PATCH 04/10] drm/amdgpu: abstract amdgpu_bo_create_shadow Chunming Zhou
2016-08-02  8:00   ` [PATCH 05/10] drm/amdgpu: shadow gart table support Chunming Zhou
2016-08-02  8:00   ` [PATCH 06/10] drm/amdgpu: make recover_bo_from_shadow be generic Chunming Zhou
2016-08-02  8:00   ` [PATCH 07/10] drm/amdgpu: implement gart recovery Chunming Zhou
2016-08-02  8:00   ` [PATCH 08/10] drm/amdgpu: recover gart table first when full reset Chunming Zhou
2016-08-02  8:00   ` [PATCH 09/10] drm/amdgpu: sync gart table before initialization completed Chunming Zhou
2016-08-02  8:00   ` [PATCH 10/10] drm/amdgpu: fix memory leak of sched fence Chunming Zhou
2016-08-02 15:15   ` [PATCH 00/10] GART table recovery Christian König
     [not found]     ` <f1b6c786-7e9c-ff61-1de9-299bc4daed15-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2016-08-03  1:33       ` zhoucm1
2016-08-03 14:01   ` Christian König
     [not found]     ` <54bb3255-2dda-f6ad-3682-8e4396ec932a-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2016-08-04  3:35       ` zhoucm1
     [not found]         ` <57A2B810.6050209-5C7GfCeVMHo@public.gmane.org>
2016-08-04  9:58           ` Christian König
     [not found]             ` <077bb11d-957d-c6f2-2f87-248fbc19304a-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2016-08-18  8:50               ` zhoucm1
     [not found]                 ` <57B576BB.4030400-5C7GfCeVMHo@public.gmane.org>
2016-08-18  9:03                   ` Christian König [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=09a51be0-3f54-8250-fd63-4ca9ae11094f@vodafone.de \
    --to=deathsimple-antagkrnahcb1svskn2v4q@public.gmane.org \
    --cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=david1.zhou-5C7GfCeVMHo@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.