All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Grodzovsky, Andrey" <Andrey.Grodzovsky-5C7GfCeVMHo@public.gmane.org>
To: "Chen, Guchun" <Guchun.Chen-5C7GfCeVMHo@public.gmane.org>,
	"amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
	<amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
Cc: "Deucher,
	Alexander" <Alexander.Deucher-5C7GfCeVMHo@public.gmane.org>,
	"Zhou1, Tao" <Tao.Zhou1-5C7GfCeVMHo@public.gmane.org>,
	"Quan, Evan" <Evan.Quan-5C7GfCeVMHo@public.gmane.org>,
	"noreply-confluence-5C7GfCeVMHo@public.gmane.org"
	<noreply-confluence-5C7GfCeVMHo@public.gmane.org>
Subject: Re: [PATCH 4/4] drm/amdgpu: Move amdgpu_ras_recovery_init to after SMU ready.
Date: Tue, 22 Oct 2019 14:33:12 +0000	[thread overview]
Message-ID: <236963ac-d383-209b-ba63-258fabf9bb02@amd.com> (raw)
In-Reply-To: <BYAPR12MB2806716EFABC7A26DD9DABD5F1690-ZGDeBxoHBPk0CuAkIMgl3QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>

I am well aware that we want to do it ASAP, but what is your suggestion 
for the Arcturus use case where we have to do it AFTER SMU is up and 
running ? Do you want to call amdgpu_ras_recovery_init in two different 
places depending if this is Vega 20 or Arcturus ? This will over 
complicate an already complicated init sequence of RAS.

Andrey

On 10/20/19 9:51 PM, Chen, Guchun wrote:
> I don't think postpone RAS recovery init is not one reasonable proposal. What we do in recovery init is to read the retired page if there is, and retire corresponding memory, this will make sure these pages are reserved well before setting up memory manager and reserving other memory blocks, it will be safe.
>
> Regards,
> Guchun
>
> -----Original Message-----
> From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Sent: Saturday, October 19, 2019 4:49 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Chen, Guchun <Guchun.Chen@amd.com>; Zhou1, Tao <Tao.Zhou1@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; noreply-confluence@amd.com; Quan, Evan <Evan.Quan@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> Subject: [PATCH 4/4] drm/amdgpu: Move amdgpu_ras_recovery_init to after SMU ready.
>
> For Arcturus the I2C traffic is done through SMU tables and so we must postpone RAS recovery init to after they are ready which is in amdgpu_device_ip_hw_init_phase2.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 13 +++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c    | 11 -----------
>   2 files changed, 13 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 17cfdaf..c40e9a5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1850,6 +1850,19 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
>   	if (r)
>   		goto init_failed;
>   
> +	/*
> +	 * retired pages will be loaded from eeprom and reserved here,
> +	 * it should be called after amdgpu_device_ip_hw_init_phase2  since
> +	 * for some ASICs the RAS EEPROM code relies on SMU fully functioning
> +	 * for I2C communication which only true at this point.
> +	 * recovery_init may fail, but it can free all resources allocated by
> +	 * itself and its failure should not stop amdgpu init process.
> +	 *
> +	 * Note: theoretically, this should be called before all vram allocations
> +	 * to protect retired page from abusing
> +	 */
> +	amdgpu_ras_recovery_init(adev);
> +
>   	if (adev->gmc.xgmi.num_physical_nodes > 1)
>   		amdgpu_xgmi_add_device(adev);
>   	amdgpu_amdkfd_device_init(adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 2e85a51..1045c3f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1721,17 +1721,6 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)  #endif
>   
>   	/*
> -	 * retired pages will be loaded from eeprom and reserved here,
> -	 * it should be called after ttm init since new bo may be created,
> -	 * recovery_init may fail, but it can free all resources allocated by
> -	 * itself and its failure should not stop amdgpu init process.
> -	 *
> -	 * Note: theoretically, this should be called before all vram allocations
> -	 * to protect retired page from abusing
> -	 */
> -	amdgpu_ras_recovery_init(adev);
> -
> -	/*
>   	 *The reserved vram for firmware must be pinned to the specified
>   	 *place on the VRAM, so reserve it early.
>   	 */
> --
> 2.7.4
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  parent reply	other threads:[~2019-10-22 14:33 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-18 20:48 [PATCH 0/4] Add RAS EEPROM table support for Arcturus Andrey Grodzovsky
     [not found] ` <1571431711-30149-1-git-send-email-andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
2019-10-18 20:48   ` [PATCH 1/4] drm/amd/powerplay: Add interface for I2C transactions to SMU Andrey Grodzovsky
     [not found]     ` <1571431711-30149-2-git-send-email-andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
2019-10-21 13:21       ` Deucher, Alexander
2019-10-18 20:48   ` [PATCH 2/4] drm/amd/powerplay: Add EEPROM I2C read/write support to Arcturus Andrey Grodzovsky
     [not found]     ` <1571431711-30149-3-git-send-email-andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
2019-10-21  1:44       ` Chen, Guchun
     [not found]         ` <BYAPR12MB2806C91E6A75F053D9C88718F1690-ZGDeBxoHBPk0CuAkIMgl3QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2019-10-22 14:59           ` Grodzovsky, Andrey
     [not found]             ` <ff1fd8fc-71c3-47db-b8d4-f11b11147be9-5C7GfCeVMHo@public.gmane.org>
2019-10-22 19:09               ` Grodzovsky, Andrey
     [not found]                 ` <fa4dc427-95d4-5ae5-aa3e-75816dbd7884-5C7GfCeVMHo@public.gmane.org>
2019-10-23  1:55                   ` Chen, Guchun
2019-10-21  3:01       ` Li, Dennis
     [not found]         ` <MN2PR12MB3167E8AFBF75FA273FC9E9B5ED690-rweVpJHSKTpa4QM4akQw/gdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2019-10-21 13:46           ` Deucher, Alexander
2019-10-21 13:23       ` Deucher, Alexander
2019-10-18 20:48   ` [PATCH 3/4] drm/amdgpu: Use ARCTURUS in RAS EEPROM Andrey Grodzovsky
     [not found]     ` <1571431711-30149-4-git-send-email-andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
2019-10-21 13:23       ` Deucher, Alexander
2019-10-18 20:48   ` [PATCH 4/4] drm/amdgpu: Move amdgpu_ras_recovery_init to after SMU ready Andrey Grodzovsky
     [not found]     ` <1571431711-30149-5-git-send-email-andrey.grodzovsky-5C7GfCeVMHo@public.gmane.org>
2019-10-21  1:51       ` Chen, Guchun
     [not found]         ` <BYAPR12MB2806716EFABC7A26DD9DABD5F1690-ZGDeBxoHBPk0CuAkIMgl3QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2019-10-22 14:33           ` Grodzovsky, Andrey [this message]
     [not found]             ` <236963ac-d383-209b-ba63-258fabf9bb02-5C7GfCeVMHo@public.gmane.org>
2019-10-23  1:24               ` Chen, Guchun
2019-10-21 13:24       ` Deucher, Alexander
2019-10-21  1:43   ` [PATCH 0/4] Add RAS EEPROM table support for Arcturus Quan, Evan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=236963ac-d383-209b-ba63-258fabf9bb02@amd.com \
    --to=andrey.grodzovsky-5c7gfcevmho@public.gmane.org \
    --cc=Alexander.Deucher-5C7GfCeVMHo@public.gmane.org \
    --cc=Evan.Quan-5C7GfCeVMHo@public.gmane.org \
    --cc=Guchun.Chen-5C7GfCeVMHo@public.gmane.org \
    --cc=Tao.Zhou1-5C7GfCeVMHo@public.gmane.org \
    --cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=noreply-confluence-5C7GfCeVMHo@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.