All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Deucher <alexdeucher@gmail.com>
To: "Zhu, James" <James.Zhu@amd.com>
Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>,
	"Zhang, Yifan" <Yifan1.Zhang@amd.com>,
	James Zhu <jzhums@gmail.com>,
	amd-gfx list <amd-gfx@lists.freedesktop.org>,
	Ken Moffat <zarniwhoop@ntlworld.com>
Subject: Re: [PATCH] drm/amdgpu: remove duplicated kfd_resume_iommu
Date: Wed, 3 Nov 2021 11:57:17 -0400	[thread overview]
Message-ID: <CADnq5_NF82=PC-n-c=Bf2gqpECsXvNZBbq=OZs+faFDMMCp1Ng@mail.gmail.com> (raw)
In-Reply-To: <BN6PR12MB1874A9156EF80C63D96EBD06E48C9@BN6PR12MB1874.namprd12.prod.outlook.com>


[-- Attachment #1.1: Type: text/plain, Size: 8197 bytes --]

I think just applying your patch is fine for drm-next (i'll take care of
that).  For 5.14.x and 5.15.x, we can just cherry-pick afd1818.

Alex

On Wed, Nov 3, 2021 at 11:54 AM Zhu, James <James.Zhu@amd.com> wrote:

> [AMD Official Use Only]
>
> Hi Alex,
>
> The following two patches were introduced for stable@vger.kernel.org
>
> 714d9e4 drm/amdgpu: init iommu after amdkfd device init
> f02abeb drm/amdgpu: move iommu_resume before ip init/resume
>
> after commit   970eae15600a883e4ad27dd0757b18871cc983ab
> Merge: 27f4432 3906fe9    BackMerge tag 'v5.15-rc7' into drm-next,
> It became redundant and overwrote afd1818.
>
> I saw that you just submit (afd1818) "[PATCH] drm/amdkfd: fix boot
> failure when iommu is disabled in Picasso" to stable@vger.kernel.org.
>
> I checked that if we re-applied afd1818 on current drm-next, it did the
> same thing as my patch after auto-merged.
>
> I am wondering if BackMerge stable into drm-next in the future will
> correct current break.
>
> For the above situation, I am not sure what is the proper way to fix this
> break.
>
> Please let me know your final decision with all these information.
>
>
> Thanks & Best Regards!
>
>
> James Zhu
> ------------------------------
> *From:* Alex Deucher <alexdeucher@gmail.com>
> *Sent:* Wednesday, November 3, 2021 11:03 AM
> *To:* Zhu, James <James.Zhu@amd.com>
> *Cc:* amd-gfx list <amd-gfx@lists.freedesktop.org>; Deucher, Alexander <
> Alexander.Deucher@amd.com>; Zhang, Yifan <Yifan1.Zhang@amd.com>; James
> Zhu <jzhums@gmail.com>; Ken Moffat <zarniwhoop@ntlworld.com>
> *Subject:* Re: [PATCH] drm/amdgpu: remove duplicated kfd_resume_iommu
>
> Reverting 714d9e4 and  f02abeb results in this diff which is more than
> this patch does.  Is that correct or should I just use your patch?
>
> Alex
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index e56bc925afcf..70540712ff2d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2360,6 +2360,10 @@ static int amdgpu_device_ip_init(struct
> amdgpu_device *adev)
>         if (r)
>                 goto init_failed;
>
> +       r = amdgpu_amdkfd_resume_iommu(adev);
> +       if (r)
> +               goto init_failed;
> +
>         r = amdgpu_device_ip_hw_init_phase1(adev);
>         if (r)
>                 goto init_failed;
> @@ -2398,10 +2402,6 @@ static int amdgpu_device_ip_init(struct
> amdgpu_device *adev)
>         if (!adev->gmc.xgmi.pending_reset)
>                 amdgpu_amdkfd_device_init(adev);
>
> -       r = amdgpu_amdkfd_resume_iommu(adev);
> -       if (r)
> -               goto init_failed;
> -
>         amdgpu_fru_get_product_info(adev);
>
>  init_failed:
> @@ -3119,10 +3119,6 @@ static int amdgpu_device_ip_resume(struct
> amdgpu_device *adev)
>  {
>         int r;
>
> -       r = amdgpu_amdkfd_resume_iommu(adev);
> -       if (r)
> -               return r;
> -
>         r = amdgpu_device_ip_resume_phase1(adev);
>         if (r)
>                 return r;
> @@ -4595,10 +4591,6 @@ int amdgpu_do_asic_reset(struct list_head
> *device_list_handle,
>                                 dev_warn(tmp_adev->dev, "asic atom init
> failed!");
>                         } else {
>                                 dev_info(tmp_adev->dev, "GPU reset
> succeeded, trying to resume\n");
> -                               r = amdgpu_amdkfd_resume_iommu(tmp_adev);
> -                               if (r)
> -                                       goto out;
> -
>                                 r =
> amdgpu_device_ip_resume_phase1(tmp_adev);
>                                 if (r)
>                                         goto out;
>
>
> On Wed, Nov 3, 2021 at 10:50 AM Alex Deucher <alexdeucher@gmail.com>
> wrote:
>
>
>
> On Wed, Nov 3, 2021 at 10:34 AM Zhu, James <James.Zhu@amd.com> wrote:
>
> [AMD Official Use Only]
>
> Hi Alex,
>
> Finally figured out the root cause for this broken,
>
> Linux 5.14.15  + afd1818 can fix the issue.
>
>
> I'll do that for stable.
>
>
> Linux 5.15rc7 re-apply "init iommu after amdkfd device init" and "move iommu_resume before ip init/resume" which overwrote afd1818 caused the issue again.
>
> 714d9e4 drm/amdgpu: init iommu after amdkfd device init
>
> f02abeb drm/amdgpu: move iommu_resume before ip init/resume
>
> afd1818 drm/amdkfd: fix boot failure when iommu is disabled in Picasso.
>
> 286826d drm/amdgpu: init iommu after amdkfd device init
>
> 9cec53c drm/amdgpu: move iommu_resume before ip init/resume
>
>
>
> So, do we just discard this patch, and revert 714d9e4 and  f02abeb?
>
>
> I'll do that for 5.15+
>
> Thanks for sorting this out.
>
> Alex
>
>
>
> Thanks & Best Regards!
>
>
> James Zhu
> ------------------------------
> *From:* Alex Deucher <alexdeucher@gmail.com>
> *Sent:* Tuesday, November 2, 2021 10:01 PM
> *To:* Zhu, James <James.Zhu@amd.com>
> *Cc:* amd-gfx list <amd-gfx@lists.freedesktop.org>; Deucher, Alexander <
> Alexander.Deucher@amd.com>; Zhang, Yifan <Yifan1.Zhang@amd.com>; James
> Zhu <jzhums@gmail.com>; Ken Moffat <zarniwhoop@ntlworld.com>
> *Subject:* Re: [PATCH] drm/amdgpu: remove duplicated kfd_resume_iommu
>
> On Tue, Nov 2, 2021 at 9:34 PM James Zhu <James.Zhu@amd.com> wrote:
> >
> > Remove duplicated kfd_resume_iommu which already runs
> > in mdgpu_amdkfd_device_init.
> >
> > Signed-off-by: James Zhu <James.Zhu@amd.com>
>
> Once you get confirmation, please add:
> Bug:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D214859&amp;data=04%7C01%7CJames.Zhu%40amd.com%7C8662c25150e94d9d664708d99e6deb2b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637715017208277821%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=a6WyuNGhOU5OT3J8GQtXSQ3O5r942D2p%2BbruFUncT0E%3D&amp;reserved=0
> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D214859&data=04%7C01%7CJames.Zhu%40amd.com%7C67f2c85612f7475d0dd008d99edb1fef%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637715486249968500%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=WhxYtNqFSoeWcuJSbJCCl99VSdd3XyHBVzjbpR3nx7g%3D&reserved=0>
> Bug:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1770&amp;data=04%7C01%7CJames.Zhu%40amd.com%7C8662c25150e94d9d664708d99e6deb2b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637715017208287813%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=E1MFXdprEaldLux2AoXNEeDWL5E85WFv8CrfZODTa%2F4%3D&amp;reserved=0
> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1770&data=04%7C01%7CJames.Zhu%40amd.com%7C67f2c85612f7475d0dd008d99edb1fef%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637715486249978500%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hX2U%2BcWp%2BEinTjxptnx0zExc%2Fy3lbFUYgHT2JDdUY0g%3D&reserved=0>
>
> Acked-by: Alex Deucher <alexander.deucher@amd.com>
>
>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ----
> >  1 file changed, 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index e56bc925afcf..f77823ce7ae8 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -2398,10 +2398,6 @@ static int amdgpu_device_ip_init(struct
> amdgpu_device *adev)
> >         if (!adev->gmc.xgmi.pending_reset)
> >                 amdgpu_amdkfd_device_init(adev);
> >
> > -       r = amdgpu_amdkfd_resume_iommu(adev);
> > -       if (r)
> > -               goto init_failed;
> > -
> >         amdgpu_fru_get_product_info(adev);
> >
> >  init_failed:
> > --
> > 2.25.1
> >
>
>

[-- Attachment #1.2: Type: text/html, Size: 15738 bytes --]

[-- Attachment #2: image.png --]
[-- Type: image/png, Size: 381936 bytes --]

  reply	other threads:[~2021-11-03 15:57 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-03  1:33 [PATCH] drm/amdgpu: remove duplicated kfd_resume_iommu James Zhu
2021-11-03  2:01 ` Alex Deucher
2021-11-03  2:50   ` Ken Moffat
2021-11-03 14:34   ` Zhu, James
2021-11-03 14:50     ` Alex Deucher
2021-11-03 15:03       ` Alex Deucher
2021-11-03 15:54         ` Zhu, James
2021-11-03 15:57           ` Alex Deucher [this message]
2021-11-05  2:31             ` Ken Moffat
2021-11-03 15:35       ` Alex Deucher
2021-11-03 15:40 ` Alex Deucher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADnq5_NF82=PC-n-c=Bf2gqpECsXvNZBbq=OZs+faFDMMCp1Ng@mail.gmail.com' \
    --to=alexdeucher@gmail.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=James.Zhu@amd.com \
    --cc=Yifan1.Zhang@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=jzhums@gmail.com \
    --cc=zarniwhoop@ntlworld.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.