All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Deucher <alexdeucher@gmail.com>
To: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Pelloux-prayer,
	Pierre-eric" <Pierre-eric.Pelloux-prayer@amd.com>,
	"Liu, Aaron" <Aaron.Liu@amd.com>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>,
	Huang Rui <ray.huang@amd.com>,
	"Deucher, Alexander" <Alexander.Deucher@amd.com>,
	"Koenig, Christian" <Christian.Koenig@amd.com>
Subject: Re: [PATCH 1/1] drm/amdgpu: Fix per-IB secure flag GFX hang
Date: Thu, 27 Feb 2020 16:30:09 -0500	[thread overview]
Message-ID: <CADnq5_MEiEkXcif17pTbRys+O6RPrXf=ksTy9SdH3gFwULyrmQ@mail.gmail.com> (raw)
In-Reply-To: <bb5c3ef4-81ce-d129-bbed-99133c8c602e@amd.com>

On Thu, Feb 27, 2020 at 4:15 PM Luben Tuikov <luben.tuikov@amd.com> wrote:
>
> On 2020-02-27 6:56 a.m., Huang Rui wrote:
> > On Thu, Feb 27, 2020 at 06:39:03AM +0800, Tuikov, Luben wrote:
> >> Since commit "Move to a per-IB secure flag (TMZ)",
> >> we've been seeing hangs in GFX. Ray H. pointed out
> >> by sending a patch that we need to send FRAME
> >> CONTROL stop/start back-to-back, every time we
> >> flip the TMZ flag as per each IB we submit. That
> >> is, when we transition from TMZ to non-TMZ we have
> >> to send a stop with TMZ followed by a start with
> >> non-TMZ, and similarly for transitioning from
> >> non-TMZ into TMZ.
> >>
> >> This patch implements this, thus fixing the GFX
> >> hang.
> >>
> >> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
> >> ---
> >>  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c   | 87 +++++++++++++++++-------
> >>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  5 +-
> >>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c   | 15 ++--
> >>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c    | 13 ++--
> >>  4 files changed, 79 insertions(+), 41 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> >> index 4b2342d11520..16d6df3304d3 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> >> @@ -216,40 +216,75 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
> >>              amdgpu_ring_emit_cntxcntl(ring, status);
> >>      }
> >>
> >> -    secure = false;
> >> +    /* Find the first non-preamble IB.
> >> +     */
> >>      for (i = 0; i < num_ibs; ++i) {
> >>              ib = &ibs[i];
> >>
> >>              /* drop preamble IBs if we don't have a context switch */
> >> -            if ((ib->flags & AMDGPU_IB_FLAG_PREAMBLE) &&
> >> -                skip_preamble &&
> >> -                !(status & AMDGPU_PREAMBLE_IB_PRESENT_FIRST) &&
> >> -                !amdgpu_mcbp &&
> >> -                !amdgpu_sriov_vf(adev)) /* for SRIOV preemption, Preamble CE ib must be inserted anyway */
> >> -                    continue;
> >> -
> >> -            /* If this IB is TMZ, add frame TMZ start packet,
> >> -             * else, turn off TMZ.
> >> -             */
> >> -            if (ib->flags & AMDGPU_IB_FLAGS_SECURE && ring->funcs->emit_tmz) {
> >> -                    if (!secure) {
> >> -                            secure = true;
> >> -                            amdgpu_ring_emit_tmz(ring, true);
> >> -                    }
> >> -            } else if (secure) {
> >> +            if (!(ib->flags & AMDGPU_IB_FLAG_PREAMBLE) ||
> >> +                !skip_preamble ||
> >> +                (status & AMDGPU_PREAMBLE_IB_PRESENT_FIRST) ||
> >> +                amdgpu_mcbp ||
> >> +                amdgpu_sriov_vf(adev)) /* for SRIOV preemption, Preamble CE ib must be inserted anyway */
> >> +                    break;
> >> +    }
> >> +    if (i >= num_ibs)
> >> +            goto Done;
> >> +    /* Setup initial TMZiness and send it off.
> >> +     */
> >> +    secure = false;
> >> +    if (job && ring->funcs->emit_frame_cntl) {
> >> +            if (ib->flags & AMDGPU_IB_FLAGS_SECURE)
> >> +                    secure = true;
> >> +            else
> >>                      secure = false;
> >> -                    amdgpu_ring_emit_tmz(ring, false);
> >> -            }
> >> -
> >> -            amdgpu_ring_emit_ib(ring, job, ib, status);
> >> -            status &= ~AMDGPU_HAVE_CTX_SWITCH;
> >> +            amdgpu_ring_emit_frame_cntl(ring, true, secure);
> >>      }
> >> +    amdgpu_ring_emit_ib(ring, job, ib, status);
> >> +    status &= ~AMDGPU_HAVE_CTX_SWITCH;
> >> +    i += 1;
> >> +    /* Send the rest of the IBs.
> >> +     */
> >> +    if (job && ring->funcs->emit_frame_cntl) {
> >> +            for ( ; i < num_ibs; ++i) {
> >> +                    ib = &ibs[i];
> >> +
> >> +                    /* drop preamble IBs if we don't have a context switch */
> >> +                    if ((ib->flags & AMDGPU_IB_FLAG_PREAMBLE) &&
> >> +                        skip_preamble &&
> >> +                        !(status & AMDGPU_PREAMBLE_IB_PRESENT_FIRST) &&
> >> +                        !amdgpu_mcbp &&
> >> +                        !amdgpu_sriov_vf(adev)) /* for SRIOV preemption, Preamble CE ib must be inserted anyway */
> >> +                            continue;
> >
> > Snip.
> >
> >> +
> >> +                    if (!!secure ^ !!(ib->flags & AMDGPU_IB_FLAGS_SECURE)) {
> >> +                            amdgpu_ring_emit_frame_cntl(ring, false, secure);
> >> +                            secure = !secure;
> >> +                            amdgpu_ring_emit_frame_cntl(ring, true, secure);
> >> +                    }
> >
> > That's pretty good optimization! I spend quit a few time to understand this.
>
> I know. I know you did. It's called experience.
>
> When I saw you v1, it was a cringe. Seriously?

It may be a good optimization but if it's hard to understand it makes
the code harder to read and comprehend, that can lead to
misinterpretation and maintenance burdens.  It took me a while to
understand what was intended here.

Alex

>
> >
> >>
> >> -    if (secure) {
> >> -            secure = false;
> >> -            amdgpu_ring_emit_tmz(ring, false);
> >> +                    amdgpu_ring_emit_ib(ring, job, ib, status);
> >> +                    status &= ~AMDGPU_HAVE_CTX_SWITCH;
> >> +            }
> >> +            amdgpu_ring_emit_frame_cntl(ring, false, secure);
> >> +    } else {
> >> +            for ( ; i < num_ibs; ++i) {
> >> +                    ib = &ibs[i];
> >> +
> >> +                    /* drop preamble IBs if we don't have a context switch */
> >> +                    if ((ib->flags & AMDGPU_IB_FLAG_PREAMBLE) &&
> >> +                        skip_preamble &&
> >> +                        !(status & AMDGPU_PREAMBLE_IB_PRESENT_FIRST) &&
> >> +                        !amdgpu_mcbp &&
> >> +                        !amdgpu_sriov_vf(adev)) /* for SRIOV preemption, Preamble CE ib must be inserted anyway */
> >> +                            continue;
> >> +
> >> +                    amdgpu_ring_emit_ib(ring, job, ib, status);
> >> +                    status &= ~AMDGPU_HAVE_CTX_SWITCH;
> >
> > To pull the checking "job && ring->funcs->emit_frame_cntl" out of the loop,
> > that make the implemenation more duplicated like below, we have to write
>
> Yes, that is exactly what we want.
>
> As I explained before and will explain again, and again, and again,
> "job && ring->funcs->emit_frame_cntl" is static to the body of the loop,
> and as such can be pulled OUT of the loop and it should.
>
> This is a *formulaic* optimization exercise. Compilers and optimizers do it first.
>
> To extrapolate, consider that what you'd eventually have is a HUGE long long loop
> and everything inside it. Not good.
>
> Regards,
> Luben
>
>
> > the same codes multiple times. I hope the implementation is more simple and
> > readable. Please see my V2 patch that I just send out. We can try to use
> > minimum changes to fix the issue.
> >
> >               for ( ; i < num_ibs; ++i) {
> >                       ib = &ibs[i];
> >
> >                       /* drop preamble IBs if we don't have a context switch */
> >                       if ((ib->flags & AMDGPU_IB_FLAG_PREAMBLE) &&
> >                           skip_preamble &&
> >                           !(status & AMDGPU_PREAMBLE_IB_PRESENT_FIRST) &&
> >                           !amdgpu_mcbp &&
> >                           !amdgpu_sriov_vf(adev)) /* for SRIOV preemption, Preamble CE ib must be inserted anyway */
> >                               continue;
> >
> >                         ...
> >                       amdgpu_ring_emit_ib(ring, job, ib, status);
> >
> > Thanks,
> > Ray
> >
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

      reply	other threads:[~2020-02-27 21:30 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-25 13:57 [PATCH] drm/amdgpu: fix the gfx hang while use per-ib secure flag Huang Rui
2020-02-25 14:10 ` Pierre-Eric Pelloux-Prayer
2020-02-25 14:10 ` Alex Deucher
2020-02-25 14:30 ` Liu, Aaron
2020-02-25 22:56 ` Luben Tuikov
2020-02-25 23:40 ` Luben Tuikov
2020-02-26 16:26   ` Luben Tuikov
2020-02-26 22:39     ` [PATCH 0/1] Fixing the GFX hang Luben Tuikov
2020-02-26 22:39       ` [PATCH 1/1] drm/amdgpu: Fix per-IB secure flag " Luben Tuikov
2020-02-27 11:56         ` Huang Rui
2020-02-27 21:15           ` Luben Tuikov
2020-02-27 21:30             ` Alex Deucher [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADnq5_MEiEkXcif17pTbRys+O6RPrXf=ksTy9SdH3gFwULyrmQ@mail.gmail.com' \
    --to=alexdeucher@gmail.com \
    --cc=Aaron.Liu@amd.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=Pierre-eric.Pelloux-prayer@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=luben.tuikov@amd.com \
    --cc=ray.huang@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.