All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Deucher <alexdeucher@gmail.com>
To: Jay Cornwall <jay.cornwall@amd.com>
Cc: Kaibo Ma <ent3rm4n@gmail.com>,
	"Kuehling, Felix" <Felix.Kuehling@amd.com>,
	amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] Revert "drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole"
Date: Wed, 3 Jan 2024 13:50:54 -0500	[thread overview]
Message-ID: <CADnq5_PF8DrWHiSfwq3ju0eKkX0dN8JwZHwnCoBpzXFhucFmzA@mail.gmail.com> (raw)
In-Reply-To: <10b32f43-7b0c-1232-1070-cf51731c5d5f@amd.com>

Applied.  Thanks!

Alex

On Wed, Jan 3, 2024 at 10:33 AM Jay Cornwall <jay.cornwall@amd.com> wrote:
>
> On 1/3/2024 09:19, Alex Deucher wrote:
> > + Jay, Felix
> >
> > On Wed, Jan 3, 2024 at 5:16 AM Kaibo Ma <ent3rm4n@gmail.com> wrote:
> >>
> >> That commit causes NULL pointer dereferences in dmesgs when
> >> running applications using ROCm, including clinfo, blender,
> >> and PyTorch, since v6.6.1. Revert it to fix blender again.
> >>
> >> This reverts commit 96c211f1f9ef82183493f4ceed4e347b52849149.
> >>
> >> Closes: https://github.com/ROCm/ROCm/issues/2596
> >> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2991
> >> Signed-off-by: Kaibo Ma <ent3rm4n@gmail.com>
> >> ---
> >>  drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c | 26 ++++++++++----------
> >>  1 file changed, 13 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
> >> index 62b205dac..6604a3f99 100644
> >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
> >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
> >> @@ -330,12 +330,6 @@ static void kfd_init_apertures_vi(struct kfd_process_device *pdd, uint8_t id)
> >>         pdd->gpuvm_limit =
> >>                 pdd->dev->kfd->shared_resources.gpuvm_size - 1;
> >>
> >> -       /* dGPUs: the reserved space for kernel
> >> -        * before SVM
> >> -        */
> >> -       pdd->qpd.cwsr_base = SVM_CWSR_BASE;
> >> -       pdd->qpd.ib_base = SVM_IB_BASE;
> >> -
> >>         pdd->scratch_base = MAKE_SCRATCH_APP_BASE_VI();
> >>         pdd->scratch_limit = MAKE_SCRATCH_APP_LIMIT(pdd->scratch_base);
> >>  }
> >> @@ -345,18 +339,18 @@ static void kfd_init_apertures_v9(struct kfd_process_device *pdd, uint8_t id)
> >>         pdd->lds_base = MAKE_LDS_APP_BASE_V9();
> >>         pdd->lds_limit = MAKE_LDS_APP_LIMIT(pdd->lds_base);
> >>
> >> -       pdd->gpuvm_base = PAGE_SIZE;
> >> +        /* Raven needs SVM to support graphic handle, etc. Leave the small
> >> +         * reserved space before SVM on Raven as well, even though we don't
> >> +         * have to.
> >> +         * Set gpuvm_base and gpuvm_limit to CANONICAL addresses so that they
> >> +         * are used in Thunk to reserve SVM.
> >> +         */
> >> +        pdd->gpuvm_base = SVM_USER_BASE;
> >>         pdd->gpuvm_limit =
> >>                 pdd->dev->kfd->shared_resources.gpuvm_size - 1;
> >>
> >>         pdd->scratch_base = MAKE_SCRATCH_APP_BASE_V9();
> >>         pdd->scratch_limit = MAKE_SCRATCH_APP_LIMIT(pdd->scratch_base);
> >> -
> >> -       /*
> >> -        * Place TBA/TMA on opposite side of VM hole to prevent
> >> -        * stray faults from triggering SVM on these pages.
> >> -        */
> >> -       pdd->qpd.cwsr_base = pdd->dev->kfd->shared_resources.gpuvm_size;
> >>  }
> >>
> >>  int kfd_init_apertures(struct kfd_process *process)
> >> @@ -413,6 +407,12 @@ int kfd_init_apertures(struct kfd_process *process)
> >>                                         return -EINVAL;
> >>                                 }
> >>                         }
> >> +
> >> +                        /* dGPUs: the reserved space for kernel
> >> +                         * before SVM
> >> +                         */
> >> +                        pdd->qpd.cwsr_base = SVM_CWSR_BASE;
> >> +                        pdd->qpd.ib_base = SVM_IB_BASE;
> >>                 }
> >>
> >>                 dev_dbg(kfd_device, "node id %u\n", id);
> >> --
> >> 2.42.0
> >>
>
> I saw a segfault issue in Mesa yesterday. Not sure about the others, but I don't know how to make this change while compatibility with older UMDs.
>
> So I agree, let's revert it.
>
> Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>

  reply	other threads:[~2024-01-03 18:51 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-03  4:29 [PATCH] Revert "drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole" Kaibo Ma
2024-01-03 15:19 ` Alex Deucher
2024-01-03 15:32   ` Jay Cornwall
2024-01-03 18:50     ` Alex Deucher [this message]
2024-01-03 18:58     ` Felix Kuehling
2024-01-03 19:13       ` Jay Cornwall
2024-01-04 18:29         ` Marek Olšák
2024-01-05 20:20           ` Felix Kuehling
2024-01-06  6:48             ` Marek Olšák
2024-01-10 17:45               ` Marek Olšák
2024-01-11 22:41                 ` Felix Kuehling
2024-01-12  6:51                   ` Christian König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADnq5_PF8DrWHiSfwq3ju0eKkX0dN8JwZHwnCoBpzXFhucFmzA@mail.gmail.com \
    --to=alexdeucher@gmail.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=ent3rm4n@gmail.com \
    --cc=jay.cornwall@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.