All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
To: "Christian König" <christian.koenig@amd.com>
Cc: ML dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: [RFC PATCH 3/5] drm/amdgpu: Allow explicit sync for VM ops.
Date: Fri, 3 Jun 2022 12:08:45 +0200	[thread overview]
Message-ID: <CAP+8YyGgam6Hr40PS_Rc7Dg=S2dLJdce=87=wNt2B0yAyPEPOw@mail.gmail.com> (raw)
In-Reply-To: <ea49dfd3-3c20-c330-3412-5b48481331cd@amd.com>

On Fri, Jun 3, 2022 at 10:11 AM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 03.06.22 um 03:21 schrieb Bas Nieuwenhuizen:
> > [SNIP]
> >> The problem is we need to wait on fences *not* added to the buffer object.
> > What fences wouldn't be added to the buffer object that we need here?
>
> Basically all still running submissions from the VM which could
> potentially access the BO.
>
> That's why we have the AMDGPU_SYNC_EQ_OWNER in amdgpu_vm_update_range().
>
> >> E.g. what we currently do here while freeing memory is:
> >> 1. Update the PTEs and make that update wait for everything!
> >> 2. Add the fence of that update to the freed up BO so that this BO isn't
> >> freed before the next CS.
> >>
> >> We might be able to fix this by adding the fences to the BO before
> >> freeing it manually, but I'm not 100% sure we can actually allocate
> >> memory for the fences in that moment.
> > I think we don't need to be able to. We're already adding the unmap
> > fence to the BO in the gem close ioctl, and that has the fallback that
> > if we can't allocate space for the fence in the BO, we wait on the
> > fence manually on the CPU. I think that is a reasonable fallback for
> > this as well?
>
> Yes, just blocking might work in an OOM situation as well.
>
> > For the TTM move path amdgpu_copy_buffer will wait on the BO resv and
> > then following submissions will trigger VM updates that will wait on
> > the amdgpu_copy_buffer jobs (and hence transitively) will wait on the
> > work.  AFAICT the amdgpu_bo_move does not trigger any VM updates by
> > itself, and the amdgpu_bo_move_notify is way after the move (and after
> > the ttm_bo_move_accel_cleanup which would free the old resource), so
> > any VM changes triggered by that would see the TTM copy and sync to
> > it.
> >
> > I do have to fix some stuff indeed, especially for the GEM close but
> > with that we should be able to keep the same basic approach?
>
> Nope, not even remotely.
>
> What we need is the following:
> 1. Rolling out my drm_exec patch set, so that we can lock buffers as needed.
> 2. When we get a VM operation we not only lock the VM page tables, but
> also all buffers we potentially need to unmap.
> 3. Nuking the freed list in the amdgpu_vm structure by updating freed
> areas directly when they are unmapped.
> 4. Tracking those updates inside the bo_va structure for the BO+VM
> combination.
> 5. When the bo_va structure is destroy because of closing the handle
> move the last clear operation over to the VM as implicit sync.
>

Hi Christian, isn't that a different problem though (that we're also
trying to solve, but in your series)?

What this patch tries to achieve:

(t+0) CS submission setting BOOKKEEP fences (i.e. no implicit sync)
(t+1) a VM operation on a BO/VM accessed by the CS.

to run concurrently. What it *doesn't* try is

(t+0) a VM operation on a BO/VM accessed by the CS.
(t+1) CS submission setting BOOKKEEP fences (i.e. no implicit sync)

to run concurrently. When you write

> Only when all this is done we then can resolve the dependency that the
> CS currently must wait for any clear operation on the VM.

isn't that all about the second problem?


>
> Regards,
> Christian.
>
>

  reply	other threads:[~2022-06-03 10:08 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-01  0:40 [RFC PATCH 0/5] Add option to disable implicit sync for userspace submits Bas Nieuwenhuizen
2022-06-01  0:40 ` [RFC PATCH 1/5] drm/ttm: Refactor num_shared into usage Bas Nieuwenhuizen
2022-06-01  8:02   ` Christian König
2022-06-01  8:11     ` Bas Nieuwenhuizen
2022-06-01  8:29       ` Christian König
2022-06-01  8:39         ` Bas Nieuwenhuizen
2022-06-01  8:42           ` Christian König
2022-06-01  8:41     ` Daniel Vetter
2022-06-01  8:47       ` Christian König
2022-06-01  0:40 ` [RFC PATCH 2/5] drm/amdgpu: Add separate mode for syncing DMA_RESV_USAGE_BOOKKEEP Bas Nieuwenhuizen
2022-06-01  0:40 ` [RFC PATCH 3/5] drm/amdgpu: Allow explicit sync for VM ops Bas Nieuwenhuizen
2022-06-01  8:03   ` Christian König
2022-06-01  8:16     ` Bas Nieuwenhuizen
2022-06-01  8:40       ` Christian König
2022-06-01  8:48         ` Bas Nieuwenhuizen
2022-06-01  8:59           ` Bas Nieuwenhuizen
2022-06-01  9:01           ` Christian König
2022-06-03  1:21             ` Bas Nieuwenhuizen
2022-06-03  8:11               ` Christian König
2022-06-03 10:08                 ` Bas Nieuwenhuizen [this message]
2022-06-03 10:16                   ` Christian König
2022-06-03 11:07                     ` Bas Nieuwenhuizen
2022-06-03 12:08                       ` Christian König
2022-06-03 12:39                         ` Bas Nieuwenhuizen
2022-06-03 12:49                           ` Christian König
2022-06-03 13:23                             ` Bas Nieuwenhuizen
2022-06-03 17:41                               ` Christian König
2022-06-03 17:50                                 ` Bas Nieuwenhuizen
2022-06-03 18:41                                   ` Christian König
2022-06-03 19:11                                     ` Bas Nieuwenhuizen
2022-06-06 10:15                                       ` Christian König
2022-06-06 10:30                                         ` Bas Nieuwenhuizen
2022-06-06 10:35                                           ` Christian König
2022-06-06 11:00                                             ` Bas Nieuwenhuizen
2022-06-15  0:40                                               ` Bas Nieuwenhuizen
2022-06-15  7:00                                                 ` Christian König
2022-06-15  7:00                                               ` Christian König
2022-06-17 13:03                                                 ` Bas Nieuwenhuizen
2022-06-17 13:08                                                   ` Christian König
2022-06-24 20:34                                                     ` Daniel Vetter
2022-06-25 13:58                                                       ` Christian König
2022-06-25 22:45                                                         ` Daniel Vetter
2022-07-04 13:37                                                           ` Christian König
2022-08-09 14:37                                                             ` Daniel Vetter
2022-06-01  0:40 ` [RFC PATCH 4/5] drm/amdgpu: Refactor amdgpu_vm_get_pd_bo Bas Nieuwenhuizen
2022-06-01  0:40 ` [RFC PATCH 5/5] drm/amdgpu: Add option to disable implicit sync for a context Bas Nieuwenhuizen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAP+8YyGgam6Hr40PS_Rc7Dg=S2dLJdce=87=wNt2B0yAyPEPOw@mail.gmail.com' \
    --to=bas@basnieuwenhuizen.nl \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.