From: Boris Brezillon <boris.brezillon@collabora.com>
To: Danilo Krummrich <dakr@redhat.com>
Cc: airlied@gmail.com, daniel@ffwll.ch, tzimmermann@suse.de,
mripard@kernel.org, corbet@lwn.net, christian.koenig@amd.com,
bskeggs@redhat.com, Liam.Howlett@oracle.com,
matthew.brost@intel.com, alexdeucher@gmail.com,
ogabbay@kernel.org, bagasdotme@gmail.com, willy@infradead.org,
jason@jlekstrand.net, dri-devel@lists.freedesktop.org,
nouveau@lists.freedesktop.org, linux-doc@vger.kernel.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH drm-next v5 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI
Date: Tue, 20 Jun 2023 11:25:40 +0200 [thread overview]
Message-ID: <20230620112540.19142ef3@collabora.com> (raw)
In-Reply-To: <20230620004217.4700-1-dakr@redhat.com>
Hi Danilo,
On Tue, 20 Jun 2023 02:42:03 +0200
Danilo Krummrich <dakr@redhat.com> wrote:
> This patch series provides a new UAPI for the Nouveau driver in order to
> support Vulkan features, such as sparse bindings and sparse residency.
>
> Furthermore, with the DRM GPUVA manager it provides a new DRM core feature to
> keep track of GPU virtual address (VA) mappings in a more generic way.
>
> The DRM GPUVA manager is indented to help drivers implement userspace-manageable
> GPU VA spaces in reference to the Vulkan API. In order to achieve this goal it
> serves the following purposes in this context.
>
> 1) Provide infrastructure to track GPU VA allocations and mappings,
> making use of the maple_tree.
>
> 2) Generically connect GPU VA mappings to their backing buffers, in
> particular DRM GEM objects.
>
> 3) Provide a common implementation to perform more complex mapping
> operations on the GPU VA space. In particular splitting and merging
> of GPU VA mappings, e.g. for intersecting mapping requests or partial
> unmap requests.
>
> The new VM_BIND Nouveau UAPI build on top of the DRM GPUVA manager, itself
> providing the following new interfaces.
>
> 1) Initialize a GPU VA space via the new DRM_IOCTL_NOUVEAU_VM_INIT ioctl
> for UMDs to specify the portion of VA space managed by the kernel and
> userspace, respectively.
>
> 2) Allocate and free a VA space region as well as bind and unbind memory
> to the GPUs VA space via the new DRM_IOCTL_NOUVEAU_VM_BIND ioctl.
>
> 3) Execute push buffers with the new DRM_IOCTL_NOUVEAU_EXEC ioctl.
>
> Both, DRM_IOCTL_NOUVEAU_VM_BIND and DRM_IOCTL_NOUVEAU_EXEC, make use of the DRM
> scheduler to queue jobs and support asynchronous processing with DRM syncobjs
> as synchronization mechanism.
>
> By default DRM_IOCTL_NOUVEAU_VM_BIND does synchronous processing,
> DRM_IOCTL_NOUVEAU_EXEC supports asynchronous processing only.
>
> The new VM_BIND UAPI for Nouveau makes also use of drm_exec (execution context
> for GEM buffers) by Christian König. Since the patch implementing drm_exec was
> not yet merged into drm-next it is part of this series, as well as a small fix
> for this patch, which was found while testing this series.
>
> This patch series is also available at [1].
>
> There is a Mesa NVK merge request by Dave Airlie [2] implementing the
> corresponding userspace parts for this series.
>
> The Vulkan CTS test suite passes the sparse binding and sparse residency test
> cases for the new UAPI together with Dave's Mesa work.
>
> There are also some test cases in the igt-gpu-tools project [3] for the new UAPI
> and hence the DRM GPU VA manager. However, most of them are testing the DRM GPU
> VA manager's logic through Nouveau's new UAPI and should be considered just as
> helper for implementation.
>
> However, I absolutely intend to change those test cases to proper kunit test
> cases for the DRM GPUVA manager, once and if we agree on it's usefulness and
> design.
>
> [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/new-uapi-drm-next /
> https://gitlab.freedesktop.org/nouvelles/kernel/-/merge_requests/1
> [2] https://gitlab.freedesktop.org/nouveau/mesa/-/merge_requests/150/
> [3] https://gitlab.freedesktop.org/dakr/igt-gpu-tools/-/tree/wip_nouveau_vm_bind
>
> Changes in V2:
> ==============
> Nouveau:
> - Reworked the Nouveau VM_BIND UAPI to avoid memory allocations in fence
> signalling critical sections. Updates to the VA space are split up in three
> separate stages, where only the 2. stage executes in a fence signalling
> critical section:
>
> 1. update the VA space, allocate new structures and page tables
> 2. (un-)map the requested memory bindings
> 3. free structures and page tables
>
> - Separated generic job scheduler code from specific job implementations.
> - Separated the EXEC and VM_BIND implementation of the UAPI.
> - Reworked the locking parts of the nvkm/vmm RAW interface, such that
> (un-)map operations can be executed in fence signalling critical sections.
>
> GPUVA Manager:
> - made drm_gpuva_regions optional for users of the GPUVA manager
> - allow NULL GEMs for drm_gpuva entries
> - swichted from drm_mm to maple_tree for track drm_gpuva / drm_gpuva_region
> entries
> - provide callbacks for users to allocate custom drm_gpuva_op structures to
> allow inheritance
> - added user bits to drm_gpuva_flags
> - added a prefetch operation type in order to support generating prefetch
> operations in the same way other operations generated
> - hand the responsibility for mutual exclusion for a GEM's
> drm_gpuva list to the user; simplified corresponding (un-)link functions
>
> Maple Tree:
> - I added two maple tree patches to the series, one to support custom tree
> walk macros and one to hand the locking responsibility to the user of the
> GPUVA manager without pre-defined lockdep checks.
>
> Changes in V3:
> ==============
> Nouveau:
> - Reworked the Nouveau VM_BIND UAPI to do the job cleanup (including page
> table cleanup) within a workqueue rather than the job_free() callback of
> the scheduler itself. A job_free() callback can stall the execution (run()
> callback) of the next job in the queue. Since the page table cleanup
> requires to take the same locks as need to be taken for page table
> allocation, doing it directly in the job_free() callback would still
> violate the fence signalling critical path.
> - Separated Nouveau fence allocation and emit, such that we do not violate
> the fence signalling critical path in EXEC jobs.
> - Implement "regions" (for handling sparse mappings through PDEs and dual
> page tables) within Nouveau.
> - Drop the requirement for every mapping to be contained within a region.
> - Add necassary synchronization of VM_BIND job operation sequences in order
> to work around limitations in page table handling. This will be addressed
> in a future re-work of Nouveau's page table handling.
> - Fixed a couple of race conditions found through more testing. Thanks to
> Dave for consitently trying to break it. :-)
>
> GPUVA Manager:
> - Implement pre-allocation capabilities for tree modifications within fence
> signalling critical sections.
> - Implement accessors to to apply tree modification while walking the GPUVA
> tree in order to actually support processing of drm_gpuva_ops through
> callbacks in fence signalling critical sections rather than through
> pre-allocated operation lists.
> - Remove merging of GPUVAs; the kernel has limited to none knowlege about
> the semantics of mapping sequences. Hence, merging is purely speculative.
> It seems that gaining a significant (or at least a measurable) performance
> increase through merging is way more likely to happen when userspace is
> responsible for merging mappings up to the next larger page size if
> possible.
> - Since merging was removed, regions pretty much loose their right to exist.
> They might still be useful for handling dual page tables or similar
> mechanisms, but since Nouveau seems to be the only driver having a need
> for this for now, regions were removed from the GPUVA manager.
> - Fixed a couple of maple_tree related issues; thanks to Liam for helping me
> out.
>
> Changes in V4:
> ==============
> Nouveau:
> - Refactored how specific VM_BIND and EXEC jobs are created and how their
> arguments are passed to the generic job implementation.
> - Fixed a UAF race condition where bind job ops could have been freed
> already while still waiting for a job cleanup to finish. This is due to
> in certain cases we need to wait for mappings actually being unmapped
> before creating sparse regions in the same area.
> - Re-based the code onto drm_exec v4 patch.
>
> GPUVA Manager:
> - Fixed a maple tree related bug when pre-allocating MA states.
> (Boris Brezillion)
> - Made struct drm_gpuva_fn_ops a const object in all occurrences.
> (Boris Brezillion)
>
> Changes in V5:
> ==============
> Nouveau:
> - Link and unlink GPUVAs outside the fence signalling critical path in
> nouveau_uvmm_bind_job_submit() holding the dma-resv lock. Mutual exclusion
> of BO evicts causing mapping invalidation and regular mapping operations
> is ensured with dma-fences.
>
> GPUVA Manager:
> - Removed the separate GEMs GPUVA list lock. Link and unlink as well as
> iterating the GEM's GPUVA list should be protected with the GEM's dma-resv
> lock instead.
> - Renamed DRM_GPUVA_EVICTED flag to DRM_GPUVA_INVALIDATED. Mappings do not
> get eviced, they might get invalidated due to eviction.
> - Maple tree uses the 'unsinged long' type for node entries. While this
> works for GPU VA spaces larger than 32-bit on 64-bit kernel, the GPU VA
> space is limited to 32-bit on 32-bit kernels as well.
> As long as we do not have a 64-bit capable maple tree for 32-bit kernels,
> the GPU VA manager contains checks to throw warnings when GPU VA entries
> exceed the maple tree's storage capabilities.
> - Extended the Documentation and added example code as requested by Donald
> Robson.
>
> Christian König (1):
> drm: execution context for GEM buffers v4
>
> Danilo Krummrich (13):
> maple_tree: split up MA_STATE() macro
> drm: manager to keep track of GPUs VA mappings
> drm: debugfs: provide infrastructure to dump a DRM GPU VA space
Core drm patches are
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
The only thing I'm worried about is the 'sync mapping requests have to
go through the async path and wait for all previous async requests to
be processed' problem I mentioned in one of your previous submission,
but I'm happy leave that for later.
> drm/nouveau: new VM_BIND uapi interfaces
> drm/nouveau: get vmm via nouveau_cli_vmm()
> drm/nouveau: bo: initialize GEM GPU VA interface
> drm/nouveau: move usercopy helpers to nouveau_drv.h
> drm/nouveau: fence: separate fence alloc and emit
> drm/nouveau: fence: fail to emit when fence context is killed
> drm/nouveau: chan: provide nouveau_channel_kill()
> drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm
> drm/nouveau: implement new VM_BIND uAPI
> drm/nouveau: debugfs: implement DRM GPU VA debugfs
>
> Documentation/gpu/driver-uapi.rst | 11 +
> Documentation/gpu/drm-mm.rst | 54 +
> drivers/gpu/drm/Kconfig | 6 +
> drivers/gpu/drm/Makefile | 3 +
> drivers/gpu/drm/drm_debugfs.c | 41 +
> drivers/gpu/drm/drm_exec.c | 278 +++
> drivers/gpu/drm/drm_gem.c | 3 +
> drivers/gpu/drm/drm_gpuva_mgr.c | 1971 ++++++++++++++++
> drivers/gpu/drm/nouveau/Kbuild | 3 +
> drivers/gpu/drm/nouveau/Kconfig | 2 +
> drivers/gpu/drm/nouveau/dispnv04/crtc.c | 9 +-
> drivers/gpu/drm/nouveau/include/nvif/if000c.h | 26 +-
> drivers/gpu/drm/nouveau/include/nvif/vmm.h | 19 +-
> .../gpu/drm/nouveau/include/nvkm/subdev/mmu.h | 20 +-
> drivers/gpu/drm/nouveau/nouveau_abi16.c | 24 +
> drivers/gpu/drm/nouveau/nouveau_abi16.h | 1 +
> drivers/gpu/drm/nouveau/nouveau_bo.c | 204 +-
> drivers/gpu/drm/nouveau/nouveau_bo.h | 2 +-
> drivers/gpu/drm/nouveau/nouveau_chan.c | 22 +-
> drivers/gpu/drm/nouveau/nouveau_chan.h | 1 +
> drivers/gpu/drm/nouveau/nouveau_debugfs.c | 39 +
> drivers/gpu/drm/nouveau/nouveau_dmem.c | 9 +-
> drivers/gpu/drm/nouveau/nouveau_drm.c | 27 +-
> drivers/gpu/drm/nouveau/nouveau_drv.h | 94 +-
> drivers/gpu/drm/nouveau/nouveau_exec.c | 418 ++++
> drivers/gpu/drm/nouveau/nouveau_exec.h | 54 +
> drivers/gpu/drm/nouveau/nouveau_fence.c | 23 +-
> drivers/gpu/drm/nouveau/nouveau_fence.h | 5 +-
> drivers/gpu/drm/nouveau/nouveau_gem.c | 62 +-
> drivers/gpu/drm/nouveau/nouveau_mem.h | 5 +
> drivers/gpu/drm/nouveau/nouveau_prime.c | 2 +-
> drivers/gpu/drm/nouveau/nouveau_sched.c | 461 ++++
> drivers/gpu/drm/nouveau/nouveau_sched.h | 123 +
> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +-
> drivers/gpu/drm/nouveau/nouveau_uvmm.c | 1979 +++++++++++++++++
> drivers/gpu/drm/nouveau/nouveau_uvmm.h | 107 +
> drivers/gpu/drm/nouveau/nouveau_vmm.c | 4 +-
> drivers/gpu/drm/nouveau/nvif/vmm.c | 100 +-
> .../gpu/drm/nouveau/nvkm/subdev/mmu/uvmm.c | 213 +-
> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 197 +-
> drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 25 +
> .../drm/nouveau/nvkm/subdev/mmu/vmmgf100.c | 16 +-
> .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 16 +-
> .../gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c | 27 +-
> include/drm/drm_debugfs.h | 25 +
> include/drm/drm_drv.h | 6 +
> include/drm/drm_exec.h | 119 +
> include/drm/drm_gem.h | 52 +
> include/drm/drm_gpuva_mgr.h | 682 ++++++
> include/linux/maple_tree.h | 7 +-
> include/uapi/drm/nouveau_drm.h | 209 ++
> 51 files changed, 7566 insertions(+), 242 deletions(-)
> create mode 100644 drivers/gpu/drm/drm_exec.c
> create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.c
> create mode 100644 drivers/gpu/drm/nouveau/nouveau_exec.h
> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.c
> create mode 100644 drivers/gpu/drm/nouveau/nouveau_sched.h
> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.c
> create mode 100644 drivers/gpu/drm/nouveau/nouveau_uvmm.h
> create mode 100644 include/drm/drm_exec.h
> create mode 100644 include/drm/drm_gpuva_mgr.h
>
>
> base-commit: 2222dcb0775d36de28992f56455ab3967b30d380
next prev parent reply other threads:[~2023-06-20 9:25 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-20 0:42 [PATCH drm-next v5 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI Danilo Krummrich
2023-06-19 23:06 ` Danilo Krummrich
2023-06-20 4:05 ` Dave Airlie
2023-06-20 7:06 ` Oded Gabbay
2023-06-20 7:13 ` Dave Airlie
2023-06-20 7:34 ` Oded Gabbay
2023-06-20 0:42 ` [PATCH drm-next v5 01/14] drm: execution context for GEM buffers v4 Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 02/14] maple_tree: split up MA_STATE() macro Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 03/14] drm: manager to keep track of GPUs VA mappings Danilo Krummrich
2023-06-20 3:00 ` kernel test robot
2023-06-20 3:32 ` kernel test robot
2023-06-20 4:54 ` Christoph Hellwig
2023-06-20 6:45 ` Christian König
2023-06-20 12:23 ` Danilo Krummrich
2023-06-22 13:54 ` Christian König
2023-06-22 14:22 ` Danilo Krummrich
2023-06-22 14:42 ` Christian König
2023-06-22 15:04 ` Danilo Krummrich
2023-06-22 15:07 ` Danilo Krummrich
2023-06-23 2:24 ` Matthew Brost
2023-06-23 7:16 ` Christian König
2023-06-23 13:55 ` Danilo Krummrich
2023-06-23 15:34 ` Christian König
2023-06-26 22:38 ` Dave Airlie
2023-06-21 18:58 ` Donald Robson
2023-06-20 0:42 ` [PATCH drm-next v5 04/14] drm: debugfs: provide infrastructure to dump a DRM GPU VA space Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 05/14] drm/nouveau: new VM_BIND uapi interfaces Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 06/14] drm/nouveau: get vmm via nouveau_cli_vmm() Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 07/14] drm/nouveau: bo: initialize GEM GPU VA interface Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 08/14] drm/nouveau: move usercopy helpers to nouveau_drv.h Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 09/14] drm/nouveau: fence: separate fence alloc and emit Danilo Krummrich
2023-06-21 2:26 ` kernel test robot
2023-06-20 0:42 ` [PATCH drm-next v5 10/14] drm/nouveau: fence: fail to emit when fence context is killed Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 11/14] drm/nouveau: chan: provide nouveau_channel_kill() Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 12/14] drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 13/14] drm/nouveau: implement new VM_BIND uAPI Danilo Krummrich
2023-06-20 0:42 ` [PATCH drm-next v5 14/14] drm/nouveau: debugfs: implement DRM GPU VA debugfs Danilo Krummrich
2023-06-20 9:25 ` Boris Brezillon [this message]
2023-06-20 12:46 ` [PATCH drm-next v5 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI Danilo Krummrich
2023-06-22 13:01 ` Boris Brezillon
2023-06-22 13:58 ` Danilo Krummrich
2023-06-22 15:19 ` Boris Brezillon
2023-06-22 15:27 ` Danilo Krummrich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230620112540.19142ef3@collabora.com \
--to=boris.brezillon@collabora.com \
--cc=Liam.Howlett@oracle.com \
--cc=airlied@gmail.com \
--cc=alexdeucher@gmail.com \
--cc=bagasdotme@gmail.com \
--cc=bskeggs@redhat.com \
--cc=christian.koenig@amd.com \
--cc=corbet@lwn.net \
--cc=dakr@redhat.com \
--cc=daniel@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=jason@jlekstrand.net \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=matthew.brost@intel.com \
--cc=mripard@kernel.org \
--cc=nouveau@lists.freedesktop.org \
--cc=ogabbay@kernel.org \
--cc=tzimmermann@suse.de \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).