dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 000/162] DG1 + LMEM enabling
@ 2020-11-27 12:04 Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 001/162] drm/i915/selftest: also consider non-contiguous objects Matthew Auld
                   ` (161 more replies)
  0 siblings, 162 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

This series includes a version of Maarten' series[1], which converts more of the
driver locking over to dma-resv. On top of this we now implement things like
LMEM eviction, which has a dependency on this new locking design.

In terms of new uAPI we have gem_create_ext, which offers extensions support for
gem_create. For now the only extension we add is giving userspace the ability to
optionally provide a priority list of potential placements for the object. The
other bit of new uAPI is the query interface for memory regions, which describes
the supported memory regions for the device. What this reports can then be fed
into gem_create_ext to specify where an object might reside, like device local
memory. Note that the class/instance complexity in the uAPI is not very relevant
for DG1, but is in preparation for the Xe HP multi-tile architecture with
multiple memory regions.

The series still includes relocation support, but that's purely for CI, until we
have completed all the IGT rework[2] and so will not be merged. Likewise for
pread/pwrite, which will also be dropped from DG1+.

[1] https://patchwork.freedesktop.org/series/82337/
[2] https://patchwork.freedesktop.org/series/82954/

Abdiel Janulgue (3):
  drm/i915/query: Expose memory regions through the query uAPI
  drm/i915: Provide a way to disable PCIe relaxed write ordering
  drm/i915: Reintroduce mem->reserved

Animesh Manna (2):
  drm/i915/lmem: reset the lmem buffer created by fbdev
  drm/i915/dsb: Enable lmem for dsb

Anshuman Gupta (1):
  drm/i915/oprom: Basic sanitization

Anusha Srivatsa (1):
  drm/i915/lmem: Bypass aperture when lmem is available

Bommu Krishnaiah (1):
  drm/i915/gem: Update shmem available memory

CQ Tang (13):
  drm/i915/dg1: Fix occasional migration error
  drm/i915: i915 returns -EBUSY on thread contention
  drm/i915: setup GPU device lmem region
  drm/i915: Fix object page offset within a region
  drm/i915: add i915_gem_object_is_devmem() function
  drm/i915: finish memory region support for stolen objects.
  drm/i915: Create stolen memory region from local memory
  drm/i915/dg1: intel_memory_region_evict() changes for eviction
  drm/i915/dg1: i915_gem_object_memcpy(..) infrastructure
  drm/i915/dg1: Eviction logic
  drm/i915/dg1: Add enable_eviction modparam
  drm/i915/dg1: Add lmem_size modparam
  drm/i915: need consider system BO snoop for dgfx

Chris Wilson (2):
  drm/i915/gt: Move move context layout registers and offsets to
    lrc_reg.h
  drm/i915/gt: Rename lrc.c to execlists_submission.c

Clint Taylor (3):
  drm/i915/dg1: Read OPROM via SPI controller
  drm/i915/dg1: Compute MEM Bandwidth using MCHBAR
  drm/i915/dg1: Double memory bandwidth available

Daniele Ceraolo Spurio (5):
  drm/i915: split gen8+ flush and bb_start emission functions to their
    own file
  drm/i915: split wa_bb code to its own file
  drm/i915: Make intel_init_workaround_bb more compatible with ww
    locking.
  drm/i915/guc: put all guc objects in lmem when available
  drm/i915: WA for zero memory channel

Imre Deak (1):
  drm/i915/dg1: Reserve first 1MB of local memory

Kui Wen (1):
  drm/i915/dg1: Do not check r->sgt.pfn for NULL

Lucas De Marchi (2):
  drm/i915: move eviction to prepare hook
  drm/i915/dg1: allow pci to auto probe

Maarten Lankhorst (60):
  drm/i915: Pin timeline map after first timeline pin, v5.
  drm/i915: Move cmd parser pinning to execbuffer
  drm/i915: Add missing -EDEADLK handling to execbuf pinning, v2.
  drm/i915: Ensure we hold the object mutex in pin correctly v2
  drm/i915: Add gem object locking to madvise.
  drm/i915: Move HAS_STRUCT_PAGE to obj->flags
  drm/i915: Rework struct phys attachment handling
  drm/i915: Convert i915_gem_object_attach_phys() to ww locking, v2.
  drm/i915: make lockdep slightly happier about execbuf.
  drm/i915: Disable userptr pread/pwrite support.
  drm/i915: No longer allow exporting userptr through dma-buf
  drm/i915: Reject more ioctls for userptr
  drm/i915: Reject UNSYNCHRONIZED for userptr, v2.
  drm/i915: Make compilation of userptr code depend on MMU_NOTIFIER.
  drm/i915: Fix userptr so we do not have to worry about obj->mm.lock,
    v5.
  drm/i915: Flatten obj->mm.lock
  drm/i915: Populate logical context during first pin.
  drm/i915: Make ring submission compatible with obj->mm.lock removal,
    v2.
  drm/i915: Handle ww locking in init_status_page
  drm/i915: Rework clflush to work correctly without obj->mm.lock.
  drm/i915: Pass ww ctx to intel_pin_to_display_plane
  drm/i915: Add object locking to vm_fault_cpu
  drm/i915: Move pinning to inside engine_wa_list_verify()
  drm/i915: Take reservation lock around i915_vma_pin.
  drm/i915: Make __engine_unpark() compatible with ww locking v2
  drm/i915: Take obj lock around set_domain ioctl
  drm/i915: Defer pin calls in buffer pool until first use by caller.
  drm/i915: Fix pread/pwrite to work with new locking rules.
  drm/i915: Fix workarounds selftest, part 1
  drm/i915: Add igt_spinner_pin() to allow for ww locking around
    spinner.
  drm/i915: Add ww locking around vm_access()
  drm/i915: Increase ww locking for perf.
  drm/i915: Lock ww in ucode objects correctly
  drm/i915: Add ww locking to dma-buf ops.
  drm/i915: Add missing ww lock in intel_dsb_prepare.
  drm/i915: Fix ww locking in shmem_create_from_object
  drm/i915: Use a single page table lock for each gtt.
  drm/i915/selftests: Prepare huge_pages testcases for obj->mm.lock
    removal.
  drm/i915/selftests: Prepare client blit for obj->mm.lock removal.
  drm/i915/selftests: Prepare coherency tests for obj->mm.lock removal.
  drm/i915/selftests: Prepare context tests for obj->mm.lock removal.
  drm/i915/selftests: Prepare dma-buf tests for obj->mm.lock removal.
  drm/i915/selftests: Prepare execbuf tests for obj->mm.lock removal.
  drm/i915/selftests: Prepare mman testcases for obj->mm.lock removal.
  drm/i915/selftests: Prepare object tests for obj->mm.lock removal.
  drm/i915/selftests: Prepare object blit tests for obj->mm.lock
    removal.
  drm/i915/selftests: Prepare igt_gem_utils for obj->mm.lock removal
  drm/i915/selftests: Prepare context selftest for obj->mm.lock removal
  drm/i915/selftests: Prepare hangcheck for obj->mm.lock removal
  drm/i915/selftests: Prepare execlists for obj->mm.lock removal
  drm/i915/selftests: Prepare mocs tests for obj->mm.lock removal
  drm/i915/selftests: Prepare ring submission for obj->mm.lock removal
  drm/i915/selftests: Prepare timeline tests for obj->mm.lock removal
  drm/i915/selftests: Prepare i915_request tests for obj->mm.lock
    removal
  drm/i915/selftests: Prepare memory region tests for obj->mm.lock
    removal
  drm/i915/selftests: Prepare cs engine tests for obj->mm.lock removal
  drm/i915/selftests: Prepare gtt tests for obj->mm.lock removal
  drm/i915: Finally remove obj->mm.lock.
  drm/i915: Keep userpointer bindings if seqcount is unchanged, v2.
  drm/i915: Implement eviction locking v2

Matt Roper (1):
  drm/i915/lmem: Fail driver init if LMEM training failed

Matthew Auld (19):
  drm/i915/selftest: also consider non-contiguous objects
  drm/i915/selftest: assert we get 2M GTT pages
  drm/i915/selftest: handle local-memory in perf_memcpy
  HAX drm/i915/lmem: support CPU relocations
  HAX drm/i915/lmem: support pread and pwrite
  drm/i915: introduce kernel blitter_context
  drm/i915/region: support basic eviction
  drm/i915: support basic object migration
  drm/i915/uapi: introduce drm_i915_gem_create_ext
  drm/i915: setup the LMEM region
  drm/i915/gtt: map the PD up front
  drm/i915/gtt/dgfx: place the PD in LMEM
  drm/i915/gtt: make flushing conditional
  drm/i915/gtt/dg1: add PTE_LM plumbing for PPGTT
  drm/i915/gtt/dg1: add PTE_LM plumbing for GGTT
  drm/i915: allocate context from LMEM
  drm/i915: move engine scratch to LMEM
  drm/i915/lmem: support optional CPU clearing for special internal use
  drm/i915: drop fake lmem

Michael J. Ruhl (2):
  drm/i915/dmabuf: Disallow LMEM objects from dma-buf
  drm/i915/dg1: Introduce dmabuf mmap to LMEM

Michel Thierry (2):
  drm/i915/lmem: allocate cmd ring in lmem
  drm/i915/lmem: allocate HWSP in lmem

Mohammed Khajapasha (2):
  drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete
  drm/i915: Return error value when bo not in LMEM for discrete

Prathap Kumar Valsan (2):
  drm/i915: Store gt in memory region
  drm/i915/pm: suspend and restore ppgtt mapping

Ramalingam C (6):
  drm/i915: define intel_partial_pages_for_sg_table
  drm/i915: create and destroy dummy vma
  drm/i915: blt copy between objs using pre-created vma windows
  drm/i915: window_blt_copy is used for swapin and swapout
  drm/i915: Lmem eviction statistics by category
  drm/i915/gem/selftest: test and measure window based blt cpy

Stuart Summers (1):
  drm/i915: Allow non-uniform subslices in gen12+

Sudeep Dutt (2):
  drm/i915/dg1: Track swap in/out stats via debugfs
  drm/i915/dg1: Measure swap in/out timing stats

Thomas Hellström (15):
  HAX drm/i915: Work around the selftest timeline lock splat workaround
  drm/i915: Introduce drm_i915_lock_isolated
  drm/i915: Lock hwsp objects isolated for pinning at create time
  drm/i915: Prepare for obj->mm.lock removal
  drm/i915: Avoid some false positives in assert_object_held()
  drm/i915: Reference contending lock objects
  drm/i915: Break out dma_resv ww locking utilities to separate files
  drm/i915: Introduce a for_i915_gem_ww(){}
  drm/i915: Untangle the vma pages_mutex
  drm/i915: Add blit functions that can be called from within a WW
    transaction
  drm/i915: Delay publishing objects on the eviction lists
  drm/i915: Perform execbuffer object locking as a separate step
  drm/i915: Support ww eviction
  drm/i915: Use a ww transaction in the fault handler
  drm/i915: Use a ww transaction in i915_gem_object_pin_map_unlocked()

Tvrtko Ursulin (4):
  drm/i915/dg1: Eliminate eviction mutex
  drm/i915/dg1: Keep engine awake across whole blit
  drm/i915/dg1: Add dedicated context for blitter eviction
  drm/i915: Improve accuracy of eviction stats

Venkata Ramana Nayana (8):
  drm/i915: suspend/resume eviction
  drm/i915: Reset blitter context when unpark engine
  drm/i915/gt: Allocate default ctx objects in SMEM
  drm/i915: suspend/resume enable blitter eviction
  drm/i915: suspend/resume handling of perma-pinned objects
  drm/i915: Support ww locks in suspend/resume
  drm/i915/dg1: Fix mapping type for default state object
  drm/i915/dg1: Fix GPU hang due to shmemfs page drop

Venkata Sandeep Dhanalakota (2):
  drm/i915: Update the helper to set correct mapping
  drm/i915/lmem: Limit block size to 4G

Zbigniew Kempczyński (1):
  drm/i915: Distinction of memory regions

 drivers/gpu/drm/i915/Kconfig.debug            |  11 +
 drivers/gpu/drm/i915/Makefile                 |   7 +-
 drivers/gpu/drm/i915/display/intel_bios.c     |  75 +-
 drivers/gpu/drm/i915/display/intel_bw.c       |  64 +-
 drivers/gpu/drm/i915/display/intel_display.c  |  80 +-
 drivers/gpu/drm/i915/display/intel_display.h  |   2 +-
 drivers/gpu/drm/i915/display/intel_dsb.c      |   9 +-
 drivers/gpu/drm/i915/display/intel_fbc.c      |  20 +-
 drivers/gpu/drm/i915/display/intel_fbdev.c    |  54 +-
 drivers/gpu/drm/i915/display/intel_opregion.c | 169 ++++
 drivers/gpu/drm/i915/display/intel_opregion.h |  31 +-
 drivers/gpu/drm/i915/display/intel_overlay.c  |  34 +-
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |  15 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_create.c    | 398 ++++++++
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    | 123 ++-
 drivers/gpu/drm/i915/gem/i915_gem_domain.c    |  52 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 302 +++++-
 drivers/gpu/drm/i915/gem/i915_gem_fence.c     |  95 --
 drivers/gpu/drm/i915/gem/i915_gem_internal.c  |   6 +-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c      | 254 ++++-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h      |  22 +
 drivers/gpu/drm/i915/gem/i915_gem_mman.c      | 187 ++--
 drivers/gpu/drm/i915/gem/i915_gem_mman.h      |  11 +
 drivers/gpu/drm/i915/gem/i915_gem_object.c    | 711 +++++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h    | 206 +++-
 .../gpu/drm/i915/gem/i915_gem_object_blt.c    | 101 +-
 .../gpu/drm/i915/gem/i915_gem_object_blt.h    |  10 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  46 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     | 140 ++-
 drivers/gpu/drm/i915/gem/i915_gem_phys.c      | 110 +--
 drivers/gpu/drm/i915/gem/i915_gem_region.c    | 234 ++++-
 drivers/gpu/drm/i915/gem/i915_gem_region.h    |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  46 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  52 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.h  |   6 +-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c    | 247 +++--
 drivers/gpu/drm/i915/gem/i915_gem_stolen.h    |  10 +-
 drivers/gpu/drm/i915/gem/i915_gem_tiling.c    |   2 -
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   | 870 ++++++-----------
 .../drm/i915/gem/selftests/huge_gem_object.c  |   4 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  60 +-
 .../i915/gem/selftests/i915_gem_client_blt.c  |   8 +-
 .../i915/gem/selftests/i915_gem_coherency.c   |  18 +-
 .../drm/i915/gem/selftests/i915_gem_context.c |  21 +-
 .../drm/i915/gem/selftests/i915_gem_dmabuf.c  |   2 +-
 .../i915/gem/selftests/i915_gem_execbuffer.c  |   2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c    |  33 +-
 .../drm/i915/gem/selftests/i915_gem_object.c  |   2 +-
 .../i915/gem/selftests/i915_gem_object_blt.c  | 172 +++-
 .../drm/i915/gem/selftests/i915_gem_phys.c    |  10 +-
 .../drm/i915/gem/selftests/igt_gem_utils.c    |   2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          |  11 +-
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c      | 393 ++++++++
 drivers/gpu/drm/i915/gt/gen8_engine_cs.h      |  26 +
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          |  91 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h          |   2 +
 drivers/gpu/drm/i915/gt/intel_context.c       |   3 +-
 drivers/gpu/drm/i915/gt/intel_context.h       |   2 +
 drivers/gpu/drm/i915/gt/intel_context_sseu.c  |   2 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |  13 +-
 drivers/gpu/drm/i915/gt/intel_engine.h        |   4 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 125 ++-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     |  14 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |   2 +
 .../drm/i915/gt/intel_engine_workaround_bb.c  | 364 +++++++
 .../drm/i915/gt/intel_engine_workaround_bb.h  |  14 +
 ...tel_lrc.c => intel_execlists_submission.c} | 899 +++---------------
 .../drm/i915/gt/intel_execlists_submission.h  |  66 ++
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |  93 +-
 drivers/gpu/drm/i915/gt/intel_gt.c            |  14 +-
 .../gpu/drm/i915/gt/intel_gt_buffer_pool.c    |  47 +-
 .../gpu/drm/i915/gt/intel_gt_buffer_pool.h    |   5 +
 .../drm/i915/gt/intel_gt_buffer_pool_types.h  |   1 +
 drivers/gpu/drm/i915/gt/intel_gt_irq.c        |   1 +
 drivers/gpu/drm/i915/gt/intel_gtt.c           | 105 +-
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  23 +-
 drivers/gpu/drm/i915/gt/intel_lrc.h           | 128 ---
 drivers/gpu/drm/i915/gt/intel_lrc_reg.h       |  39 +
 drivers/gpu/drm/i915/gt/intel_mocs.c          |   2 +-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  20 +-
 drivers/gpu/drm/i915/gt/intel_renderstate.c   |   2 +-
 drivers/gpu/drm/i915/gt/intel_renderstate.h   |   1 +
 drivers/gpu/drm/i915/gt/intel_ring.c          |  24 +-
 .../gpu/drm/i915/gt/intel_ring_submission.c   | 184 ++--
 drivers/gpu/drm/i915/gt/intel_sseu.c          |   6 +-
 drivers/gpu/drm/i915/gt/intel_timeline.c      | 121 ++-
 drivers/gpu/drm/i915/gt/intel_timeline.h      |   1 +
 .../gpu/drm/i915/gt/intel_timeline_types.h    |   1 +
 drivers/gpu/drm/i915/gt/intel_workarounds.c   |  24 +-
 drivers/gpu/drm/i915/gt/mock_engine.c         |  24 +-
 drivers/gpu/drm/i915/gt/selftest_context.c    |   5 +-
 drivers/gpu/drm/i915/gt/selftest_engine_cs.c  |   4 +-
 .../{selftest_lrc.c => selftest_execlists.c}  |  37 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   8 +-
 drivers/gpu/drm/i915/gt/selftest_mocs.c       |   2 +-
 drivers/gpu/drm/i915/gt/selftest_reset.c      |   5 +-
 .../drm/i915/gt/selftest_ring_submission.c    |   4 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   | 100 +-
 .../gpu/drm/i915/gt/selftest_workarounds.c    | 101 +-
 drivers/gpu/drm/i915/gt/shmem_utils.c         |  11 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.c        |  13 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |   1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c     |  11 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c    |   4 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |   1 +
 drivers/gpu/drm/i915/gt/uc/intel_huc.c        |  18 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      |  37 +-
 drivers/gpu/drm/i915/gvt/dmabuf.c             |   2 +-
 drivers/gpu/drm/i915/gvt/mmio_context.h       |   2 +
 drivers/gpu/drm/i915/gvt/scheduler.c          |   1 +
 drivers/gpu/drm/i915/i915_active.c            |  20 +-
 drivers/gpu/drm/i915/i915_cmd_parser.c        | 104 +-
 drivers/gpu/drm/i915/i915_debugfs.c           |  42 +-
 drivers/gpu/drm/i915/i915_drv.c               | 277 +++++-
 drivers/gpu/drm/i915/i915_drv.h               |  57 +-
 drivers/gpu/drm/i915/i915_gem.c               | 418 +++-----
 drivers/gpu/drm/i915/i915_gem.h               |  12 -
 drivers/gpu/drm/i915/i915_gem_gtt.c           |   2 +-
 drivers/gpu/drm/i915/i915_gem_ww.c            |  93 ++
 drivers/gpu/drm/i915/i915_gem_ww.h            |  53 ++
 drivers/gpu/drm/i915/i915_gpu_error.c         |   4 +-
 drivers/gpu/drm/i915/i915_memcpy.c            |   2 +-
 drivers/gpu/drm/i915/i915_memcpy.h            |   2 +-
 drivers/gpu/drm/i915/i915_mm.c                |   2 +-
 drivers/gpu/drm/i915/i915_params.c            |  11 +-
 drivers/gpu/drm/i915/i915_params.h            |   3 +-
 drivers/gpu/drm/i915/i915_pci.c               |   5 +-
 drivers/gpu/drm/i915/i915_perf.c              |  57 +-
 drivers/gpu/drm/i915/i915_query.c             |  62 ++
 drivers/gpu/drm/i915/i915_reg.h               |  17 +
 drivers/gpu/drm/i915/i915_selftest.h          |   2 +
 drivers/gpu/drm/i915/i915_vma.c               | 154 ++-
 drivers/gpu/drm/i915/i915_vma.h               |  28 +-
 drivers/gpu/drm/i915/intel_memory_region.c    | 229 ++++-
 drivers/gpu/drm/i915/intel_memory_region.h    |  53 +-
 drivers/gpu/drm/i915/intel_region_lmem.c      | 168 ++--
 drivers/gpu/drm/i915/intel_region_lmem.h      |   3 +-
 drivers/gpu/drm/i915/intel_uncore.c           |  12 +
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 100 +-
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 drivers/gpu/drm/i915/selftests/i915_perf.c    |   3 +-
 drivers/gpu/drm/i915/selftests/i915_request.c |  10 +-
 drivers/gpu/drm/i915/selftests/igt_spinner.c  | 136 ++-
 drivers/gpu/drm/i915/selftests/igt_spinner.h  |   5 +
 .../drm/i915/selftests/intel_memory_region.c  | 442 ++++++++-
 drivers/gpu/drm/i915/selftests/mock_region.c  |   4 +-
 include/uapi/drm/i915_drm.h                   | 118 +++
 148 files changed, 7813 insertions(+), 3312 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_create.c
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_fence.c
 create mode 100644 drivers/gpu/drm/i915/gt/gen8_engine_cs.c
 create mode 100644 drivers/gpu/drm/i915/gt/gen8_engine_cs.h
 create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_workaround_bb.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_workaround_bb.h
 rename drivers/gpu/drm/i915/gt/{intel_lrc.c => intel_execlists_submission.c} (87%)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_execlists_submission.h
 delete mode 100644 drivers/gpu/drm/i915/gt/intel_lrc.h
 rename drivers/gpu/drm/i915/gt/{selftest_lrc.c => selftest_execlists.c} (99%)
 create mode 100644 drivers/gpu/drm/i915/i915_gem_ww.c
 create mode 100644 drivers/gpu/drm/i915/i915_gem_ww.h

-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* [RFC PATCH 001/162] drm/i915/selftest: also consider non-contiguous objects
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 19:44   ` Chris Wilson
  2020-11-27 12:04 ` [RFC PATCH 002/162] drm/i915/selftest: assert we get 2M GTT pages Matthew Auld
                   ` (160 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

In igt_ppgtt_sanity_check we should also exercise the non-contiguous
option for LMEM, since this will give us slightly different sg layouts
and alignment.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index 1f35e71429b4..0bf93947d89d 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -1333,6 +1333,7 @@ static int igt_ppgtt_sanity_check(void *arg)
 		unsigned int flags;
 	} backends[] = {
 		{ igt_create_system, 0,                        },
+		{ igt_create_local,  0,                        },
 		{ igt_create_local,  I915_BO_ALLOC_CONTIGUOUS, },
 	};
 	struct {
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 002/162] drm/i915/selftest: assert we get 2M GTT pages
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 001/162] drm/i915/selftest: also consider non-contiguous objects Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 003/162] drm/i915/selftest: handle local-memory in perf_memcpy Matthew Auld
                   ` (159 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

For the LMEM case if we have suitable alignment and 2M physical pages we
should always get 2M GTT pages within the constraints of the hugepages
selftest. If we don't then something might be wrong in our construction
of the backing pages.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 .../gpu/drm/i915/gem/selftests/huge_pages.c   | 21 +++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index 0bf93947d89d..77a13527a7e6 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -368,6 +368,27 @@ static int igt_check_page_sizes(struct i915_vma *vma)
 		err = -EINVAL;
 	}
 
+
+	/*
+	 * The dma-api is like a box of chocolates when it comes to the
+	 * alignment of dma addresses, however for LMEM we have total control
+	 * and so can guarantee alignment, likewise when we allocate our blocks
+	 * they should appear in descending order, and if we know that we align
+	 * to the largest page size for the GTT address, we should be able to
+	 * assert that if we see 2M physical pages then we should also get 2M
+	 * GTT pages. If we don't then something might be wrong in our
+	 * construction of the backing pages.
+	 */
+	if (i915_gem_object_is_lmem(obj) &&
+	    IS_ALIGNED(vma->node.start, SZ_2M) &&
+	    vma->page_sizes.sg & SZ_2M &&
+	    vma->page_sizes.gtt < SZ_2M) {
+		pr_err("gtt pages mismatch for LMEM, expected 2M GTT pages, sg(%u), gtt(%u)\n",
+		       vma->page_sizes.sg, vma->page_sizes.gtt);
+		err = -EINVAL;
+	}
+
+
 	if (obj->mm.page_sizes.gtt) {
 		pr_err("obj->page_sizes.gtt(%u) should never be set\n",
 		       obj->mm.page_sizes.gtt);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 003/162] drm/i915/selftest: handle local-memory in perf_memcpy
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 001/162] drm/i915/selftest: also consider non-contiguous objects Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 002/162] drm/i915/selftest: assert we get 2M GTT pages Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 004/162] drm/i915/gt: Move move context layout registers and offsets to lrc_reg.h Matthew Auld
                   ` (158 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

We currently only support WC when mapping device local-memory, which is
returned as a generic -ENOMEM when mapping the object with an
unsupported type. Try to handle that case also, although it's starting
to get pretty ugly in there.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/selftests/intel_memory_region.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index 0aeba8e3af28..27389fb19951 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -681,6 +681,8 @@ create_region_for_mapping(struct intel_memory_region *mr, u64 size, u32 type,
 		i915_gem_object_put(obj);
 		if (PTR_ERR(addr) == -ENXIO)
 			return ERR_PTR(-ENODEV);
+		if (PTR_ERR(addr) == -ENOMEM) /* WB local-memory */
+			return ERR_PTR(-ENODEV);
 		return addr;
 	}
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 004/162] drm/i915/gt: Move move context layout registers and offsets to lrc_reg.h
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (2 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 003/162] drm/i915/selftest: handle local-memory in perf_memcpy Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 19:55   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:04 ` [RFC PATCH 005/162] drm/i915/gt: Rename lrc.c to execlists_submission.c Matthew Auld
                   ` (157 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx
  Cc: Tvrtko Ursulin, dri-devel, Chris Wilson, Daniele Ceraolo Spurio,
	John Harrison

From: Chris Wilson <chris@chris-wilson.co.uk>

Cleanup intel_lrc.h by moving some of the residual common register
definitions into intel_lrc_reg.h, prior to rebranding and splitting off
the submission backends.

v2: keep the SCHEDULE enum in the old file, since it is specific to the
gvt usage of the execlists submission backend (John)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  2 +-
 drivers/gpu/drm/i915/gt/intel_gt_irq.c    |  1 +
 drivers/gpu/drm/i915/gt/intel_lrc.h       | 39 -----------------------
 drivers/gpu/drm/i915/gt/intel_lrc_reg.h   | 39 +++++++++++++++++++++++
 drivers/gpu/drm/i915/gvt/mmio_context.h   |  2 ++
 5 files changed, 43 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index d4e988b2816a..02ea16b29c9f 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -36,7 +36,7 @@
 #include "intel_gt.h"
 #include "intel_gt_requests.h"
 #include "intel_gt_pm.h"
-#include "intel_lrc.h"
+#include "intel_lrc_reg.h"
 #include "intel_reset.h"
 #include "intel_ring.h"
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
index 257063a57101..9830342aa6f4 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
@@ -11,6 +11,7 @@
 #include "intel_breadcrumbs.h"
 #include "intel_gt.h"
 #include "intel_gt_irq.h"
+#include "intel_lrc_reg.h"
 #include "intel_uncore.h"
 #include "intel_rps.h"
 
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h
index 802585a308e9..9116b46844a2 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.h
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.h
@@ -34,45 +34,6 @@ struct i915_request;
 struct intel_context;
 struct intel_engine_cs;
 
-/* Execlists regs */
-#define RING_ELSP(base)				_MMIO((base) + 0x230)
-#define RING_EXECLIST_STATUS_LO(base)		_MMIO((base) + 0x234)
-#define RING_EXECLIST_STATUS_HI(base)		_MMIO((base) + 0x234 + 4)
-#define RING_CONTEXT_CONTROL(base)		_MMIO((base) + 0x244)
-#define	  CTX_CTRL_INHIBIT_SYN_CTX_SWITCH	(1 << 3)
-#define	  CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT	(1 << 0)
-#define   CTX_CTRL_RS_CTX_ENABLE		(1 << 1)
-#define	  CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT	(1 << 2)
-#define	  GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE	(1 << 8)
-#define RING_CONTEXT_STATUS_PTR(base)		_MMIO((base) + 0x3a0)
-#define RING_EXECLIST_SQ_CONTENTS(base)		_MMIO((base) + 0x510)
-#define RING_EXECLIST_CONTROL(base)		_MMIO((base) + 0x550)
-
-#define	  EL_CTRL_LOAD				(1 << 0)
-
-/* The docs specify that the write pointer wraps around after 5h, "After status
- * is written out to the last available status QW at offset 5h, this pointer
- * wraps to 0."
- *
- * Therefore, one must infer than even though there are 3 bits available, 6 and
- * 7 appear to be * reserved.
- */
-#define GEN8_CSB_ENTRIES 6
-#define GEN8_CSB_PTR_MASK 0x7
-#define GEN8_CSB_READ_PTR_MASK (GEN8_CSB_PTR_MASK << 8)
-#define GEN8_CSB_WRITE_PTR_MASK (GEN8_CSB_PTR_MASK << 0)
-
-#define GEN11_CSB_ENTRIES 12
-#define GEN11_CSB_PTR_MASK 0xf
-#define GEN11_CSB_READ_PTR_MASK (GEN11_CSB_PTR_MASK << 8)
-#define GEN11_CSB_WRITE_PTR_MASK (GEN11_CSB_PTR_MASK << 0)
-
-#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
-#define MAX_GUC_CONTEXT_HW_ID (1 << 20) /* exclusive */
-#define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
-/* in Gen12 ID 0x7FF is reserved to indicate idle */
-#define GEN12_MAX_CONTEXT_HW_ID	(GEN11_MAX_CONTEXT_HW_ID - 1)
-
 enum {
 	INTEL_CONTEXT_SCHEDULE_IN = 0,
 	INTEL_CONTEXT_SCHEDULE_OUT,
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
index 1b51f7b9a5c3..b2e03ce35599 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
+++ b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
@@ -52,4 +52,43 @@
 #define GEN8_EXECLISTS_STATUS_BUF 0x370
 #define GEN11_EXECLISTS_STATUS_BUF2 0x3c0
 
+/* Execlists regs */
+#define RING_ELSP(base)				_MMIO((base) + 0x230)
+#define RING_EXECLIST_STATUS_LO(base)		_MMIO((base) + 0x234)
+#define RING_EXECLIST_STATUS_HI(base)		_MMIO((base) + 0x234 + 4)
+#define RING_CONTEXT_CONTROL(base)		_MMIO((base) + 0x244)
+#define	  CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT	REG_BIT(0)
+#define   CTX_CTRL_RS_CTX_ENABLE		REG_BIT(1)
+#define	  CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT	REG_BIT(2)
+#define	  CTX_CTRL_INHIBIT_SYN_CTX_SWITCH	REG_BIT(3)
+#define	  GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE	REG_BIT(8)
+#define RING_CONTEXT_STATUS_PTR(base)		_MMIO((base) + 0x3a0)
+#define RING_EXECLIST_SQ_CONTENTS(base)		_MMIO((base) + 0x510)
+#define RING_EXECLIST_CONTROL(base)		_MMIO((base) + 0x550)
+#define	  EL_CTRL_LOAD				REG_BIT(0)
+
+/*
+ * The docs specify that the write pointer wraps around after 5h, "After status
+ * is written out to the last available status QW at offset 5h, this pointer
+ * wraps to 0."
+ *
+ * Therefore, one must infer than even though there are 3 bits available, 6 and
+ * 7 appear to be * reserved.
+ */
+#define GEN8_CSB_ENTRIES 6
+#define GEN8_CSB_PTR_MASK 0x7
+#define GEN8_CSB_READ_PTR_MASK	(GEN8_CSB_PTR_MASK << 8)
+#define GEN8_CSB_WRITE_PTR_MASK	(GEN8_CSB_PTR_MASK << 0)
+
+#define GEN11_CSB_ENTRIES 12
+#define GEN11_CSB_PTR_MASK 0xf
+#define GEN11_CSB_READ_PTR_MASK		(GEN11_CSB_PTR_MASK << 8)
+#define GEN11_CSB_WRITE_PTR_MASK	(GEN11_CSB_PTR_MASK << 0)
+
+#define MAX_CONTEXT_HW_ID	(1 << 21) /* exclusive */
+#define MAX_GUC_CONTEXT_HW_ID	(1 << 20) /* exclusive */
+#define GEN11_MAX_CONTEXT_HW_ID	(1 << 11) /* exclusive */
+/* in Gen12 ID 0x7FF is reserved to indicate idle */
+#define GEN12_MAX_CONTEXT_HW_ID	(GEN11_MAX_CONTEXT_HW_ID - 1)
+
 #endif /* _INTEL_LRC_REG_H_ */
diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.h b/drivers/gpu/drm/i915/gvt/mmio_context.h
index 3b25e7fe32f6..412b96ee6883 100644
--- a/drivers/gpu/drm/i915/gvt/mmio_context.h
+++ b/drivers/gpu/drm/i915/gvt/mmio_context.h
@@ -36,6 +36,8 @@
 #ifndef __GVT_RENDER_H__
 #define __GVT_RENDER_H__
 
+#include "gt/intel_lrc_reg.h"
+
 struct engine_mmio {
 	enum intel_engine_id id;
 	i915_reg_t reg;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 005/162] drm/i915/gt: Rename lrc.c to execlists_submission.c
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (3 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 004/162] drm/i915/gt: Move move context layout registers and offsets to lrc_reg.h Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 19:56   ` Chris Wilson
  2020-11-27 12:04 ` [RFC PATCH 006/162] drm/i915: split gen8+ flush and bb_start emission functions to their own file Matthew Auld
                   ` (156 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: Tvrtko Ursulin, Daniele Ceraolo Spurio, dri-devel, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

We want to separate the utility functions for controlling the logical
ring context from the execlists submission mechanism (which is an
overgrown scheduler).

This is similar to Daniele's work to split up the files, but being
selfish I wanted to base it after my own changes to intel_lrc.c petered
out.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  1 +
 drivers/gpu/drm/i915/gt/intel_context_sseu.c  |  2 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  1 +
 ...tel_lrc.c => intel_execlists_submission.c} | 30 ++----------------
 ...tel_lrc.h => intel_execlists_submission.h} | 31 +++----------------
 drivers/gpu/drm/i915/gt/intel_mocs.c          |  2 +-
 .../{selftest_lrc.c => selftest_execlists.c}  |  0
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    |  1 +
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  1 +
 drivers/gpu/drm/i915/gvt/scheduler.c          |  1 +
 drivers/gpu/drm/i915/i915_drv.h               |  1 -
 drivers/gpu/drm/i915/i915_perf.c              |  1 +
 13 files changed, 16 insertions(+), 58 deletions(-)
 rename drivers/gpu/drm/i915/gt/{intel_lrc.c => intel_execlists_submission.c} (99%)
 rename drivers/gpu/drm/i915/gt/{intel_lrc.h => intel_execlists_submission.h} (57%)
 rename drivers/gpu/drm/i915/gt/{selftest_lrc.c => selftest_execlists.c} (100%)

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index e5574e506a5c..aedbd8f52be8 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -91,6 +91,7 @@ gt-y += \
 	gt/intel_engine_heartbeat.o \
 	gt/intel_engine_pm.o \
 	gt/intel_engine_user.o \
+	gt/intel_execlists_submission.o \
 	gt/intel_ggtt.o \
 	gt/intel_ggtt_fencing.o \
 	gt/intel_gt.o \
@@ -102,7 +103,6 @@ gt-y += \
 	gt/intel_gt_requests.o \
 	gt/intel_gtt.o \
 	gt/intel_llc.o \
-	gt/intel_lrc.o \
 	gt/intel_mocs.o \
 	gt/intel_ppgtt.o \
 	gt/intel_rc6.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index a6299da64de4..ad136d009d9b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -72,6 +72,7 @@
 #include "gt/intel_context_param.h"
 #include "gt/intel_engine_heartbeat.h"
 #include "gt/intel_engine_user.h"
+#include "gt/intel_execlists_submission.h" /* virtual_engine */
 #include "gt/intel_ring.h"
 
 #include "i915_gem_context.h"
diff --git a/drivers/gpu/drm/i915/gt/intel_context_sseu.c b/drivers/gpu/drm/i915/gt/intel_context_sseu.c
index b9c8163978a3..5f94b44022dc 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_context_sseu.c
@@ -8,7 +8,7 @@
 #include "intel_context.h"
 #include "intel_engine_pm.h"
 #include "intel_gpu_commands.h"
-#include "intel_lrc.h"
+#include "intel_execlists_submission.h"
 #include "intel_lrc_reg.h"
 #include "intel_ring.h"
 #include "intel_sseu.h"
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 02ea16b29c9f..97ceaf7116e8 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -33,6 +33,7 @@
 #include "intel_engine.h"
 #include "intel_engine_pm.h"
 #include "intel_engine_user.h"
+#include "intel_execlists_submission.h"
 #include "intel_gt.h"
 #include "intel_gt_requests.h"
 #include "intel_gt_pm.h"
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
similarity index 99%
rename from drivers/gpu/drm/i915/gt/intel_lrc.c
rename to drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 43703efb36d1..fc330233ea20 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -1,31 +1,6 @@
+// SPDX-License-Identifier: MIT
 /*
  * Copyright © 2014 Intel Corporation
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the next
- * paragraph) shall be included in all copies or substantial portions of the
- * Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
- * IN THE SOFTWARE.
- *
- * Authors:
- *    Ben Widawsky <ben@bwidawsk.net>
- *    Michel Thierry <michel.thierry@intel.com>
- *    Thomas Daniel <thomas.daniel@intel.com>
- *    Oscar Mateo <oscar.mateo@intel.com>
- *
  */
 
 /**
@@ -140,6 +115,7 @@
 #include "intel_breadcrumbs.h"
 #include "intel_context.h"
 #include "intel_engine_pm.h"
+#include "intel_execlists_submission.h"
 #include "intel_gt.h"
 #include "intel_gt_pm.h"
 #include "intel_gt_requests.h"
@@ -6127,5 +6103,5 @@ intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine)
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-#include "selftest_lrc.c"
+#include "selftest_execlists.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
similarity index 57%
rename from drivers/gpu/drm/i915/gt/intel_lrc.h
rename to drivers/gpu/drm/i915/gt/intel_execlists_submission.h
index 9116b46844a2..2c9d7354b42f 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.h
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
@@ -1,35 +1,15 @@
+/* SPDX-License-Identifier: MIT */
 /*
  * Copyright © 2014 Intel Corporation
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the next
- * paragraph) shall be included in all copies or substantial portions of the
- * Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
  */
 
-#ifndef _INTEL_LRC_H_
-#define _INTEL_LRC_H_
+#ifndef __INTEL_EXECLISTS_SUBMISSION_H__
+#define __INTEL_EXECLISTS_SUBMISSION_H__
 
 #include <linux/types.h>
 
 struct drm_printer;
 
-struct drm_i915_private;
-struct i915_gem_context;
 struct i915_request;
 struct intel_context;
 struct intel_engine_cs;
@@ -40,9 +20,6 @@ enum {
 	INTEL_CONTEXT_SCHEDULE_PREEMPTED,
 };
 
-/* Logical Rings */
-void intel_logical_ring_cleanup(struct intel_engine_cs *engine);
-
 int intel_execlists_submission_setup(struct intel_engine_cs *engine);
 
 /* Logical Ring Contexts */
@@ -86,4 +63,4 @@ int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
 bool
 intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
 
-#endif /* _INTEL_LRC_H_ */
+#endif /* __INTEL_EXECLISTS_SUBMISSION_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
index b8d0c32ae9dd..516206007398 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -24,8 +24,8 @@
 
 #include "intel_engine.h"
 #include "intel_gt.h"
+#include "intel_lrc_reg.h"
 #include "intel_mocs.h"
-#include "intel_lrc.h"
 #include "intel_ring.h"
 
 /* structures required */
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
similarity index 100%
rename from drivers/gpu/drm/i915/gt/selftest_lrc.c
rename to drivers/gpu/drm/i915/gt/selftest_execlists.c
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 5212ff844292..1a2e4f631763 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -3,6 +3,7 @@
  * Copyright © 2014-2019 Intel Corporation
  */
 
+#include "gt/intel_execlists_submission.h" /* lrc layout */
 #include "gt/intel_gt.h"
 #include "intel_guc_ads.h"
 #include "intel_uc.h"
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index fdfeb4b9b0f5..8528ab574dbe 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -8,6 +8,7 @@
 #include "gem/i915_gem_context.h"
 #include "gt/intel_context.h"
 #include "gt/intel_engine_pm.h"
+#include "gt/intel_execlists_submission.h" /* XXX */
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_pm.h"
 #include "gt/intel_lrc_reg.h"
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index aed2ef6466a2..ed30fdde4114 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -37,6 +37,7 @@
 
 #include "gem/i915_gem_pm.h"
 #include "gt/intel_context.h"
+#include "gt/intel_execlists_submission.h"
 #include "gt/intel_ring.h"
 
 #include "i915_drv.h"
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 15be8debae54..0f7bf6831633 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -79,7 +79,6 @@
 #include "gem/i915_gem_shrinker.h"
 #include "gem/i915_gem_stolen.h"
 
-#include "gt/intel_lrc.h"
 #include "gt/intel_engine.h"
 #include "gt/intel_gt_types.h"
 #include "gt/intel_workarounds.h"
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 3b12c8ff7182..0b300e0d9561 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -198,6 +198,7 @@
 #include "gem/i915_gem_context.h"
 #include "gt/intel_engine_pm.h"
 #include "gt/intel_engine_user.h"
+#include "gt/intel_execlists_submission.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_lrc_reg.h"
 #include "gt/intel_ring.h"
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 006/162] drm/i915: split gen8+ flush and bb_start emission functions to their own file
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (4 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 005/162] drm/i915/gt: Rename lrc.c to execlists_submission.c Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 19:58   ` Chris Wilson
  2020-11-27 12:04 ` [RFC PATCH 007/162] drm/i915: split wa_bb code to its " Matthew Auld
                   ` (155 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx
  Cc: Tvrtko Ursulin, Chris P Wilson, Daniele Ceraolo Spurio,
	dri-devel, John Harrison

From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

These functions are independent from the backend used and can therefore
be split out of the exelists submission file, so they can be re-used by
the upcoming GuC submission backend.

Based on a patch by Chris Wilson.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris P Wilson <chris.p.wilson@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c      | 393 ++++++++++++++++++
 drivers/gpu/drm/i915/gt/gen8_engine_cs.h      |  26 ++
 .../drm/i915/gt/intel_execlists_submission.c  | 385 +----------------
 4 files changed, 421 insertions(+), 384 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/gen8_engine_cs.c
 create mode 100644 drivers/gpu/drm/i915/gt/gen8_engine_cs.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index aedbd8f52be8..f9ef5199b124 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -82,6 +82,7 @@ gt-y += \
 	gt/gen6_engine_cs.o \
 	gt/gen6_ppgtt.o \
 	gt/gen7_renderclear.o \
+	gt/gen8_engine_cs.o \
 	gt/gen8_ppgtt.o \
 	gt/intel_breadcrumbs.o \
 	gt/intel_context.o \
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
new file mode 100644
index 000000000000..a96fe108685e
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -0,0 +1,393 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2014 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "intel_execlists_submission.h" /* XXX */
+#include "intel_gpu_commands.h"
+#include "intel_ring.h"
+
+int gen8_emit_flush_render(struct i915_request *request, u32 mode)
+{
+	bool vf_flush_wa = false, dc_flush_wa = false;
+	u32 *cs, flags = 0;
+	int len;
+
+	flags |= PIPE_CONTROL_CS_STALL;
+
+	if (mode & EMIT_FLUSH) {
+		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
+		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
+		flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
+		flags |= PIPE_CONTROL_FLUSH_ENABLE;
+	}
+
+	if (mode & EMIT_INVALIDATE) {
+		flags |= PIPE_CONTROL_TLB_INVALIDATE;
+		flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_QW_WRITE;
+		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
+
+		/*
+		 * On GEN9: before VF_CACHE_INVALIDATE we need to emit a NULL
+		 * pipe control.
+		 */
+		if (IS_GEN(request->engine->i915, 9))
+			vf_flush_wa = true;
+
+		/* WaForGAMHang:kbl */
+		if (IS_KBL_GT_REVID(request->engine->i915, 0, KBL_REVID_B0))
+			dc_flush_wa = true;
+	}
+
+	len = 6;
+
+	if (vf_flush_wa)
+		len += 6;
+
+	if (dc_flush_wa)
+		len += 12;
+
+	cs = intel_ring_begin(request, len);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	if (vf_flush_wa)
+		cs = gen8_emit_pipe_control(cs, 0, 0);
+
+	if (dc_flush_wa)
+		cs = gen8_emit_pipe_control(cs, PIPE_CONTROL_DC_FLUSH_ENABLE,
+					    0);
+
+	cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
+
+	if (dc_flush_wa)
+		cs = gen8_emit_pipe_control(cs, PIPE_CONTROL_CS_STALL, 0);
+
+	intel_ring_advance(request, cs);
+
+	return 0;
+}
+
+int gen8_emit_flush(struct i915_request *request, u32 mode)
+{
+	u32 cmd, *cs;
+
+	cs = intel_ring_begin(request, 4);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	cmd = MI_FLUSH_DW + 1;
+
+	/* We always require a command barrier so that subsequent
+	 * commands, such as breadcrumb interrupts, are strictly ordered
+	 * wrt the contents of the write cache being flushed to memory
+	 * (and thus being coherent from the CPU).
+	 */
+	cmd |= MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW;
+
+	if (mode & EMIT_INVALIDATE) {
+		cmd |= MI_INVALIDATE_TLB;
+		if (request->engine->class == VIDEO_DECODE_CLASS)
+			cmd |= MI_INVALIDATE_BSD;
+	}
+
+	*cs++ = cmd;
+	*cs++ = LRC_PPHWSP_SCRATCH_ADDR;
+	*cs++ = 0; /* upper addr */
+	*cs++ = 0; /* value */
+	intel_ring_advance(request, cs);
+
+	return 0;
+}
+
+int gen11_emit_flush_render(struct i915_request *request, u32 mode)
+{
+	if (mode & EMIT_FLUSH) {
+		u32 *cs;
+		u32 flags = 0;
+
+		flags |= PIPE_CONTROL_CS_STALL;
+
+		flags |= PIPE_CONTROL_TILE_CACHE_FLUSH;
+		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
+		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
+		flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
+		flags |= PIPE_CONTROL_FLUSH_ENABLE;
+		flags |= PIPE_CONTROL_QW_WRITE;
+		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
+
+		cs = intel_ring_begin(request, 6);
+		if (IS_ERR(cs))
+			return PTR_ERR(cs);
+
+		cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
+		intel_ring_advance(request, cs);
+	}
+
+	if (mode & EMIT_INVALIDATE) {
+		u32 *cs;
+		u32 flags = 0;
+
+		flags |= PIPE_CONTROL_CS_STALL;
+
+		flags |= PIPE_CONTROL_COMMAND_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_TLB_INVALIDATE;
+		flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_QW_WRITE;
+		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
+
+		cs = intel_ring_begin(request, 6);
+		if (IS_ERR(cs))
+			return PTR_ERR(cs);
+
+		cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
+		intel_ring_advance(request, cs);
+	}
+
+	return 0;
+}
+
+static u32 preparser_disable(bool state)
+{
+	return MI_ARB_CHECK | 1 << 8 | state;
+}
+
+static i915_reg_t aux_inv_reg(const struct intel_engine_cs *engine)
+{
+	static const i915_reg_t vd[] = {
+		GEN12_VD0_AUX_NV,
+		GEN12_VD1_AUX_NV,
+		GEN12_VD2_AUX_NV,
+		GEN12_VD3_AUX_NV,
+	};
+
+	static const i915_reg_t ve[] = {
+		GEN12_VE0_AUX_NV,
+		GEN12_VE1_AUX_NV,
+	};
+
+	if (engine->class == VIDEO_DECODE_CLASS)
+		return vd[engine->instance];
+
+	if (engine->class == VIDEO_ENHANCEMENT_CLASS)
+		return ve[engine->instance];
+
+	GEM_BUG_ON("unknown aux_inv_reg\n");
+
+	return INVALID_MMIO_REG;
+}
+
+static u32 *
+gen12_emit_aux_table_inv(const i915_reg_t inv_reg, u32 *cs)
+{
+	*cs++ = MI_LOAD_REGISTER_IMM(1);
+	*cs++ = i915_mmio_reg_offset(inv_reg);
+	*cs++ = AUX_INV;
+	*cs++ = MI_NOOP;
+
+	return cs;
+}
+
+int gen12_emit_flush_render(struct i915_request *request, u32 mode)
+{
+	if (mode & EMIT_FLUSH) {
+		u32 flags = 0;
+		u32 *cs;
+
+		flags |= PIPE_CONTROL_TILE_CACHE_FLUSH;
+		flags |= PIPE_CONTROL_FLUSH_L3;
+		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
+		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
+		/* Wa_1409600907:tgl */
+		flags |= PIPE_CONTROL_DEPTH_STALL;
+		flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
+		flags |= PIPE_CONTROL_FLUSH_ENABLE;
+
+		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
+		flags |= PIPE_CONTROL_QW_WRITE;
+
+		flags |= PIPE_CONTROL_CS_STALL;
+
+		cs = intel_ring_begin(request, 6);
+		if (IS_ERR(cs))
+			return PTR_ERR(cs);
+
+		cs = gen12_emit_pipe_control(cs,
+					     PIPE_CONTROL0_HDC_PIPELINE_FLUSH,
+					     flags, LRC_PPHWSP_SCRATCH_ADDR);
+		intel_ring_advance(request, cs);
+	}
+
+	if (mode & EMIT_INVALIDATE) {
+		u32 flags = 0;
+		u32 *cs;
+
+		flags |= PIPE_CONTROL_COMMAND_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_TLB_INVALIDATE;
+		flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
+		flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
+
+		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
+		flags |= PIPE_CONTROL_QW_WRITE;
+
+		flags |= PIPE_CONTROL_CS_STALL;
+
+		cs = intel_ring_begin(request, 8 + 4);
+		if (IS_ERR(cs))
+			return PTR_ERR(cs);
+
+		/*
+		 * Prevent the pre-parser from skipping past the TLB
+		 * invalidate and loading a stale page for the batch
+		 * buffer / request payload.
+		 */
+		*cs++ = preparser_disable(true);
+
+		cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
+
+		/* hsdes: 1809175790 */
+		cs = gen12_emit_aux_table_inv(GEN12_GFX_CCS_AUX_NV, cs);
+
+		*cs++ = preparser_disable(false);
+		intel_ring_advance(request, cs);
+	}
+
+	return 0;
+}
+
+int gen12_emit_flush(struct i915_request *request, u32 mode)
+{
+	intel_engine_mask_t aux_inv = 0;
+	u32 cmd, *cs;
+
+	cmd = 4;
+	if (mode & EMIT_INVALIDATE)
+		cmd += 2;
+	if (mode & EMIT_INVALIDATE)
+		aux_inv = request->engine->mask & ~BIT(BCS0);
+	if (aux_inv)
+		cmd += 2 * hweight8(aux_inv) + 2;
+
+	cs = intel_ring_begin(request, cmd);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	if (mode & EMIT_INVALIDATE)
+		*cs++ = preparser_disable(true);
+
+	cmd = MI_FLUSH_DW + 1;
+
+	/* We always require a command barrier so that subsequent
+	 * commands, such as breadcrumb interrupts, are strictly ordered
+	 * wrt the contents of the write cache being flushed to memory
+	 * (and thus being coherent from the CPU).
+	 */
+	cmd |= MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW;
+
+	if (mode & EMIT_INVALIDATE) {
+		cmd |= MI_INVALIDATE_TLB;
+		if (request->engine->class == VIDEO_DECODE_CLASS)
+			cmd |= MI_INVALIDATE_BSD;
+	}
+
+	*cs++ = cmd;
+	*cs++ = LRC_PPHWSP_SCRATCH_ADDR;
+	*cs++ = 0; /* upper addr */
+	*cs++ = 0; /* value */
+
+	if (aux_inv) { /* hsdes: 1809175790 */
+		struct intel_engine_cs *engine;
+		unsigned int tmp;
+
+		*cs++ = MI_LOAD_REGISTER_IMM(hweight8(aux_inv));
+		for_each_engine_masked(engine, request->engine->gt,
+				       aux_inv, tmp) {
+			*cs++ = i915_mmio_reg_offset(aux_inv_reg(engine));
+			*cs++ = AUX_INV;
+		}
+		*cs++ = MI_NOOP;
+	}
+
+	if (mode & EMIT_INVALIDATE)
+		*cs++ = preparser_disable(false);
+
+	intel_ring_advance(request, cs);
+
+	return 0;
+}
+
+int gen8_emit_bb_start_noarb(struct i915_request *rq,
+			     u64 offset, u32 len,
+			     const unsigned int flags)
+{
+	u32 *cs;
+
+	cs = intel_ring_begin(rq, 4);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	/*
+	 * WaDisableCtxRestoreArbitration:bdw,chv
+	 *
+	 * We don't need to perform MI_ARB_ENABLE as often as we do (in
+	 * particular all the gen that do not need the w/a at all!), if we
+	 * took care to make sure that on every switch into this context
+	 * (both ordinary and for preemption) that arbitrartion was enabled
+	 * we would be fine.  However, for gen8 there is another w/a that
+	 * requires us to not preempt inside GPGPU execution, so we keep
+	 * arbitration disabled for gen8 batches. Arbitration will be
+	 * re-enabled before we close the request
+	 * (engine->emit_fini_breadcrumb).
+	 */
+	*cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
+
+	/* FIXME(BDW+): Address space and security selectors. */
+	*cs++ = MI_BATCH_BUFFER_START_GEN8 |
+		(flags & I915_DISPATCH_SECURE ? 0 : BIT(8));
+	*cs++ = lower_32_bits(offset);
+	*cs++ = upper_32_bits(offset);
+
+	intel_ring_advance(rq, cs);
+
+	return 0;
+}
+
+int gen8_emit_bb_start(struct i915_request *rq,
+		       u64 offset, u32 len,
+		       const unsigned int flags)
+{
+	u32 *cs;
+
+	cs = intel_ring_begin(rq, 6);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+
+	*cs++ = MI_BATCH_BUFFER_START_GEN8 |
+		(flags & I915_DISPATCH_SECURE ? 0 : BIT(8));
+	*cs++ = lower_32_bits(offset);
+	*cs++ = upper_32_bits(offset);
+
+	*cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
+	*cs++ = MI_NOOP;
+
+	intel_ring_advance(rq, cs);
+
+	return 0;
+}
+
+
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
new file mode 100644
index 000000000000..c0c62284b650
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2014 Intel Corporation
+ */
+
+#ifndef __GEN8_ENGINE_CS_H__
+#define __GEN8_ENGINE_CS_H__
+
+#include <linux/types.h>
+
+struct i915_request;
+
+int gen8_emit_flush_render(struct i915_request *request, u32 mode);
+int gen8_emit_flush(struct i915_request *request, u32 mode);
+int gen11_emit_flush_render(struct i915_request *request, u32 mode);
+int gen12_emit_flush_render(struct i915_request *request, u32 mode);
+int gen12_emit_flush(struct i915_request *request, u32 mode);
+
+int gen8_emit_bb_start_noarb(struct i915_request *rq,
+			     u64 offset, u32 len,
+			     const unsigned int flags);
+int gen8_emit_bb_start(struct i915_request *rq,
+		       u64 offset, u32 len,
+		       const unsigned int flags);
+
+#endif /* __GEN8_ENGINE_CS_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index fc330233ea20..9069a456d2f7 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -112,6 +112,7 @@
 #include "i915_perf.h"
 #include "i915_trace.h"
 #include "i915_vgpu.h"
+#include "gen8_engine_cs.h"
 #include "intel_breadcrumbs.h"
 #include "intel_context.h"
 #include "intel_engine_pm.h"
@@ -4465,67 +4466,6 @@ static void execlists_reset_finish(struct intel_engine_cs *engine)
 		     atomic_read(&execlists->tasklet.count));
 }
 
-static int gen8_emit_bb_start_noarb(struct i915_request *rq,
-				    u64 offset, u32 len,
-				    const unsigned int flags)
-{
-	u32 *cs;
-
-	cs = intel_ring_begin(rq, 4);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	/*
-	 * WaDisableCtxRestoreArbitration:bdw,chv
-	 *
-	 * We don't need to perform MI_ARB_ENABLE as often as we do (in
-	 * particular all the gen that do not need the w/a at all!), if we
-	 * took care to make sure that on every switch into this context
-	 * (both ordinary and for preemption) that arbitrartion was enabled
-	 * we would be fine.  However, for gen8 there is another w/a that
-	 * requires us to not preempt inside GPGPU execution, so we keep
-	 * arbitration disabled for gen8 batches. Arbitration will be
-	 * re-enabled before we close the request
-	 * (engine->emit_fini_breadcrumb).
-	 */
-	*cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
-
-	/* FIXME(BDW+): Address space and security selectors. */
-	*cs++ = MI_BATCH_BUFFER_START_GEN8 |
-		(flags & I915_DISPATCH_SECURE ? 0 : BIT(8));
-	*cs++ = lower_32_bits(offset);
-	*cs++ = upper_32_bits(offset);
-
-	intel_ring_advance(rq, cs);
-
-	return 0;
-}
-
-static int gen8_emit_bb_start(struct i915_request *rq,
-			      u64 offset, u32 len,
-			      const unsigned int flags)
-{
-	u32 *cs;
-
-	cs = intel_ring_begin(rq, 6);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
-
-	*cs++ = MI_BATCH_BUFFER_START_GEN8 |
-		(flags & I915_DISPATCH_SECURE ? 0 : BIT(8));
-	*cs++ = lower_32_bits(offset);
-	*cs++ = upper_32_bits(offset);
-
-	*cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
-	*cs++ = MI_NOOP;
-
-	intel_ring_advance(rq, cs);
-
-	return 0;
-}
-
 static void gen8_logical_ring_enable_irq(struct intel_engine_cs *engine)
 {
 	ENGINE_WRITE(engine, RING_IMR,
@@ -4538,329 +4478,6 @@ static void gen8_logical_ring_disable_irq(struct intel_engine_cs *engine)
 	ENGINE_WRITE(engine, RING_IMR, ~engine->irq_keep_mask);
 }
 
-static int gen8_emit_flush(struct i915_request *request, u32 mode)
-{
-	u32 cmd, *cs;
-
-	cs = intel_ring_begin(request, 4);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	cmd = MI_FLUSH_DW + 1;
-
-	/* We always require a command barrier so that subsequent
-	 * commands, such as breadcrumb interrupts, are strictly ordered
-	 * wrt the contents of the write cache being flushed to memory
-	 * (and thus being coherent from the CPU).
-	 */
-	cmd |= MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW;
-
-	if (mode & EMIT_INVALIDATE) {
-		cmd |= MI_INVALIDATE_TLB;
-		if (request->engine->class == VIDEO_DECODE_CLASS)
-			cmd |= MI_INVALIDATE_BSD;
-	}
-
-	*cs++ = cmd;
-	*cs++ = LRC_PPHWSP_SCRATCH_ADDR;
-	*cs++ = 0; /* upper addr */
-	*cs++ = 0; /* value */
-	intel_ring_advance(request, cs);
-
-	return 0;
-}
-
-static int gen8_emit_flush_render(struct i915_request *request,
-				  u32 mode)
-{
-	bool vf_flush_wa = false, dc_flush_wa = false;
-	u32 *cs, flags = 0;
-	int len;
-
-	flags |= PIPE_CONTROL_CS_STALL;
-
-	if (mode & EMIT_FLUSH) {
-		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
-		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
-		flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
-		flags |= PIPE_CONTROL_FLUSH_ENABLE;
-	}
-
-	if (mode & EMIT_INVALIDATE) {
-		flags |= PIPE_CONTROL_TLB_INVALIDATE;
-		flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_QW_WRITE;
-		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
-
-		/*
-		 * On GEN9: before VF_CACHE_INVALIDATE we need to emit a NULL
-		 * pipe control.
-		 */
-		if (IS_GEN(request->engine->i915, 9))
-			vf_flush_wa = true;
-
-		/* WaForGAMHang:kbl */
-		if (IS_KBL_GT_REVID(request->engine->i915, 0, KBL_REVID_B0))
-			dc_flush_wa = true;
-	}
-
-	len = 6;
-
-	if (vf_flush_wa)
-		len += 6;
-
-	if (dc_flush_wa)
-		len += 12;
-
-	cs = intel_ring_begin(request, len);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	if (vf_flush_wa)
-		cs = gen8_emit_pipe_control(cs, 0, 0);
-
-	if (dc_flush_wa)
-		cs = gen8_emit_pipe_control(cs, PIPE_CONTROL_DC_FLUSH_ENABLE,
-					    0);
-
-	cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
-
-	if (dc_flush_wa)
-		cs = gen8_emit_pipe_control(cs, PIPE_CONTROL_CS_STALL, 0);
-
-	intel_ring_advance(request, cs);
-
-	return 0;
-}
-
-static int gen11_emit_flush_render(struct i915_request *request,
-				   u32 mode)
-{
-	if (mode & EMIT_FLUSH) {
-		u32 *cs;
-		u32 flags = 0;
-
-		flags |= PIPE_CONTROL_CS_STALL;
-
-		flags |= PIPE_CONTROL_TILE_CACHE_FLUSH;
-		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
-		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
-		flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
-		flags |= PIPE_CONTROL_FLUSH_ENABLE;
-		flags |= PIPE_CONTROL_QW_WRITE;
-		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
-
-		cs = intel_ring_begin(request, 6);
-		if (IS_ERR(cs))
-			return PTR_ERR(cs);
-
-		cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
-		intel_ring_advance(request, cs);
-	}
-
-	if (mode & EMIT_INVALIDATE) {
-		u32 *cs;
-		u32 flags = 0;
-
-		flags |= PIPE_CONTROL_CS_STALL;
-
-		flags |= PIPE_CONTROL_COMMAND_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_TLB_INVALIDATE;
-		flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_QW_WRITE;
-		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
-
-		cs = intel_ring_begin(request, 6);
-		if (IS_ERR(cs))
-			return PTR_ERR(cs);
-
-		cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
-		intel_ring_advance(request, cs);
-	}
-
-	return 0;
-}
-
-static u32 preparser_disable(bool state)
-{
-	return MI_ARB_CHECK | 1 << 8 | state;
-}
-
-static i915_reg_t aux_inv_reg(const struct intel_engine_cs *engine)
-{
-	static const i915_reg_t vd[] = {
-		GEN12_VD0_AUX_NV,
-		GEN12_VD1_AUX_NV,
-		GEN12_VD2_AUX_NV,
-		GEN12_VD3_AUX_NV,
-	};
-
-	static const i915_reg_t ve[] = {
-		GEN12_VE0_AUX_NV,
-		GEN12_VE1_AUX_NV,
-	};
-
-	if (engine->class == VIDEO_DECODE_CLASS)
-		return vd[engine->instance];
-
-	if (engine->class == VIDEO_ENHANCEMENT_CLASS)
-		return ve[engine->instance];
-
-	GEM_BUG_ON("unknown aux_inv_reg\n");
-
-	return INVALID_MMIO_REG;
-}
-
-static u32 *
-gen12_emit_aux_table_inv(const i915_reg_t inv_reg, u32 *cs)
-{
-	*cs++ = MI_LOAD_REGISTER_IMM(1);
-	*cs++ = i915_mmio_reg_offset(inv_reg);
-	*cs++ = AUX_INV;
-	*cs++ = MI_NOOP;
-
-	return cs;
-}
-
-static int gen12_emit_flush_render(struct i915_request *request,
-				   u32 mode)
-{
-	if (mode & EMIT_FLUSH) {
-		u32 flags = 0;
-		u32 *cs;
-
-		flags |= PIPE_CONTROL_TILE_CACHE_FLUSH;
-		flags |= PIPE_CONTROL_FLUSH_L3;
-		flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
-		flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
-		/* Wa_1409600907:tgl */
-		flags |= PIPE_CONTROL_DEPTH_STALL;
-		flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
-		flags |= PIPE_CONTROL_FLUSH_ENABLE;
-
-		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
-		flags |= PIPE_CONTROL_QW_WRITE;
-
-		flags |= PIPE_CONTROL_CS_STALL;
-
-		cs = intel_ring_begin(request, 6);
-		if (IS_ERR(cs))
-			return PTR_ERR(cs);
-
-		cs = gen12_emit_pipe_control(cs,
-					     PIPE_CONTROL0_HDC_PIPELINE_FLUSH,
-					     flags, LRC_PPHWSP_SCRATCH_ADDR);
-		intel_ring_advance(request, cs);
-	}
-
-	if (mode & EMIT_INVALIDATE) {
-		u32 flags = 0;
-		u32 *cs;
-
-		flags |= PIPE_CONTROL_COMMAND_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_TLB_INVALIDATE;
-		flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
-		flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
-
-		flags |= PIPE_CONTROL_STORE_DATA_INDEX;
-		flags |= PIPE_CONTROL_QW_WRITE;
-
-		flags |= PIPE_CONTROL_CS_STALL;
-
-		cs = intel_ring_begin(request, 8 + 4);
-		if (IS_ERR(cs))
-			return PTR_ERR(cs);
-
-		/*
-		 * Prevent the pre-parser from skipping past the TLB
-		 * invalidate and loading a stale page for the batch
-		 * buffer / request payload.
-		 */
-		*cs++ = preparser_disable(true);
-
-		cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
-
-		/* hsdes: 1809175790 */
-		cs = gen12_emit_aux_table_inv(GEN12_GFX_CCS_AUX_NV, cs);
-
-		*cs++ = preparser_disable(false);
-		intel_ring_advance(request, cs);
-	}
-
-	return 0;
-}
-
-static int gen12_emit_flush(struct i915_request *request, u32 mode)
-{
-	intel_engine_mask_t aux_inv = 0;
-	u32 cmd, *cs;
-
-	cmd = 4;
-	if (mode & EMIT_INVALIDATE)
-		cmd += 2;
-	if (mode & EMIT_INVALIDATE)
-		aux_inv = request->engine->mask & ~BIT(BCS0);
-	if (aux_inv)
-		cmd += 2 * hweight8(aux_inv) + 2;
-
-	cs = intel_ring_begin(request, cmd);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	if (mode & EMIT_INVALIDATE)
-		*cs++ = preparser_disable(true);
-
-	cmd = MI_FLUSH_DW + 1;
-
-	/* We always require a command barrier so that subsequent
-	 * commands, such as breadcrumb interrupts, are strictly ordered
-	 * wrt the contents of the write cache being flushed to memory
-	 * (and thus being coherent from the CPU).
-	 */
-	cmd |= MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW;
-
-	if (mode & EMIT_INVALIDATE) {
-		cmd |= MI_INVALIDATE_TLB;
-		if (request->engine->class == VIDEO_DECODE_CLASS)
-			cmd |= MI_INVALIDATE_BSD;
-	}
-
-	*cs++ = cmd;
-	*cs++ = LRC_PPHWSP_SCRATCH_ADDR;
-	*cs++ = 0; /* upper addr */
-	*cs++ = 0; /* value */
-
-	if (aux_inv) { /* hsdes: 1809175790 */
-		struct intel_engine_cs *engine;
-		unsigned int tmp;
-
-		*cs++ = MI_LOAD_REGISTER_IMM(hweight8(aux_inv));
-		for_each_engine_masked(engine, request->engine->gt,
-				       aux_inv, tmp) {
-			*cs++ = i915_mmio_reg_offset(aux_inv_reg(engine));
-			*cs++ = AUX_INV;
-		}
-		*cs++ = MI_NOOP;
-	}
-
-	if (mode & EMIT_INVALIDATE)
-		*cs++ = preparser_disable(false);
-
-	intel_ring_advance(request, cs);
-
-	return 0;
-}
 
 static void assert_request_valid(struct i915_request *rq)
 {
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 007/162] drm/i915: split wa_bb code to its own file
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (5 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 006/162] drm/i915: split gen8+ flush and bb_start emission functions to their own file Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 008/162] HAX drm/i915: Work around the selftest timeline lock splat workaround Matthew Auld
                   ` (154 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx
  Cc: Tvrtko Ursulin, Chris P Wilson, Daniele Ceraolo Spurio,
	dri-devel, John Harrison

From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Continuing the split of back-end independent code from the execlist
submission specific file.

Based on a patch by Chris Wilson.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris P Wilson <chris.p.wilson@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 .../drm/i915/gt/intel_engine_workaround_bb.c  | 335 ++++++++++++++++++
 .../drm/i915/gt/intel_engine_workaround_bb.h  |  14 +
 .../drm/i915/gt/intel_execlists_submission.c  | 327 +----------------
 4 files changed, 352 insertions(+), 325 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_workaround_bb.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_workaround_bb.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index f9ef5199b124..2445cc990e15 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -92,6 +92,7 @@ gt-y += \
 	gt/intel_engine_heartbeat.o \
 	gt/intel_engine_pm.o \
 	gt/intel_engine_user.o \
+	gt/intel_engine_workaround_bb.o \
 	gt/intel_execlists_submission.o \
 	gt/intel_ggtt.o \
 	gt/intel_ggtt_fencing.o \
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_workaround_bb.c b/drivers/gpu/drm/i915/gt/intel_engine_workaround_bb.c
new file mode 100644
index 000000000000..b03bdfc92bb2
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_engine_workaround_bb.c
@@ -0,0 +1,335 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2014 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "intel_engine_types.h"
+#include "intel_engine_workaround_bb.h"
+#include "intel_execlists_submission.h" /* XXX */
+#include "intel_gpu_commands.h"
+#include "intel_gt.h"
+
+/*
+ * In this WA we need to set GEN8_L3SQCREG4[21:21] and reset it after
+ * PIPE_CONTROL instruction. This is required for the flush to happen correctly
+ * but there is a slight complication as this is applied in WA batch where the
+ * values are only initialized once so we cannot take register value at the
+ * beginning and reuse it further; hence we save its value to memory, upload a
+ * constant value with bit21 set and then we restore it back with the saved value.
+ * To simplify the WA, a constant value is formed by using the default value
+ * of this register. This shouldn't be a problem because we are only modifying
+ * it for a short period and this batch in non-premptible. We can ofcourse
+ * use additional instructions that read the actual value of the register
+ * at that time and set our bit of interest but it makes the WA complicated.
+ *
+ * This WA is also required for Gen9 so extracting as a function avoids
+ * code duplication.
+ */
+static u32 *
+gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
+{
+	/* NB no one else is allowed to scribble over scratch + 256! */
+	*batch++ = MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
+	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
+	*batch++ = intel_gt_scratch_offset(engine->gt,
+					   INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA);
+	*batch++ = 0;
+
+	*batch++ = MI_LOAD_REGISTER_IMM(1);
+	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
+	*batch++ = 0x40400000 | GEN8_LQSC_FLUSH_COHERENT_LINES;
+
+	batch = gen8_emit_pipe_control(batch,
+				       PIPE_CONTROL_CS_STALL |
+				       PIPE_CONTROL_DC_FLUSH_ENABLE,
+				       0);
+
+	*batch++ = MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
+	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
+	*batch++ = intel_gt_scratch_offset(engine->gt,
+					   INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA);
+	*batch++ = 0;
+
+	return batch;
+}
+
+/*
+ * Typically we only have one indirect_ctx and per_ctx batch buffer which are
+ * initialized at the beginning and shared across all contexts but this field
+ * helps us to have multiple batches at different offsets and select them based
+ * on a criteria. At the moment this batch always start at the beginning of the page
+ * and at this point we don't have multiple wa_ctx batch buffers.
+ *
+ * The number of WA applied are not known at the beginning; we use this field
+ * to return the no of DWORDS written.
+ *
+ * It is to be noted that this batch does not contain MI_BATCH_BUFFER_END
+ * so it adds NOOPs as padding to make it cacheline aligned.
+ * MI_BATCH_BUFFER_END will be added to perctx batch and both of them together
+ * makes a complete batch buffer.
+ */
+static u32 *gen8_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
+{
+	/* WaDisableCtxRestoreArbitration:bdw,chv */
+	*batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
+
+	/* WaFlushCoherentL3CacheLinesAtContextSwitch:bdw */
+	if (IS_BROADWELL(engine->i915))
+		batch = gen8_emit_flush_coherentl3_wa(engine, batch);
+
+	/* WaClearSlmSpaceAtContextSwitch:bdw,chv */
+	/* Actual scratch location is at 128 bytes offset */
+	batch = gen8_emit_pipe_control(batch,
+				       PIPE_CONTROL_FLUSH_L3 |
+				       PIPE_CONTROL_STORE_DATA_INDEX |
+				       PIPE_CONTROL_CS_STALL |
+				       PIPE_CONTROL_QW_WRITE,
+				       LRC_PPHWSP_SCRATCH_ADDR);
+
+	*batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+
+	/* Pad to end of cacheline */
+	while ((unsigned long)batch % CACHELINE_BYTES)
+		*batch++ = MI_NOOP;
+
+	/*
+	 * MI_BATCH_BUFFER_END is not required in Indirect ctx BB because
+	 * execution depends on the length specified in terms of cache lines
+	 * in the register CTX_RCS_INDIRECT_CTX
+	 */
+
+	return batch;
+}
+
+struct lri {
+	i915_reg_t reg;
+	u32 value;
+};
+
+static u32 *emit_lri(u32 *batch, const struct lri *lri, unsigned int count)
+{
+	GEM_BUG_ON(!count || count > 63);
+
+	*batch++ = MI_LOAD_REGISTER_IMM(count);
+	do {
+		*batch++ = i915_mmio_reg_offset(lri->reg);
+		*batch++ = lri->value;
+	} while (lri++, --count);
+	*batch++ = MI_NOOP;
+
+	return batch;
+}
+
+static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
+{
+	static const struct lri lri[] = {
+		/* WaDisableGatherAtSetShaderCommonSlice:skl,bxt,kbl,glk */
+		{
+			COMMON_SLICE_CHICKEN2,
+			__MASKED_FIELD(GEN9_DISABLE_GATHER_AT_SET_SHADER_COMMON_SLICE,
+				       0),
+		},
+
+		/* BSpec: 11391 */
+		{
+			FF_SLICE_CHICKEN,
+			__MASKED_FIELD(FF_SLICE_CHICKEN_CL_PROVOKING_VERTEX_FIX,
+				       FF_SLICE_CHICKEN_CL_PROVOKING_VERTEX_FIX),
+		},
+
+		/* BSpec: 11299 */
+		{
+			_3D_CHICKEN3,
+			__MASKED_FIELD(_3D_CHICKEN_SF_PROVOKING_VERTEX_FIX,
+				       _3D_CHICKEN_SF_PROVOKING_VERTEX_FIX),
+		}
+	};
+
+	*batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
+
+	/* WaFlushCoherentL3CacheLinesAtContextSwitch:skl,bxt,glk */
+	batch = gen8_emit_flush_coherentl3_wa(engine, batch);
+
+	/* WaClearSlmSpaceAtContextSwitch:skl,bxt,kbl,glk,cfl */
+	batch = gen8_emit_pipe_control(batch,
+				       PIPE_CONTROL_FLUSH_L3 |
+				       PIPE_CONTROL_STORE_DATA_INDEX |
+				       PIPE_CONTROL_CS_STALL |
+				       PIPE_CONTROL_QW_WRITE,
+				       LRC_PPHWSP_SCRATCH_ADDR);
+
+	batch = emit_lri(batch, lri, ARRAY_SIZE(lri));
+
+	/* WaMediaPoolStateCmdInWABB:bxt,glk */
+	if (HAS_POOLED_EU(engine->i915)) {
+		/*
+		 * EU pool configuration is setup along with golden context
+		 * during context initialization. This value depends on
+		 * device type (2x6 or 3x6) and needs to be updated based
+		 * on which subslice is disabled especially for 2x6
+		 * devices, however it is safe to load default
+		 * configuration of 3x6 device instead of masking off
+		 * corresponding bits because HW ignores bits of a disabled
+		 * subslice and drops down to appropriate config. Please
+		 * see render_state_setup() in i915_gem_render_state.c for
+		 * possible configurations, to avoid duplication they are
+		 * not shown here again.
+		 */
+		*batch++ = GEN9_MEDIA_POOL_STATE;
+		*batch++ = GEN9_MEDIA_POOL_ENABLE;
+		*batch++ = 0x00777000;
+		*batch++ = 0;
+		*batch++ = 0;
+		*batch++ = 0;
+	}
+
+	*batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+
+	/* Pad to end of cacheline */
+	while ((unsigned long)batch % CACHELINE_BYTES)
+		*batch++ = MI_NOOP;
+
+	return batch;
+}
+
+static u32 *
+gen10_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
+{
+	int i;
+
+	/*
+	 * WaPipeControlBefore3DStateSamplePattern: cnl
+	 *
+	 * Ensure the engine is idle prior to programming a
+	 * 3DSTATE_SAMPLE_PATTERN during a context restore.
+	 */
+	batch = gen8_emit_pipe_control(batch,
+				       PIPE_CONTROL_CS_STALL,
+				       0);
+	/*
+	 * WaPipeControlBefore3DStateSamplePattern says we need 4 dwords for
+	 * the PIPE_CONTROL followed by 12 dwords of 0x0, so 16 dwords in
+	 * total. However, a PIPE_CONTROL is 6 dwords long, not 4, which is
+	 * confusing. Since gen8_emit_pipe_control() already advances the
+	 * batch by 6 dwords, we advance the other 10 here, completing a
+	 * cacheline. It's not clear if the workaround requires this padding
+	 * before other commands, or if it's just the regular padding we would
+	 * already have for the workaround bb, so leave it here for now.
+	 */
+	for (i = 0; i < 10; i++)
+		*batch++ = MI_NOOP;
+
+	/* Pad to end of cacheline */
+	while ((unsigned long)batch % CACHELINE_BYTES)
+		*batch++ = MI_NOOP;
+
+	return batch;
+}
+
+#define CTX_WA_BB_OBJ_SIZE (PAGE_SIZE)
+
+static int lrc_setup_wa_ctx(struct intel_engine_cs *engine)
+{
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+	int err;
+
+	obj = i915_gem_object_create_shmem(engine->i915, CTX_WA_BB_OBJ_SIZE);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	vma = i915_vma_instance(obj, &engine->gt->ggtt->vm, NULL);
+	if (IS_ERR(vma)) {
+		err = PTR_ERR(vma);
+		goto err;
+	}
+
+	err = i915_ggtt_pin(vma, NULL, 0, PIN_HIGH);
+	if (err)
+		goto err;
+
+	engine->wa_ctx.vma = vma;
+	return 0;
+
+err:
+	i915_gem_object_put(obj);
+	return err;
+}
+
+typedef u32 *(*wa_bb_func_t)(struct intel_engine_cs *engine, u32 *batch);
+
+int intel_init_workaround_bb(struct intel_engine_cs *engine)
+{
+	struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
+	struct i915_wa_ctx_bb *wa_bb[2] = { &wa_ctx->indirect_ctx,
+					    &wa_ctx->per_ctx };
+	wa_bb_func_t wa_bb_fn[2];
+	void *batch, *batch_ptr;
+	unsigned int i;
+	int ret;
+
+	if (engine->class != RENDER_CLASS)
+		return 0;
+
+	switch (INTEL_GEN(engine->i915)) {
+	case 12:
+	case 11:
+		return 0;
+	case 10:
+		wa_bb_fn[0] = gen10_init_indirectctx_bb;
+		wa_bb_fn[1] = NULL;
+		break;
+	case 9:
+		wa_bb_fn[0] = gen9_init_indirectctx_bb;
+		wa_bb_fn[1] = NULL;
+		break;
+	case 8:
+		wa_bb_fn[0] = gen8_init_indirectctx_bb;
+		wa_bb_fn[1] = NULL;
+		break;
+	default:
+		MISSING_CASE(INTEL_GEN(engine->i915));
+		return 0;
+	}
+
+	ret = lrc_setup_wa_ctx(engine);
+	if (ret) {
+		drm_dbg(&engine->i915->drm,
+			"Failed to setup context WA page: %d\n", ret);
+		return ret;
+	}
+
+	batch = i915_gem_object_pin_map(wa_ctx->vma->obj, I915_MAP_WB);
+
+	/*
+	 * Emit the two workaround batch buffers, recording the offset from the
+	 * start of the workaround batch buffer object for each and their
+	 * respective sizes.
+	 */
+	batch_ptr = batch;
+	for (i = 0; i < ARRAY_SIZE(wa_bb_fn); i++) {
+		wa_bb[i]->offset = batch_ptr - batch;
+		if (GEM_DEBUG_WARN_ON(!IS_ALIGNED(wa_bb[i]->offset,
+						  CACHELINE_BYTES))) {
+			ret = -EINVAL;
+			break;
+		}
+		if (wa_bb_fn[i])
+			batch_ptr = wa_bb_fn[i](engine, batch_ptr);
+		wa_bb[i]->size = batch_ptr - (batch + wa_bb[i]->offset);
+	}
+	GEM_BUG_ON(batch_ptr - batch > CTX_WA_BB_OBJ_SIZE);
+
+	__i915_gem_object_flush_map(wa_ctx->vma->obj, 0, batch_ptr - batch);
+	__i915_gem_object_release_map(wa_ctx->vma->obj);
+	if (ret)
+		intel_fini_workaround_bb(engine);
+
+	return ret;
+}
+
+void intel_fini_workaround_bb(struct intel_engine_cs *engine)
+{
+	i915_vma_unpin_and_release(&engine->wa_ctx.vma, 0);
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_workaround_bb.h b/drivers/gpu/drm/i915/gt/intel_engine_workaround_bb.h
new file mode 100644
index 000000000000..88771d77fd42
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_engine_workaround_bb.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2014 Intel Corporation
+ */
+
+#ifndef __INTEL_ENGINE_WORKAROUND_BB_H__
+#define __INTEL_ENGINE_WORKAROUND_BB_H__
+
+struct intel_engine_cs;
+
+int intel_init_workaround_bb(struct intel_engine_cs *engine);
+void intel_fini_workaround_bb(struct intel_engine_cs *engine);
+
+#endif /* __INTEL_ENGINE_WORKAROUND_BB_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 9069a456d2f7..1cc93ea6b7f0 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -116,6 +116,7 @@
 #include "intel_breadcrumbs.h"
 #include "intel_context.h"
 #include "intel_engine_pm.h"
+#include "intel_engine_workaround_bb.h"
 #include "intel_execlists_submission.h"
 #include "intel_gt.h"
 #include "intel_gt_pm.h"
@@ -3695,330 +3696,6 @@ static int execlists_request_alloc(struct i915_request *request)
 	return 0;
 }
 
-/*
- * In this WA we need to set GEN8_L3SQCREG4[21:21] and reset it after
- * PIPE_CONTROL instruction. This is required for the flush to happen correctly
- * but there is a slight complication as this is applied in WA batch where the
- * values are only initialized once so we cannot take register value at the
- * beginning and reuse it further; hence we save its value to memory, upload a
- * constant value with bit21 set and then we restore it back with the saved value.
- * To simplify the WA, a constant value is formed by using the default value
- * of this register. This shouldn't be a problem because we are only modifying
- * it for a short period and this batch in non-premptible. We can ofcourse
- * use additional instructions that read the actual value of the register
- * at that time and set our bit of interest but it makes the WA complicated.
- *
- * This WA is also required for Gen9 so extracting as a function avoids
- * code duplication.
- */
-static u32 *
-gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
-{
-	/* NB no one else is allowed to scribble over scratch + 256! */
-	*batch++ = MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
-	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = intel_gt_scratch_offset(engine->gt,
-					   INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA);
-	*batch++ = 0;
-
-	*batch++ = MI_LOAD_REGISTER_IMM(1);
-	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = 0x40400000 | GEN8_LQSC_FLUSH_COHERENT_LINES;
-
-	batch = gen8_emit_pipe_control(batch,
-				       PIPE_CONTROL_CS_STALL |
-				       PIPE_CONTROL_DC_FLUSH_ENABLE,
-				       0);
-
-	*batch++ = MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
-	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = intel_gt_scratch_offset(engine->gt,
-					   INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA);
-	*batch++ = 0;
-
-	return batch;
-}
-
-/*
- * Typically we only have one indirect_ctx and per_ctx batch buffer which are
- * initialized at the beginning and shared across all contexts but this field
- * helps us to have multiple batches at different offsets and select them based
- * on a criteria. At the moment this batch always start at the beginning of the page
- * and at this point we don't have multiple wa_ctx batch buffers.
- *
- * The number of WA applied are not known at the beginning; we use this field
- * to return the no of DWORDS written.
- *
- * It is to be noted that this batch does not contain MI_BATCH_BUFFER_END
- * so it adds NOOPs as padding to make it cacheline aligned.
- * MI_BATCH_BUFFER_END will be added to perctx batch and both of them together
- * makes a complete batch buffer.
- */
-static u32 *gen8_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
-{
-	/* WaDisableCtxRestoreArbitration:bdw,chv */
-	*batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
-
-	/* WaFlushCoherentL3CacheLinesAtContextSwitch:bdw */
-	if (IS_BROADWELL(engine->i915))
-		batch = gen8_emit_flush_coherentl3_wa(engine, batch);
-
-	/* WaClearSlmSpaceAtContextSwitch:bdw,chv */
-	/* Actual scratch location is at 128 bytes offset */
-	batch = gen8_emit_pipe_control(batch,
-				       PIPE_CONTROL_FLUSH_L3 |
-				       PIPE_CONTROL_STORE_DATA_INDEX |
-				       PIPE_CONTROL_CS_STALL |
-				       PIPE_CONTROL_QW_WRITE,
-				       LRC_PPHWSP_SCRATCH_ADDR);
-
-	*batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
-
-	/* Pad to end of cacheline */
-	while ((unsigned long)batch % CACHELINE_BYTES)
-		*batch++ = MI_NOOP;
-
-	/*
-	 * MI_BATCH_BUFFER_END is not required in Indirect ctx BB because
-	 * execution depends on the length specified in terms of cache lines
-	 * in the register CTX_RCS_INDIRECT_CTX
-	 */
-
-	return batch;
-}
-
-struct lri {
-	i915_reg_t reg;
-	u32 value;
-};
-
-static u32 *emit_lri(u32 *batch, const struct lri *lri, unsigned int count)
-{
-	GEM_BUG_ON(!count || count > 63);
-
-	*batch++ = MI_LOAD_REGISTER_IMM(count);
-	do {
-		*batch++ = i915_mmio_reg_offset(lri->reg);
-		*batch++ = lri->value;
-	} while (lri++, --count);
-	*batch++ = MI_NOOP;
-
-	return batch;
-}
-
-static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
-{
-	static const struct lri lri[] = {
-		/* WaDisableGatherAtSetShaderCommonSlice:skl,bxt,kbl,glk */
-		{
-			COMMON_SLICE_CHICKEN2,
-			__MASKED_FIELD(GEN9_DISABLE_GATHER_AT_SET_SHADER_COMMON_SLICE,
-				       0),
-		},
-
-		/* BSpec: 11391 */
-		{
-			FF_SLICE_CHICKEN,
-			__MASKED_FIELD(FF_SLICE_CHICKEN_CL_PROVOKING_VERTEX_FIX,
-				       FF_SLICE_CHICKEN_CL_PROVOKING_VERTEX_FIX),
-		},
-
-		/* BSpec: 11299 */
-		{
-			_3D_CHICKEN3,
-			__MASKED_FIELD(_3D_CHICKEN_SF_PROVOKING_VERTEX_FIX,
-				       _3D_CHICKEN_SF_PROVOKING_VERTEX_FIX),
-		}
-	};
-
-	*batch++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
-
-	/* WaFlushCoherentL3CacheLinesAtContextSwitch:skl,bxt,glk */
-	batch = gen8_emit_flush_coherentl3_wa(engine, batch);
-
-	/* WaClearSlmSpaceAtContextSwitch:skl,bxt,kbl,glk,cfl */
-	batch = gen8_emit_pipe_control(batch,
-				       PIPE_CONTROL_FLUSH_L3 |
-				       PIPE_CONTROL_STORE_DATA_INDEX |
-				       PIPE_CONTROL_CS_STALL |
-				       PIPE_CONTROL_QW_WRITE,
-				       LRC_PPHWSP_SCRATCH_ADDR);
-
-	batch = emit_lri(batch, lri, ARRAY_SIZE(lri));
-
-	/* WaMediaPoolStateCmdInWABB:bxt,glk */
-	if (HAS_POOLED_EU(engine->i915)) {
-		/*
-		 * EU pool configuration is setup along with golden context
-		 * during context initialization. This value depends on
-		 * device type (2x6 or 3x6) and needs to be updated based
-		 * on which subslice is disabled especially for 2x6
-		 * devices, however it is safe to load default
-		 * configuration of 3x6 device instead of masking off
-		 * corresponding bits because HW ignores bits of a disabled
-		 * subslice and drops down to appropriate config. Please
-		 * see render_state_setup() in i915_gem_render_state.c for
-		 * possible configurations, to avoid duplication they are
-		 * not shown here again.
-		 */
-		*batch++ = GEN9_MEDIA_POOL_STATE;
-		*batch++ = GEN9_MEDIA_POOL_ENABLE;
-		*batch++ = 0x00777000;
-		*batch++ = 0;
-		*batch++ = 0;
-		*batch++ = 0;
-	}
-
-	*batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
-
-	/* Pad to end of cacheline */
-	while ((unsigned long)batch % CACHELINE_BYTES)
-		*batch++ = MI_NOOP;
-
-	return batch;
-}
-
-static u32 *
-gen10_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
-{
-	int i;
-
-	/*
-	 * WaPipeControlBefore3DStateSamplePattern: cnl
-	 *
-	 * Ensure the engine is idle prior to programming a
-	 * 3DSTATE_SAMPLE_PATTERN during a context restore.
-	 */
-	batch = gen8_emit_pipe_control(batch,
-				       PIPE_CONTROL_CS_STALL,
-				       0);
-	/*
-	 * WaPipeControlBefore3DStateSamplePattern says we need 4 dwords for
-	 * the PIPE_CONTROL followed by 12 dwords of 0x0, so 16 dwords in
-	 * total. However, a PIPE_CONTROL is 6 dwords long, not 4, which is
-	 * confusing. Since gen8_emit_pipe_control() already advances the
-	 * batch by 6 dwords, we advance the other 10 here, completing a
-	 * cacheline. It's not clear if the workaround requires this padding
-	 * before other commands, or if it's just the regular padding we would
-	 * already have for the workaround bb, so leave it here for now.
-	 */
-	for (i = 0; i < 10; i++)
-		*batch++ = MI_NOOP;
-
-	/* Pad to end of cacheline */
-	while ((unsigned long)batch % CACHELINE_BYTES)
-		*batch++ = MI_NOOP;
-
-	return batch;
-}
-
-#define CTX_WA_BB_OBJ_SIZE (PAGE_SIZE)
-
-static int lrc_setup_wa_ctx(struct intel_engine_cs *engine)
-{
-	struct drm_i915_gem_object *obj;
-	struct i915_vma *vma;
-	int err;
-
-	obj = i915_gem_object_create_shmem(engine->i915, CTX_WA_BB_OBJ_SIZE);
-	if (IS_ERR(obj))
-		return PTR_ERR(obj);
-
-	vma = i915_vma_instance(obj, &engine->gt->ggtt->vm, NULL);
-	if (IS_ERR(vma)) {
-		err = PTR_ERR(vma);
-		goto err;
-	}
-
-	err = i915_ggtt_pin(vma, NULL, 0, PIN_HIGH);
-	if (err)
-		goto err;
-
-	engine->wa_ctx.vma = vma;
-	return 0;
-
-err:
-	i915_gem_object_put(obj);
-	return err;
-}
-
-static void lrc_destroy_wa_ctx(struct intel_engine_cs *engine)
-{
-	i915_vma_unpin_and_release(&engine->wa_ctx.vma, 0);
-}
-
-typedef u32 *(*wa_bb_func_t)(struct intel_engine_cs *engine, u32 *batch);
-
-static int intel_init_workaround_bb(struct intel_engine_cs *engine)
-{
-	struct i915_ctx_workarounds *wa_ctx = &engine->wa_ctx;
-	struct i915_wa_ctx_bb *wa_bb[2] = { &wa_ctx->indirect_ctx,
-					    &wa_ctx->per_ctx };
-	wa_bb_func_t wa_bb_fn[2];
-	void *batch, *batch_ptr;
-	unsigned int i;
-	int ret;
-
-	if (engine->class != RENDER_CLASS)
-		return 0;
-
-	switch (INTEL_GEN(engine->i915)) {
-	case 12:
-	case 11:
-		return 0;
-	case 10:
-		wa_bb_fn[0] = gen10_init_indirectctx_bb;
-		wa_bb_fn[1] = NULL;
-		break;
-	case 9:
-		wa_bb_fn[0] = gen9_init_indirectctx_bb;
-		wa_bb_fn[1] = NULL;
-		break;
-	case 8:
-		wa_bb_fn[0] = gen8_init_indirectctx_bb;
-		wa_bb_fn[1] = NULL;
-		break;
-	default:
-		MISSING_CASE(INTEL_GEN(engine->i915));
-		return 0;
-	}
-
-	ret = lrc_setup_wa_ctx(engine);
-	if (ret) {
-		drm_dbg(&engine->i915->drm,
-			"Failed to setup context WA page: %d\n", ret);
-		return ret;
-	}
-
-	batch = i915_gem_object_pin_map(wa_ctx->vma->obj, I915_MAP_WB);
-
-	/*
-	 * Emit the two workaround batch buffers, recording the offset from the
-	 * start of the workaround batch buffer object for each and their
-	 * respective sizes.
-	 */
-	batch_ptr = batch;
-	for (i = 0; i < ARRAY_SIZE(wa_bb_fn); i++) {
-		wa_bb[i]->offset = batch_ptr - batch;
-		if (GEM_DEBUG_WARN_ON(!IS_ALIGNED(wa_bb[i]->offset,
-						  CACHELINE_BYTES))) {
-			ret = -EINVAL;
-			break;
-		}
-		if (wa_bb_fn[i])
-			batch_ptr = wa_bb_fn[i](engine, batch_ptr);
-		wa_bb[i]->size = batch_ptr - (batch + wa_bb[i]->offset);
-	}
-	GEM_BUG_ON(batch_ptr - batch > CTX_WA_BB_OBJ_SIZE);
-
-	__i915_gem_object_flush_map(wa_ctx->vma->obj, 0, batch_ptr - batch);
-	__i915_gem_object_release_map(wa_ctx->vma->obj);
-	if (ret)
-		lrc_destroy_wa_ctx(engine);
-
-	return ret;
-}
-
 static void reset_csb_pointers(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
@@ -4707,7 +4384,7 @@ static void execlists_release(struct intel_engine_cs *engine)
 	execlists_shutdown(engine);
 
 	intel_engine_cleanup_common(engine);
-	lrc_destroy_wa_ctx(engine);
+	intel_fini_workaround_bb(engine);
 }
 
 static void
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 008/162] HAX drm/i915: Work around the selftest timeline lock splat workaround
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (6 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 007/162] drm/i915: split wa_bb code to its " Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 009/162] drm/i915: Introduce drm_i915_lock_isolated Matthew Auld
                   ` (153 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Hellström, Mattew Auld, dri-devel

From: Thomas Hellström <thomas.hellstrom@intel.com>

There is a dirty hack to work around a lockdep splat because incorrect
ordering of selftest timeline lock against other locks. However, some
selftests recently started to use the same nesting level as the workaround
and thus introduced more splats. Add a workaround to the workaround making
some selftests aware of the workaround.

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Mattew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_context.c     |  3 ++-
 drivers/gpu/drm/i915/gt/intel_context.h     |  2 ++
 drivers/gpu/drm/i915/gt/selftest_timeline.c | 10 ++++++----
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 349e7fa1488d..b63a8eb6c1a9 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -495,7 +495,8 @@ struct i915_request *intel_context_create_request(struct intel_context *ce)
 	 */
 	lockdep_unpin_lock(&ce->timeline->mutex, rq->cookie);
 	mutex_release(&ce->timeline->mutex.dep_map, _RET_IP_);
-	mutex_acquire(&ce->timeline->mutex.dep_map, SINGLE_DEPTH_NESTING, 0, _RET_IP_);
+	mutex_acquire(&ce->timeline->mutex.dep_map, SELFTEST_WA_NESTING, 0,
+		      _RET_IP_);
 	rq->cookie = lockdep_pin_lock(&ce->timeline->mutex);
 
 	return rq;
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index fda2eba81e22..175d505951c7 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -25,6 +25,8 @@
 		     ##__VA_ARGS__);					\
 } while (0)
 
+#define SELFTEST_WA_NESTING SINGLE_DEPTH_NESTING
+
 struct i915_gem_ww_ctx;
 
 void intel_context_init(struct intel_context *ce,
diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c
index e4285d5a0360..fa3fec049542 100644
--- a/drivers/gpu/drm/i915/gt/selftest_timeline.c
+++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c
@@ -688,7 +688,7 @@ static int live_hwsp_wrap(void *arg)
 
 		tl->seqno = -4u;
 
-		mutex_lock_nested(&tl->mutex, SINGLE_DEPTH_NESTING);
+		mutex_lock_nested(&tl->mutex, SELFTEST_WA_NESTING + 1);
 		err = intel_timeline_get_seqno(tl, rq, &seqno[0]);
 		mutex_unlock(&tl->mutex);
 		if (err) {
@@ -705,7 +705,7 @@ static int live_hwsp_wrap(void *arg)
 		}
 		hwsp_seqno[0] = tl->hwsp_seqno;
 
-		mutex_lock_nested(&tl->mutex, SINGLE_DEPTH_NESTING);
+		mutex_lock_nested(&tl->mutex, SELFTEST_WA_NESTING + 1);
 		err = intel_timeline_get_seqno(tl, rq, &seqno[1]);
 		mutex_unlock(&tl->mutex);
 		if (err) {
@@ -1037,7 +1037,8 @@ static int live_hwsp_read(void *arg)
 				goto out;
 			}
 
-			mutex_lock(&watcher[0].rq->context->timeline->mutex);
+			mutex_lock_nested(&watcher[0].rq->context->timeline->mutex,
+					  SELFTEST_WA_NESTING + 1);
 			err = intel_timeline_read_hwsp(rq, watcher[0].rq, &hwsp);
 			if (err == 0)
 				err = emit_read_hwsp(watcher[0].rq, /* before */
@@ -1050,7 +1051,8 @@ static int live_hwsp_read(void *arg)
 				goto out;
 			}
 
-			mutex_lock(&watcher[1].rq->context->timeline->mutex);
+			mutex_lock_nested(&watcher[1].rq->context->timeline->mutex,
+					  SELFTEST_WA_NESTING + 1);
 			err = intel_timeline_read_hwsp(rq, watcher[1].rq, &hwsp);
 			if (err == 0)
 				err = emit_read_hwsp(watcher[1].rq, /* after */
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 009/162] drm/i915: Introduce drm_i915_lock_isolated
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (7 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 008/162] HAX drm/i915: Work around the selftest timeline lock splat workaround Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 010/162] drm/i915: Lock hwsp objects isolated for pinning at create time Matthew Auld
                   ` (152 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Hellström, dri-devel

From: Thomas Hellström <thomas.hellstrom@intel.com>

When an object is just created and not yet put on any lists, there's
a single owner and thus trylock will always succeed. Introduce
drm_i915_lock_isolated to annotate trylock in this situation.
This is similar to TTM's create_locked() functionality.

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index be14486f63a7..d61194ef484e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -107,6 +107,13 @@ i915_gem_object_put(struct drm_i915_gem_object *obj)
 
 #define assert_object_held(obj) dma_resv_assert_held((obj)->base.resv)
 
+#define object_is_isolated(obj)					\
+	(!IS_ENABLED(CONFIG_LOCKDEP) ||				\
+	 ((kref_read(&obj->base.refcount) == 0) ||		\
+	  ((kref_read(&obj->base.refcount) == 1) &&		\
+	   list_empty_careful(&obj->mm.link) &&			\
+	   list_empty_careful(&obj->vma.list))))
+
 static inline int __i915_gem_object_lock(struct drm_i915_gem_object *obj,
 					 struct i915_gem_ww_ctx *ww,
 					 bool intr)
@@ -147,6 +154,15 @@ static inline bool i915_gem_object_trylock(struct drm_i915_gem_object *obj)
 	return dma_resv_trylock(obj->base.resv);
 }
 
+static inline void i915_gem_object_lock_isolated(struct drm_i915_gem_object *obj)
+{
+	int ret;
+
+	WARN_ON(!object_is_isolated(obj));
+	ret = dma_resv_trylock(obj->base.resv);
+	GEM_WARN_ON(!ret);
+}
+
 static inline void i915_gem_object_unlock(struct drm_i915_gem_object *obj)
 {
 	dma_resv_unlock(obj->base.resv);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 010/162] drm/i915: Lock hwsp objects isolated for pinning at create time
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (8 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 009/162] drm/i915: Introduce drm_i915_lock_isolated Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 011/162] drm/i915: Pin timeline map after first timeline pin, v5 Matthew Auld
                   ` (151 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Hellström, dri-devel

From: Thomas Hellström <thomas.hellstrom@intel.com>

We may need to create hwsp objects at request treate time in the
middle of a ww transaction. Since we typically don't have easy
access to the ww_acquire_context, lock the hwsp objects isolated
for pinning/mapping only at create time.
For later binding to the ggtt, make sure lockdep allows
binding of already pinned pages to the ggtt without the
underlying object lock held.

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_timeline.c | 58 ++++++++++++++----------
 drivers/gpu/drm/i915/i915_vma.c          | 13 ++++--
 2 files changed, 44 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 512afacd2bdc..a58228d1cd3b 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -24,25 +24,43 @@ struct intel_timeline_hwsp {
 	struct list_head free_link;
 	struct i915_vma *vma;
 	u64 free_bitmap;
+	void *vaddr;
 };
 
-static struct i915_vma *__hwsp_alloc(struct intel_gt *gt)
+static int __hwsp_alloc(struct intel_gt *gt, struct intel_timeline_hwsp *hwsp)
 {
 	struct drm_i915_private *i915 = gt->i915;
 	struct drm_i915_gem_object *obj;
-	struct i915_vma *vma;
+	int ret;
 
 	obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
 	if (IS_ERR(obj))
-		return ERR_CAST(obj);
+		return PTR_ERR(obj);
 
+	i915_gem_object_lock_isolated(obj);
 	i915_gem_object_set_cache_coherency(obj, I915_CACHE_LLC);
 
-	vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL);
-	if (IS_ERR(vma))
-		i915_gem_object_put(obj);
+	hwsp->vma = i915_vma_instance(obj, &gt->ggtt->vm, NULL);
+	if (IS_ERR(hwsp->vma)) {
+		ret = PTR_ERR(hwsp->vma);
+		goto out_unlock;
+	}
+
+	/* Pin early so we can call i915_ggtt_pin unlocked. */
+	hwsp->vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	if (IS_ERR(hwsp->vaddr)) {
+		ret = PTR_ERR(hwsp->vaddr);
+		goto out_unlock;
+	}
+
+	i915_gem_object_unlock(obj);
+	return 0;
+
+out_unlock:
+	i915_gem_object_unlock(obj);
+	i915_gem_object_put(obj);
 
-	return vma;
+	return ret;
 }
 
 static struct i915_vma *
@@ -59,7 +77,7 @@ hwsp_alloc(struct intel_timeline *timeline, unsigned int *cacheline)
 	hwsp = list_first_entry_or_null(&gt->hwsp_free_list,
 					typeof(*hwsp), free_link);
 	if (!hwsp) {
-		struct i915_vma *vma;
+		int ret;
 
 		spin_unlock_irq(&gt->hwsp_lock);
 
@@ -67,17 +85,16 @@ hwsp_alloc(struct intel_timeline *timeline, unsigned int *cacheline)
 		if (!hwsp)
 			return ERR_PTR(-ENOMEM);
 
-		vma = __hwsp_alloc(timeline->gt);
-		if (IS_ERR(vma)) {
+		ret = __hwsp_alloc(timeline->gt, hwsp);
+		if (ret) {
 			kfree(hwsp);
-			return vma;
+			return ERR_PTR(ret);
 		}
 
 		GT_TRACE(timeline->gt, "new HWSP allocated\n");
 
-		vma->private = hwsp;
+		hwsp->vma->private = hwsp;
 		hwsp->gt = timeline->gt;
-		hwsp->vma = vma;
 		hwsp->free_bitmap = ~0ull;
 		hwsp->gt_timelines = gt;
 
@@ -113,9 +130,12 @@ static void __idle_hwsp_free(struct intel_timeline_hwsp *hwsp, int cacheline)
 
 	/* And if no one is left using it, give the page back to the system */
 	if (hwsp->free_bitmap == ~0ull) {
-		i915_vma_put(hwsp->vma);
 		list_del(&hwsp->free_link);
+		spin_unlock_irqrestore(&gt->hwsp_lock, flags);
+		i915_gem_object_unpin_map(hwsp->vma->obj);
+		i915_vma_put(hwsp->vma);
 		kfree(hwsp);
+		return;
 	}
 
 	spin_unlock_irqrestore(&gt->hwsp_lock, flags);
@@ -134,7 +154,6 @@ static void __idle_cacheline_free(struct intel_timeline_cacheline *cl)
 {
 	GEM_BUG_ON(!i915_active_is_idle(&cl->active));
 
-	i915_gem_object_unpin_map(cl->hwsp->vma->obj);
 	i915_vma_put(cl->hwsp->vma);
 	__idle_hwsp_free(cl->hwsp, ptr_unmask_bits(cl->vaddr, CACHELINE_BITS));
 
@@ -165,7 +184,6 @@ static struct intel_timeline_cacheline *
 cacheline_alloc(struct intel_timeline_hwsp *hwsp, unsigned int cacheline)
 {
 	struct intel_timeline_cacheline *cl;
-	void *vaddr;
 
 	GEM_BUG_ON(cacheline >= BIT(CACHELINE_BITS));
 
@@ -173,15 +191,9 @@ cacheline_alloc(struct intel_timeline_hwsp *hwsp, unsigned int cacheline)
 	if (!cl)
 		return ERR_PTR(-ENOMEM);
 
-	vaddr = i915_gem_object_pin_map(hwsp->vma->obj, I915_MAP_WB);
-	if (IS_ERR(vaddr)) {
-		kfree(cl);
-		return ERR_CAST(vaddr);
-	}
-
 	i915_vma_get(hwsp->vma);
 	cl->hwsp = hwsp;
-	cl->vaddr = page_pack_bits(vaddr, cacheline);
+	cl->vaddr = page_pack_bits(hwsp->vaddr, cacheline);
 
 	i915_active_init(&cl->active, __cacheline_active, __cacheline_retire);
 
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index caa9b041616b..8e8c80ccbe32 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -862,10 +862,15 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	unsigned int bound;
 	int err;
 
-#ifdef CONFIG_PROVE_LOCKING
-	if (debug_locks && lockdep_is_held(&vma->vm->i915->drm.struct_mutex))
-		WARN_ON(!ww);
-#endif
+	if (IS_ENABLED(CONFIG_PROVE_LOCKING) && debug_locks) {
+		bool pinned_bind_wo_alloc =
+			vma->obj && i915_gem_object_has_pinned_pages(vma->obj) &&
+			!vma->vm->allocate_va_range;
+
+		if (lockdep_is_held(&vma->vm->i915->drm.struct_mutex) &&
+		    !pinned_bind_wo_alloc)
+			WARN_ON(!ww);
+	}
 
 	BUILD_BUG_ON(PIN_GLOBAL != I915_VMA_GLOBAL_BIND);
 	BUILD_BUG_ON(PIN_USER != I915_VMA_LOCAL_BIND);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 011/162] drm/i915: Pin timeline map after first timeline pin, v5.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (9 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 010/162] drm/i915: Lock hwsp objects isolated for pinning at create time Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 012/162] drm/i915: Move cmd parser pinning to execbuffer Matthew Auld
                   ` (150 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: kernel test robot, dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

We're starting to require the reservation lock for pinning,
so wait until we have that.

Update the selftests to handle this correctly, and ensure pin is
called in live_hwsp_rollover_user() and mock_hwsp_freelist().

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_timeline.c      | 49 ++++++++++----
 drivers/gpu/drm/i915/gt/intel_timeline.h      |  1 +
 .../gpu/drm/i915/gt/intel_timeline_types.h    |  1 +
 drivers/gpu/drm/i915/gt/mock_engine.c         | 24 ++++++-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   | 64 ++++++++++---------
 drivers/gpu/drm/i915/i915_selftest.h          |  2 +
 6 files changed, 96 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index a58228d1cd3b..479eb5440bc6 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -229,13 +229,30 @@ static void cacheline_free(struct intel_timeline_cacheline *cl)
 	i915_active_release(&cl->active);
 }
 
+I915_SELFTEST_EXPORT int
+intel_timeline_pin_map(struct intel_timeline *timeline)
+{
+	if (!timeline->hwsp_cacheline) {
+		struct drm_i915_gem_object *obj = timeline->hwsp_ggtt->obj;
+		u32 ofs = offset_in_page(timeline->hwsp_offset);
+		void *vaddr;
+
+		vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+		if (IS_ERR(vaddr))
+			return PTR_ERR(vaddr);
+
+		timeline->hwsp_map = vaddr;
+		timeline->hwsp_seqno = memset(vaddr + ofs, 0, CACHELINE_BYTES);
+	}
+
+	return 0;
+}
+
 static int intel_timeline_init(struct intel_timeline *timeline,
 			       struct intel_gt *gt,
 			       struct i915_vma *hwsp,
 			       unsigned int offset)
 {
-	void *vaddr;
-
 	kref_init(&timeline->kref);
 	atomic_set(&timeline->pin_count, 0);
 
@@ -260,18 +277,15 @@ static int intel_timeline_init(struct intel_timeline *timeline,
 
 		timeline->hwsp_cacheline = cl;
 		timeline->hwsp_offset = cacheline * CACHELINE_BYTES;
-
-		vaddr = page_mask_bits(cl->vaddr);
+		timeline->hwsp_map = page_mask_bits(cl->vaddr);
+		timeline->hwsp_seqno =
+			memset(timeline->hwsp_map + timeline->hwsp_offset, 0,
+			       CACHELINE_BYTES);
 	} else {
 		timeline->hwsp_offset = offset;
-		vaddr = i915_gem_object_pin_map(hwsp->obj, I915_MAP_WB);
-		if (IS_ERR(vaddr))
-			return PTR_ERR(vaddr);
+		timeline->hwsp_map = NULL;
 	}
 
-	timeline->hwsp_seqno =
-		memset(vaddr + timeline->hwsp_offset, 0, CACHELINE_BYTES);
-
 	timeline->hwsp_ggtt = i915_vma_get(hwsp);
 	GEM_BUG_ON(timeline->hwsp_offset >= hwsp->size);
 
@@ -306,7 +320,7 @@ static void intel_timeline_fini(struct intel_timeline *timeline)
 
 	if (timeline->hwsp_cacheline)
 		cacheline_free(timeline->hwsp_cacheline);
-	else
+	else if (timeline->hwsp_map)
 		i915_gem_object_unpin_map(timeline->hwsp_ggtt->obj);
 
 	i915_vma_put(timeline->hwsp_ggtt);
@@ -346,9 +360,18 @@ int intel_timeline_pin(struct intel_timeline *tl, struct i915_gem_ww_ctx *ww)
 	if (atomic_add_unless(&tl->pin_count, 1, 0))
 		return 0;
 
+	if (!tl->hwsp_cacheline) {
+		err = intel_timeline_pin_map(tl);
+		if (err)
+			return err;
+	}
+
 	err = i915_ggtt_pin(tl->hwsp_ggtt, ww, 0, PIN_HIGH);
-	if (err)
+	if (err) {
+		if (!tl->hwsp_cacheline)
+			i915_gem_object_unpin_map(tl->hwsp_ggtt->obj);
 		return err;
+	}
 
 	tl->hwsp_offset =
 		i915_ggtt_offset(tl->hwsp_ggtt) +
@@ -360,6 +383,8 @@ int intel_timeline_pin(struct intel_timeline *tl, struct i915_gem_ww_ctx *ww)
 	if (atomic_fetch_inc(&tl->pin_count)) {
 		cacheline_release(tl->hwsp_cacheline);
 		__i915_vma_unpin(tl->hwsp_ggtt);
+		if (!tl->hwsp_cacheline)
+			i915_gem_object_unpin_map(tl->hwsp_ggtt->obj);
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.h b/drivers/gpu/drm/i915/gt/intel_timeline.h
index 634acebd0c4b..725bae16237c 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.h
@@ -114,5 +114,6 @@ void intel_gt_show_timelines(struct intel_gt *gt,
 						  const struct i915_request *rq,
 						  const char *prefix,
 						  int indent));
+I915_SELFTEST_DECLARE(int intel_timeline_pin_map(struct intel_timeline *tl));
 
 #endif
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline_types.h b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
index 4474f487f589..cac7fa3dfd43 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_timeline_types.h
@@ -45,6 +45,7 @@ struct intel_timeline {
 	atomic_t pin_count;
 	atomic_t active_count;
 
+	void *hwsp_map;
 	const u32 *hwsp_seqno;
 	struct i915_vma *hwsp_ggtt;
 	u32 hwsp_offset;
diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
index 2f830017c51d..016f4f345706 100644
--- a/drivers/gpu/drm/i915/gt/mock_engine.c
+++ b/drivers/gpu/drm/i915/gt/mock_engine.c
@@ -32,9 +32,22 @@
 #include "mock_engine.h"
 #include "selftests/mock_request.h"
 
-static void mock_timeline_pin(struct intel_timeline *tl)
+static int mock_timeline_pin(struct intel_timeline *tl)
 {
+	int err;
+
+	if (!tl->hwsp_cacheline) {
+		if (WARN_ON(!i915_gem_object_trylock(tl->hwsp_ggtt->obj)))
+			return -EBUSY;
+
+		err = intel_timeline_pin_map(tl);
+		i915_gem_object_unlock(tl->hwsp_ggtt->obj);
+		if (err)
+			return err;
+	}
+
 	atomic_inc(&tl->pin_count);
+	return 0;
 }
 
 static void mock_timeline_unpin(struct intel_timeline *tl)
@@ -152,6 +165,8 @@ static void mock_context_destroy(struct kref *ref)
 
 static int mock_context_alloc(struct intel_context *ce)
 {
+	int err;
+
 	ce->ring = mock_ring(ce->engine);
 	if (!ce->ring)
 		return -ENOMEM;
@@ -162,7 +177,12 @@ static int mock_context_alloc(struct intel_context *ce)
 		return PTR_ERR(ce->timeline);
 	}
 
-	mock_timeline_pin(ce->timeline);
+	err = mock_timeline_pin(ce->timeline);
+	if (err) {
+		intel_timeline_put(ce->timeline);
+		ce->timeline = NULL;
+		return err;
+	}
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c
index fa3fec049542..7435abf5a703 100644
--- a/drivers/gpu/drm/i915/gt/selftest_timeline.c
+++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c
@@ -34,7 +34,7 @@ static unsigned long hwsp_cacheline(struct intel_timeline *tl)
 {
 	unsigned long address = (unsigned long)page_address(hwsp_page(tl));
 
-	return (address + tl->hwsp_offset) / CACHELINE_BYTES;
+	return (address + offset_in_page(tl->hwsp_offset)) / CACHELINE_BYTES;
 }
 
 #define CACHELINES_PER_PAGE (PAGE_SIZE / CACHELINE_BYTES)
@@ -58,6 +58,7 @@ static void __mock_hwsp_record(struct mock_hwsp_freelist *state,
 	tl = xchg(&state->history[idx], tl);
 	if (tl) {
 		radix_tree_delete(&state->cachelines, hwsp_cacheline(tl));
+		intel_timeline_unpin(tl);
 		intel_timeline_put(tl);
 	}
 }
@@ -77,6 +78,12 @@ static int __mock_hwsp_timeline(struct mock_hwsp_freelist *state,
 		if (IS_ERR(tl))
 			return PTR_ERR(tl);
 
+		err = intel_timeline_pin(tl, NULL);
+		if (err) {
+			intel_timeline_put(tl);
+			return err;
+		}
+
 		cacheline = hwsp_cacheline(tl);
 		err = radix_tree_insert(&state->cachelines, cacheline, tl);
 		if (err) {
@@ -84,6 +91,7 @@ static int __mock_hwsp_timeline(struct mock_hwsp_freelist *state,
 				pr_err("HWSP cacheline %lu already used; duplicate allocation!\n",
 				       cacheline);
 			}
+			intel_timeline_unpin(tl);
 			intel_timeline_put(tl);
 			return err;
 		}
@@ -451,7 +459,7 @@ static int emit_ggtt_store_dw(struct i915_request *rq, u32 addr, u32 value)
 }
 
 static struct i915_request *
-tl_write(struct intel_timeline *tl, struct intel_engine_cs *engine, u32 value)
+checked_tl_write(struct intel_timeline *tl, struct intel_engine_cs *engine, u32 value)
 {
 	struct i915_request *rq;
 	int err;
@@ -462,6 +470,13 @@ tl_write(struct intel_timeline *tl, struct intel_engine_cs *engine, u32 value)
 		goto out;
 	}
 
+	if (READ_ONCE(*tl->hwsp_seqno) != tl->seqno) {
+		pr_err("Timeline created with incorrect breadcrumb, found %x, expected %x\n",
+		       *tl->hwsp_seqno, tl->seqno);
+		intel_timeline_unpin(tl);
+		return ERR_PTR(-EINVAL);
+	}
+
 	rq = intel_engine_create_kernel_request(engine);
 	if (IS_ERR(rq))
 		goto out_unpin;
@@ -483,25 +498,6 @@ tl_write(struct intel_timeline *tl, struct intel_engine_cs *engine, u32 value)
 	return rq;
 }
 
-static struct intel_timeline *
-checked_intel_timeline_create(struct intel_gt *gt)
-{
-	struct intel_timeline *tl;
-
-	tl = intel_timeline_create(gt);
-	if (IS_ERR(tl))
-		return tl;
-
-	if (READ_ONCE(*tl->hwsp_seqno) != tl->seqno) {
-		pr_err("Timeline created with incorrect breadcrumb, found %x, expected %x\n",
-		       *tl->hwsp_seqno, tl->seqno);
-		intel_timeline_put(tl);
-		return ERR_PTR(-EINVAL);
-	}
-
-	return tl;
-}
-
 static int live_hwsp_engine(void *arg)
 {
 #define NUM_TIMELINES 4096
@@ -534,13 +530,13 @@ static int live_hwsp_engine(void *arg)
 			struct intel_timeline *tl;
 			struct i915_request *rq;
 
-			tl = checked_intel_timeline_create(gt);
+			tl = intel_timeline_create(gt);
 			if (IS_ERR(tl)) {
 				err = PTR_ERR(tl);
 				break;
 			}
 
-			rq = tl_write(tl, engine, count);
+			rq = checked_tl_write(tl, engine, count);
 			if (IS_ERR(rq)) {
 				intel_timeline_put(tl);
 				err = PTR_ERR(rq);
@@ -607,14 +603,14 @@ static int live_hwsp_alternate(void *arg)
 			if (!intel_engine_can_store_dword(engine))
 				continue;
 
-			tl = checked_intel_timeline_create(gt);
+			tl = intel_timeline_create(gt);
 			if (IS_ERR(tl)) {
 				err = PTR_ERR(tl);
 				goto out;
 			}
 
 			intel_engine_pm_get(engine);
-			rq = tl_write(tl, engine, count);
+			rq = checked_tl_write(tl, engine, count);
 			intel_engine_pm_put(engine);
 			if (IS_ERR(rq)) {
 				intel_timeline_put(tl);
@@ -1239,8 +1235,13 @@ static int live_hwsp_rollover_user(void *arg)
 		if (!tl->has_initial_breadcrumb || !tl->hwsp_cacheline)
 			goto out;
 
+		err = intel_context_pin(ce);
+		if (err)
+			goto out;
+
 		timeline_rollback(tl);
 		timeline_rollback(tl);
+
 		WRITE_ONCE(*(u32 *)tl->hwsp_seqno, tl->seqno);
 
 		for (i = 0; i < ARRAY_SIZE(rq); i++) {
@@ -1249,7 +1250,7 @@ static int live_hwsp_rollover_user(void *arg)
 			this = intel_context_create_request(ce);
 			if (IS_ERR(this)) {
 				err = PTR_ERR(this);
-				goto out;
+				goto out_unpin;
 			}
 
 			pr_debug("%s: create fence.seqnp:%d\n",
@@ -1268,17 +1269,18 @@ static int live_hwsp_rollover_user(void *arg)
 		if (i915_request_wait(rq[2], 0, HZ / 5) < 0) {
 			pr_err("Wait for timeline wrap timed out!\n");
 			err = -EIO;
-			goto out;
+			goto out_unpin;
 		}
 
 		for (i = 0; i < ARRAY_SIZE(rq); i++) {
 			if (!i915_request_completed(rq[i])) {
 				pr_err("Pre-wrap request not completed!\n");
 				err = -EINVAL;
-				goto out;
+				goto out_unpin;
 			}
 		}
-
+out_unpin:
+		intel_context_unpin(ce);
 out:
 		for (i = 0; i < ARRAY_SIZE(rq); i++)
 			i915_request_put(rq[i]);
@@ -1320,13 +1322,13 @@ static int live_hwsp_recycle(void *arg)
 			struct intel_timeline *tl;
 			struct i915_request *rq;
 
-			tl = checked_intel_timeline_create(gt);
+			tl = intel_timeline_create(gt);
 			if (IS_ERR(tl)) {
 				err = PTR_ERR(tl);
 				break;
 			}
 
-			rq = tl_write(tl, engine, count);
+			rq = checked_tl_write(tl, engine, count);
 			if (IS_ERR(rq)) {
 				intel_timeline_put(tl);
 				err = PTR_ERR(rq);
diff --git a/drivers/gpu/drm/i915/i915_selftest.h b/drivers/gpu/drm/i915/i915_selftest.h
index d53d207ab6eb..f54de0499be7 100644
--- a/drivers/gpu/drm/i915/i915_selftest.h
+++ b/drivers/gpu/drm/i915/i915_selftest.h
@@ -107,6 +107,7 @@ int __i915_subtests(const char *caller,
 
 #define I915_SELFTEST_DECLARE(x) x
 #define I915_SELFTEST_ONLY(x) unlikely(x)
+#define I915_SELFTEST_EXPORT
 
 #else /* !IS_ENABLED(CONFIG_DRM_I915_SELFTEST) */
 
@@ -116,6 +117,7 @@ static inline int i915_perf_selftests(struct pci_dev *pdev) { return 0; }
 
 #define I915_SELFTEST_DECLARE(x)
 #define I915_SELFTEST_ONLY(x) 0
+#define I915_SELFTEST_EXPORT static
 
 #endif
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 012/162] drm/i915: Move cmd parser pinning to execbuffer
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (10 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 011/162] drm/i915: Pin timeline map after first timeline pin, v5 Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 013/162] drm/i915: Add missing -EDEADLK handling to execbuf pinning, v2 Matthew Auld
                   ` (149 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

We need to get rid of allocations in the cmd parser, because it needs
to be called from a signaling context, first move all pinning to
execbuf, where we already hold all locks.

Allocate jump_whitelist in the execbuffer, and add annotations around
intel_engine_cmd_parser(), to ensure we only call the command parser
without allocating any memory, or taking any locks we're not supposed to.

Because i915_gem_object_get_page() may also allocate memory, add a
path to i915_gem_object_get_sg() that prevents memory allocations,
and walk the sg list manually. It should be similarly fast.

This has the added benefit of being able to catch all memory allocation
errors before the point of no return, and return -ENOMEM safely to the
execbuf submitter.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  74 ++++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  10 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     |  21 +++-
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |   2 +-
 drivers/gpu/drm/i915/i915_cmd_parser.c        | 104 ++++++++----------
 drivers/gpu/drm/i915/i915_drv.h               |   7 +-
 drivers/gpu/drm/i915/i915_memcpy.c            |   2 +-
 drivers/gpu/drm/i915/i915_memcpy.h            |   2 +-
 8 files changed, 142 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 1904e6e5ea64..60afa6f826d6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -24,6 +24,7 @@
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
 #include "i915_gem_ioctls.h"
+#include "i915_memcpy.h"
 #include "i915_sw_fence_work.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
@@ -2273,24 +2274,45 @@ struct eb_parse_work {
 	struct i915_vma *trampoline;
 	unsigned long batch_offset;
 	unsigned long batch_length;
+	unsigned long *jump_whitelist;
+	const void *batch_map;
+	void *shadow_map;
 };
 
 static int __eb_parse(struct dma_fence_work *work)
 {
 	struct eb_parse_work *pw = container_of(work, typeof(*pw), base);
+	int ret;
+	bool cookie;
 
-	return intel_engine_cmd_parser(pw->engine,
-				       pw->batch,
-				       pw->batch_offset,
-				       pw->batch_length,
-				       pw->shadow,
-				       pw->trampoline);
+	cookie = dma_fence_begin_signalling();
+	ret = intel_engine_cmd_parser(pw->engine,
+				      pw->batch,
+				      pw->batch_offset,
+				      pw->batch_length,
+				      pw->shadow,
+				      pw->jump_whitelist,
+				      pw->shadow_map,
+				      pw->batch_map);
+	dma_fence_end_signalling(cookie);
+
+	return ret;
 }
 
 static void __eb_parse_release(struct dma_fence_work *work)
 {
 	struct eb_parse_work *pw = container_of(work, typeof(*pw), base);
 
+	if (!IS_ERR_OR_NULL(pw->jump_whitelist))
+		kfree(pw->jump_whitelist);
+
+	if (pw->batch_map)
+		i915_gem_object_unpin_map(pw->batch->obj);
+	else
+		i915_gem_object_unpin_pages(pw->batch->obj);
+
+	i915_gem_object_unpin_map(pw->shadow->obj);
+
 	if (pw->trampoline)
 		i915_active_release(&pw->trampoline->active);
 	i915_active_release(&pw->shadow->active);
@@ -2340,6 +2362,8 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
 			     struct i915_vma *trampoline)
 {
 	struct eb_parse_work *pw;
+	struct drm_i915_gem_object *batch = eb->batch->vma->obj;
+	bool needs_clflush;
 	int err;
 
 	GEM_BUG_ON(overflows_type(eb->batch_start_offset, pw->batch_offset));
@@ -2363,6 +2387,34 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
 			goto err_shadow;
 	}
 
+	pw->shadow_map = i915_gem_object_pin_map(shadow->obj, I915_MAP_FORCE_WB);
+	if (IS_ERR(pw->shadow_map)) {
+		err = PTR_ERR(pw->shadow_map);
+		goto err_trampoline;
+	}
+
+	needs_clflush =
+		!(batch->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ);
+
+	pw->batch_map = ERR_PTR(-ENODEV);
+	if (needs_clflush && i915_has_memcpy_from_wc())
+		pw->batch_map = i915_gem_object_pin_map(batch, I915_MAP_WC);
+
+	if (IS_ERR(pw->batch_map)) {
+		err = i915_gem_object_pin_pages(batch);
+		if (err)
+			goto err_unmap_shadow;
+		pw->batch_map = NULL;
+	}
+
+	pw->jump_whitelist =
+		intel_engine_cmd_parser_alloc_jump_whitelist(eb->batch_len,
+							     trampoline);
+	if (IS_ERR(pw->jump_whitelist)) {
+		err = PTR_ERR(pw->jump_whitelist);
+		goto err_unmap_batch;
+	}
+
 	dma_fence_work_init(&pw->base, &eb_parse_ops);
 
 	pw->engine = eb->engine;
@@ -2402,6 +2454,16 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
 	dma_fence_work_commit_imm(&pw->base);
 	return err;
 
+err_unmap_batch:
+	if (pw->batch_map)
+		i915_gem_object_unpin_map(batch);
+	else
+		i915_gem_object_unpin_pages(batch);
+err_unmap_shadow:
+	i915_gem_object_unpin_map(shadow->obj);
+err_trampoline:
+	if (trampoline)
+		i915_active_release(&trampoline->active);
 err_shadow:
 	i915_active_release(&shadow->active);
 err_batch:
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index d61194ef484e..80c5b2b326f5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -291,22 +291,22 @@ struct scatterlist *
 __i915_gem_object_get_sg(struct drm_i915_gem_object *obj,
 			 struct i915_gem_object_page_iter *iter,
 			 unsigned int n,
-			 unsigned int *offset);
+			 unsigned int *offset, bool allow_alloc);
 
 static inline struct scatterlist *
 i915_gem_object_get_sg(struct drm_i915_gem_object *obj,
 		       unsigned int n,
-		       unsigned int *offset)
+		       unsigned int *offset, bool allow_alloc)
 {
-	return __i915_gem_object_get_sg(obj, &obj->mm.get_page, n, offset);
+	return __i915_gem_object_get_sg(obj, &obj->mm.get_page, n, offset, allow_alloc);
 }
 
 static inline struct scatterlist *
 i915_gem_object_get_sg_dma(struct drm_i915_gem_object *obj,
 			   unsigned int n,
-			   unsigned int *offset)
+			   unsigned int *offset, bool allow_alloc)
 {
-	return __i915_gem_object_get_sg(obj, &obj->mm.get_dma_page, n, offset);
+	return __i915_gem_object_get_sg(obj, &obj->mm.get_dma_page, n, offset, allow_alloc);
 }
 
 struct page *
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index e2c7b2a7895f..ca076203f5e9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -445,7 +445,8 @@ struct scatterlist *
 __i915_gem_object_get_sg(struct drm_i915_gem_object *obj,
 			 struct i915_gem_object_page_iter *iter,
 			 unsigned int n,
-			 unsigned int *offset)
+			 unsigned int *offset,
+			 bool allow_alloc)
 {
 	const bool dma = iter == &obj->mm.get_dma_page;
 	struct scatterlist *sg;
@@ -467,6 +468,9 @@ __i915_gem_object_get_sg(struct drm_i915_gem_object *obj,
 	if (n < READ_ONCE(iter->sg_idx))
 		goto lookup;
 
+	if (!allow_alloc)
+		goto manual_lookup;
+
 	mutex_lock(&iter->lock);
 
 	/* We prefer to reuse the last sg so that repeated lookup of this
@@ -516,7 +520,16 @@ __i915_gem_object_get_sg(struct drm_i915_gem_object *obj,
 	if (unlikely(n < idx)) /* insertion completed by another thread */
 		goto lookup;
 
-	/* In case we failed to insert the entry into the radixtree, we need
+	goto manual_walk;
+
+manual_lookup:
+	idx = 0;
+	sg = obj->mm.pages->sgl;
+	count = __sg_page_count(sg);
+
+manual_walk:
+	/*
+	 * In case we failed to insert the entry into the radixtree, we need
 	 * to look beyond the current sg.
 	 */
 	while (idx + count <= n) {
@@ -563,7 +576,7 @@ i915_gem_object_get_page(struct drm_i915_gem_object *obj, unsigned int n)
 
 	GEM_BUG_ON(!i915_gem_object_has_struct_page(obj));
 
-	sg = i915_gem_object_get_sg(obj, n, &offset);
+	sg = i915_gem_object_get_sg(obj, n, &offset, true);
 	return nth_page(sg_page(sg), offset);
 }
 
@@ -589,7 +602,7 @@ i915_gem_object_get_dma_address_len(struct drm_i915_gem_object *obj,
 	struct scatterlist *sg;
 	unsigned int offset;
 
-	sg = i915_gem_object_get_sg_dma(obj, n, &offset);
+	sg = i915_gem_object_get_sg_dma(obj, n, &offset, true);
 
 	if (len)
 		*len = sg_dma_len(sg) - (offset << PAGE_SHIFT);
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index cf94525be2c1..60bd2c8ed8b0 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -1383,7 +1383,7 @@ intel_partial_pages(const struct i915_ggtt_view *view,
 	if (ret)
 		goto err_sg_alloc;
 
-	iter = i915_gem_object_get_sg_dma(obj, view->partial.offset, &offset);
+	iter = i915_gem_object_get_sg_dma(obj, view->partial.offset, &offset, true);
 	GEM_BUG_ON(!iter);
 
 	sg = st->sgl;
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 93265951fdbb..8883a7d4964f 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1136,38 +1136,19 @@ find_reg(const struct intel_engine_cs *engine, u32 addr)
 /* Returns a vmap'd pointer to dst_obj, which the caller must unmap */
 static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
 		       struct drm_i915_gem_object *src_obj,
-		       unsigned long offset, unsigned long length)
+		       unsigned long offset, unsigned long length,
+		       void *dst, const void *src)
 {
-	bool needs_clflush;
-	void *dst, *src;
-	int ret;
-
-	dst = i915_gem_object_pin_map(dst_obj, I915_MAP_FORCE_WB);
-	if (IS_ERR(dst))
-		return dst;
-
-	ret = i915_gem_object_pin_pages(src_obj);
-	if (ret) {
-		i915_gem_object_unpin_map(dst_obj);
-		return ERR_PTR(ret);
-	}
-
-	needs_clflush =
+	bool needs_clflush =
 		!(src_obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ);
 
-	src = ERR_PTR(-ENODEV);
-	if (needs_clflush && i915_has_memcpy_from_wc()) {
-		src = i915_gem_object_pin_map(src_obj, I915_MAP_WC);
-		if (!IS_ERR(src)) {
-			i915_unaligned_memcpy_from_wc(dst,
-						      src + offset,
-						      length);
-			i915_gem_object_unpin_map(src_obj);
-		}
-	}
-	if (IS_ERR(src)) {
-		unsigned long x, n;
+	if (src) {
+		GEM_BUG_ON(!needs_clflush);
+		i915_unaligned_memcpy_from_wc(dst, src + offset, length);
+	} else {
+		struct scatterlist *sg;
 		void *ptr;
+		unsigned int x, sg_ofs;
 
 		/*
 		 * We can avoid clflushing partial cachelines before the write
@@ -1183,23 +1164,32 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
 
 		ptr = dst;
 		x = offset_in_page(offset);
-		for (n = offset >> PAGE_SHIFT; length; n++) {
-			int len = min(length, PAGE_SIZE - x);
-
-			src = kmap_atomic(i915_gem_object_get_page(src_obj, n));
-			if (needs_clflush)
-				drm_clflush_virt_range(src + x, len);
-			memcpy(ptr, src + x, len);
-			kunmap_atomic(src);
-
-			ptr += len;
-			length -= len;
-			x = 0;
+
+		sg = i915_gem_object_get_sg(src_obj, offset >> PAGE_SHIFT, &sg_ofs, false);
+
+		while (length) {
+			unsigned long sg_max = sg->length >> PAGE_SHIFT;
+
+			for (; length && sg_ofs < sg_max; sg_ofs++) {
+				unsigned long len = min(length, PAGE_SIZE - x);
+				void *map;
+
+				map = kmap_atomic(nth_page(sg_page(sg), sg_ofs));
+				if (needs_clflush)
+					drm_clflush_virt_range(map + x, len);
+				memcpy(ptr, map + x, len);
+				kunmap_atomic(map);
+
+				ptr += len;
+				length -= len;
+				x = 0;
+			}
+
+			sg_ofs = 0;
+			sg = sg_next(sg);
 		}
 	}
 
-	i915_gem_object_unpin_pages(src_obj);
-
 	/* dst_obj is returned with vmap pinned */
 	return dst;
 }
@@ -1359,9 +1349,6 @@ static int check_bbstart(u32 *cmd, u32 offset, u32 length,
 	if (target_cmd_index == offset)
 		return 0;
 
-	if (IS_ERR(jump_whitelist))
-		return PTR_ERR(jump_whitelist);
-
 	if (!test_bit(target_cmd_index, jump_whitelist)) {
 		DRM_DEBUG("CMD: BB_START to 0x%llx not a previously executed cmd\n",
 			  jump_target);
@@ -1371,10 +1358,14 @@ static int check_bbstart(u32 *cmd, u32 offset, u32 length,
 	return 0;
 }
 
-static unsigned long *alloc_whitelist(u32 batch_length)
+unsigned long *intel_engine_cmd_parser_alloc_jump_whitelist(u32 batch_length,
+							    bool trampoline)
 {
 	unsigned long *jmp;
 
+	if (trampoline)
+		return NULL;
+
 	/*
 	 * We expect batch_length to be less than 256KiB for known users,
 	 * i.e. we need at most an 8KiB bitmap allocation which should be
@@ -1417,14 +1408,16 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 			    unsigned long batch_offset,
 			    unsigned long batch_length,
 			    struct i915_vma *shadow,
-			    bool trampoline)
+			    unsigned long *jump_whitelist,
+			    void *shadow_map,
+			    const void *batch_map)
 {
 	u32 *cmd, *batch_end, offset = 0;
 	struct drm_i915_cmd_descriptor default_desc = noop_desc;
 	const struct drm_i915_cmd_descriptor *desc = &default_desc;
-	unsigned long *jump_whitelist;
 	u64 batch_addr, shadow_addr;
 	int ret = 0;
+	bool trampoline = !jump_whitelist;
 
 	GEM_BUG_ON(!IS_ALIGNED(batch_offset, sizeof(*cmd)));
 	GEM_BUG_ON(!IS_ALIGNED(batch_length, sizeof(*cmd)));
@@ -1432,16 +1425,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 				     batch->size));
 	GEM_BUG_ON(!batch_length);
 
-	cmd = copy_batch(shadow->obj, batch->obj, batch_offset, batch_length);
-	if (IS_ERR(cmd)) {
-		DRM_DEBUG("CMD: Failed to copy batch\n");
-		return PTR_ERR(cmd);
-	}
-
-	jump_whitelist = NULL;
-	if (!trampoline)
-		/* Defer failure until attempted use */
-		jump_whitelist = alloc_whitelist(batch_length);
+	cmd = copy_batch(shadow->obj, batch->obj, batch_offset, batch_length,
+			 shadow_map, batch_map);
 
 	shadow_addr = gen8_canonical_addr(shadow->node.start);
 	batch_addr = gen8_canonical_addr(batch->node.start + batch_offset);
@@ -1549,9 +1534,6 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 		drm_clflush_virt_range(ptr, (void *)(cmd + 1) - ptr);
 	}
 
-	if (!IS_ERR_OR_NULL(jump_whitelist))
-		kfree(jump_whitelist);
-	i915_gem_object_unpin_map(shadow->obj);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0f7bf6831633..84182a40e777 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1951,12 +1951,17 @@ const char *i915_cache_level_str(struct drm_i915_private *i915, int type);
 int i915_cmd_parser_get_version(struct drm_i915_private *dev_priv);
 void intel_engine_init_cmd_parser(struct intel_engine_cs *engine);
 void intel_engine_cleanup_cmd_parser(struct intel_engine_cs *engine);
+unsigned long *intel_engine_cmd_parser_alloc_jump_whitelist(u32 batch_length,
+							    bool trampoline);
+
 int intel_engine_cmd_parser(struct intel_engine_cs *engine,
 			    struct i915_vma *batch,
 			    unsigned long batch_offset,
 			    unsigned long batch_length,
 			    struct i915_vma *shadow,
-			    bool trampoline);
+			    unsigned long *jump_whitelist,
+			    void *shadow_map,
+			    const void *batch_map);
 #define I915_CMD_PARSER_TRAMPOLINE_SIZE 8
 
 /* intel_device_info.c */
diff --git a/drivers/gpu/drm/i915/i915_memcpy.c b/drivers/gpu/drm/i915/i915_memcpy.c
index 7b3b83bd5ab8..1b021a4902de 100644
--- a/drivers/gpu/drm/i915/i915_memcpy.c
+++ b/drivers/gpu/drm/i915/i915_memcpy.c
@@ -135,7 +135,7 @@ bool i915_memcpy_from_wc(void *dst, const void *src, unsigned long len)
  * accepts that its arguments may not be aligned, but are valid for the
  * potential 16-byte read past the end.
  */
-void i915_unaligned_memcpy_from_wc(void *dst, void *src, unsigned long len)
+void i915_unaligned_memcpy_from_wc(void *dst, const void *src, unsigned long len)
 {
 	unsigned long addr;
 
diff --git a/drivers/gpu/drm/i915/i915_memcpy.h b/drivers/gpu/drm/i915/i915_memcpy.h
index e36d30edd987..3df063a3293b 100644
--- a/drivers/gpu/drm/i915/i915_memcpy.h
+++ b/drivers/gpu/drm/i915/i915_memcpy.h
@@ -13,7 +13,7 @@ struct drm_i915_private;
 void i915_memcpy_init_early(struct drm_i915_private *i915);
 
 bool i915_memcpy_from_wc(void *dst, const void *src, unsigned long len);
-void i915_unaligned_memcpy_from_wc(void *dst, void *src, unsigned long len);
+void i915_unaligned_memcpy_from_wc(void *dst, const void *src, unsigned long len);
 
 /* The movntdqa instructions used for memcpy-from-wc require 16-byte alignment,
  * as well as SSE4.1 support. i915_memcpy_from_wc() will report if it cannot
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 013/162] drm/i915: Add missing -EDEADLK handling to execbuf pinning, v2.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (11 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 012/162] drm/i915: Move cmd parser pinning to execbuffer Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 014/162] drm/i915: Ensure we hold the object mutex in pin correctly v2 Matthew Auld
                   ` (148 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: Matthew Brost, dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

i915_vma_pin may fail with -EDEADLK when we start locking page tables,
so ensure we handle this correctly.

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 35 +++++++++++++------
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 60afa6f826d6..568c8321dc3d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -419,13 +419,14 @@ static u64 eb_pin_flags(const struct drm_i915_gem_exec_object2 *entry,
 	return pin_flags;
 }
 
-static inline bool
+static inline int
 eb_pin_vma(struct i915_execbuffer *eb,
 	   const struct drm_i915_gem_exec_object2 *entry,
 	   struct eb_vma *ev)
 {
 	struct i915_vma *vma = ev->vma;
 	u64 pin_flags;
+	int err;
 
 	if (vma->node.size)
 		pin_flags = vma->node.start;
@@ -437,24 +438,29 @@ eb_pin_vma(struct i915_execbuffer *eb,
 		pin_flags |= PIN_GLOBAL;
 
 	/* Attempt to reuse the current location if available */
-	/* TODO: Add -EDEADLK handling here */
-	if (unlikely(i915_vma_pin_ww(vma, &eb->ww, 0, 0, pin_flags))) {
+	err = i915_vma_pin_ww(vma, &eb->ww, 0, 0, pin_flags);
+	if (err == -EDEADLK)
+		return err;
+
+	if (unlikely(err)) {
 		if (entry->flags & EXEC_OBJECT_PINNED)
-			return false;
+			return err;
 
 		/* Failing that pick any _free_ space if suitable */
-		if (unlikely(i915_vma_pin_ww(vma, &eb->ww,
+		err = i915_vma_pin_ww(vma, &eb->ww,
 					     entry->pad_to_size,
 					     entry->alignment,
 					     eb_pin_flags(entry, ev->flags) |
-					     PIN_USER | PIN_NOEVICT)))
-			return false;
+					     PIN_USER | PIN_NOEVICT);
+		if (unlikely(err))
+			return err;
 	}
 
 	if (unlikely(ev->flags & EXEC_OBJECT_NEEDS_FENCE)) {
-		if (unlikely(i915_vma_pin_fence(vma))) {
+		err = i915_vma_pin_fence(vma);
+		if (unlikely(err)) {
 			i915_vma_unpin(vma);
-			return false;
+			return err;
 		}
 
 		if (vma->fence)
@@ -462,7 +468,10 @@ eb_pin_vma(struct i915_execbuffer *eb,
 	}
 
 	ev->flags |= __EXEC_OBJECT_HAS_PIN;
-	return !eb_vma_misplaced(entry, vma, ev->flags);
+	if (eb_vma_misplaced(entry, vma, ev->flags))
+		return -EBADSLT;
+
+	return 0;
 }
 
 static inline void
@@ -900,7 +909,11 @@ static int eb_validate_vmas(struct i915_execbuffer *eb)
 		if (err)
 			return err;
 
-		if (eb_pin_vma(eb, entry, ev)) {
+		err = eb_pin_vma(eb, entry, ev);
+		if (err == -EDEADLK)
+			return err;
+
+		if (!err) {
 			if (entry->offset != vma->node.start) {
 				entry->offset = vma->node.start | UPDATE;
 				eb->args->flags |= __EXEC_HAS_RELOC;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 014/162] drm/i915: Ensure we hold the object mutex in pin correctly v2
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (12 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 013/162] drm/i915: Add missing -EDEADLK handling to execbuf pinning, v2 Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 015/162] drm/i915: Add gem object locking to madvise Matthew Auld
                   ` (147 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Currently we have a lot of places where we hold the gem object lock,
but haven't yet been converted to the ww dance. Complain loudly about
those places.

i915_vma_pin shouldn't have the obj lock held, so we can do a ww dance,
while i915_vma_pin_ww should.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_renderstate.c |  2 +-
 drivers/gpu/drm/i915/gt/intel_timeline.c    |  4 +-
 drivers/gpu/drm/i915/i915_vma.c             | 46 +++++++++++++++++++--
 drivers/gpu/drm/i915/i915_vma.h             |  5 +++
 4 files changed, 50 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_renderstate.c b/drivers/gpu/drm/i915/gt/intel_renderstate.c
index ea2a77c7b469..a68e5c23a67c 100644
--- a/drivers/gpu/drm/i915/gt/intel_renderstate.c
+++ b/drivers/gpu/drm/i915/gt/intel_renderstate.c
@@ -196,7 +196,7 @@ int intel_renderstate_init(struct intel_renderstate *so,
 	if (err)
 		goto err_context;
 
-	err = i915_vma_pin(so->vma, 0, 0, PIN_GLOBAL | PIN_HIGH);
+	err = i915_vma_pin_ww(so->vma, &so->ww, 0, 0, PIN_GLOBAL | PIN_HIGH);
 	if (err)
 		goto err_context;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 479eb5440bc6..b2d04717db20 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -46,7 +46,7 @@ static int __hwsp_alloc(struct intel_gt *gt, struct intel_timeline_hwsp *hwsp)
 		goto out_unlock;
 	}
 
-	/* Pin early so we can call i915_ggtt_pin unlocked. */
+	/* Pin early so we can call i915_ggtt_pin_unlocked(). */
 	hwsp->vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
 	if (IS_ERR(hwsp->vaddr)) {
 		ret = PTR_ERR(hwsp->vaddr);
@@ -514,7 +514,7 @@ __intel_timeline_get_seqno(struct intel_timeline *tl,
 		goto err_rollback;
 	}
 
-	err = i915_ggtt_pin(vma, NULL, 0, PIN_HIGH);
+	err = i915_ggtt_pin_unlocked(vma, 0, PIN_HIGH);
 	if (err) {
 		__idle_hwsp_free(vma->private, cacheline);
 		goto err_rollback;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 8e8c80ccbe32..e07621825da9 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -862,7 +862,8 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	unsigned int bound;
 	int err;
 
-	if (IS_ENABLED(CONFIG_PROVE_LOCKING) && debug_locks) {
+#ifdef CONFIG_PROVE_LOCKING
+	if (debug_locks) {
 		bool pinned_bind_wo_alloc =
 			vma->obj && i915_gem_object_has_pinned_pages(vma->obj) &&
 			!vma->vm->allocate_va_range;
@@ -870,7 +871,10 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 		if (lockdep_is_held(&vma->vm->i915->drm.struct_mutex) &&
 		    !pinned_bind_wo_alloc)
 			WARN_ON(!ww);
+		if (ww && vma->resv)
+			assert_vma_held(vma);
 	}
+#endif
 
 	BUILD_BUG_ON(PIN_GLOBAL != I915_VMA_GLOBAL_BIND);
 	BUILD_BUG_ON(PIN_USER != I915_VMA_LOCAL_BIND);
@@ -1017,8 +1021,8 @@ static void flush_idle_contexts(struct intel_gt *gt)
 	intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT);
 }
 
-int i915_ggtt_pin(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
-		  u32 align, unsigned int flags)
+static int __i915_ggtt_pin(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
+			   u32 align, unsigned int flags, bool unlocked)
 {
 	struct i915_address_space *vm = vma->vm;
 	int err;
@@ -1026,7 +1030,10 @@ int i915_ggtt_pin(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	GEM_BUG_ON(!i915_vma_is_ggtt(vma));
 
 	do {
-		err = i915_vma_pin_ww(vma, ww, 0, align, flags | PIN_GLOBAL);
+		if (ww || unlocked)
+			err = i915_vma_pin_ww(vma, ww, 0, align, flags | PIN_GLOBAL);
+		else
+			err = i915_vma_pin(vma, 0, align, flags | PIN_GLOBAL);
 		if (err != -ENOSPC) {
 			if (!err) {
 				err = i915_vma_wait_for_bind(vma);
@@ -1045,6 +1052,37 @@ int i915_ggtt_pin(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	} while (1);
 }
 
+int i915_ggtt_pin(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
+		  u32 align, unsigned int flags)
+{
+#ifdef CONFIG_LOCKDEP
+	WARN_ON(!ww && vma->resv && dma_resv_held(vma->resv));
+#endif
+
+	return __i915_ggtt_pin(vma, ww, align, flags, false);
+}
+
+/**
+ * i915_ggtt_pin_unlocked - Pin a vma to ggtt without the underlying
+ * object's dma-resv held, but with object pages pinned.
+ *
+ * @vma: The vma to pin.
+ * @align: ggtt alignment.
+ * @flags: Pinning flags
+ *
+ * RETURN: Zero on success, negative error code on error.
+ *
+ * This function relies on the fact that object pages are already pinned,
+ * and that ggtt pinning doesn't require any page table page allocations
+ * to pin a vma without dma_resv lock and ww acquire context.
+ */
+int i915_ggtt_pin_unlocked(struct i915_vma *vma, u32 align, unsigned int flags)
+{
+	if (IS_ENABLED(CONFIG_LOCKDEP))
+		WARN_ON(vma->obj && !i915_gem_object_has_pinned_pages(vma->obj));
+	return __i915_ggtt_pin(vma, NULL, align, flags, true);
+}
+
 static void __vma_close(struct i915_vma *vma, struct intel_gt *gt)
 {
 	/*
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 5b3a3c653454..22387a361999 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -243,12 +243,17 @@ i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 static inline int __must_check
 i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 {
+#ifdef CONFIG_LOCKDEP
+	WARN_ON_ONCE(vma->resv && dma_resv_held(vma->resv));
+#endif
 	return i915_vma_pin_ww(vma, NULL, size, alignment, flags);
 }
 
 int i915_ggtt_pin(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 		  u32 align, unsigned int flags);
 
+int i915_ggtt_pin_unlocked(struct i915_vma *vma, u32 align, unsigned int flags);
+
 static inline int i915_vma_pin_count(const struct i915_vma *vma)
 {
 	return atomic_read(&vma->flags) & I915_VMA_PIN_MASK;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 015/162] drm/i915: Add gem object locking to madvise.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (13 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 014/162] drm/i915: Ensure we hold the object mutex in pin correctly v2 Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 016/162] drm/i915: Move HAS_STRUCT_PAGE to obj->flags Matthew Auld
                   ` (146 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Doesn't need the full ww lock, only checking if pages are bound.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 58276694c848..b03e245640c0 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1051,10 +1051,14 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 	if (!obj)
 		return -ENOENT;
 
-	err = mutex_lock_interruptible(&obj->mm.lock);
+	err = i915_gem_object_lock_interruptible(obj, NULL);
 	if (err)
 		goto out;
 
+	err = mutex_lock_interruptible(&obj->mm.lock);
+	if (err)
+		goto out_ww;
+
 	if (i915_gem_object_has_pages(obj) &&
 	    i915_gem_object_is_tiled(obj) &&
 	    i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
@@ -1099,6 +1103,8 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 	args->retained = obj->mm.madv != __I915_MADV_PURGED;
 	mutex_unlock(&obj->mm.lock);
 
+out_ww:
+	i915_gem_object_unlock(obj);
 out:
 	i915_gem_object_put(obj);
 	return err;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 016/162] drm/i915: Move HAS_STRUCT_PAGE to obj->flags
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (14 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 015/162] drm/i915: Add gem object locking to madvise Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 017/162] drm/i915: Rework struct phys attachment handling Matthew Auld
                   ` (145 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

We want to remove the changing of ops structure for attaching
phys pages, so we need to kill off HAS_STRUCT_PAGE from ops->flags,
and put it in the bo.

This will remove a potential race of dereferencing the wrong obj->ops
without ww mutex held.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c           |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_internal.c         |  6 +++---
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c             |  4 ++--
 drivers/gpu/drm/i915/gem/i915_gem_mman.c             |  7 +++----
 drivers/gpu/drm/i915/gem/i915_gem_object.c           |  4 +++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h           |  5 +++--
 drivers/gpu/drm/i915/gem/i915_gem_object_types.h     |  8 +++++---
 drivers/gpu/drm/i915/gem/i915_gem_pages.c            |  5 ++---
 drivers/gpu/drm/i915/gem/i915_gem_phys.c             |  2 ++
 drivers/gpu/drm/i915/gem/i915_gem_region.c           |  4 +---
 drivers/gpu/drm/i915/gem/i915_gem_region.h           |  3 +--
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c            |  8 ++++----
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c           |  4 ++--
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c          |  6 +++---
 drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c |  4 ++--
 drivers/gpu/drm/i915/gem/selftests/huge_pages.c      | 10 +++++-----
 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c   | 11 ++++-------
 drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c   | 12 ++++++++++++
 drivers/gpu/drm/i915/gvt/dmabuf.c                    |  2 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c        |  2 +-
 drivers/gpu/drm/i915/selftests/mock_region.c         |  4 ++--
 21 files changed, 62 insertions(+), 51 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 04e9c04545ad..36e3c2765f4c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -258,7 +258,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
 	}
 
 	drm_gem_private_object_init(dev, &obj->base, dma_buf->size);
-	i915_gem_object_init(obj, &i915_gem_object_dmabuf_ops, &lock_class);
+	i915_gem_object_init(obj, &i915_gem_object_dmabuf_ops, &lock_class, 0);
 	obj->base.import_attach = attach;
 	obj->base.resv = dma_buf->resv;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
index ad22f42541bd..21cc40897ca8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
@@ -138,8 +138,7 @@ static void i915_gem_object_put_pages_internal(struct drm_i915_gem_object *obj,
 
 static const struct drm_i915_gem_object_ops i915_gem_object_internal_ops = {
 	.name = "i915_gem_object_internal",
-	.flags = I915_GEM_OBJECT_HAS_STRUCT_PAGE |
-		 I915_GEM_OBJECT_IS_SHRINKABLE,
+	.flags = I915_GEM_OBJECT_IS_SHRINKABLE,
 	.get_pages = i915_gem_object_get_pages_internal,
 	.put_pages = i915_gem_object_put_pages_internal,
 };
@@ -178,7 +177,8 @@ i915_gem_object_create_internal(struct drm_i915_private *i915,
 		return ERR_PTR(-ENOMEM);
 
 	drm_gem_private_object_init(&i915->drm, &obj->base, size);
-	i915_gem_object_init(obj, &i915_gem_object_internal_ops, &lock_class);
+	i915_gem_object_init(obj, &i915_gem_object_internal_ops, &lock_class,
+			     I915_BO_ALLOC_STRUCT_PAGE);
 
 	/*
 	 * Mark the object as volatile, such that the pages are marked as
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index 932ee21e6609..e953965f8263 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -45,13 +45,13 @@ __i915_gem_lmem_object_create(struct intel_memory_region *mem,
 		return ERR_PTR(-ENOMEM);
 
 	drm_gem_private_object_init(&i915->drm, &obj->base, size);
-	i915_gem_object_init(obj, &i915_gem_lmem_obj_ops, &lock_class);
+	i915_gem_object_init(obj, &i915_gem_lmem_obj_ops, &lock_class, flags);
 
 	obj->read_domains = I915_GEM_DOMAIN_WC | I915_GEM_DOMAIN_GTT;
 
 	i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
 
-	i915_gem_object_init_memory_region(obj, mem, flags);
+	i915_gem_object_init_memory_region(obj, mem);
 
 	return obj;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index ec28a6cde49b..c0034d811e50 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -251,7 +251,7 @@ static vm_fault_t vm_fault_cpu(struct vm_fault *vmf)
 		goto out;
 
 	iomap = -1;
-	if (!i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_STRUCT_PAGE)) {
+	if (!i915_gem_object_has_struct_page(obj)) {
 		iomap = obj->mm.region->iomap.base;
 		iomap -= obj->mm.region->region.start;
 	}
@@ -653,9 +653,8 @@ __assign_mmap_offset(struct drm_file *file,
 	}
 
 	if (mmap_type != I915_MMAP_TYPE_GTT &&
-	    !i915_gem_object_type_has(obj,
-				      I915_GEM_OBJECT_HAS_STRUCT_PAGE |
-				      I915_GEM_OBJECT_HAS_IOMEM)) {
+	    !i915_gem_object_has_struct_page(obj) &&
+	    !i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM)) {
 		err = -ENODEV;
 		goto out;
 	}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 00d24000b5e8..1393988bd5af 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -60,7 +60,7 @@ void i915_gem_object_free(struct drm_i915_gem_object *obj)
 
 void i915_gem_object_init(struct drm_i915_gem_object *obj,
 			  const struct drm_i915_gem_object_ops *ops,
-			  struct lock_class_key *key)
+			  struct lock_class_key *key, unsigned flags)
 {
 	__mutex_init(&obj->mm.lock, ops->name ?: "obj->mm.lock", key);
 
@@ -78,6 +78,8 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 	init_rcu_head(&obj->rcu);
 
 	obj->ops = ops;
+	GEM_BUG_ON(flags & ~I915_BO_ALLOC_FLAGS);
+	obj->flags = flags;
 
 	obj->mm.madv = I915_MADV_WILLNEED;
 	INIT_RADIX_TREE(&obj->mm.get_page.radix, GFP_KERNEL | __GFP_NOWARN);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 80c5b2b326f5..16608bf7a4e9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -23,7 +23,8 @@ void i915_gem_object_free(struct drm_i915_gem_object *obj);
 
 void i915_gem_object_init(struct drm_i915_gem_object *obj,
 			  const struct drm_i915_gem_object_ops *ops,
-			  struct lock_class_key *key);
+			  struct lock_class_key *key,
+			  unsigned alloc_flags);
 struct drm_i915_gem_object *
 i915_gem_object_create_shmem(struct drm_i915_private *i915,
 			     resource_size_t size);
@@ -213,7 +214,7 @@ i915_gem_object_type_has(const struct drm_i915_gem_object *obj,
 static inline bool
 i915_gem_object_has_struct_page(const struct drm_i915_gem_object *obj)
 {
-	return i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_STRUCT_PAGE);
+	return obj->flags & I915_BO_ALLOC_STRUCT_PAGE;
 }
 
 static inline bool
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index e2d9b7e1e152..b53e44b06b09 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -30,7 +30,6 @@ struct i915_lut_handle {
 
 struct drm_i915_gem_object_ops {
 	unsigned int flags;
-#define I915_GEM_OBJECT_HAS_STRUCT_PAGE	BIT(0)
 #define I915_GEM_OBJECT_HAS_IOMEM	BIT(1)
 #define I915_GEM_OBJECT_IS_SHRINKABLE	BIT(2)
 #define I915_GEM_OBJECT_IS_PROXY	BIT(3)
@@ -165,8 +164,11 @@ struct drm_i915_gem_object {
 	unsigned long flags;
 #define I915_BO_ALLOC_CONTIGUOUS BIT(0)
 #define I915_BO_ALLOC_VOLATILE   BIT(1)
-#define I915_BO_ALLOC_FLAGS (I915_BO_ALLOC_CONTIGUOUS | I915_BO_ALLOC_VOLATILE)
-#define I915_BO_READONLY         BIT(2)
+#define I915_BO_ALLOC_STRUCT_PAGE BIT(2)
+#define I915_BO_ALLOC_FLAGS (I915_BO_ALLOC_CONTIGUOUS | \
+			     I915_BO_ALLOC_VOLATILE | \
+			     I915_BO_ALLOC_STRUCT_PAGE)
+#define I915_BO_READONLY         BIT(3)
 
 	/*
 	 * Is the object to be mapped as read-only to the GPU
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index ca076203f5e9..7983423237e3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -328,13 +328,12 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
 			      enum i915_map_type type)
 {
 	enum i915_map_type has_type;
-	unsigned int flags;
 	bool pinned;
 	void *ptr;
 	int err;
 
-	flags = I915_GEM_OBJECT_HAS_STRUCT_PAGE | I915_GEM_OBJECT_HAS_IOMEM;
-	if (!i915_gem_object_type_has(obj, flags))
+	if (!i915_gem_object_has_struct_page(obj) &&
+	    !i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM))
 		return ERR_PTR(-ENXIO);
 
 	err = mutex_lock_interruptible_nested(&obj->mm.lock, I915_MM_GET_PAGES);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index 3a4dfe2ef1da..965590d3a570 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -240,6 +240,7 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align)
 	pages = __i915_gem_object_unset_pages(obj);
 
 	obj->ops = &i915_gem_phys_ops;
+	obj->flags &= ~I915_BO_ALLOC_STRUCT_PAGE;
 
 	err = ____i915_gem_object_get_pages(obj);
 	if (err)
@@ -258,6 +259,7 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align)
 
 err_xfer:
 	obj->ops = &i915_gem_shmem_ops;
+	obj->flags |= I915_BO_ALLOC_STRUCT_PAGE;
 	if (!IS_ERR_OR_NULL(pages)) {
 		unsigned int sg_page_sizes = i915_sg_page_sizes(pages->sgl);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c
index 1515384d7e0e..6a96741253b3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
@@ -102,13 +102,11 @@ i915_gem_object_get_pages_buddy(struct drm_i915_gem_object *obj)
 }
 
 void i915_gem_object_init_memory_region(struct drm_i915_gem_object *obj,
-					struct intel_memory_region *mem,
-					unsigned long flags)
+					struct intel_memory_region *mem)
 {
 	INIT_LIST_HEAD(&obj->mm.blocks);
 	obj->mm.region = intel_memory_region_get(mem);
 
-	obj->flags |= flags;
 	if (obj->base.size <= mem->min_page_size)
 		obj->flags |= I915_BO_ALLOC_CONTIGUOUS;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.h b/drivers/gpu/drm/i915/gem/i915_gem_region.h
index f2ff6f8bff74..ebddc86d78f7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_region.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_region.h
@@ -17,8 +17,7 @@ void i915_gem_object_put_pages_buddy(struct drm_i915_gem_object *obj,
 				     struct sg_table *pages);
 
 void i915_gem_object_init_memory_region(struct drm_i915_gem_object *obj,
-					struct intel_memory_region *mem,
-					unsigned long flags);
+					struct intel_memory_region *mem);
 void i915_gem_object_release_memory_region(struct drm_i915_gem_object *obj);
 
 struct drm_i915_gem_object *
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 75e8b71c18b9..31c617a1115f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -430,8 +430,7 @@ static void shmem_release(struct drm_i915_gem_object *obj)
 
 const struct drm_i915_gem_object_ops i915_gem_shmem_ops = {
 	.name = "i915_gem_object_shmem",
-	.flags = I915_GEM_OBJECT_HAS_STRUCT_PAGE |
-		 I915_GEM_OBJECT_IS_SHRINKABLE,
+	.flags = I915_GEM_OBJECT_IS_SHRINKABLE,
 
 	.get_pages = shmem_get_pages,
 	.put_pages = shmem_put_pages,
@@ -496,7 +495,8 @@ create_shmem(struct intel_memory_region *mem,
 	mapping_set_gfp_mask(mapping, mask);
 	GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM));
 
-	i915_gem_object_init(obj, &i915_gem_shmem_ops, &lock_class);
+	i915_gem_object_init(obj, &i915_gem_shmem_ops, &lock_class,
+			     I915_BO_ALLOC_STRUCT_PAGE);
 
 	obj->write_domain = I915_GEM_DOMAIN_CPU;
 	obj->read_domains = I915_GEM_DOMAIN_CPU;
@@ -520,7 +520,7 @@ create_shmem(struct intel_memory_region *mem,
 
 	i915_gem_object_set_cache_coherency(obj, cache_level);
 
-	i915_gem_object_init_memory_region(obj, mem, 0);
+	i915_gem_object_init_memory_region(obj, mem);
 
 	return obj;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index 29bffc6afcc1..5372b888ba01 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -636,7 +636,7 @@ __i915_gem_object_create_stolen(struct intel_memory_region *mem,
 		goto err;
 
 	drm_gem_private_object_init(&mem->i915->drm, &obj->base, stolen->size);
-	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class);
+	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class, 0);
 
 	obj->stolen = stolen;
 	obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
@@ -647,7 +647,7 @@ __i915_gem_object_create_stolen(struct intel_memory_region *mem,
 	if (err)
 		goto cleanup;
 
-	i915_gem_object_init_memory_region(obj, mem, 0);
+	i915_gem_object_init_memory_region(obj, mem);
 
 	return obj;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index f2eaed6aca3d..30edc5a0a54e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -702,8 +702,7 @@ i915_gem_userptr_dmabuf_export(struct drm_i915_gem_object *obj)
 
 static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
 	.name = "i915_gem_object_userptr",
-	.flags = I915_GEM_OBJECT_HAS_STRUCT_PAGE |
-		 I915_GEM_OBJECT_IS_SHRINKABLE |
+	.flags = I915_GEM_OBJECT_IS_SHRINKABLE |
 		 I915_GEM_OBJECT_NO_MMAP |
 		 I915_GEM_OBJECT_ASYNC_CANCEL,
 	.get_pages = i915_gem_userptr_get_pages,
@@ -810,7 +809,8 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
 		return -ENOMEM;
 
 	drm_gem_private_object_init(dev, &obj->base, args->user_size);
-	i915_gem_object_init(obj, &i915_gem_userptr_ops, &lock_class);
+	i915_gem_object_init(obj, &i915_gem_userptr_ops, &lock_class,
+			     I915_BO_ALLOC_STRUCT_PAGE);
 	obj->read_domains = I915_GEM_DOMAIN_CPU;
 	obj->write_domain = I915_GEM_DOMAIN_CPU;
 	i915_gem_object_set_cache_coherency(obj, I915_CACHE_LLC);
diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c b/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c
index a768ec61e966..dfad86d74dd0 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_gem_object.c
@@ -89,7 +89,6 @@ static void huge_put_pages(struct drm_i915_gem_object *obj,
 
 static const struct drm_i915_gem_object_ops huge_ops = {
 	.name = "huge-gem",
-	.flags = I915_GEM_OBJECT_HAS_STRUCT_PAGE,
 	.get_pages = huge_get_pages,
 	.put_pages = huge_put_pages,
 };
@@ -115,7 +114,8 @@ huge_gem_object(struct drm_i915_private *i915,
 		return ERR_PTR(-ENOMEM);
 
 	drm_gem_private_object_init(&i915->drm, &obj->base, dma_size);
-	i915_gem_object_init(obj, &huge_ops, &lock_class);
+	i915_gem_object_init(obj, &huge_ops, &lock_class,
+			     I915_BO_ALLOC_STRUCT_PAGE);
 
 	obj->read_domains = I915_GEM_DOMAIN_CPU;
 	obj->write_domain = I915_GEM_DOMAIN_CPU;
diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index 77a13527a7e6..709c63b9cfc4 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -140,8 +140,7 @@ static void put_huge_pages(struct drm_i915_gem_object *obj,
 
 static const struct drm_i915_gem_object_ops huge_page_ops = {
 	.name = "huge-gem",
-	.flags = I915_GEM_OBJECT_HAS_STRUCT_PAGE |
-		 I915_GEM_OBJECT_IS_SHRINKABLE,
+	.flags = I915_GEM_OBJECT_IS_SHRINKABLE,
 	.get_pages = get_huge_pages,
 	.put_pages = put_huge_pages,
 };
@@ -168,7 +167,8 @@ huge_pages_object(struct drm_i915_private *i915,
 		return ERR_PTR(-ENOMEM);
 
 	drm_gem_private_object_init(&i915->drm, &obj->base, size);
-	i915_gem_object_init(obj, &huge_page_ops, &lock_class);
+	i915_gem_object_init(obj, &huge_page_ops, &lock_class,
+			     I915_BO_ALLOC_STRUCT_PAGE);
 
 	i915_gem_object_set_volatile(obj);
 
@@ -319,9 +319,9 @@ fake_huge_pages_object(struct drm_i915_private *i915, u64 size, bool single)
 	drm_gem_private_object_init(&i915->drm, &obj->base, size);
 
 	if (single)
-		i915_gem_object_init(obj, &fake_ops_single, &lock_class);
+		i915_gem_object_init(obj, &fake_ops_single, &lock_class, 0);
 	else
-		i915_gem_object_init(obj, &fake_ops, &lock_class);
+		i915_gem_object_init(obj, &fake_ops, &lock_class, 0);
 
 	i915_gem_object_set_volatile(obj);
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index d27d87a678c8..3ac7628f3bc4 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -834,9 +834,8 @@ static bool can_mmap(struct drm_i915_gem_object *obj, enum i915_mmap_type type)
 		return false;
 
 	if (type != I915_MMAP_TYPE_GTT &&
-	    !i915_gem_object_type_has(obj,
-				      I915_GEM_OBJECT_HAS_STRUCT_PAGE |
-				      I915_GEM_OBJECT_HAS_IOMEM))
+	    !i915_gem_object_has_struct_page(obj) &&
+	    !i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM))
 		return false;
 
 	return true;
@@ -976,10 +975,8 @@ static const char *repr_mmap_type(enum i915_mmap_type type)
 
 static bool can_access(const struct drm_i915_gem_object *obj)
 {
-	unsigned int flags =
-		I915_GEM_OBJECT_HAS_STRUCT_PAGE | I915_GEM_OBJECT_HAS_IOMEM;
-
-	return i915_gem_object_type_has(obj, flags);
+	return i915_gem_object_has_struct_page(obj) ||
+	       i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM);
 }
 
 static int __igt_mmap_access(struct drm_i915_private *i915,
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c
index 8cee68c6a6dc..fb6a17701310 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c
@@ -25,12 +25,24 @@ static int mock_phys_object(void *arg)
 		goto out;
 	}
 
+	if (!i915_gem_object_has_struct_page(obj)) {
+		err = -EINVAL;
+		pr_err("shmem has no struct page\n");
+		goto out_obj;
+	}
+
 	err = i915_gem_object_attach_phys(obj, PAGE_SIZE);
 	if (err) {
 		pr_err("i915_gem_object_attach_phys failed, err=%d\n", err);
 		goto out_obj;
 	}
 
+	if (i915_gem_object_has_struct_page(obj)) {
+		err = -EINVAL;
+		pr_err("shmem has a struct page\n");
+		goto out_obj;
+	}
+
 	if (obj->ops != &i915_gem_phys_ops) {
 		pr_err("i915_gem_object_attach_phys did not create a phys object\n");
 		err = -EINVAL;
diff --git a/drivers/gpu/drm/i915/gvt/dmabuf.c b/drivers/gpu/drm/i915/gvt/dmabuf.c
index c3eb3838fe88..d4f883f35b95 100644
--- a/drivers/gpu/drm/i915/gvt/dmabuf.c
+++ b/drivers/gpu/drm/i915/gvt/dmabuf.c
@@ -218,7 +218,7 @@ static struct drm_i915_gem_object *vgpu_create_gem(struct drm_device *dev,
 
 	drm_gem_private_object_init(dev, &obj->base,
 		roundup(info->size, PAGE_SIZE));
-	i915_gem_object_init(obj, &intel_vgpu_gem_ops, &lock_class);
+	i915_gem_object_init(obj, &intel_vgpu_gem_ops, &lock_class, 0);
 	i915_gem_object_set_readonly(obj);
 
 	obj->read_domains = I915_GEM_DOMAIN_GTT;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index c53a222e3dec..2cfe99c79034 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -120,7 +120,7 @@ fake_dma_object(struct drm_i915_private *i915, u64 size)
 		goto err;
 
 	drm_gem_private_object_init(&i915->drm, &obj->base, size);
-	i915_gem_object_init(obj, &fake_ops, &lock_class);
+	i915_gem_object_init(obj, &fake_ops, &lock_class, 0);
 
 	i915_gem_object_set_volatile(obj);
 
diff --git a/drivers/gpu/drm/i915/selftests/mock_region.c b/drivers/gpu/drm/i915/selftests/mock_region.c
index 979d96f27c43..b046bd1a9ad3 100644
--- a/drivers/gpu/drm/i915/selftests/mock_region.c
+++ b/drivers/gpu/drm/i915/selftests/mock_region.c
@@ -32,13 +32,13 @@ mock_object_create(struct intel_memory_region *mem,
 		return ERR_PTR(-ENOMEM);
 
 	drm_gem_private_object_init(&i915->drm, &obj->base, size);
-	i915_gem_object_init(obj, &mock_region_obj_ops, &lock_class);
+	i915_gem_object_init(obj, &mock_region_obj_ops, &lock_class, flags);
 
 	obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
 
 	i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
 
-	i915_gem_object_init_memory_region(obj, mem, flags);
+	i915_gem_object_init_memory_region(obj, mem);
 
 	return obj;
 }
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 017/162] drm/i915: Rework struct phys attachment handling
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (15 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 016/162] drm/i915: Move HAS_STRUCT_PAGE to obj->flags Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 018/162] drm/i915: Convert i915_gem_object_attach_phys() to ww locking, v2 Matthew Auld
                   ` (144 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Instead of creating a separate object type, we make changes to
the shmem type, to clear struct page backing. This will allow us to
ensure we never run into a race when we exchange obj->ops with other
function pointers.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   8 ++
 drivers/gpu/drm/i915/gem/i915_gem_phys.c      | 102 +++++++++---------
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  22 +++-
 .../drm/i915/gem/selftests/i915_gem_phys.c    |   6 --
 4 files changed, 78 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 16608bf7a4e9..e549b88693a2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -37,7 +37,15 @@ void __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
 				     struct sg_table *pages,
 				     bool needs_clflush);
 
+int i915_gem_object_pwrite_phys(struct drm_i915_gem_object *obj,
+				const struct drm_i915_gem_pwrite *args);
+int i915_gem_object_pread_phys(struct drm_i915_gem_object *obj,
+			       const struct drm_i915_gem_pread *args);
+
 int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align);
+void i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj,
+				    struct sg_table *pages);
+
 
 void i915_gem_flush_free_objects(struct drm_i915_private *i915);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index 965590d3a570..4bdd0429c08b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -76,6 +76,8 @@ static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
 
 	intel_gt_chipset_flush(&to_i915(obj->base.dev)->gt);
 
+	/* We're no longer struct page backed */
+	obj->flags &= ~I915_BO_ALLOC_STRUCT_PAGE;
 	__i915_gem_object_set_pages(obj, st, sg->length);
 
 	return 0;
@@ -89,7 +91,7 @@ static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
 	return -ENOMEM;
 }
 
-static void
+void
 i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj,
 			       struct sg_table *pages)
 {
@@ -134,9 +136,8 @@ i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj,
 			  vaddr, dma);
 }
 
-static int
-phys_pwrite(struct drm_i915_gem_object *obj,
-	    const struct drm_i915_gem_pwrite *args)
+int i915_gem_object_pwrite_phys(struct drm_i915_gem_object *obj,
+				const struct drm_i915_gem_pwrite *args)
 {
 	void *vaddr = sg_page(obj->mm.pages->sgl) + args->offset;
 	char __user *user_data = u64_to_user_ptr(args->data_ptr);
@@ -165,9 +166,8 @@ phys_pwrite(struct drm_i915_gem_object *obj,
 	return 0;
 }
 
-static int
-phys_pread(struct drm_i915_gem_object *obj,
-	   const struct drm_i915_gem_pread *args)
+int i915_gem_object_pread_phys(struct drm_i915_gem_object *obj,
+			       const struct drm_i915_gem_pread *args)
 {
 	void *vaddr = sg_page(obj->mm.pages->sgl) + args->offset;
 	char __user *user_data = u64_to_user_ptr(args->data_ptr);
@@ -186,86 +186,82 @@ phys_pread(struct drm_i915_gem_object *obj,
 	return 0;
 }
 
-static void phys_release(struct drm_i915_gem_object *obj)
+static int i915_gem_object_shmem_to_phys(struct drm_i915_gem_object *obj)
 {
-	fput(obj->base.filp);
-}
+	struct sg_table *pages;
+	int err;
 
-static const struct drm_i915_gem_object_ops i915_gem_phys_ops = {
-	.name = "i915_gem_object_phys",
-	.get_pages = i915_gem_object_get_pages_phys,
-	.put_pages = i915_gem_object_put_pages_phys,
+	pages = __i915_gem_object_unset_pages(obj);
+
+	err = i915_gem_object_get_pages_phys(obj);
+	if (err)
+		goto err_xfer;
 
-	.pread  = phys_pread,
-	.pwrite = phys_pwrite,
+	/* Perma-pin (until release) the physical set of pages */
+	__i915_gem_object_pin_pages(obj);
 
-	.release = phys_release,
-};
+	if (!IS_ERR_OR_NULL(pages))
+		i915_gem_shmem_ops.put_pages(obj, pages);
+
+	i915_gem_object_release_memory_region(obj);
+	return 0;
+
+err_xfer:
+	if (!IS_ERR_OR_NULL(pages)) {
+		unsigned int sg_page_sizes = i915_sg_page_sizes(pages->sgl);
+
+		__i915_gem_object_set_pages(obj, pages, sg_page_sizes);
+	}
+	return err;
+}
 
 int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align)
 {
-	struct sg_table *pages;
 	int err;
 
 	if (align > obj->base.size)
 		return -EINVAL;
 
-	if (obj->ops == &i915_gem_phys_ops)
-		return 0;
-
 	if (obj->ops != &i915_gem_shmem_ops)
 		return -EINVAL;
 
+	if (!i915_gem_object_has_struct_page(obj))
+		return 0;
+
 	err = i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE);
 	if (err)
 		return err;
 
 	mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES);
 
+	if (unlikely(!i915_gem_object_has_struct_page(obj)))
+		goto out;
+
 	if (obj->mm.madv != I915_MADV_WILLNEED) {
 		err = -EFAULT;
-		goto err_unlock;
+		goto out;
 	}
 
 	if (obj->mm.quirked) {
 		err = -EFAULT;
-		goto err_unlock;
+		goto out;
 	}
 
-	if (obj->mm.mapping) {
+	if (obj->mm.mapping || i915_gem_object_has_pinned_pages(obj)) {
 		err = -EBUSY;
-		goto err_unlock;
+		goto out;
 	}
 
-	pages = __i915_gem_object_unset_pages(obj);
-
-	obj->ops = &i915_gem_phys_ops;
-	obj->flags &= ~I915_BO_ALLOC_STRUCT_PAGE;
-
-	err = ____i915_gem_object_get_pages(obj);
-	if (err)
-		goto err_xfer;
-
-	/* Perma-pin (until release) the physical set of pages */
-	__i915_gem_object_pin_pages(obj);
-
-	if (!IS_ERR_OR_NULL(pages))
-		i915_gem_shmem_ops.put_pages(obj, pages);
-
-	i915_gem_object_release_memory_region(obj);
-
-	mutex_unlock(&obj->mm.lock);
-	return 0;
+	if (unlikely(obj->mm.madv != I915_MADV_WILLNEED)) {
+		drm_dbg(obj->base.dev,
+			"Attempting to obtain a purgeable object\n");
+		err = -EFAULT;
+		goto out;
+	}
 
-err_xfer:
-	obj->ops = &i915_gem_shmem_ops;
-	obj->flags |= I915_BO_ALLOC_STRUCT_PAGE;
-	if (!IS_ERR_OR_NULL(pages)) {
-		unsigned int sg_page_sizes = i915_sg_page_sizes(pages->sgl);
+	err = i915_gem_object_shmem_to_phys(obj);
 
-		__i915_gem_object_set_pages(obj, pages, sg_page_sizes);
-	}
-err_unlock:
+out:
 	mutex_unlock(&obj->mm.lock);
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 31c617a1115f..d590e0c3bd00 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -303,6 +303,11 @@ shmem_put_pages(struct drm_i915_gem_object *obj, struct sg_table *pages)
 	struct pagevec pvec;
 	struct page *page;
 
+	if (unlikely(!i915_gem_object_has_struct_page(obj))) {
+		i915_gem_object_put_pages_phys(obj, pages);
+		return;
+	}
+
 	__i915_gem_object_release_shmem(obj, pages, true);
 
 	i915_gem_gtt_finish_pages(obj, pages);
@@ -343,6 +348,9 @@ shmem_pwrite(struct drm_i915_gem_object *obj,
 	/* Caller already validated user args */
 	GEM_BUG_ON(!access_ok(user_data, arg->size));
 
+	if (!i915_gem_object_has_struct_page(obj))
+		return i915_gem_object_pwrite_phys(obj, arg);
+
 	/*
 	 * Before we instantiate/pin the backing store for our use, we
 	 * can prepopulate the shmemfs filp efficiently using a write into
@@ -421,9 +429,20 @@ shmem_pwrite(struct drm_i915_gem_object *obj,
 	return 0;
 }
 
+static int
+shmem_pread(struct drm_i915_gem_object *obj,
+	    const struct drm_i915_gem_pread *arg)
+{
+	if (!i915_gem_object_has_struct_page(obj))
+		return i915_gem_object_pread_phys(obj, arg);
+
+	return -ENODEV;
+}
+
 static void shmem_release(struct drm_i915_gem_object *obj)
 {
-	i915_gem_object_release_memory_region(obj);
+	if (obj->flags & I915_BO_ALLOC_STRUCT_PAGE)
+		i915_gem_object_release_memory_region(obj);
 
 	fput(obj->base.filp);
 }
@@ -438,6 +457,7 @@ const struct drm_i915_gem_object_ops i915_gem_shmem_ops = {
 	.writeback = shmem_writeback,
 
 	.pwrite = shmem_pwrite,
+	.pread = shmem_pread,
 
 	.release = shmem_release,
 };
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c
index fb6a17701310..0cfa082047fe 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c
@@ -38,12 +38,6 @@ static int mock_phys_object(void *arg)
 	}
 
 	if (i915_gem_object_has_struct_page(obj)) {
-		err = -EINVAL;
-		pr_err("shmem has a struct page\n");
-		goto out_obj;
-	}
-
-	if (obj->ops != &i915_gem_phys_ops) {
 		pr_err("i915_gem_object_attach_phys did not create a phys object\n");
 		err = -EINVAL;
 		goto out_obj;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 018/162] drm/i915: Convert i915_gem_object_attach_phys() to ww locking, v2.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (16 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 017/162] drm/i915: Rework struct phys attachment handling Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 019/162] drm/i915: make lockdep slightly happier about execbuf Matthew Auld
                   ` (143 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Simple adding of i915_gem_object_lock, we may start to pass ww to
get_pages() in the future, but that won't be the case here;
We override shmem's get_pages() handling by calling
i915_gem_object_get_pages_phys(), no ww is needed.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  2 ++
 drivers/gpu/drm/i915/gem/i915_gem_phys.c   | 12 ++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c  | 17 ++++++++++-------
 3 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index e549b88693a2..47da3aff2a79 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -43,6 +43,8 @@ int i915_gem_object_pread_phys(struct drm_i915_gem_object *obj,
 			       const struct drm_i915_gem_pread *args);
 
 int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align);
+void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj,
+				     struct sg_table *pages);
 void i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj,
 				    struct sg_table *pages);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index 4bdd0429c08b..144e4940eede 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -201,7 +201,7 @@ static int i915_gem_object_shmem_to_phys(struct drm_i915_gem_object *obj)
 	__i915_gem_object_pin_pages(obj);
 
 	if (!IS_ERR_OR_NULL(pages))
-		i915_gem_shmem_ops.put_pages(obj, pages);
+		i915_gem_object_put_pages_shmem(obj, pages);
 
 	i915_gem_object_release_memory_region(obj);
 	return 0;
@@ -232,7 +232,13 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align)
 	if (err)
 		return err;
 
-	mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES);
+	err = i915_gem_object_lock_interruptible(obj, NULL);
+	if (err)
+		return err;
+
+	err = mutex_lock_interruptible_nested(&obj->mm.lock, I915_MM_GET_PAGES);
+	if (err)
+		goto err_unlock;
 
 	if (unlikely(!i915_gem_object_has_struct_page(obj)))
 		goto out;
@@ -263,6 +269,8 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align)
 
 out:
 	mutex_unlock(&obj->mm.lock);
+err_unlock:
+	i915_gem_object_unlock(obj);
 	return err;
 }
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index d590e0c3bd00..7a59fd1ea4e5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -296,18 +296,12 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
 	__start_cpu_write(obj);
 }
 
-static void
-shmem_put_pages(struct drm_i915_gem_object *obj, struct sg_table *pages)
+void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages)
 {
 	struct sgt_iter sgt_iter;
 	struct pagevec pvec;
 	struct page *page;
 
-	if (unlikely(!i915_gem_object_has_struct_page(obj))) {
-		i915_gem_object_put_pages_phys(obj, pages);
-		return;
-	}
-
 	__i915_gem_object_release_shmem(obj, pages, true);
 
 	i915_gem_gtt_finish_pages(obj, pages);
@@ -336,6 +330,15 @@ shmem_put_pages(struct drm_i915_gem_object *obj, struct sg_table *pages)
 	kfree(pages);
 }
 
+static void
+shmem_put_pages(struct drm_i915_gem_object *obj, struct sg_table *pages)
+{
+	if (likely(i915_gem_object_has_struct_page(obj)))
+		i915_gem_object_put_pages_shmem(obj, pages);
+	else
+		i915_gem_object_put_pages_phys(obj, pages);
+}
+
 static int
 shmem_pwrite(struct drm_i915_gem_object *obj,
 	     const struct drm_i915_gem_pwrite *arg)
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 019/162] drm/i915: make lockdep slightly happier about execbuf.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (17 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 018/162] drm/i915: Convert i915_gem_object_attach_phys() to ww locking, v2 Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 020/162] drm/i915: Disable userptr pread/pwrite support Matthew Auld
                   ` (142 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

As soon as we install fences, we should stop allocating memory
in order to prevent any potential deadlocks.

This is required later on, when we start adding support for
dma-fence annotations.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 24 ++++++++++++++-----
 drivers/gpu/drm/i915/i915_active.c            | 20 ++++++++--------
 drivers/gpu/drm/i915/i915_vma.c               |  8 ++++---
 drivers/gpu/drm/i915/i915_vma.h               |  3 +++
 4 files changed, 36 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 568c8321dc3d..31e412e5c68a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -49,11 +49,12 @@ enum {
 #define DBG_FORCE_RELOC 0 /* choose one of the above! */
 };
 
-#define __EXEC_OBJECT_HAS_PIN		BIT(31)
-#define __EXEC_OBJECT_HAS_FENCE		BIT(30)
-#define __EXEC_OBJECT_NEEDS_MAP		BIT(29)
-#define __EXEC_OBJECT_NEEDS_BIAS	BIT(28)
-#define __EXEC_OBJECT_INTERNAL_FLAGS	(~0u << 28) /* all of the above */
+/* __EXEC_OBJECT_NO_RESERVE is BIT(31), defined in i915_vma.h */
+#define __EXEC_OBJECT_HAS_PIN		BIT(30)
+#define __EXEC_OBJECT_HAS_FENCE		BIT(29)
+#define __EXEC_OBJECT_NEEDS_MAP		BIT(28)
+#define __EXEC_OBJECT_NEEDS_BIAS	BIT(27)
+#define __EXEC_OBJECT_INTERNAL_FLAGS	(~0u << 27) /* all of the above + */
 #define __EXEC_OBJECT_RESERVED (__EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_FENCE)
 
 #define __EXEC_HAS_RELOC	BIT(31)
@@ -929,6 +930,12 @@ static int eb_validate_vmas(struct i915_execbuffer *eb)
 			}
 		}
 
+		if (!(ev->flags & EXEC_OBJECT_WRITE)) {
+			err = dma_resv_reserve_shared(vma->resv, 1);
+			if (err)
+				return err;
+		}
+
 		GEM_BUG_ON(drm_mm_node_allocated(&vma->node) &&
 			   eb_vma_misplaced(&eb->exec[i], vma, ev->flags));
 	}
@@ -2194,7 +2201,8 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 		}
 
 		if (err == 0)
-			err = i915_vma_move_to_active(vma, eb->request, flags);
+			err = i915_vma_move_to_active(vma, eb->request,
+						      flags | __EXEC_OBJECT_NO_RESERVE);
 	}
 
 	if (unlikely(err))
@@ -2446,6 +2454,10 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
 	if (err)
 		goto err_commit;
 
+	err = dma_resv_reserve_shared(shadow->resv, 1);
+	if (err)
+		goto err_commit;
+
 	/* Wait for all writes (and relocs) into the batch to complete */
 	err = i915_sw_fence_await_reservation(&pw->base.chain,
 					      pw->batch->resv, NULL, false,
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
index 10a865f3dc09..6ba4f878ab0e 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -296,18 +296,13 @@ static struct active_node *__active_lookup(struct i915_active *ref, u64 idx)
 static struct i915_active_fence *
 active_instance(struct i915_active *ref, u64 idx)
 {
-	struct active_node *node, *prealloc;
+	struct active_node *node;
 	struct rb_node **p, *parent;
 
 	node = __active_lookup(ref, idx);
 	if (likely(node))
 		return &node->base;
 
-	/* Preallocate a replacement, just in case */
-	prealloc = kmem_cache_alloc(global.slab_cache, GFP_KERNEL);
-	if (!prealloc)
-		return NULL;
-
 	spin_lock_irq(&ref->tree_lock);
 	GEM_BUG_ON(i915_active_is_idle(ref));
 
@@ -317,10 +312,8 @@ active_instance(struct i915_active *ref, u64 idx)
 		parent = *p;
 
 		node = rb_entry(parent, struct active_node, node);
-		if (node->timeline == idx) {
-			kmem_cache_free(global.slab_cache, prealloc);
+		if (node->timeline == idx)
 			goto out;
-		}
 
 		if (node->timeline < idx)
 			p = &parent->rb_right;
@@ -328,7 +321,14 @@ active_instance(struct i915_active *ref, u64 idx)
 			p = &parent->rb_left;
 	}
 
-	node = prealloc;
+	/*
+	 * XXX: We should preallocate this before i915_active_ref() is ever
+	 *  called, but we cannot call into fs_reclaim() anyway, so use GFP_ATOMIC.
+	 */
+	node = kmem_cache_alloc(global.slab_cache, GFP_ATOMIC);
+	if (!node)
+		goto out;
+
 	__i915_active_fence_init(&node->base, NULL, node_retire);
 	node->ref = ref;
 	node->timeline = idx;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index e07621825da9..5b1d78fa748e 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -1281,9 +1281,11 @@ int i915_vma_move_to_active(struct i915_vma *vma,
 		obj->write_domain = I915_GEM_DOMAIN_RENDER;
 		obj->read_domains = 0;
 	} else {
-		err = dma_resv_reserve_shared(vma->resv, 1);
-		if (unlikely(err))
-			return err;
+		if (!(flags & __EXEC_OBJECT_NO_RESERVE)) {
+			err = dma_resv_reserve_shared(vma->resv, 1);
+			if (unlikely(err))
+				return err;
+		}
 
 		dma_resv_add_shared_fence(vma->resv, &rq->fence);
 		obj->write_domain = 0;
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 22387a361999..a2e7b58b70ca 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -52,6 +52,9 @@ static inline bool i915_vma_is_active(const struct i915_vma *vma)
 	return !i915_active_is_idle(&vma->active);
 }
 
+/* do not reserve memory to prevent deadlocks */
+#define __EXEC_OBJECT_NO_RESERVE BIT(31)
+
 int __must_check __i915_vma_move_to_active(struct i915_vma *vma,
 					   struct i915_request *rq);
 int __must_check i915_vma_move_to_active(struct i915_vma *vma,
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 020/162] drm/i915: Disable userptr pread/pwrite support.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (18 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 019/162] drm/i915: make lockdep slightly happier about execbuf Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 021/162] drm/i915: No longer allow exporting userptr through dma-buf Matthew Auld
                   ` (141 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Userptr should not need the kernel for a userspace memcpy, userspace
needs to call memcpy directly.

Specifically, disable i915_gem_pwrite_ioctl() and i915_gem_pread_ioctl().

Still needs an ack from relevant userspace that it won't break,
but should be good.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 30edc5a0a54e..8c3d1eb2f96a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -700,6 +700,24 @@ i915_gem_userptr_dmabuf_export(struct drm_i915_gem_object *obj)
 	return i915_gem_userptr_init__mmu_notifier(obj, 0);
 }
 
+static int
+i915_gem_userptr_pwrite(struct drm_i915_gem_object *obj,
+			const struct drm_i915_gem_pwrite *args)
+{
+	drm_dbg(obj->base.dev, "pwrite to userptr no longer allowed\n");
+
+	return -EINVAL;
+}
+
+static int
+i915_gem_userptr_pread(struct drm_i915_gem_object *obj,
+		       const struct drm_i915_gem_pread *args)
+{
+	drm_dbg(obj->base.dev, "pread from userptr no longer allowed\n");
+
+	return -EINVAL;
+}
+
 static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
 	.name = "i915_gem_object_userptr",
 	.flags = I915_GEM_OBJECT_IS_SHRINKABLE |
@@ -708,6 +726,8 @@ static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
 	.get_pages = i915_gem_userptr_get_pages,
 	.put_pages = i915_gem_userptr_put_pages,
 	.dmabuf_export = i915_gem_userptr_dmabuf_export,
+	.pwrite = i915_gem_userptr_pwrite,
+	.pread = i915_gem_userptr_pread,
 	.release = i915_gem_userptr_release,
 };
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 021/162] drm/i915: No longer allow exporting userptr through dma-buf
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (19 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 020/162] drm/i915: Disable userptr pread/pwrite support Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 022/162] drm/i915: Reject more ioctls for userptr Matthew Auld
                   ` (140 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

It doesn't make sense to export a memory address, we will prevent
allowing access this way to different address spaces when we
rework userptr handling, so best to explicitly disable it.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 8c3d1eb2f96a..44af6265948d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -694,10 +694,9 @@ i915_gem_userptr_release(struct drm_i915_gem_object *obj)
 static int
 i915_gem_userptr_dmabuf_export(struct drm_i915_gem_object *obj)
 {
-	if (obj->userptr.mmu_object)
-		return 0;
+	drm_dbg(obj->base.dev, "Exporting userptr no longer allowed\n");
 
-	return i915_gem_userptr_init__mmu_notifier(obj, 0);
+	return -EINVAL;
 }
 
 static int
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 022/162] drm/i915: Reject more ioctls for userptr
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (20 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 021/162] drm/i915: No longer allow exporting userptr through dma-buf Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:04 ` [RFC PATCH 023/162] drm/i915: Reject UNSYNCHRONIZED for userptr, v2 Matthew Auld
                   ` (139 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

There are a couple of ioctl's related to tiling and cache placement,
that make no sense for userptr, reject those:
- i915_gem_set_tiling_ioctl()
    Tiling should always be linear for userptr. Changing placement will
    fail with -ENXIO.
- i915_gem_set_caching_ioctl()
    Userptr memory should always be cached. Changing will fail with
    -ENXIO.
- i915_gem_set_domain_ioctl()
    Changed to be equivalent to gem_wait, which is correct for the
    cached linear userptr pointers. This is required because we
    cannot grab a reference to the pages in the rework, but waiting
    for idle will do the same.
Still needs an ack from relevant userspace that it won't break,
but should be good.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c | 2 +-
 drivers/gpu/drm/i915/gem/i915_gem_domain.c   | 4 +++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h   | 6 ++++++
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c  | 3 ++-
 4 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index ba26545392bc..f36921a3c4bc 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -17854,7 +17854,7 @@ static int intel_user_framebuffer_create_handle(struct drm_framebuffer *fb,
 	struct drm_i915_gem_object *obj = intel_fb_obj(fb);
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 
-	if (obj->userptr.mm) {
+	if (i915_gem_object_is_userptr(obj)) {
 		drm_dbg(&i915->drm,
 			"attempting to use a userptr for a framebuffer, denied\n");
 		return -EINVAL;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index fcce6909f201..c1d4bf62b3ea 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -528,7 +528,9 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
 	 * considered to be outside of any cache domain.
 	 */
 	if (i915_gem_object_is_proxy(obj)) {
-		err = -ENXIO;
+		/* silently allow userptr to complete */
+		if (!i915_gem_object_is_userptr(obj))
+			err = -ENXIO;
 		goto out;
 	}
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 47da3aff2a79..95907b8eb4c4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -551,6 +551,12 @@ void __i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj,
 void __i915_gem_object_invalidate_frontbuffer(struct drm_i915_gem_object *obj,
 					      enum fb_op_origin origin);
 
+static inline bool
+i915_gem_object_is_userptr(struct drm_i915_gem_object *obj)
+{
+	return obj->userptr.mm;
+}
+
 static inline void
 i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj,
 				  enum fb_op_origin origin)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 44af6265948d..64a946d5f753 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -721,7 +721,8 @@ static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
 	.name = "i915_gem_object_userptr",
 	.flags = I915_GEM_OBJECT_IS_SHRINKABLE |
 		 I915_GEM_OBJECT_NO_MMAP |
-		 I915_GEM_OBJECT_ASYNC_CANCEL,
+		 I915_GEM_OBJECT_ASYNC_CANCEL |
+		 I915_GEM_OBJECT_IS_PROXY,
 	.get_pages = i915_gem_userptr_get_pages,
 	.put_pages = i915_gem_userptr_put_pages,
 	.dmabuf_export = i915_gem_userptr_dmabuf_export,
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 023/162] drm/i915: Reject UNSYNCHRONIZED for userptr, v2.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (21 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 022/162] drm/i915: Reject more ioctls for userptr Matthew Auld
@ 2020-11-27 12:04 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 024/162] drm/i915: Make compilation of userptr code depend on MMU_NOTIFIER Matthew Auld
                   ` (138 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:04 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

We should not allow this any more, as it will break with the new userptr
implementation, it could still be made to work, but there's no point in
doing so.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 64a946d5f753..241f865077b9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -224,7 +224,7 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
 	struct i915_mmu_object *mo;
 
 	if (flags & I915_USERPTR_UNSYNCHRONIZED)
-		return capable(CAP_SYS_ADMIN) ? 0 : -EPERM;
+		return -ENODEV;
 
 	if (GEM_WARN_ON(!obj->userptr.mm))
 		return -EINVAL;
@@ -274,13 +274,7 @@ static int
 i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
 				    unsigned flags)
 {
-	if ((flags & I915_USERPTR_UNSYNCHRONIZED) == 0)
-		return -ENODEV;
-
-	if (!capable(CAP_SYS_ADMIN))
-		return -EPERM;
-
-	return 0;
+	return -ENODEV;
 }
 
 static void
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 024/162] drm/i915: Make compilation of userptr code depend on MMU_NOTIFIER.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (22 preceding siblings ...)
  2020-11-27 12:04 ` [RFC PATCH 023/162] drm/i915: Reject UNSYNCHRONIZED for userptr, v2 Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 025/162] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v5 Matthew Auld
                   ` (137 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Now that unsynchronized mappings are removed, the only time userptr
works is when the MMU notifier is enabled. Put all of the userptr
code behind a mmu notifier ifdef.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  2 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  4 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  2 +
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   | 58 +++++++------------
 drivers/gpu/drm/i915/i915_drv.h               |  2 +
 5 files changed, 31 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 31e412e5c68a..064285a5009b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1970,8 +1970,10 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb,
 		err = 0;
 	}
 
+#ifdef CONFIG_MMU_NOTIFIER
 	if (!err)
 		flush_workqueue(eb->i915->mm.userptr_wq);
+#endif
 
 err_relock:
 	i915_gem_ww_ctx_init(&eb->ww, true);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 95907b8eb4c4..7b3a84f98b42 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -554,7 +554,11 @@ void __i915_gem_object_invalidate_frontbuffer(struct drm_i915_gem_object *obj,
 static inline bool
 i915_gem_object_is_userptr(struct drm_i915_gem_object *obj)
 {
+#ifdef CONFIG_MMU_NOTIFIER
 	return obj->userptr.mm;
+#else
+	return false;
+#endif
 }
 
 static inline void
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index b53e44b06b09..6d3f451c15c6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -289,6 +289,7 @@ struct drm_i915_gem_object {
 	unsigned long *bit_17;
 
 	union {
+#ifdef CONFIG_MMU_NOTIFIER
 		struct i915_gem_userptr {
 			uintptr_t ptr;
 
@@ -296,6 +297,7 @@ struct drm_i915_gem_object {
 			struct i915_mmu_object *mmu_object;
 			struct work_struct *work;
 		} userptr;
+#endif
 
 		unsigned long scratch;
 		u64 encode;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 241f865077b9..1183b28c084b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -15,6 +15,8 @@
 #include "i915_gem_object.h"
 #include "i915_scatterlist.h"
 
+#if defined(CONFIG_MMU_NOTIFIER)
+
 struct i915_mm_struct {
 	struct mm_struct *mm;
 	struct drm_i915_private *i915;
@@ -24,7 +26,6 @@ struct i915_mm_struct {
 	struct rcu_work work;
 };
 
-#if defined(CONFIG_MMU_NOTIFIER)
 #include <linux/interval_tree.h>
 
 struct i915_mmu_notifier {
@@ -217,15 +218,11 @@ i915_mmu_notifier_find(struct i915_mm_struct *mm)
 }
 
 static int
-i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
-				    unsigned flags)
+i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj)
 {
 	struct i915_mmu_notifier *mn;
 	struct i915_mmu_object *mo;
 
-	if (flags & I915_USERPTR_UNSYNCHRONIZED)
-		return -ENODEV;
-
 	if (GEM_WARN_ON(!obj->userptr.mm))
 		return -EINVAL;
 
@@ -258,32 +255,6 @@ i915_mmu_notifier_free(struct i915_mmu_notifier *mn,
 	kfree(mn);
 }
 
-#else
-
-static void
-__i915_gem_userptr_set_active(struct drm_i915_gem_object *obj, bool value)
-{
-}
-
-static void
-i915_gem_userptr_release__mmu_notifier(struct drm_i915_gem_object *obj)
-{
-}
-
-static int
-i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
-				    unsigned flags)
-{
-	return -ENODEV;
-}
-
-static void
-i915_mmu_notifier_free(struct i915_mmu_notifier *mn,
-		       struct mm_struct *mm)
-{
-}
-
-#endif
 
 static struct i915_mm_struct *
 __i915_mm_struct_find(struct drm_i915_private *i915, struct mm_struct *real)
@@ -725,6 +696,8 @@ static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
 	.release = i915_gem_userptr_release,
 };
 
+#endif
+
 /*
  * Creates a new mm object that wraps some normal memory from the process
  * context - user memory.
@@ -765,12 +738,12 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
 		       void *data,
 		       struct drm_file *file)
 {
-	static struct lock_class_key lock_class;
+	static struct lock_class_key __maybe_unused lock_class;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct drm_i915_gem_userptr *args = data;
-	struct drm_i915_gem_object *obj;
-	int ret;
-	u32 handle;
+	struct drm_i915_gem_object __maybe_unused *obj;
+	int __maybe_unused ret;
+	u32 __maybe_unused handle;
 
 	if (!HAS_LLC(dev_priv) && !HAS_SNOOP(dev_priv)) {
 		/* We cannot support coherent userptr objects on hw without
@@ -809,6 +782,9 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
 	if (!access_ok((char __user *)(unsigned long)args->user_ptr, args->user_size))
 		return -EFAULT;
 
+	if (args->flags & I915_USERPTR_UNSYNCHRONIZED)
+		return -ENODEV;
+
 	if (args->flags & I915_USERPTR_READ_ONLY) {
 		/*
 		 * On almost all of the older hw, we cannot tell the GPU that
@@ -818,6 +794,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
 			return -ENODEV;
 	}
 
+#ifdef CONFIG_MMU_NOTIFIER
 	obj = i915_gem_object_alloc();
 	if (obj == NULL)
 		return -ENOMEM;
@@ -839,7 +816,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
 	 */
 	ret = i915_gem_userptr_init__mm_struct(obj);
 	if (ret == 0)
-		ret = i915_gem_userptr_init__mmu_notifier(obj, args->flags);
+		ret = i915_gem_userptr_init__mmu_notifier(obj);
 	if (ret == 0)
 		ret = drm_gem_handle_create(file, &obj->base, &handle);
 
@@ -850,10 +827,14 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
 
 	args->handle = handle;
 	return 0;
+#else
+	return -ENODEV;
+#endif
 }
 
 int i915_gem_init_userptr(struct drm_i915_private *dev_priv)
 {
+#ifdef CONFIG_MMU_NOTIFIER
 	spin_lock_init(&dev_priv->mm_lock);
 	hash_init(dev_priv->mm_structs);
 
@@ -863,11 +844,14 @@ int i915_gem_init_userptr(struct drm_i915_private *dev_priv)
 				0);
 	if (!dev_priv->mm.userptr_wq)
 		return -ENOMEM;
+#endif
 
 	return 0;
 }
 
 void i915_gem_cleanup_userptr(struct drm_i915_private *dev_priv)
 {
+#ifdef CONFIG_MMU_NOTIFIER
 	destroy_workqueue(dev_priv->mm.userptr_wq);
+#endif
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 84182a40e777..d3c67e17cd02 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -589,12 +589,14 @@ struct i915_gem_mm {
 	struct notifier_block vmap_notifier;
 	struct shrinker shrinker;
 
+#ifdef CONFIG_MMU_NOTIFIER
 	/**
 	 * Workqueue to fault in userptr pages, flushed by the execbuf
 	 * when required but otherwise left to userspace to try again
 	 * on EAGAIN.
 	 */
 	struct workqueue_struct *userptr_wq;
+#endif
 
 	/* shrinker accounting, also useful for userland debugging */
 	u64 shrink_memory;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 025/162] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v5.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (23 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 024/162] drm/i915: Make compilation of userptr code depend on MMU_NOTIFIER Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 026/162] drm/i915: Flatten obj->mm.lock Matthew Auld
                   ` (136 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Instead of doing what we do currently, which will never work with
PROVE_LOCKING, do the same as AMD does, and something similar to
relocation slowpath. When all locks are dropped, we acquire the
pages for pinning. When the locks are taken, we transfer those
pages in .get_pages() to the bo. As a final check before installing
the fences, we ensure that the mmu notifier was not called; if it is,
we return -EAGAIN to userspace to signal it has to start over.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 101 ++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  35 +-
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  10 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   | 764 ++++++------------
 drivers/gpu/drm/i915/i915_drv.h               |   9 +-
 drivers/gpu/drm/i915/i915_gem.c               |   5 +-
 7 files changed, 344 insertions(+), 582 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 064285a5009b..f5ea49e244ca 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -52,14 +52,16 @@ enum {
 /* __EXEC_OBJECT_NO_RESERVE is BIT(31), defined in i915_vma.h */
 #define __EXEC_OBJECT_HAS_PIN		BIT(30)
 #define __EXEC_OBJECT_HAS_FENCE		BIT(29)
-#define __EXEC_OBJECT_NEEDS_MAP		BIT(28)
-#define __EXEC_OBJECT_NEEDS_BIAS	BIT(27)
-#define __EXEC_OBJECT_INTERNAL_FLAGS	(~0u << 27) /* all of the above + */
+#define __EXEC_OBJECT_USERPTR_INIT	BIT(28)
+#define __EXEC_OBJECT_NEEDS_MAP		BIT(27)
+#define __EXEC_OBJECT_NEEDS_BIAS	BIT(26)
+#define __EXEC_OBJECT_INTERNAL_FLAGS	(~0u << 26) /* all of the above + */
 #define __EXEC_OBJECT_RESERVED (__EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_FENCE)
 
 #define __EXEC_HAS_RELOC	BIT(31)
 #define __EXEC_ENGINE_PINNED	BIT(30)
-#define __EXEC_INTERNAL_FLAGS	(~0u << 30)
+#define __EXEC_USERPTR_USED	BIT(29)
+#define __EXEC_INTERNAL_FLAGS	(~0u << 29)
 #define UPDATE			PIN_OFFSET_FIXED
 
 #define BATCH_OFFSET_BIAS (256*1024)
@@ -865,6 +867,26 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
 		}
 
 		eb_add_vma(eb, i, batch, vma);
+
+		if (i915_gem_object_is_userptr(vma->obj)) {
+			err = i915_gem_object_userptr_submit_init(vma->obj);
+			if (err) {
+				if (i + 1 < eb->buffer_count) {
+					/*
+					 * Execbuffer code expects last vma entry to be NULL,
+					 * since we already initialized this entry,
+					 * set the next value to NULL or we mess up
+					 * cleanup handling.
+					 */
+					eb->vma[i + 1].vma = NULL;
+				}
+
+				return err;
+			}
+
+			eb->vma[i].flags |= __EXEC_OBJECT_USERPTR_INIT;
+			eb->args->flags |= __EXEC_USERPTR_USED;
+		}
 	}
 
 	if (unlikely(eb->batch->flags & EXEC_OBJECT_WRITE)) {
@@ -966,7 +988,7 @@ eb_get_vma(const struct i915_execbuffer *eb, unsigned long handle)
 	}
 }
 
-static void eb_release_vmas(struct i915_execbuffer *eb, bool final)
+static void eb_release_vmas(struct i915_execbuffer *eb, bool final, bool release_userptr)
 {
 	const unsigned int count = eb->buffer_count;
 	unsigned int i;
@@ -980,6 +1002,11 @@ static void eb_release_vmas(struct i915_execbuffer *eb, bool final)
 
 		eb_unreserve_vma(ev);
 
+		if (release_userptr && ev->flags & __EXEC_OBJECT_USERPTR_INIT) {
+			ev->flags &= ~__EXEC_OBJECT_USERPTR_INIT;
+			i915_gem_object_userptr_submit_fini(vma->obj);
+		}
+
 		if (final)
 			i915_vma_put(vma);
 	}
@@ -1915,6 +1942,31 @@ static int eb_prefault_relocations(const struct i915_execbuffer *eb)
 	return 0;
 }
 
+static int eb_reinit_userptr(struct i915_execbuffer *eb)
+{
+	const unsigned int count = eb->buffer_count;
+	unsigned int i;
+	int ret;
+
+	if (likely(!(eb->args->flags & __EXEC_USERPTR_USED)))
+		return 0;
+
+	for (i = 0; i < count; i++) {
+		struct eb_vma *ev = &eb->vma[i];
+
+		if (!i915_gem_object_is_userptr(ev->vma->obj))
+			continue;
+
+		ret = i915_gem_object_userptr_submit_init(ev->vma->obj);
+		if (ret)
+			return ret;
+
+		ev->flags |= __EXEC_OBJECT_USERPTR_INIT;
+	}
+
+	return 0;
+}
+
 static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb,
 					   struct i915_request *rq)
 {
@@ -1929,7 +1981,7 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb,
 	}
 
 	/* We may process another execbuffer during the unlock... */
-	eb_release_vmas(eb, false);
+	eb_release_vmas(eb, false, true);
 	i915_gem_ww_ctx_fini(&eb->ww);
 
 	if (rq) {
@@ -1970,10 +2022,8 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb,
 		err = 0;
 	}
 
-#ifdef CONFIG_MMU_NOTIFIER
 	if (!err)
-		flush_workqueue(eb->i915->mm.userptr_wq);
-#endif
+		err = eb_reinit_userptr(eb);
 
 err_relock:
 	i915_gem_ww_ctx_init(&eb->ww, true);
@@ -2035,7 +2085,7 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb,
 
 err:
 	if (err == -EDEADLK) {
-		eb_release_vmas(eb, false);
+		eb_release_vmas(eb, false, false);
 		err = i915_gem_ww_ctx_backoff(&eb->ww);
 		if (!err)
 			goto repeat_validate;
@@ -2132,7 +2182,7 @@ static int eb_relocate_parse(struct i915_execbuffer *eb)
 
 err:
 	if (err == -EDEADLK) {
-		eb_release_vmas(eb, false);
+		eb_release_vmas(eb, false, false);
 		err = i915_gem_ww_ctx_backoff(&eb->ww);
 		if (!err)
 			goto retry;
@@ -2207,6 +2257,30 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 						      flags | __EXEC_OBJECT_NO_RESERVE);
 	}
 
+#ifdef CONFIG_MMU_NOTIFIER
+	if (!err && (eb->args->flags & __EXEC_USERPTR_USED)) {
+		spin_lock(&eb->i915->mm.notifier_lock);
+
+		/*
+		 * count is always at least 1, otherwise __EXEC_USERPTR_USED
+		 * could not have been set
+		 */
+		for (i = 0; i < count; i++) {
+			struct eb_vma *ev = &eb->vma[i];
+			struct drm_i915_gem_object *obj = ev->vma->obj;
+
+			if (!i915_gem_object_is_userptr(obj))
+				continue;
+
+			err = i915_gem_object_userptr_submit_done(obj);
+			if (err)
+				break;
+		}
+
+		spin_unlock(&eb->i915->mm.notifier_lock);
+	}
+#endif
+
 	if (unlikely(err))
 		goto err_skip;
 
@@ -3347,7 +3421,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 
 	err = eb_lookup_vmas(&eb);
 	if (err) {
-		eb_release_vmas(&eb, true);
+		eb_release_vmas(&eb, true, true);
 		goto err_engine;
 	}
 
@@ -3419,6 +3493,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 
 	trace_i915_request_queue(eb.request, eb.batch_flags);
 	err = eb_submit(&eb, batch);
+
 err_request:
 	i915_request_get(eb.request);
 	eb_request_add(&eb);
@@ -3439,7 +3514,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	i915_request_put(eb.request);
 
 err_vma:
-	eb_release_vmas(&eb, true);
+	eb_release_vmas(&eb, true, true);
 	if (eb.trampoline)
 		i915_vma_unpin(eb.trampoline);
 	WARN_ON(err == -EDEADLK);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 7b3a84f98b42..33412248f6df 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -33,6 +33,7 @@ i915_gem_object_create_shmem_from_data(struct drm_i915_private *i915,
 				       const void *data, resource_size_t size);
 
 extern const struct drm_i915_gem_object_ops i915_gem_shmem_ops;
+
 void __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
 				     struct sg_table *pages,
 				     bool needs_clflush);
@@ -245,12 +246,6 @@ i915_gem_object_never_mmap(const struct drm_i915_gem_object *obj)
 	return i915_gem_object_type_has(obj, I915_GEM_OBJECT_NO_MMAP);
 }
 
-static inline bool
-i915_gem_object_needs_async_cancel(const struct drm_i915_gem_object *obj)
-{
-	return i915_gem_object_type_has(obj, I915_GEM_OBJECT_ASYNC_CANCEL);
-}
-
 static inline bool
 i915_gem_object_is_framebuffer(const struct drm_i915_gem_object *obj)
 {
@@ -551,16 +546,6 @@ void __i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj,
 void __i915_gem_object_invalidate_frontbuffer(struct drm_i915_gem_object *obj,
 					      enum fb_op_origin origin);
 
-static inline bool
-i915_gem_object_is_userptr(struct drm_i915_gem_object *obj)
-{
-#ifdef CONFIG_MMU_NOTIFIER
-	return obj->userptr.mm;
-#else
-	return false;
-#endif
-}
-
 static inline void
 i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj,
 				  enum fb_op_origin origin)
@@ -577,4 +562,22 @@ i915_gem_object_invalidate_frontbuffer(struct drm_i915_gem_object *obj,
 		__i915_gem_object_invalidate_frontbuffer(obj, origin);
 }
 
+#ifdef CONFIG_MMU_NOTIFIER
+static inline bool
+i915_gem_object_is_userptr(struct drm_i915_gem_object *obj)
+{
+	return obj->userptr.notifier.mm;
+}
+
+int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj);
+int i915_gem_object_userptr_submit_done(struct drm_i915_gem_object *obj);
+void i915_gem_object_userptr_submit_fini(struct drm_i915_gem_object *obj);
+#else
+static inline bool i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) { return false; }
+
+static inline int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); return -ENODEV; }
+static inline int i915_gem_object_userptr_submit_done(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); return -ENODEV; }
+static inline void i915_gem_object_userptr_submit_fini(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); }
+#endif
+
 #endif
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 6d3f451c15c6..5234c1ed62d4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -7,6 +7,8 @@
 #ifndef __I915_GEM_OBJECT_TYPES_H__
 #define __I915_GEM_OBJECT_TYPES_H__
 
+#include <linux/mmu_notifier.h>
+
 #include <drm/drm_gem.h>
 #include <uapi/drm/i915_drm.h>
 
@@ -34,7 +36,6 @@ struct drm_i915_gem_object_ops {
 #define I915_GEM_OBJECT_IS_SHRINKABLE	BIT(2)
 #define I915_GEM_OBJECT_IS_PROXY	BIT(3)
 #define I915_GEM_OBJECT_NO_MMAP		BIT(4)
-#define I915_GEM_OBJECT_ASYNC_CANCEL	BIT(5)
 
 	/* Interface between the GEM object and its backing storage.
 	 * get_pages() is called once prior to the use of the associated set
@@ -292,10 +293,11 @@ struct drm_i915_gem_object {
 #ifdef CONFIG_MMU_NOTIFIER
 		struct i915_gem_userptr {
 			uintptr_t ptr;
+			unsigned long notifier_seq;
 
-			struct i915_mm_struct *mm;
-			struct i915_mmu_object *mmu_object;
-			struct work_struct *work;
+			struct mmu_interval_notifier notifier;
+			struct page **pvec;
+			int page_ref;
 		} userptr;
 #endif
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 7983423237e3..60149cad6080 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -223,7 +223,7 @@ int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 	 * get_pages backends we should be better able to handle the
 	 * cancellation of the async task in a more uniform manner.
 	 */
-	if (!pages && !i915_gem_object_needs_async_cancel(obj))
+	if (!pages)
 		pages = ERR_PTR(-EINVAL);
 
 	if (!IS_ERR(pages))
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 1183b28c084b..9ea9aa65ade1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -2,10 +2,39 @@
  * SPDX-License-Identifier: MIT
  *
  * Copyright © 2012-2014 Intel Corporation
+ *
+  * Based on amdgpu_mn, which bears the following notice:
+ *
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ */
+/*
+ * Authors:
+ *    Christian König <christian.koenig@amd.com>
  */
 
 #include <linux/mmu_context.h>
-#include <linux/mmu_notifier.h>
 #include <linux/mempolicy.h>
 #include <linux/swap.h>
 #include <linux/sched/mm.h>
@@ -15,365 +44,106 @@
 #include "i915_gem_object.h"
 #include "i915_scatterlist.h"
 
-#if defined(CONFIG_MMU_NOTIFIER)
-
-struct i915_mm_struct {
-	struct mm_struct *mm;
-	struct drm_i915_private *i915;
-	struct i915_mmu_notifier *mn;
-	struct hlist_node node;
-	struct kref kref;
-	struct rcu_work work;
-};
-
-#include <linux/interval_tree.h>
-
-struct i915_mmu_notifier {
-	spinlock_t lock;
-	struct hlist_node node;
-	struct mmu_notifier mn;
-	struct rb_root_cached objects;
-	struct i915_mm_struct *mm;
-};
-
-struct i915_mmu_object {
-	struct i915_mmu_notifier *mn;
-	struct drm_i915_gem_object *obj;
-	struct interval_tree_node it;
-};
-
-static void add_object(struct i915_mmu_object *mo)
-{
-	GEM_BUG_ON(!RB_EMPTY_NODE(&mo->it.rb));
-	interval_tree_insert(&mo->it, &mo->mn->objects);
-}
-
-static void del_object(struct i915_mmu_object *mo)
-{
-	if (RB_EMPTY_NODE(&mo->it.rb))
-		return;
-
-	interval_tree_remove(&mo->it, &mo->mn->objects);
-	RB_CLEAR_NODE(&mo->it.rb);
-}
+#ifdef CONFIG_MMU_NOTIFIER
 
-static void
-__i915_gem_userptr_set_active(struct drm_i915_gem_object *obj, bool value)
+/**
+ * i915_gem_userptr_invalidate - callback to notify about mm change
+ *
+ * @mni: the range (mm) is about to update
+ * @range: details on the invalidation
+ * @cur_seq: Value to pass to mmu_interval_set_seq()
+ *
+ * Block for operations on BOs to finish and mark pages as accessed and
+ * potentially dirty.
+ */
+static bool i915_gem_userptr_invalidate(struct mmu_interval_notifier *mni,
+					const struct mmu_notifier_range *range,
+					unsigned long cur_seq)
 {
-	struct i915_mmu_object *mo = obj->userptr.mmu_object;
-
-	/*
-	 * During mm_invalidate_range we need to cancel any userptr that
-	 * overlaps the range being invalidated. Doing so requires the
-	 * struct_mutex, and that risks recursion. In order to cause
-	 * recursion, the user must alias the userptr address space with
-	 * a GTT mmapping (possible with a MAP_FIXED) - then when we have
-	 * to invalidate that mmaping, mm_invalidate_range is called with
-	 * the userptr address *and* the struct_mutex held.  To prevent that
-	 * we set a flag under the i915_mmu_notifier spinlock to indicate
-	 * whether this object is valid.
-	 */
-	if (!mo)
-		return;
+	struct drm_i915_gem_object *obj = container_of(mni, struct drm_i915_gem_object, userptr.notifier);
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	long r;
 
-	spin_lock(&mo->mn->lock);
-	if (value)
-		add_object(mo);
-	else
-		del_object(mo);
-	spin_unlock(&mo->mn->lock);
-}
+	if (!mmu_notifier_range_blockable(range))
+		return false;
 
-static int
-userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
-				  const struct mmu_notifier_range *range)
-{
-	struct i915_mmu_notifier *mn =
-		container_of(_mn, struct i915_mmu_notifier, mn);
-	struct interval_tree_node *it;
-	unsigned long end;
-	int ret = 0;
-
-	if (RB_EMPTY_ROOT(&mn->objects.rb_root))
-		return 0;
-
-	/* interval ranges are inclusive, but invalidate range is exclusive */
-	end = range->end - 1;
-
-	spin_lock(&mn->lock);
-	it = interval_tree_iter_first(&mn->objects, range->start, end);
-	while (it) {
-		struct drm_i915_gem_object *obj;
-
-		if (!mmu_notifier_range_blockable(range)) {
-			ret = -EAGAIN;
-			break;
-		}
+	spin_lock(&i915->mm.notifier_lock);
 
-		/*
-		 * The mmu_object is released late when destroying the
-		 * GEM object so it is entirely possible to gain a
-		 * reference on an object in the process of being freed
-		 * since our serialisation is via the spinlock and not
-		 * the struct_mutex - and consequently use it after it
-		 * is freed and then double free it. To prevent that
-		 * use-after-free we only acquire a reference on the
-		 * object if it is not in the process of being destroyed.
-		 */
-		obj = container_of(it, struct i915_mmu_object, it)->obj;
-		if (!kref_get_unless_zero(&obj->base.refcount)) {
-			it = interval_tree_iter_next(it, range->start, end);
-			continue;
-		}
-		spin_unlock(&mn->lock);
+	mmu_interval_set_seq(mni, cur_seq);
 
-		ret = i915_gem_object_unbind(obj,
-					     I915_GEM_OBJECT_UNBIND_ACTIVE |
-					     I915_GEM_OBJECT_UNBIND_BARRIER);
-		if (ret == 0)
-			ret = __i915_gem_object_put_pages(obj);
-		i915_gem_object_put(obj);
-		if (ret)
-			return ret;
+	spin_unlock(&i915->mm.notifier_lock);
 
-		spin_lock(&mn->lock);
+	/* During exit there's no need to wait */
+	if (current->flags & PF_EXITING)
+		return true;
 
-		/*
-		 * As we do not (yet) protect the mmu from concurrent insertion
-		 * over this range, there is no guarantee that this search will
-		 * terminate given a pathologic workload.
-		 */
-		it = interval_tree_iter_first(&mn->objects, range->start, end);
-	}
-	spin_unlock(&mn->lock);
-
-	return ret;
+	/* we will unbind on next submission, still have userptr pins */
+	r = dma_resv_wait_timeout_rcu(obj->base.resv, true, false,
+				      MAX_SCHEDULE_TIMEOUT);
+	if (r <= 0)
+		drm_err(&i915->drm, "(%ld) failed to wait for idle\n", r);
 
+	return true;
 }
 
-static const struct mmu_notifier_ops i915_gem_userptr_notifier = {
-	.invalidate_range_start = userptr_mn_invalidate_range_start,
+static const struct mmu_interval_notifier_ops i915_gem_userptr_notifier_ops = {
+	.invalidate = i915_gem_userptr_invalidate,
 };
 
-static struct i915_mmu_notifier *
-i915_mmu_notifier_create(struct i915_mm_struct *mm)
-{
-	struct i915_mmu_notifier *mn;
-
-	mn = kmalloc(sizeof(*mn), GFP_KERNEL);
-	if (mn == NULL)
-		return ERR_PTR(-ENOMEM);
-
-	spin_lock_init(&mn->lock);
-	mn->mn.ops = &i915_gem_userptr_notifier;
-	mn->objects = RB_ROOT_CACHED;
-	mn->mm = mm;
-
-	return mn;
-}
-
-static void
-i915_gem_userptr_release__mmu_notifier(struct drm_i915_gem_object *obj)
-{
-	struct i915_mmu_object *mo;
-
-	mo = fetch_and_zero(&obj->userptr.mmu_object);
-	if (!mo)
-		return;
-
-	spin_lock(&mo->mn->lock);
-	del_object(mo);
-	spin_unlock(&mo->mn->lock);
-	kfree(mo);
-}
-
-static struct i915_mmu_notifier *
-i915_mmu_notifier_find(struct i915_mm_struct *mm)
-{
-	struct i915_mmu_notifier *mn, *old;
-	int err;
-
-	mn = READ_ONCE(mm->mn);
-	if (likely(mn))
-		return mn;
-
-	mn = i915_mmu_notifier_create(mm);
-	if (IS_ERR(mn))
-		return mn;
-
-	err = mmu_notifier_register(&mn->mn, mm->mm);
-	if (err) {
-		kfree(mn);
-		return ERR_PTR(err);
-	}
-
-	old = cmpxchg(&mm->mn, NULL, mn);
-	if (old) {
-		mmu_notifier_unregister(&mn->mn, mm->mm);
-		kfree(mn);
-		mn = old;
-	}
-
-	return mn;
-}
-
 static int
 i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj)
 {
-	struct i915_mmu_notifier *mn;
-	struct i915_mmu_object *mo;
-
-	if (GEM_WARN_ON(!obj->userptr.mm))
-		return -EINVAL;
-
-	mn = i915_mmu_notifier_find(obj->userptr.mm);
-	if (IS_ERR(mn))
-		return PTR_ERR(mn);
-
-	mo = kzalloc(sizeof(*mo), GFP_KERNEL);
-	if (!mo)
-		return -ENOMEM;
-
-	mo->mn = mn;
-	mo->obj = obj;
-	mo->it.start = obj->userptr.ptr;
-	mo->it.last = obj->userptr.ptr + obj->base.size - 1;
-	RB_CLEAR_NODE(&mo->it.rb);
-
-	obj->userptr.mmu_object = mo;
-	return 0;
+	return mmu_interval_notifier_insert(&obj->userptr.notifier, current->mm,
+					    obj->userptr.ptr, obj->base.size,
+					    &i915_gem_userptr_notifier_ops);
 }
 
-static void
-i915_mmu_notifier_free(struct i915_mmu_notifier *mn,
-		       struct mm_struct *mm)
-{
-	if (mn == NULL)
-		return;
-
-	mmu_notifier_unregister(&mn->mn, mm);
-	kfree(mn);
-}
-
-
-static struct i915_mm_struct *
-__i915_mm_struct_find(struct drm_i915_private *i915, struct mm_struct *real)
-{
-	struct i915_mm_struct *it, *mm = NULL;
-
-	rcu_read_lock();
-	hash_for_each_possible_rcu(i915->mm_structs,
-				   it, node,
-				   (unsigned long)real)
-		if (it->mm == real && kref_get_unless_zero(&it->kref)) {
-			mm = it;
-			break;
-		}
-	rcu_read_unlock();
-
-	return mm;
-}
-
-static int
-i915_gem_userptr_init__mm_struct(struct drm_i915_gem_object *obj)
+static void i915_gem_object_userptr_drop_ref(struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
-	struct i915_mm_struct *mm, *new;
-	int ret = 0;
-
-	/* During release of the GEM object we hold the struct_mutex. This
-	 * precludes us from calling mmput() at that time as that may be
-	 * the last reference and so call exit_mmap(). exit_mmap() will
-	 * attempt to reap the vma, and if we were holding a GTT mmap
-	 * would then call drm_gem_vm_close() and attempt to reacquire
-	 * the struct mutex. So in order to avoid that recursion, we have
-	 * to defer releasing the mm reference until after we drop the
-	 * struct_mutex, i.e. we need to schedule a worker to do the clean
-	 * up.
-	 */
-	mm = __i915_mm_struct_find(i915, current->mm);
-	if (mm)
-		goto out;
+	struct page **pvec = NULL;
 
-	new = kmalloc(sizeof(*mm), GFP_KERNEL);
-	if (!new)
-		return -ENOMEM;
-
-	kref_init(&new->kref);
-	new->i915 = to_i915(obj->base.dev);
-	new->mm = current->mm;
-	new->mn = NULL;
-
-	spin_lock(&i915->mm_lock);
-	mm = __i915_mm_struct_find(i915, current->mm);
-	if (!mm) {
-		hash_add_rcu(i915->mm_structs,
-			     &new->node,
-			     (unsigned long)new->mm);
-		mmgrab(current->mm);
-		mm = new;
+	spin_lock(&i915->mm.notifier_lock);
+	if (!--obj->userptr.page_ref) {
+		pvec = obj->userptr.pvec;
+		obj->userptr.pvec = NULL;
 	}
-	spin_unlock(&i915->mm_lock);
-	if (mm != new)
-		kfree(new);
-
-out:
-	obj->userptr.mm = mm;
-	return ret;
-}
-
-static void
-__i915_mm_struct_free__worker(struct work_struct *work)
-{
-	struct i915_mm_struct *mm = container_of(work, typeof(*mm), work.work);
-
-	i915_mmu_notifier_free(mm->mn, mm->mm);
-	mmdrop(mm->mm);
-	kfree(mm);
-}
-
-static void
-__i915_mm_struct_free(struct kref *kref)
-{
-	struct i915_mm_struct *mm = container_of(kref, typeof(*mm), kref);
-
-	spin_lock(&mm->i915->mm_lock);
-	hash_del_rcu(&mm->node);
-	spin_unlock(&mm->i915->mm_lock);
-
-	INIT_RCU_WORK(&mm->work, __i915_mm_struct_free__worker);
-	queue_rcu_work(system_wq, &mm->work);
-}
+	GEM_BUG_ON(obj->userptr.page_ref < 0);
+	spin_unlock(&i915->mm.notifier_lock);
 
-static void
-i915_gem_userptr_release__mm_struct(struct drm_i915_gem_object *obj)
-{
-	if (obj->userptr.mm == NULL)
-		return;
+	if (pvec) {
+		const unsigned long num_pages = obj->base.size >> PAGE_SHIFT;
 
-	kref_put(&obj->userptr.mm->kref, __i915_mm_struct_free);
-	obj->userptr.mm = NULL;
+		unpin_user_pages(pvec, num_pages);
+		kfree(pvec);
+	}
 }
 
-struct get_pages_work {
-	struct work_struct work;
-	struct drm_i915_gem_object *obj;
-	struct task_struct *task;
-};
-
-static struct sg_table *
-__i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
-			       struct page **pvec, unsigned long num_pages)
+static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 {
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	const unsigned long num_pages = obj->base.size >> PAGE_SHIFT;
 	unsigned int max_segment = i915_sg_segment_size();
 	struct sg_table *st;
 	unsigned int sg_page_sizes;
 	struct scatterlist *sg;
+	struct page **pvec;
 	int ret;
 
 	st = kmalloc(sizeof(*st), GFP_KERNEL);
 	if (!st)
-		return ERR_PTR(-ENOMEM);
+		return -ENOMEM;
+
+	spin_lock(&i915->mm.notifier_lock);
+	if (GEM_WARN_ON(!obj->userptr.page_ref)) {
+		spin_unlock(&i915->mm.notifier_lock);
+		ret = -EFAULT;
+		goto err_free;
+	}
+
+	obj->userptr.page_ref++;
+	pvec = obj->userptr.pvec;
+	spin_unlock(&i915->mm.notifier_lock);
 
 alloc_table:
 	sg = __sg_alloc_table_from_pages(st, pvec, num_pages, 0,
@@ -381,7 +151,8 @@ __i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
 					 NULL, 0, GFP_KERNEL);
 	if (IS_ERR(sg)) {
 		kfree(st);
-		return ERR_CAST(sg);
+		ret = PTR_ERR(sg);
+		goto err;
 	}
 
 	ret = i915_gem_gtt_prepare_pages(obj, st);
@@ -393,203 +164,20 @@ __i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
 			goto alloc_table;
 		}
 
-		kfree(st);
-		return ERR_PTR(ret);
+		goto err;
 	}
 
 	sg_page_sizes = i915_sg_page_sizes(st->sgl);
 
 	__i915_gem_object_set_pages(obj, st, sg_page_sizes);
 
-	return st;
-}
-
-static void
-__i915_gem_userptr_get_pages_worker(struct work_struct *_work)
-{
-	struct get_pages_work *work = container_of(_work, typeof(*work), work);
-	struct drm_i915_gem_object *obj = work->obj;
-	const unsigned long npages = obj->base.size >> PAGE_SHIFT;
-	unsigned long pinned;
-	struct page **pvec;
-	int ret;
-
-	ret = -ENOMEM;
-	pinned = 0;
-
-	pvec = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
-	if (pvec != NULL) {
-		struct mm_struct *mm = obj->userptr.mm->mm;
-		unsigned int flags = 0;
-		int locked = 0;
-
-		if (!i915_gem_object_is_readonly(obj))
-			flags |= FOLL_WRITE;
-
-		ret = -EFAULT;
-		if (mmget_not_zero(mm)) {
-			while (pinned < npages) {
-				if (!locked) {
-					mmap_read_lock(mm);
-					locked = 1;
-				}
-				ret = pin_user_pages_remote
-					(mm,
-					 obj->userptr.ptr + pinned * PAGE_SIZE,
-					 npages - pinned,
-					 flags,
-					 pvec + pinned, NULL, &locked);
-				if (ret < 0)
-					break;
-
-				pinned += ret;
-			}
-			if (locked)
-				mmap_read_unlock(mm);
-			mmput(mm);
-		}
-	}
-
-	mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES);
-	if (obj->userptr.work == &work->work) {
-		struct sg_table *pages = ERR_PTR(ret);
-
-		if (pinned == npages) {
-			pages = __i915_gem_userptr_alloc_pages(obj, pvec,
-							       npages);
-			if (!IS_ERR(pages)) {
-				pinned = 0;
-				pages = NULL;
-			}
-		}
-
-		obj->userptr.work = ERR_CAST(pages);
-		if (IS_ERR(pages))
-			__i915_gem_userptr_set_active(obj, false);
-	}
-	mutex_unlock(&obj->mm.lock);
-
-	unpin_user_pages(pvec, pinned);
-	kvfree(pvec);
-
-	i915_gem_object_put(obj);
-	put_task_struct(work->task);
-	kfree(work);
-}
-
-static struct sg_table *
-__i915_gem_userptr_get_pages_schedule(struct drm_i915_gem_object *obj)
-{
-	struct get_pages_work *work;
-
-	/* Spawn a worker so that we can acquire the
-	 * user pages without holding our mutex. Access
-	 * to the user pages requires mmap_lock, and we have
-	 * a strict lock ordering of mmap_lock, struct_mutex -
-	 * we already hold struct_mutex here and so cannot
-	 * call gup without encountering a lock inversion.
-	 *
-	 * Userspace will keep on repeating the operation
-	 * (thanks to EAGAIN) until either we hit the fast
-	 * path or the worker completes. If the worker is
-	 * cancelled or superseded, the task is still run
-	 * but the results ignored. (This leads to
-	 * complications that we may have a stray object
-	 * refcount that we need to be wary of when
-	 * checking for existing objects during creation.)
-	 * If the worker encounters an error, it reports
-	 * that error back to this function through
-	 * obj->userptr.work = ERR_PTR.
-	 */
-	work = kmalloc(sizeof(*work), GFP_KERNEL);
-	if (work == NULL)
-		return ERR_PTR(-ENOMEM);
-
-	obj->userptr.work = &work->work;
-
-	work->obj = i915_gem_object_get(obj);
-
-	work->task = current;
-	get_task_struct(work->task);
-
-	INIT_WORK(&work->work, __i915_gem_userptr_get_pages_worker);
-	queue_work(to_i915(obj->base.dev)->mm.userptr_wq, &work->work);
-
-	return ERR_PTR(-EAGAIN);
-}
-
-static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
-{
-	const unsigned long num_pages = obj->base.size >> PAGE_SHIFT;
-	struct mm_struct *mm = obj->userptr.mm->mm;
-	struct page **pvec;
-	struct sg_table *pages;
-	bool active;
-	int pinned;
-	unsigned int gup_flags = 0;
-
-	/* If userspace should engineer that these pages are replaced in
-	 * the vma between us binding this page into the GTT and completion
-	 * of rendering... Their loss. If they change the mapping of their
-	 * pages they need to create a new bo to point to the new vma.
-	 *
-	 * However, that still leaves open the possibility of the vma
-	 * being copied upon fork. Which falls under the same userspace
-	 * synchronisation issue as a regular bo, except that this time
-	 * the process may not be expecting that a particular piece of
-	 * memory is tied to the GPU.
-	 *
-	 * Fortunately, we can hook into the mmu_notifier in order to
-	 * discard the page references prior to anything nasty happening
-	 * to the vma (discard or cloning) which should prevent the more
-	 * egregious cases from causing harm.
-	 */
-
-	if (obj->userptr.work) {
-		/* active flag should still be held for the pending work */
-		if (IS_ERR(obj->userptr.work))
-			return PTR_ERR(obj->userptr.work);
-		else
-			return -EAGAIN;
-	}
-
-	pvec = NULL;
-	pinned = 0;
-
-	if (mm == current->mm) {
-		pvec = kvmalloc_array(num_pages, sizeof(struct page *),
-				      GFP_KERNEL |
-				      __GFP_NORETRY |
-				      __GFP_NOWARN);
-		if (pvec) {
-			/* defer to worker if malloc fails */
-			if (!i915_gem_object_is_readonly(obj))
-				gup_flags |= FOLL_WRITE;
-			pinned = pin_user_pages_fast_only(obj->userptr.ptr,
-							  num_pages, gup_flags,
-							  pvec);
-		}
-	}
-
-	active = false;
-	if (pinned < 0) {
-		pages = ERR_PTR(pinned);
-		pinned = 0;
-	} else if (pinned < num_pages) {
-		pages = __i915_gem_userptr_get_pages_schedule(obj);
-		active = pages == ERR_PTR(-EAGAIN);
-	} else {
-		pages = __i915_gem_userptr_alloc_pages(obj, pvec, num_pages);
-		active = !IS_ERR(pages);
-	}
-	if (active)
-		__i915_gem_userptr_set_active(obj, true);
-
-	if (IS_ERR(pages))
-		unpin_user_pages(pvec, pinned);
-	kvfree(pvec);
+	return 0;
 
-	return PTR_ERR_OR_ZERO(pages);
+err:
+	i915_gem_object_userptr_drop_ref(obj);
+err_free:
+	kfree(st);
+	return ret;
 }
 
 static void
@@ -599,9 +187,6 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 	struct sgt_iter sgt_iter;
 	struct page *page;
 
-	/* Cancel any inflight work and force them to restart their gup */
-	obj->userptr.work = NULL;
-	__i915_gem_userptr_set_active(obj, false);
 	if (!pages)
 		return;
 
@@ -641,19 +226,135 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 		}
 
 		mark_page_accessed(page);
-		unpin_user_page(page);
 	}
 	obj->mm.dirty = false;
 
 	sg_free_table(pages);
 	kfree(pages);
+
+	i915_gem_object_userptr_drop_ref(obj);
+}
+
+static int i915_gem_object_userptr_unbind(struct drm_i915_gem_object *obj, bool get_pages)
+{
+	struct sg_table *pages;
+	int err;
+
+	err = i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE);
+	if (err)
+		return err;
+
+	if (GEM_WARN_ON(i915_gem_object_has_pinned_pages(obj)))
+		return -EBUSY;
+
+	mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES);
+
+	pages = __i915_gem_object_unset_pages(obj);
+	if (!IS_ERR_OR_NULL(pages))
+		i915_gem_userptr_put_pages(obj, pages);
+
+	if (get_pages)
+		err = ____i915_gem_object_get_pages(obj);
+	mutex_unlock(&obj->mm.lock);
+
+	return err;
+}
+
+int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	const unsigned long num_pages = obj->base.size >> PAGE_SHIFT;
+	struct page **pvec;
+	unsigned int gup_flags = 0;
+	unsigned long notifier_seq;
+	int pinned, ret;
+
+	if (obj->userptr.notifier.mm != current->mm)
+		return -EFAULT;
+
+	ret = i915_gem_object_lock_interruptible(obj, NULL);
+	if (ret)
+		return ret;
+
+	/* Make sure userptr is unbound for next attempt, so we don't use stale pages. */
+	ret = i915_gem_object_userptr_unbind(obj, false);
+	i915_gem_object_unlock(obj);
+	if (ret)
+		return ret;
+
+	notifier_seq = mmu_interval_read_begin(&obj->userptr.notifier);
+
+	pvec = kvmalloc_array(num_pages, sizeof(struct page *), GFP_KERNEL);
+	if (!pvec)
+		return -ENOMEM;
+
+	if (!i915_gem_object_is_readonly(obj))
+		gup_flags |= FOLL_WRITE;
+
+	pinned = ret = 0;
+	while (pinned < num_pages) {
+		ret = pin_user_pages_fast(obj->userptr.ptr + pinned * PAGE_SIZE,
+					  num_pages - pinned, gup_flags,
+					  &pvec[pinned]);
+		if (ret < 0)
+			goto out;
+
+		pinned += ret;
+	}
+	ret = 0;
+
+	spin_lock(&i915->mm.notifier_lock);
+
+	if (mmu_interval_read_retry(&obj->userptr.notifier,
+		!obj->userptr.page_ref ? notifier_seq :
+		obj->userptr.notifier_seq)) {
+		ret = -EAGAIN;
+		goto out_unlock;
+	}
+
+	if (!obj->userptr.page_ref++) {
+		obj->userptr.pvec = pvec;
+		obj->userptr.notifier_seq = notifier_seq;
+
+		pvec = NULL;
+	}
+
+out_unlock:
+	spin_unlock(&i915->mm.notifier_lock);
+
+out:
+	if (pvec) {
+		unpin_user_pages(pvec, pinned);
+		kvfree(pvec);
+	}
+
+	return ret;
+}
+
+int i915_gem_object_userptr_submit_done(struct drm_i915_gem_object *obj)
+{
+	if (mmu_interval_read_retry(&obj->userptr.notifier,
+				    obj->userptr.notifier_seq)) {
+		/* We collided with the mmu notifier, need to retry */
+
+		return -EAGAIN;
+	}
+
+	return 0;
+}
+
+void i915_gem_object_userptr_submit_fini(struct drm_i915_gem_object *obj)
+{
+	i915_gem_object_userptr_drop_ref(obj);
 }
 
 static void
 i915_gem_userptr_release(struct drm_i915_gem_object *obj)
 {
-	i915_gem_userptr_release__mmu_notifier(obj);
-	i915_gem_userptr_release__mm_struct(obj);
+	GEM_WARN_ON(obj->userptr.page_ref);
+
+	mmu_interval_notifier_remove(&obj->userptr.notifier);
+	obj->userptr.notifier.mm = NULL;
 }
 
 static int
@@ -686,7 +387,6 @@ static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
 	.name = "i915_gem_object_userptr",
 	.flags = I915_GEM_OBJECT_IS_SHRINKABLE |
 		 I915_GEM_OBJECT_NO_MMAP |
-		 I915_GEM_OBJECT_ASYNC_CANCEL |
 		 I915_GEM_OBJECT_IS_PROXY,
 	.get_pages = i915_gem_userptr_get_pages,
 	.put_pages = i915_gem_userptr_put_pages,
@@ -807,6 +507,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
 	i915_gem_object_set_cache_coherency(obj, I915_CACHE_LLC);
 
 	obj->userptr.ptr = args->user_ptr;
+	obj->userptr.notifier_seq = ULONG_MAX;
 	if (args->flags & I915_USERPTR_READ_ONLY)
 		i915_gem_object_set_readonly(obj);
 
@@ -814,9 +515,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
 	 * at binding. This means that we need to hook into the mmu_notifier
 	 * in order to detect if the mmu is destroyed.
 	 */
-	ret = i915_gem_userptr_init__mm_struct(obj);
-	if (ret == 0)
-		ret = i915_gem_userptr_init__mmu_notifier(obj);
+	ret = i915_gem_userptr_init__mmu_notifier(obj);
 	if (ret == 0)
 		ret = drm_gem_handle_create(file, &obj->base, &handle);
 
@@ -835,15 +534,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
 int i915_gem_init_userptr(struct drm_i915_private *dev_priv)
 {
 #ifdef CONFIG_MMU_NOTIFIER
-	spin_lock_init(&dev_priv->mm_lock);
-	hash_init(dev_priv->mm_structs);
-
-	dev_priv->mm.userptr_wq =
-		alloc_workqueue("i915-userptr-acquire",
-				WQ_HIGHPRI | WQ_UNBOUND,
-				0);
-	if (!dev_priv->mm.userptr_wq)
-		return -ENOMEM;
+	spin_lock_init(&dev_priv->mm.notifier_lock);
 #endif
 
 	return 0;
@@ -851,7 +542,4 @@ int i915_gem_init_userptr(struct drm_i915_private *dev_priv)
 
 void i915_gem_cleanup_userptr(struct drm_i915_private *dev_priv)
 {
-#ifdef CONFIG_MMU_NOTIFIER
-	destroy_workqueue(dev_priv->mm.userptr_wq);
-#endif
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d3c67e17cd02..ce8d5ff8b9f4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -591,11 +591,10 @@ struct i915_gem_mm {
 
 #ifdef CONFIG_MMU_NOTIFIER
 	/**
-	 * Workqueue to fault in userptr pages, flushed by the execbuf
-	 * when required but otherwise left to userspace to try again
-	 * on EAGAIN.
+	 * notifier_lock for mmu notifiers, memory may not be allocated
+	 * while holding this lock.
 	 */
-	struct workqueue_struct *userptr_wq;
+	spinlock_t notifier_lock;
 #endif
 
 	/* shrinker accounting, also useful for userland debugging */
@@ -978,8 +977,6 @@ struct drm_i915_private {
 	struct i915_ggtt ggtt; /* VM representing the global address space */
 
 	struct i915_gem_mm mm;
-	DECLARE_HASHTABLE(mm_structs, 7);
-	spinlock_t mm_lock;
 
 	/* Kernel Modesetting */
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b03e245640c0..0b9eab66511c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1158,10 +1158,8 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 err_unlock:
 	i915_gem_drain_workqueue(dev_priv);
 
-	if (ret != -EIO) {
+	if (ret != -EIO)
 		intel_uc_cleanup_firmwares(&dev_priv->gt.uc);
-		i915_gem_cleanup_userptr(dev_priv);
-	}
 
 	if (ret == -EIO) {
 		/*
@@ -1220,7 +1218,6 @@ void i915_gem_driver_release(struct drm_i915_private *dev_priv)
 	intel_wa_list_free(&dev_priv->gt_wa_list);
 
 	intel_uc_cleanup_firmwares(&dev_priv->gt.uc);
-	i915_gem_cleanup_userptr(dev_priv);
 
 	i915_gem_drain_freed_objects(dev_priv);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 026/162] drm/i915: Flatten obj->mm.lock
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (24 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 025/162] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v5 Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 027/162] drm/i915: Populate logical context during first pin Matthew Auld
                   ` (135 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

With userptr fixed, there is no need for all separate lockdep classes
now, and we can remove all lockdep tricks used. A trylock in the
shrinker is all we need now to flatten the locking hierarchy.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c   |  6 +---
 drivers/gpu/drm/i915/gem/i915_gem_object.h   | 20 ++----------
 drivers/gpu/drm/i915/gem/i915_gem_pages.c    | 34 ++++++++++----------
 drivers/gpu/drm/i915/gem/i915_gem_phys.c     |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 10 +++---
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c  |  2 +-
 6 files changed, 27 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 1393988bd5af..028a556ab1a5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -62,7 +62,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 			  const struct drm_i915_gem_object_ops *ops,
 			  struct lock_class_key *key, unsigned flags)
 {
-	__mutex_init(&obj->mm.lock, ops->name ?: "obj->mm.lock", key);
+	mutex_init(&obj->mm.lock);
 
 	spin_lock_init(&obj->vma.lock);
 	INIT_LIST_HEAD(&obj->vma.list);
@@ -86,10 +86,6 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 	mutex_init(&obj->mm.get_page.lock);
 	INIT_RADIX_TREE(&obj->mm.get_dma_page.radix, GFP_KERNEL | __GFP_NOWARN);
 	mutex_init(&obj->mm.get_dma_page.lock);
-
-	if (IS_ENABLED(CONFIG_LOCKDEP) && i915_gem_object_is_shrinkable(obj))
-		i915_gem_shrinker_taints_mutex(to_i915(obj->base.dev),
-					       &obj->mm.lock);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 33412248f6df..1b85f51c6ddd 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -339,27 +339,10 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
 int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
 
-enum i915_mm_subclass { /* lockdep subclass for obj->mm.lock/struct_mutex */
-	I915_MM_NORMAL = 0,
-	/*
-	 * Only used by struct_mutex, when called "recursively" from
-	 * direct-reclaim-esque. Safe because there is only every one
-	 * struct_mutex in the entire system.
-	 */
-	I915_MM_SHRINKER = 1,
-	/*
-	 * Used for obj->mm.lock when allocating pages. Safe because the object
-	 * isn't yet on any LRU, and therefore the shrinker can't deadlock on
-	 * it. As soon as the object has pages, obj->mm.lock nests within
-	 * fs_reclaim.
-	 */
-	I915_MM_GET_PAGES = 1,
-};
-
 static inline int __must_check
 i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
 {
-	might_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES);
+	might_lock(&obj->mm.lock);
 
 	if (atomic_inc_not_zero(&obj->mm.pages_pin_count))
 		return 0;
@@ -403,6 +386,7 @@ i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 }
 
 int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
+int __i915_gem_object_put_pages_locked(struct drm_i915_gem_object *obj);
 void i915_gem_object_truncate(struct drm_i915_gem_object *obj);
 void i915_gem_object_writeback(struct drm_i915_gem_object *obj);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 60149cad6080..5bcd21a8fc4e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -111,7 +111,7 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 {
 	int err;
 
-	err = mutex_lock_interruptible_nested(&obj->mm.lock, I915_MM_GET_PAGES);
+	err = mutex_lock_interruptible(&obj->mm.lock);
 	if (err)
 		return err;
 
@@ -193,21 +193,13 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)
 	return pages;
 }
 
-int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
+int __i915_gem_object_put_pages_locked(struct drm_i915_gem_object *obj)
 {
 	struct sg_table *pages;
-	int err;
 
 	if (i915_gem_object_has_pinned_pages(obj))
 		return -EBUSY;
 
-	/* May be called by shrinker from within get_pages() (on another bo) */
-	mutex_lock(&obj->mm.lock);
-	if (unlikely(atomic_read(&obj->mm.pages_pin_count))) {
-		err = -EBUSY;
-		goto unlock;
-	}
-
 	i915_gem_object_release_mmap_offset(obj);
 
 	/*
@@ -223,14 +215,22 @@ int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 	 * get_pages backends we should be better able to handle the
 	 * cancellation of the async task in a more uniform manner.
 	 */
-	if (!pages)
-		pages = ERR_PTR(-EINVAL);
-
-	if (!IS_ERR(pages))
+	if (!IS_ERR_OR_NULL(pages))
 		obj->ops->put_pages(obj, pages);
 
-	err = 0;
-unlock:
+	return 0;
+}
+
+int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
+{
+	int err;
+
+	if (i915_gem_object_has_pinned_pages(obj))
+		return -EBUSY;
+
+	/* May be called by shrinker from within get_pages() (on another bo) */
+	mutex_lock(&obj->mm.lock);
+	err = __i915_gem_object_put_pages_locked(obj);
 	mutex_unlock(&obj->mm.lock);
 
 	return err;
@@ -336,7 +336,7 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
 	    !i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM))
 		return ERR_PTR(-ENXIO);
 
-	err = mutex_lock_interruptible_nested(&obj->mm.lock, I915_MM_GET_PAGES);
+	err = mutex_lock_interruptible(&obj->mm.lock);
 	if (err)
 		return ERR_PTR(err);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index 144e4940eede..0d176bf06405 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -236,7 +236,7 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align)
 	if (err)
 		return err;
 
-	err = mutex_lock_interruptible_nested(&obj->mm.lock, I915_MM_GET_PAGES);
+	err = mutex_lock_interruptible(&obj->mm.lock);
 	if (err)
 		goto err_unlock;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index dc8f052a0ffe..afc6e5b4dcf1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -48,9 +48,9 @@ static bool unsafe_drop_pages(struct drm_i915_gem_object *obj,
 		flags = I915_GEM_OBJECT_UNBIND_TEST;
 
 	if (i915_gem_object_unbind(obj, flags) == 0)
-		__i915_gem_object_put_pages(obj);
+		return true;
 
-	return !i915_gem_object_has_pages(obj);
+	return false;
 }
 
 static void try_to_writeback(struct drm_i915_gem_object *obj,
@@ -199,10 +199,10 @@ i915_gem_shrink(struct drm_i915_private *i915,
 
 			spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
 
-			if (unsafe_drop_pages(obj, shrink)) {
+			if (unsafe_drop_pages(obj, shrink) &&
+			    mutex_trylock(&obj->mm.lock)) {
 				/* May arrive from get_pages on another bo */
-				mutex_lock(&obj->mm.lock);
-				if (!i915_gem_object_has_pages(obj)) {
+				if (!__i915_gem_object_put_pages_locked(obj)) {
 					try_to_writeback(obj, shrink);
 					count += obj->base.size >> PAGE_SHIFT;
 				}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 9ea9aa65ade1..0cab9da6669e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -247,7 +247,7 @@ static int i915_gem_object_userptr_unbind(struct drm_i915_gem_object *obj, bool
 	if (GEM_WARN_ON(i915_gem_object_has_pinned_pages(obj)))
 		return -EBUSY;
 
-	mutex_lock_nested(&obj->mm.lock, I915_MM_GET_PAGES);
+	mutex_lock(&obj->mm.lock);
 
 	pages = __i915_gem_object_unset_pages(obj);
 	if (!IS_ERR_OR_NULL(pages))
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 027/162] drm/i915: Populate logical context during first pin.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (25 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 026/162] drm/i915: Flatten obj->mm.lock Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 028/162] drm/i915: Make ring submission compatible with obj->mm.lock removal, v2 Matthew Auld
                   ` (134 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

This allows us to remove pin_map from state allocation, which saves
us a few retry loops. We won't need this until first pin, anyway.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_context_types.h |  13 ++-
 .../drm/i915/gt/intel_execlists_submission.c  | 107 +++++++++---------
 2 files changed, 62 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 52fa9c132746..a593c98398a7 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -81,12 +81,13 @@ struct intel_context {
 	unsigned long flags;
 #define CONTEXT_BARRIER_BIT		0
 #define CONTEXT_ALLOC_BIT		1
-#define CONTEXT_VALID_BIT		2
-#define CONTEXT_CLOSED_BIT		3
-#define CONTEXT_USE_SEMAPHORES		4
-#define CONTEXT_BANNED			5
-#define CONTEXT_FORCE_SINGLE_SUBMISSION	6
-#define CONTEXT_NOPREEMPT		7
+#define CONTEXT_INIT_BIT		2
+#define CONTEXT_VALID_BIT		3
+#define CONTEXT_CLOSED_BIT		4
+#define CONTEXT_USE_SEMAPHORES		5
+#define CONTEXT_BANNED			6
+#define CONTEXT_FORCE_SINGLE_SUBMISSION	7
+#define CONTEXT_NOPREEMPT		8
 
 	u32 *lrc_reg_state;
 	union {
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 1cc93ea6b7f0..7eec42b27bc1 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -3497,9 +3497,39 @@ __execlists_update_reg_state(const struct intel_context *ce,
 	}
 }
 
+static void populate_lr_context(struct intel_context *ce,
+				struct intel_engine_cs *engine,
+				void *vaddr)
+{
+	bool inhibit = true;
+	struct drm_i915_gem_object *ctx_obj = ce->state->obj;
+
+	set_redzone(vaddr, engine);
+
+	if (engine->default_state) {
+		shmem_read(engine->default_state, 0,
+			   vaddr, engine->context_size);
+		__set_bit(CONTEXT_VALID_BIT, &ce->flags);
+		inhibit = false;
+	}
+
+	/* Clear the ppHWSP (inc. per-context counters) */
+	memset(vaddr, 0, PAGE_SIZE);
+
+	/*
+	 * The second page of the context object contains some registers which
+	 * must be set up prior to the first execution.
+	 */
+	execlists_init_reg_state(vaddr + LRC_STATE_OFFSET,
+				 ce, engine, ce->ring, inhibit);
+
+	__i915_gem_object_flush_map(ctx_obj, 0, engine->context_size);
+}
+
 static int
-execlists_context_pre_pin(struct intel_context *ce,
-			  struct i915_gem_ww_ctx *ww, void **vaddr)
+__execlists_context_pre_pin(struct intel_context *ce,
+			    struct intel_engine_cs *engine,
+			    struct i915_gem_ww_ctx *ww, void **vaddr)
 {
 	GEM_BUG_ON(!ce->state);
 	GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
@@ -3507,8 +3537,20 @@ execlists_context_pre_pin(struct intel_context *ce,
 	*vaddr = i915_gem_object_pin_map(ce->state->obj,
 					i915_coherent_map_type(ce->engine->i915) |
 					I915_MAP_OVERRIDE);
+	if (IS_ERR(*vaddr))
+		return PTR_ERR(*vaddr);
+
+	if (!__test_and_set_bit(CONTEXT_INIT_BIT, &ce->flags))
+		populate_lr_context(ce, engine, *vaddr);
+
+	return 0;
+}
 
-	return PTR_ERR_OR_ZERO(*vaddr);
+static int
+execlists_context_pre_pin(struct intel_context *ce,
+			  struct i915_gem_ww_ctx *ww, void **vaddr)
+{
+	return __execlists_context_pre_pin(ce, ce->engine, ww, vaddr);
 }
 
 static int
@@ -4610,45 +4652,6 @@ static void execlists_init_reg_state(u32 *regs,
 	__reset_stop_ring(regs, engine);
 }
 
-static int
-populate_lr_context(struct intel_context *ce,
-		    struct drm_i915_gem_object *ctx_obj,
-		    struct intel_engine_cs *engine,
-		    struct intel_ring *ring)
-{
-	bool inhibit = true;
-	void *vaddr;
-
-	vaddr = i915_gem_object_pin_map(ctx_obj, I915_MAP_WB);
-	if (IS_ERR(vaddr)) {
-		drm_dbg(&engine->i915->drm, "Could not map object pages!\n");
-		return PTR_ERR(vaddr);
-	}
-
-	set_redzone(vaddr, engine);
-
-	if (engine->default_state) {
-		shmem_read(engine->default_state, 0,
-			   vaddr, engine->context_size);
-		__set_bit(CONTEXT_VALID_BIT, &ce->flags);
-		inhibit = false;
-	}
-
-	/* Clear the ppHWSP (inc. per-context counters) */
-	memset(vaddr, 0, PAGE_SIZE);
-
-	/*
-	 * The second page of the context object contains some registers which
-	 * must be set up prior to the first execution.
-	 */
-	execlists_init_reg_state(vaddr + LRC_STATE_OFFSET,
-				 ce, engine, ring, inhibit);
-
-	__i915_gem_object_flush_map(ctx_obj, 0, engine->context_size);
-	i915_gem_object_unpin_map(ctx_obj);
-	return 0;
-}
-
 static struct intel_timeline *pinned_timeline(struct intel_context *ce)
 {
 	struct intel_timeline *tl = fetch_and_zero(&ce->timeline);
@@ -4712,20 +4715,11 @@ static int __execlists_context_alloc(struct intel_context *ce,
 		goto error_deref_obj;
 	}
 
-	ret = populate_lr_context(ce, ctx_obj, engine, ring);
-	if (ret) {
-		drm_dbg(&engine->i915->drm,
-			"Failed to populate LRC: %d\n", ret);
-		goto error_ring_free;
-	}
-
 	ce->ring = ring;
 	ce->state = vma;
 
 	return 0;
 
-error_ring_free:
-	intel_ring_put(ring);
 error_deref_obj:
 	i915_gem_object_put(ctx_obj);
 	return ret;
@@ -4849,6 +4843,15 @@ static int virtual_context_alloc(struct intel_context *ce)
 	return __execlists_context_alloc(ce, ve->siblings[0]);
 }
 
+static int
+virtual_context_pre_pin(struct intel_context *ce,
+			  struct i915_gem_ww_ctx *ww, void **vaddr)
+{
+	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
+
+	return __execlists_context_pre_pin(ce, ve->siblings[0], ww, vaddr);
+}
+
 static int virtual_context_pin(struct intel_context *ce, void *vaddr)
 {
 	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
@@ -4882,7 +4885,7 @@ static void virtual_context_exit(struct intel_context *ce)
 static const struct intel_context_ops virtual_context_ops = {
 	.alloc = virtual_context_alloc,
 
-	.pre_pin = execlists_context_pre_pin,
+	.pre_pin = virtual_context_pre_pin,
 	.pin = virtual_context_pin,
 	.unpin = execlists_context_unpin,
 	.post_unpin = execlists_context_post_unpin,
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 028/162] drm/i915: Make ring submission compatible with obj->mm.lock removal, v2.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (26 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 027/162] drm/i915: Populate logical context during first pin Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 029/162] drm/i915: Handle ww locking in init_status_page Matthew Auld
                   ` (133 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx
  Cc: Thomas Hellström, kernel test robot, dri-devel, Dan Carpenter

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

We map the initial context during first pin.

This allows us to remove pin_map from state allocation, which saves
us a few retry loops. We won't need this until first pin anyway.

intel_ring_submission_setup() is also reworked slightly to do all
pinning in a single ww loop.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 .../gpu/drm/i915/gt/intel_ring_submission.c   | 184 +++++++++++-------
 1 file changed, 118 insertions(+), 66 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index a41b43f445b8..6b280904db43 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -478,6 +478,26 @@ static void ring_context_destroy(struct kref *ref)
 	intel_context_free(ce);
 }
 
+static int ring_context_init_default_state(struct intel_context *ce,
+					   struct i915_gem_ww_ctx *ww)
+{
+	struct drm_i915_gem_object *obj = ce->state->obj;
+	void *vaddr;
+
+	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	if (IS_ERR(vaddr))
+		return PTR_ERR(vaddr);
+
+	shmem_read(ce->engine->default_state, 0,
+		   vaddr, ce->engine->context_size);
+
+	i915_gem_object_flush_map(obj);
+	__i915_gem_object_release_map(obj);
+
+	__set_bit(CONTEXT_VALID_BIT, &ce->flags);
+	return 0;
+}
+
 static int ring_context_pre_pin(struct intel_context *ce,
 				struct i915_gem_ww_ctx *ww,
 				void **unused)
@@ -485,6 +505,13 @@ static int ring_context_pre_pin(struct intel_context *ce,
 	struct i915_address_space *vm;
 	int err = 0;
 
+	if (ce->engine->default_state &&
+	    !test_bit(CONTEXT_VALID_BIT, &ce->flags)) {
+		err = ring_context_init_default_state(ce, ww);
+		if (err)
+			return err;
+	}
+
 	vm = vm_alias(ce->vm);
 	if (vm)
 		err = gen6_ppgtt_pin(i915_vm_to_ppgtt((vm)), ww);
@@ -540,22 +567,6 @@ alloc_context_vma(struct intel_engine_cs *engine)
 	if (IS_IVYBRIDGE(i915))
 		i915_gem_object_set_cache_coherency(obj, I915_CACHE_L3_LLC);
 
-	if (engine->default_state) {
-		void *vaddr;
-
-		vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
-		if (IS_ERR(vaddr)) {
-			err = PTR_ERR(vaddr);
-			goto err_obj;
-		}
-
-		shmem_read(engine->default_state, 0,
-			   vaddr, engine->context_size);
-
-		i915_gem_object_flush_map(obj);
-		__i915_gem_object_release_map(obj);
-	}
-
 	vma = i915_vma_instance(obj, &engine->gt->ggtt->vm, NULL);
 	if (IS_ERR(vma)) {
 		err = PTR_ERR(vma);
@@ -587,8 +598,6 @@ static int ring_context_alloc(struct intel_context *ce)
 			return PTR_ERR(vma);
 
 		ce->state = vma;
-		if (engine->default_state)
-			__set_bit(CONTEXT_VALID_BIT, &ce->flags);
 	}
 
 	return 0;
@@ -1184,37 +1193,15 @@ static int gen7_ctx_switch_bb_setup(struct intel_engine_cs * const engine,
 	return gen7_setup_clear_gpr_bb(engine, vma);
 }
 
-static int gen7_ctx_switch_bb_init(struct intel_engine_cs *engine)
+static int gen7_ctx_switch_bb_init(struct intel_engine_cs *engine,
+				   struct i915_gem_ww_ctx *ww,
+				   struct i915_vma *vma)
 {
-	struct drm_i915_gem_object *obj;
-	struct i915_vma *vma;
-	int size;
 	int err;
 
-	size = gen7_ctx_switch_bb_setup(engine, NULL /* probe size */);
-	if (size <= 0)
-		return size;
-
-	size = ALIGN(size, PAGE_SIZE);
-	obj = i915_gem_object_create_internal(engine->i915, size);
-	if (IS_ERR(obj))
-		return PTR_ERR(obj);
-
-	vma = i915_vma_instance(obj, engine->gt->vm, NULL);
-	if (IS_ERR(vma)) {
-		err = PTR_ERR(vma);
-		goto err_obj;
-	}
-
-	vma->private = intel_context_create(engine); /* dummy residuals */
-	if (IS_ERR(vma->private)) {
-		err = PTR_ERR(vma->private);
-		goto err_obj;
-	}
-
-	err = i915_vma_pin(vma, 0, 0, PIN_USER | PIN_HIGH);
+	err = i915_vma_pin_ww(vma, ww, 0, 0, PIN_USER | PIN_HIGH);
 	if (err)
-		goto err_private;
+		return err;
 
 	err = i915_vma_sync(vma);
 	if (err)
@@ -1229,17 +1216,53 @@ static int gen7_ctx_switch_bb_init(struct intel_engine_cs *engine)
 
 err_unpin:
 	i915_vma_unpin(vma);
-err_private:
-	intel_context_put(vma->private);
-err_obj:
-	i915_gem_object_put(obj);
 	return err;
 }
 
+static struct i915_vma *gen7_ctx_vma(struct intel_engine_cs *engine)
+{
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+	int size, err;
+
+	if (!IS_HASWELL(engine->i915) || engine->class != RENDER_CLASS)
+		return 0;
+
+	err = gen7_ctx_switch_bb_setup(engine, NULL /* probe size */);
+	if (err < 0)
+		return ERR_PTR(err);
+	if (!err)
+		return NULL;
+
+	size = ALIGN(err, PAGE_SIZE);
+
+	obj = i915_gem_object_create_internal(engine->i915, size);
+	if (IS_ERR(obj))
+		return ERR_CAST(obj);
+
+	vma = i915_vma_instance(obj, engine->gt->vm, NULL);
+	if (IS_ERR(vma)) {
+		i915_gem_object_put(obj);
+		return ERR_CAST(vma);
+	}
+
+	vma->private = intel_context_create(engine); /* dummy residuals */
+	if (IS_ERR(vma->private)) {
+		err = PTR_ERR(vma->private);
+		vma->private = NULL;
+		i915_gem_object_put(obj);
+		return ERR_PTR(err);
+	}
+
+	return vma;
+}
+
 int intel_ring_submission_setup(struct intel_engine_cs *engine)
 {
+	struct i915_gem_ww_ctx ww;
 	struct intel_timeline *timeline;
 	struct intel_ring *ring;
+	struct i915_vma *gen7_wa_vma;
 	int err;
 
 	setup_common(engine);
@@ -1270,43 +1293,72 @@ int intel_ring_submission_setup(struct intel_engine_cs *engine)
 	}
 	GEM_BUG_ON(timeline->has_initial_breadcrumb);
 
-	err = intel_timeline_pin(timeline, NULL);
-	if (err)
-		goto err_timeline;
-
 	ring = intel_engine_create_ring(engine, SZ_16K);
 	if (IS_ERR(ring)) {
 		err = PTR_ERR(ring);
-		goto err_timeline_unpin;
+		goto err_timeline;
 	}
 
-	err = intel_ring_pin(ring, NULL);
-	if (err)
-		goto err_ring;
-
 	GEM_BUG_ON(engine->legacy.ring);
 	engine->legacy.ring = ring;
 	engine->legacy.timeline = timeline;
 
-	GEM_BUG_ON(timeline->hwsp_ggtt != engine->status_page.vma);
+	gen7_wa_vma = gen7_ctx_vma(engine);
+	if (IS_ERR(gen7_wa_vma)) {
+		err = PTR_ERR(gen7_wa_vma);
+		goto err_ring;
+	}
 
-	if (IS_HASWELL(engine->i915) && engine->class == RENDER_CLASS) {
-		err = gen7_ctx_switch_bb_init(engine);
+	i915_gem_ww_ctx_init(&ww, false);
+
+retry:
+	err = i915_gem_object_lock(timeline->hwsp_ggtt->obj, &ww);
+	if (!err && gen7_wa_vma)
+		err = i915_gem_object_lock(gen7_wa_vma->obj, &ww);
+	if (!err && engine->legacy.ring->vma->obj)
+		err = i915_gem_object_lock(engine->legacy.ring->vma->obj, &ww);
+	if (!err)
+		err = intel_timeline_pin(timeline, &ww);
+	if (!err) {
+		err = intel_ring_pin(ring, &ww);
 		if (err)
-			goto err_ring_unpin;
+			intel_timeline_unpin(timeline);
 	}
+	if (err)
+		goto out;
+
+	GEM_BUG_ON(timeline->hwsp_ggtt != engine->status_page.vma);
+
+	if (gen7_wa_vma) {
+		err = gen7_ctx_switch_bb_init(engine, &ww, gen7_wa_vma);
+		if (err) {
+			intel_ring_unpin(ring);
+			intel_timeline_unpin(timeline);
+		}
+	}
+
+out:
+	if (err == -EDEADLK) {
+		err = i915_gem_ww_ctx_backoff(&ww);
+		if (!err)
+			goto retry;
+	}
+	i915_gem_ww_ctx_fini(&ww);
+	if (err)
+		goto err_gen7_put;
 
 	/* Finally, take ownership and responsibility for cleanup! */
 	engine->release = ring_release;
 
 	return 0;
 
-err_ring_unpin:
-	intel_ring_unpin(ring);
+err_gen7_put:
+	if (gen7_wa_vma) {
+		intel_context_put(gen7_wa_vma->private);
+		i915_gem_object_put(gen7_wa_vma->obj);
+	}
 err_ring:
 	intel_ring_put(ring);
-err_timeline_unpin:
-	intel_timeline_unpin(timeline);
 err_timeline:
 	intel_timeline_put(timeline);
 err:
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 029/162] drm/i915: Handle ww locking in init_status_page
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (27 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 028/162] drm/i915: Make ring submission compatible with obj->mm.lock removal, v2 Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 030/162] drm/i915: Rework clflush to work correctly without obj->mm.lock Matthew Auld
                   ` (132 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Try to pin to ggtt first, and use a full ww loop to handle
eviction correctly.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 37 +++++++++++++++--------
 1 file changed, 24 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 97ceaf7116e8..420c6a35f3ed 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -618,6 +618,7 @@ static void cleanup_status_page(struct intel_engine_cs *engine)
 }
 
 static int pin_ggtt_status_page(struct intel_engine_cs *engine,
+				struct i915_gem_ww_ctx *ww,
 				struct i915_vma *vma)
 {
 	unsigned int flags;
@@ -638,12 +639,13 @@ static int pin_ggtt_status_page(struct intel_engine_cs *engine,
 	else
 		flags = PIN_HIGH;
 
-	return i915_ggtt_pin(vma, NULL, 0, flags);
+	return i915_ggtt_pin(vma, ww, 0, flags);
 }
 
 static int init_status_page(struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_object *obj;
+	struct i915_gem_ww_ctx ww;
 	struct i915_vma *vma;
 	void *vaddr;
 	int ret;
@@ -667,30 +669,39 @@ static int init_status_page(struct intel_engine_cs *engine)
 	vma = i915_vma_instance(obj, &engine->gt->ggtt->vm, NULL);
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
-		goto err;
+		goto err_put;
 	}
 
+	i915_gem_ww_ctx_init(&ww, true);
+retry:
+	ret = i915_gem_object_lock(obj, &ww);
+	if (!ret && !HWS_NEEDS_PHYSICAL(engine->i915))
+		ret = pin_ggtt_status_page(engine, &ww, vma);
+	if (ret)
+		goto err;
+
 	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
 	if (IS_ERR(vaddr)) {
 		ret = PTR_ERR(vaddr);
-		goto err;
+		goto err_unpin;
 	}
 
 	engine->status_page.addr = memset(vaddr, 0, PAGE_SIZE);
 	engine->status_page.vma = vma;
 
-	if (!HWS_NEEDS_PHYSICAL(engine->i915)) {
-		ret = pin_ggtt_status_page(engine, vma);
-		if (ret)
-			goto err_unpin;
-	}
-
-	return 0;
-
 err_unpin:
-	i915_gem_object_unpin_map(obj);
+	if (ret)
+		i915_vma_unpin(vma);
 err:
-	i915_gem_object_put(obj);
+	if (ret == -EDEADLK) {
+		ret = i915_gem_ww_ctx_backoff(&ww);
+		if (!ret)
+			goto retry;
+	}
+	i915_gem_ww_ctx_fini(&ww);
+err_put:
+	if (ret)
+		i915_gem_object_put(obj);
 	return ret;
 }
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 030/162] drm/i915: Rework clflush to work correctly without obj->mm.lock.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (28 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 029/162] drm/i915: Handle ww locking in init_status_page Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 031/162] drm/i915: Pass ww ctx to intel_pin_to_display_plane Matthew Auld
                   ` (131 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Pin in the caller, not in the work itself. This should also
work better for dma-fence annotations.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index bc0223716906..daf9284ef1f5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -27,15 +27,8 @@ static void __do_clflush(struct drm_i915_gem_object *obj)
 static int clflush_work(struct dma_fence_work *base)
 {
 	struct clflush *clflush = container_of(base, typeof(*clflush), base);
-	struct drm_i915_gem_object *obj = clflush->obj;
-	int err;
 
-	err = i915_gem_object_pin_pages(obj);
-	if (err)
-		return err;
-
-	__do_clflush(obj);
-	i915_gem_object_unpin_pages(obj);
+	__do_clflush(clflush->obj);
 
 	return 0;
 }
@@ -44,6 +37,7 @@ static void clflush_release(struct dma_fence_work *base)
 {
 	struct clflush *clflush = container_of(base, typeof(*clflush), base);
 
+	i915_gem_object_unpin_pages(clflush->obj);
 	i915_gem_object_put(clflush->obj);
 }
 
@@ -63,6 +57,11 @@ static struct clflush *clflush_work_create(struct drm_i915_gem_object *obj)
 	if (!clflush)
 		return NULL;
 
+	if (__i915_gem_object_get_pages(obj) < 0) {
+		kfree(clflush);
+		return NULL;
+	}
+
 	dma_fence_work_init(&clflush->base, &clflush_ops);
 	clflush->obj = i915_gem_object_get(obj); /* obj <-> clflush cycle */
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 031/162] drm/i915: Pass ww ctx to intel_pin_to_display_plane
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (29 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 030/162] drm/i915: Rework clflush to work correctly without obj->mm.lock Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 032/162] drm/i915: Add object locking to vm_fault_cpu Matthew Auld
                   ` (130 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Instead of multiple lockings, lock the object once,
and perform the ww dance around attach_phys and pin_pages.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c  | 69 ++++++++++++-------
 drivers/gpu/drm/i915/display/intel_display.h  |  2 +-
 drivers/gpu/drm/i915/display/intel_fbdev.c    |  2 +-
 drivers/gpu/drm/i915/display/intel_overlay.c  | 34 +++++++--
 drivers/gpu/drm/i915/gem/i915_gem_domain.c    | 30 ++------
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_phys.c      | 10 +--
 .../drm/i915/gem/selftests/i915_gem_phys.c    |  2 +
 8 files changed, 86 insertions(+), 64 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index f36921a3c4bc..8a7945f55278 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -2232,6 +2232,7 @@ static bool intel_plane_uses_fence(const struct intel_plane_state *plane_state)
 
 struct i915_vma *
 intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb,
+			   bool phys_cursor,
 			   const struct i915_ggtt_view *view,
 			   bool uses_fence,
 			   unsigned long *out_flags)
@@ -2240,14 +2241,19 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb,
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct drm_i915_gem_object *obj = intel_fb_obj(fb);
 	intel_wakeref_t wakeref;
+	struct i915_gem_ww_ctx ww;
 	struct i915_vma *vma;
 	unsigned int pinctl;
 	u32 alignment;
+	int ret;
 
 	if (drm_WARN_ON(dev, !i915_gem_object_is_framebuffer(obj)))
 		return ERR_PTR(-EINVAL);
 
-	alignment = intel_surf_alignment(fb, 0);
+	if (phys_cursor)
+		alignment = intel_cursor_alignment(dev_priv);
+	else
+		alignment = intel_surf_alignment(fb, 0);
 	if (drm_WARN_ON(dev, alignment && !is_power_of_2(alignment)))
 		return ERR_PTR(-EINVAL);
 
@@ -2282,14 +2288,26 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb,
 	if (HAS_GMCH(dev_priv))
 		pinctl |= PIN_MAPPABLE;
 
-	vma = i915_gem_object_pin_to_display_plane(obj,
-						   alignment, view, pinctl);
-	if (IS_ERR(vma))
+	i915_gem_ww_ctx_init(&ww, true);
+retry:
+	ret = i915_gem_object_lock(obj, &ww);
+	if (!ret && phys_cursor)
+		ret = i915_gem_object_attach_phys(obj, alignment);
+	if (!ret)
+		ret = i915_gem_object_pin_pages(obj);
+	if (ret)
 		goto err;
 
-	if (uses_fence && i915_vma_is_map_and_fenceable(vma)) {
-		int ret;
+	if (!ret) {
+		vma = i915_gem_object_pin_to_display_plane(obj, &ww, alignment,
+							   view, pinctl);
+		if (IS_ERR(vma)) {
+			ret = PTR_ERR(vma);
+			goto err_unpin;
+		}
+	}
 
+	if (uses_fence && i915_vma_is_map_and_fenceable(vma)) {
 		/*
 		 * Install a fence for tiled scan-out. Pre-i965 always needs a
 		 * fence, whereas 965+ only requires a fence if using
@@ -2310,16 +2328,28 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb,
 		ret = i915_vma_pin_fence(vma);
 		if (ret != 0 && INTEL_GEN(dev_priv) < 4) {
 			i915_gem_object_unpin_from_display_plane(vma);
-			vma = ERR_PTR(ret);
-			goto err;
+			goto err_unpin;
 		}
+		ret = 0;
 
-		if (ret == 0 && vma->fence)
+		if (vma->fence)
 			*out_flags |= PLANE_HAS_FENCE;
 	}
 
 	i915_vma_get(vma);
+
+err_unpin:
+	i915_gem_object_unpin_pages(obj);
 err:
+	if (ret == -EDEADLK) {
+		ret = i915_gem_ww_ctx_backoff(&ww);
+		if (!ret)
+			goto retry;
+	}
+	i915_gem_ww_ctx_fini(&ww);
+	if (ret)
+		vma = ERR_PTR(ret);
+
 	atomic_dec(&dev_priv->gpu_error.pending_fb_pin);
 	intel_runtime_pm_put(&dev_priv->runtime_pm, wakeref);
 	return vma;
@@ -16626,19 +16656,11 @@ static int intel_plane_pin_fb(struct intel_plane_state *plane_state)
 	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
 	struct drm_framebuffer *fb = plane_state->hw.fb;
 	struct i915_vma *vma;
+	bool phys_cursor =
+		plane->id == PLANE_CURSOR &&
+		INTEL_INFO(dev_priv)->display.cursor_needs_physical;
 
-	if (plane->id == PLANE_CURSOR &&
-	    INTEL_INFO(dev_priv)->display.cursor_needs_physical) {
-		struct drm_i915_gem_object *obj = intel_fb_obj(fb);
-		const int align = intel_cursor_alignment(dev_priv);
-		int err;
-
-		err = i915_gem_object_attach_phys(obj, align);
-		if (err)
-			return err;
-	}
-
-	vma = intel_pin_and_fence_fb_obj(fb,
+	vma = intel_pin_and_fence_fb_obj(fb, phys_cursor,
 					 &plane_state->view,
 					 intel_plane_uses_fence(plane_state),
 					 &plane_state->flags);
@@ -16734,13 +16756,8 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
 	if (!obj)
 		return 0;
 
-	ret = i915_gem_object_pin_pages(obj);
-	if (ret)
-		return ret;
 
 	ret = intel_plane_pin_fb(new_plane_state);
-
-	i915_gem_object_unpin_pages(obj);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h
index 5e0d42d82c11..5f5e632e216b 100644
--- a/drivers/gpu/drm/i915/display/intel_display.h
+++ b/drivers/gpu/drm/i915/display/intel_display.h
@@ -569,7 +569,7 @@ void intel_release_load_detect_pipe(struct drm_connector *connector,
 				    struct intel_load_detect_pipe *old,
 				    struct drm_modeset_acquire_ctx *ctx);
 struct i915_vma *
-intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb,
+intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, bool phys_cursor,
 			   const struct i915_ggtt_view *view,
 			   bool uses_fence,
 			   unsigned long *out_flags);
diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
index 842c04e63214..bdf44e923cc0 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -211,7 +211,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	 * This also validates that any existing fb inherited from the
 	 * BIOS is suitable for own access.
 	 */
-	vma = intel_pin_and_fence_fb_obj(&ifbdev->fb->base,
+	vma = intel_pin_and_fence_fb_obj(&ifbdev->fb->base, false,
 					 &view, false, &flags);
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
diff --git a/drivers/gpu/drm/i915/display/intel_overlay.c b/drivers/gpu/drm/i915/display/intel_overlay.c
index 52b4f6193b4c..9cf634cc7084 100644
--- a/drivers/gpu/drm/i915/display/intel_overlay.c
+++ b/drivers/gpu/drm/i915/display/intel_overlay.c
@@ -755,6 +755,32 @@ static u32 overlay_cmd_reg(struct drm_intel_overlay_put_image *params)
 	return cmd;
 }
 
+static struct i915_vma *intel_overlay_pin_fb(struct drm_i915_gem_object *new_bo)
+{
+	struct i915_gem_ww_ctx ww;
+	struct i915_vma *vma;
+	int ret;
+
+	i915_gem_ww_ctx_init(&ww, true);
+retry:
+	ret = i915_gem_object_lock(new_bo, &ww);
+	if (!ret) {
+		vma = i915_gem_object_pin_to_display_plane(new_bo, &ww, 0,
+							   NULL, PIN_MAPPABLE);
+		ret = PTR_ERR_OR_ZERO(vma);
+	}
+	if (ret == -EDEADLK) {
+		ret = i915_gem_ww_ctx_backoff(&ww);
+		if (!ret)
+			goto retry;
+	}
+	i915_gem_ww_ctx_fini(&ww);
+	if (ret)
+		return ERR_PTR(ret);
+
+	return vma;
+}
+
 static int intel_overlay_do_put_image(struct intel_overlay *overlay,
 				      struct drm_i915_gem_object *new_bo,
 				      struct drm_intel_overlay_put_image *params)
@@ -776,12 +802,10 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
 
 	atomic_inc(&dev_priv->gpu_error.pending_fb_pin);
 
-	vma = i915_gem_object_pin_to_display_plane(new_bo,
-						   0, NULL, PIN_MAPPABLE);
-	if (IS_ERR(vma)) {
-		ret = PTR_ERR(vma);
+	vma = intel_overlay_pin_fb(new_bo);
+	if (IS_ERR(vma))
 		goto out_pin_section;
-	}
+
 	i915_gem_object_flush_frontbuffer(new_bo, ORIGIN_DIRTYFB);
 
 	if (!overlay->active) {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index c1d4bf62b3ea..51a33c4f61d0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -313,12 +313,12 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
  */
 struct i915_vma *
 i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
+				     struct i915_gem_ww_ctx *ww,
 				     u32 alignment,
 				     const struct i915_ggtt_view *view,
 				     unsigned int flags)
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
-	struct i915_gem_ww_ctx ww;
 	struct i915_vma *vma;
 	int ret;
 
@@ -326,11 +326,6 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 	if (HAS_LMEM(i915) && !i915_gem_object_is_lmem(obj))
 		return ERR_PTR(-EINVAL);
 
-	i915_gem_ww_ctx_init(&ww, true);
-retry:
-	ret = i915_gem_object_lock(obj, &ww);
-	if (ret)
-		goto err;
 	/*
 	 * The display engine is not coherent with the LLC cache on gen6.  As
 	 * a result, we make sure that the pinning that is about to occur is
@@ -345,7 +340,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 					      HAS_WT(i915) ?
 					      I915_CACHE_WT : I915_CACHE_NONE);
 	if (ret)
-		goto err;
+		return ERR_PTR(ret);
 
 	/*
 	 * As the user may map the buffer once pinned in the display plane
@@ -358,32 +353,19 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 	vma = ERR_PTR(-ENOSPC);
 	if ((flags & PIN_MAPPABLE) == 0 &&
 	    (!view || view->type == I915_GGTT_VIEW_NORMAL))
-		vma = i915_gem_object_ggtt_pin_ww(obj, &ww, view, 0, alignment,
+		vma = i915_gem_object_ggtt_pin_ww(obj, ww, view, 0, alignment,
 						  flags | PIN_MAPPABLE |
 						  PIN_NONBLOCK);
 	if (IS_ERR(vma) && vma != ERR_PTR(-EDEADLK))
-		vma = i915_gem_object_ggtt_pin_ww(obj, &ww, view, 0,
+		vma = i915_gem_object_ggtt_pin_ww(obj, ww, view, 0,
 						  alignment, flags);
-	if (IS_ERR(vma)) {
-		ret = PTR_ERR(vma);
-		goto err;
-	}
+	if (IS_ERR(vma))
+		return vma;
 
 	vma->display_alignment = max_t(u64, vma->display_alignment, alignment);
 
 	i915_gem_object_flush_if_display_locked(obj);
 
-err:
-	if (ret == -EDEADLK) {
-		ret = i915_gem_ww_ctx_backoff(&ww);
-		if (!ret)
-			goto retry;
-	}
-	i915_gem_ww_ctx_fini(&ww);
-
-	if (ret)
-		return ERR_PTR(ret);
-
 	return vma;
 }
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 1b85f51c6ddd..0fec91ad6f62 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -489,6 +489,7 @@ int __must_check
 i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write);
 struct i915_vma * __must_check
 i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
+				     struct i915_gem_ww_ctx *ww,
 				     u32 alignment,
 				     const struct i915_ggtt_view *view,
 				     unsigned int flags);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index 0d176bf06405..f317be5f5e34 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -219,6 +219,8 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align)
 {
 	int err;
 
+	assert_object_held(obj);
+
 	if (align > obj->base.size)
 		return -EINVAL;
 
@@ -232,13 +234,9 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align)
 	if (err)
 		return err;
 
-	err = i915_gem_object_lock_interruptible(obj, NULL);
-	if (err)
-		return err;
-
 	err = mutex_lock_interruptible(&obj->mm.lock);
 	if (err)
-		goto err_unlock;
+		return err;
 
 	if (unlikely(!i915_gem_object_has_struct_page(obj)))
 		goto out;
@@ -269,8 +267,6 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align)
 
 out:
 	mutex_unlock(&obj->mm.lock);
-err_unlock:
-	i915_gem_object_unlock(obj);
 	return err;
 }
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c
index 0cfa082047fe..3a6ce87f8b52 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c
@@ -31,7 +31,9 @@ static int mock_phys_object(void *arg)
 		goto out_obj;
 	}
 
+	i915_gem_object_lock(obj, NULL);
 	err = i915_gem_object_attach_phys(obj, PAGE_SIZE);
+	i915_gem_object_unlock(obj);
 	if (err) {
 		pr_err("i915_gem_object_attach_phys failed, err=%d\n", err);
 		goto out_obj;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 032/162] drm/i915: Add object locking to vm_fault_cpu
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (30 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 031/162] drm/i915: Pass ww ctx to intel_pin_to_display_plane Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 033/162] drm/i915: Move pinning to inside engine_wa_list_verify() Matthew Auld
                   ` (129 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Take a simple lock so we hold ww around (un)pin_pages as needed.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index c0034d811e50..163208a6260d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -246,6 +246,9 @@ static vm_fault_t vm_fault_cpu(struct vm_fault *vmf)
 		     area->vm_flags & VM_WRITE))
 		return VM_FAULT_SIGBUS;
 
+	if (i915_gem_object_lock_interruptible(obj, NULL))
+		return VM_FAULT_NOPAGE;
+
 	err = i915_gem_object_pin_pages(obj);
 	if (err)
 		goto out;
@@ -269,6 +272,7 @@ static vm_fault_t vm_fault_cpu(struct vm_fault *vmf)
 	i915_gem_object_unpin_pages(obj);
 
 out:
+	i915_gem_object_unlock(obj);
 	return i915_error_to_vmf_fault(err);
 }
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 033/162] drm/i915: Move pinning to inside engine_wa_list_verify()
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (31 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 032/162] drm/i915: Add object locking to vm_fault_cpu Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 034/162] drm/i915: Take reservation lock around i915_vma_pin Matthew Auld
                   ` (128 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

This should be done as part of the ww loop, in order to remove a
i915_vma_pin that needs ww held.

Now only i915_ggtt_pin() callers remaining.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 24 ++++++++----------
 .../gpu/drm/i915/gt/selftest_workarounds.c    | 25 ++++++++++++++++---
 2 files changed, 32 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index a82554baa6ac..de50b7c47ea3 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -2073,7 +2073,6 @@ create_scratch(struct i915_address_space *vm, int count)
 	struct drm_i915_gem_object *obj;
 	struct i915_vma *vma;
 	unsigned int size;
-	int err;
 
 	size = round_up(count * sizeof(u32), PAGE_SIZE);
 	obj = i915_gem_object_create_internal(vm->i915, size);
@@ -2084,20 +2083,11 @@ create_scratch(struct i915_address_space *vm, int count)
 
 	vma = i915_vma_instance(obj, vm, NULL);
 	if (IS_ERR(vma)) {
-		err = PTR_ERR(vma);
-		goto err_obj;
+		i915_gem_object_put(obj);
+		return vma;
 	}
 
-	err = i915_vma_pin(vma, 0, 0,
-			   i915_vma_is_ggtt(vma) ? PIN_GLOBAL : PIN_USER);
-	if (err)
-		goto err_obj;
-
 	return vma;
-
-err_obj:
-	i915_gem_object_put(obj);
-	return ERR_PTR(err);
 }
 
 struct mcr_range {
@@ -2215,10 +2205,15 @@ static int engine_wa_list_verify(struct intel_context *ce,
 	if (err)
 		goto err_pm;
 
+	err = i915_vma_pin_ww(vma, &ww, 0, 0,
+			   i915_vma_is_ggtt(vma) ? PIN_GLOBAL : PIN_USER);
+	if (err)
+		goto err_unpin;
+
 	rq = i915_request_create(ce);
 	if (IS_ERR(rq)) {
 		err = PTR_ERR(rq);
-		goto err_unpin;
+		goto err_vma;
 	}
 
 	err = i915_request_await_object(rq, vma->obj, true);
@@ -2259,6 +2254,8 @@ static int engine_wa_list_verify(struct intel_context *ce,
 
 err_rq:
 	i915_request_put(rq);
+err_vma:
+	i915_vma_unpin(vma);
 err_unpin:
 	intel_context_unpin(ce);
 err_pm:
@@ -2269,7 +2266,6 @@ static int engine_wa_list_verify(struct intel_context *ce,
 	}
 	i915_gem_ww_ctx_fini(&ww);
 	intel_engine_pm_put(ce->engine);
-	i915_vma_unpin(vma);
 	i915_vma_put(vma);
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
index 61a0532d0f3d..810ab026a55e 100644
--- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
@@ -386,6 +386,25 @@ static struct i915_vma *create_batch(struct i915_address_space *vm)
 	return ERR_PTR(err);
 }
 
+static struct i915_vma *
+create_scratch_pinned(struct i915_address_space *vm, int count)
+{
+	struct i915_vma *vma = create_scratch(vm, count);
+	int err;
+
+	if (IS_ERR(vma))
+		return vma;
+
+	err = i915_vma_pin(vma, 0, 0,
+			   i915_vma_is_ggtt(vma) ? PIN_GLOBAL : PIN_USER);
+	if (err) {
+		i915_vma_put(vma);
+		return ERR_PTR(err);
+	}
+
+	return vma;
+}
+
 static u32 reg_write(u32 old, u32 new, u32 rsvd)
 {
 	if (rsvd == 0x0000ffff) {
@@ -489,7 +508,7 @@ static int check_dirty_whitelist(struct intel_context *ce)
 	int err = 0, i, v;
 	u32 *cs, *results;
 
-	scratch = create_scratch(ce->vm, 2 * ARRAY_SIZE(values) + 1);
+	scratch = create_scratch_pinned(ce->vm, 2 * ARRAY_SIZE(values) + 1);
 	if (IS_ERR(scratch))
 		return PTR_ERR(scratch);
 
@@ -1043,7 +1062,7 @@ static int live_isolated_whitelist(void *arg)
 
 		vm = i915_gem_context_get_vm_rcu(c);
 
-		client[i].scratch[0] = create_scratch(vm, 1024);
+		client[i].scratch[0] = create_scratch_pinned(vm, 1024);
 		if (IS_ERR(client[i].scratch[0])) {
 			err = PTR_ERR(client[i].scratch[0]);
 			i915_vm_put(vm);
@@ -1051,7 +1070,7 @@ static int live_isolated_whitelist(void *arg)
 			goto err;
 		}
 
-		client[i].scratch[1] = create_scratch(vm, 1024);
+		client[i].scratch[1] = create_scratch_pinned(vm, 1024);
 		if (IS_ERR(client[i].scratch[1])) {
 			err = PTR_ERR(client[i].scratch[1]);
 			i915_vma_unpin_and_release(&client[i].scratch[0], 0);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 034/162] drm/i915: Take reservation lock around i915_vma_pin.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (32 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 033/162] drm/i915: Move pinning to inside engine_wa_list_verify() Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 035/162] drm/i915: Make intel_init_workaround_bb more compatible with ww locking Matthew Auld
                   ` (127 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

We previously complained when ww == NULL.

This function is now only used in selftests to pin an object,
and ww locking is now fixed.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 .../i915/gem/selftests/i915_gem_coherency.c   | 14 +++++--------
 drivers/gpu/drm/i915/i915_gem.c               |  6 +++++-
 drivers/gpu/drm/i915/i915_vma.c               |  3 +--
 drivers/gpu/drm/i915/i915_vma.h               | 20 +++++++++++++++----
 4 files changed, 27 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
index 7049a6bbc03d..2e439bb269d6 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
@@ -199,16 +199,14 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v)
 	u32 *cs;
 	int err;
 
+	vma = i915_gem_object_ggtt_pin(ctx->obj, NULL, 0, 0, 0);
+	if (IS_ERR(vma))
+		return PTR_ERR(vma);
+
 	i915_gem_object_lock(ctx->obj, NULL);
 	err = i915_gem_object_set_to_gtt_domain(ctx->obj, true);
 	if (err)
-		goto out_unlock;
-
-	vma = i915_gem_object_ggtt_pin(ctx->obj, NULL, 0, 0, 0);
-	if (IS_ERR(vma)) {
-		err = PTR_ERR(vma);
-		goto out_unlock;
-	}
+		goto out_unpin;
 
 	rq = intel_engine_create_kernel_request(ctx->engine);
 	if (IS_ERR(rq)) {
@@ -248,9 +246,7 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v)
 	i915_request_add(rq);
 out_unpin:
 	i915_vma_unpin(vma);
-out_unlock:
 	i915_gem_object_unlock(ctx->obj);
-
 	return err;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0b9eab66511c..b5311f7ad870 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1011,7 +1011,11 @@ i915_gem_object_ggtt_pin_ww(struct drm_i915_gem_object *obj,
 			return ERR_PTR(ret);
 	}
 
-	ret = i915_vma_pin_ww(vma, ww, size, alignment, flags | PIN_GLOBAL);
+	if (ww)
+		ret = i915_vma_pin_ww(vma, ww, size, alignment, flags | PIN_GLOBAL);
+	else
+		ret = i915_vma_pin(vma, size, alignment, flags | PIN_GLOBAL);
+
 	if (ret)
 		return ERR_PTR(ret);
 
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 5b1d78fa748e..63bdb0cc981e 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -868,8 +868,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 			vma->obj && i915_gem_object_has_pinned_pages(vma->obj) &&
 			!vma->vm->allocate_va_range;
 
-		if (lockdep_is_held(&vma->vm->i915->drm.struct_mutex) &&
-		    !pinned_bind_wo_alloc)
+		if (!pinned_bind_wo_alloc)
 			WARN_ON(!ww);
 		if (ww && vma->resv)
 			assert_vma_held(vma);
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index a2e7b58b70ca..2db4f25b8d5f 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -246,10 +246,22 @@ i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 static inline int __must_check
 i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 {
-#ifdef CONFIG_LOCKDEP
-	WARN_ON_ONCE(vma->resv && dma_resv_held(vma->resv));
-#endif
-	return i915_vma_pin_ww(vma, NULL, size, alignment, flags);
+	struct i915_gem_ww_ctx ww;
+	int err;
+
+	i915_gem_ww_ctx_init(&ww, true);
+retry:
+	err = i915_gem_object_lock(vma->obj, &ww);
+	if (!err)
+		err = i915_vma_pin_ww(vma, &ww, size, alignment, flags);
+	if (err == -EDEADLK) {
+		err = i915_gem_ww_ctx_backoff(&ww);
+		if (!err)
+			goto retry;
+	}
+	i915_gem_ww_ctx_fini(&ww);
+
+	return err;
 }
 
 int i915_ggtt_pin(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 035/162] drm/i915: Make intel_init_workaround_bb more compatible with ww locking.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (33 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 034/162] drm/i915: Take reservation lock around i915_vma_pin Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 036/162] drm/i915: Make __engine_unpark() compatible with ww locking v2 Matthew Auld
                   ` (126 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniele Ceraolo Spurio, dri-devel, Thomas Hellström

From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Make creation separate from pinning, in order to take the lock only
once, and pin the mapping with the lock held.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 .../drm/i915/gt/intel_engine_workaround_bb.c  | 45 +++++++++++++++----
 1 file changed, 37 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_workaround_bb.c b/drivers/gpu/drm/i915/gt/intel_engine_workaround_bb.c
index b03bdfc92bb2..f3636b73cc10 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_workaround_bb.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_workaround_bb.c
@@ -229,7 +229,7 @@ gen10_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
 
 #define CTX_WA_BB_OBJ_SIZE (PAGE_SIZE)
 
-static int lrc_setup_wa_ctx(struct intel_engine_cs *engine)
+static int lrc_init_wa_ctx(struct intel_engine_cs *engine)
 {
 	struct drm_i915_gem_object *obj;
 	struct i915_vma *vma;
@@ -245,10 +245,6 @@ static int lrc_setup_wa_ctx(struct intel_engine_cs *engine)
 		goto err;
 	}
 
-	err = i915_ggtt_pin(vma, NULL, 0, PIN_HIGH);
-	if (err)
-		goto err;
-
 	engine->wa_ctx.vma = vma;
 	return 0;
 
@@ -257,6 +253,18 @@ static int lrc_setup_wa_ctx(struct intel_engine_cs *engine)
 	return err;
 }
 
+static void lrc_destroy_wa_ctx(struct intel_engine_cs *engine, bool unpin)
+{
+	if (!engine->wa_ctx.vma)
+		return;
+
+	if (unpin)
+		i915_vma_unpin(engine->wa_ctx.vma);
+
+	i915_vma_put(engine->wa_ctx.vma);
+	engine->wa_ctx.vma = NULL;
+}
+
 typedef u32 *(*wa_bb_func_t)(struct intel_engine_cs *engine, u32 *batch);
 
 int intel_init_workaround_bb(struct intel_engine_cs *engine)
@@ -266,6 +274,7 @@ int intel_init_workaround_bb(struct intel_engine_cs *engine)
 					    &wa_ctx->per_ctx };
 	wa_bb_func_t wa_bb_fn[2];
 	void *batch, *batch_ptr;
+	struct i915_gem_ww_ctx ww;
 	unsigned int i;
 	int ret;
 
@@ -293,13 +302,21 @@ int intel_init_workaround_bb(struct intel_engine_cs *engine)
 		return 0;
 	}
 
-	ret = lrc_setup_wa_ctx(engine);
+	ret = lrc_init_wa_ctx(engine);
 	if (ret) {
 		drm_dbg(&engine->i915->drm,
 			"Failed to setup context WA page: %d\n", ret);
 		return ret;
 	}
 
+	i915_gem_ww_ctx_init(&ww, true);
+retry:
+	ret = i915_gem_object_lock(wa_ctx->vma->obj, &ww);
+	if (!ret)
+		ret = i915_ggtt_pin(wa_ctx->vma, &ww, 0, PIN_HIGH);
+	if (ret)
+		goto err;
+
 	batch = i915_gem_object_pin_map(wa_ctx->vma->obj, I915_MAP_WB);
 
 	/*
@@ -323,13 +340,25 @@ int intel_init_workaround_bb(struct intel_engine_cs *engine)
 
 	__i915_gem_object_flush_map(wa_ctx->vma->obj, 0, batch_ptr - batch);
 	__i915_gem_object_release_map(wa_ctx->vma->obj);
+
+	if (ret)
+		i915_vma_unpin(wa_ctx->vma);
+
+err:
+	if (ret == -EDEADLK) {
+		ret = i915_gem_ww_ctx_backoff(&ww);
+		if (!ret)
+			goto retry;
+	}
+	i915_gem_ww_ctx_fini(&ww);
 	if (ret)
-		intel_fini_workaround_bb(engine);
+		lrc_destroy_wa_ctx(engine, false);
 
 	return ret;
 }
 
+
 void intel_fini_workaround_bb(struct intel_engine_cs *engine)
 {
-	i915_vma_unpin_and_release(&engine->wa_ctx.vma, 0);
+	lrc_destroy_wa_ctx(engine, true);
 }
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 036/162] drm/i915: Make __engine_unpark() compatible with ww locking v2
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (34 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 035/162] drm/i915: Make intel_init_workaround_bb more compatible with ww locking Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 037/162] drm/i915: Take obj lock around set_domain ioctl Matthew Auld
                   ` (125 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Take the ww lock around engine_unpark. Because of the
many many places where rpm is used, I chose the safest option
and used a trylock to opportunistically take this lock for
__engine_unpark.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 499b09cb4acf..5d51144ef074 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -27,12 +27,16 @@ static void dbg_poison_ce(struct intel_context *ce)
 		int type = i915_coherent_map_type(ce->engine->i915);
 		void *map;
 
+		if (!i915_gem_object_trylock(ce->state->obj))
+			return;
+
 		map = i915_gem_object_pin_map(obj, type);
 		if (!IS_ERR(map)) {
 			memset(map, CONTEXT_REDZONE, obj->base.size);
 			i915_gem_object_flush_map(obj);
 			i915_gem_object_unpin_map(obj);
 		}
+		i915_gem_object_unlock(obj);
 	}
 }
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 037/162] drm/i915: Take obj lock around set_domain ioctl
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (35 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 036/162] drm/i915: Make __engine_unpark() compatible with ww locking v2 Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 038/162] drm/i915: Defer pin calls in buffer pool until first use by caller Matthew Auld
                   ` (124 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

We need to lock the object to move it to the correct domain,
add the missing lock.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index 51a33c4f61d0..e62f9e8dd339 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -516,6 +516,10 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
 		goto out;
 	}
 
+	err = i915_gem_object_lock_interruptible(obj, NULL);
+	if (err)
+		goto out;
+
 	/*
 	 * Flush and acquire obj->pages so that we are coherent through
 	 * direct access in memory with previous cached writes through
@@ -527,7 +531,7 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
 	 */
 	err = i915_gem_object_pin_pages(obj);
 	if (err)
-		goto out;
+		goto out_unlock;
 
 	/*
 	 * Already in the desired write domain? Nothing for us to do!
@@ -542,10 +546,6 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
 	if (READ_ONCE(obj->write_domain) == read_domains)
 		goto out_unpin;
 
-	err = i915_gem_object_lock_interruptible(obj, NULL);
-	if (err)
-		goto out_unpin;
-
 	if (read_domains & I915_GEM_DOMAIN_WC)
 		err = i915_gem_object_set_to_wc_domain(obj, write_domain);
 	else if (read_domains & I915_GEM_DOMAIN_GTT)
@@ -556,13 +556,15 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
 	/* And bump the LRU for this access */
 	i915_gem_object_bump_inactive_ggtt(obj);
 
+out_unpin:
+	i915_gem_object_unpin_pages(obj);
+
+out_unlock:
 	i915_gem_object_unlock(obj);
 
-	if (write_domain)
+	if (!err && write_domain)
 		i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU);
 
-out_unpin:
-	i915_gem_object_unpin_pages(obj);
 out:
 	i915_gem_object_put(obj);
 	return err;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 038/162] drm/i915: Defer pin calls in buffer pool until first use by caller.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (36 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 037/162] drm/i915: Take obj lock around set_domain ioctl Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 039/162] drm/i915: Fix pread/pwrite to work with new locking rules Matthew Auld
                   ` (123 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

We need to take the obj lock to pin pages, so wait until the callers
have done so, before making the object unshrinkable.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  2 +
 .../gpu/drm/i915/gem/i915_gem_object_blt.c    |  6 +++
 .../gpu/drm/i915/gt/intel_gt_buffer_pool.c    | 47 +++++++++----------
 .../gpu/drm/i915/gt/intel_gt_buffer_pool.h    |  5 ++
 .../drm/i915/gt/intel_gt_buffer_pool_types.h  |  1 +
 5 files changed, 35 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index f5ea49e244ca..91f0c3fd9a4b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1343,6 +1343,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
 		err = PTR_ERR(cmd);
 		goto err_pool;
 	}
+	intel_gt_buffer_pool_mark_used(pool);
 
 	batch = i915_vma_instance(pool->obj, vma->vm, NULL);
 	if (IS_ERR(batch)) {
@@ -2635,6 +2636,7 @@ static int eb_parse(struct i915_execbuffer *eb)
 		err = PTR_ERR(shadow);
 		goto err;
 	}
+	intel_gt_buffer_pool_mark_used(pool);
 	i915_gem_object_set_readonly(shadow->obj);
 	shadow->private = pool;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
index aee7ad3cc3c6..e0b873c3f46a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
@@ -54,6 +54,9 @@ struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce,
 	if (unlikely(err))
 		goto out_put;
 
+	/* we pinned the pool, mark it as such */
+	intel_gt_buffer_pool_mark_used(pool);
+
 	cmd = i915_gem_object_pin_map(pool->obj, I915_MAP_WC);
 	if (IS_ERR(cmd)) {
 		err = PTR_ERR(cmd);
@@ -276,6 +279,9 @@ struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce,
 	if (unlikely(err))
 		goto out_put;
 
+	/* we pinned the pool, mark it as such */
+	intel_gt_buffer_pool_mark_used(pool);
+
 	cmd = i915_gem_object_pin_map(pool->obj, I915_MAP_WC);
 	if (IS_ERR(cmd)) {
 		err = PTR_ERR(cmd);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
index 104cb30e8c13..030759305196 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
@@ -98,28 +98,6 @@ static void pool_free_work(struct work_struct *wrk)
 				      round_jiffies_up_relative(HZ));
 }
 
-static int pool_active(struct i915_active *ref)
-{
-	struct intel_gt_buffer_pool_node *node =
-		container_of(ref, typeof(*node), active);
-	struct dma_resv *resv = node->obj->base.resv;
-	int err;
-
-	if (dma_resv_trylock(resv)) {
-		dma_resv_add_excl_fence(resv, NULL);
-		dma_resv_unlock(resv);
-	}
-
-	err = i915_gem_object_pin_pages(node->obj);
-	if (err)
-		return err;
-
-	/* Hide this pinned object from the shrinker until retired */
-	i915_gem_object_make_unshrinkable(node->obj);
-
-	return 0;
-}
-
 __i915_active_call
 static void pool_retire(struct i915_active *ref)
 {
@@ -129,10 +107,13 @@ static void pool_retire(struct i915_active *ref)
 	struct list_head *list = bucket_for_size(pool, node->obj->base.size);
 	unsigned long flags;
 
-	i915_gem_object_unpin_pages(node->obj);
+	if (node->pinned) {
+		i915_gem_object_unpin_pages(node->obj);
 
-	/* Return this object to the shrinker pool */
-	i915_gem_object_make_purgeable(node->obj);
+		/* Return this object to the shrinker pool */
+		i915_gem_object_make_purgeable(node->obj);
+		node->pinned = false;
+	}
 
 	GEM_BUG_ON(node->age);
 	spin_lock_irqsave(&pool->lock, flags);
@@ -144,6 +125,19 @@ static void pool_retire(struct i915_active *ref)
 			      round_jiffies_up_relative(HZ));
 }
 
+void intel_gt_buffer_pool_mark_used(struct intel_gt_buffer_pool_node *node)
+{
+	assert_object_held(node->obj);
+
+	if (node->pinned)
+		return;
+
+	__i915_gem_object_pin_pages(node->obj);
+	/* Hide this pinned object from the shrinker until retired */
+	i915_gem_object_make_unshrinkable(node->obj);
+	node->pinned = true;
+}
+
 static struct intel_gt_buffer_pool_node *
 node_create(struct intel_gt_buffer_pool *pool, size_t sz)
 {
@@ -158,7 +152,8 @@ node_create(struct intel_gt_buffer_pool *pool, size_t sz)
 
 	node->age = 0;
 	node->pool = pool;
-	i915_active_init(&node->active, pool_active, pool_retire);
+	node->pinned = false;
+	i915_active_init(&node->active, NULL, pool_retire);
 
 	obj = i915_gem_object_create_internal(gt->i915, sz);
 	if (IS_ERR(obj)) {
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.h b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.h
index 42cbac003e8a..9878ce9a07ab 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.h
@@ -17,10 +17,15 @@ struct i915_request;
 struct intel_gt_buffer_pool_node *
 intel_gt_get_buffer_pool(struct intel_gt *gt, size_t size);
 
+void intel_gt_buffer_pool_mark_used(struct intel_gt_buffer_pool_node *node);
+
 static inline int
 intel_gt_buffer_pool_mark_active(struct intel_gt_buffer_pool_node *node,
 				 struct i915_request *rq)
 {
+	/* did we call mark_used? */
+	GEM_WARN_ON(!node->pinned);
+
 	return i915_active_add_request(&node->active, rq);
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool_types.h b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool_types.h
index bcf1658c9633..0401825e829d 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool_types.h
@@ -31,6 +31,7 @@ struct intel_gt_buffer_pool_node {
 		struct rcu_head rcu;
 	};
 	unsigned long age;
+	bool pinned;
 };
 
 #endif /* INTEL_GT_BUFFER_POOL_TYPES_H */
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 039/162] drm/i915: Fix pread/pwrite to work with new locking rules.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (37 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 038/162] drm/i915: Defer pin calls in buffer pool until first use by caller Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 040/162] drm/i915: Fix workarounds selftest, part 1 Matthew Auld
                   ` (122 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

We are removing obj->mm.lock, and need to take the reservation lock
before we can pin pages. Move the pinning pages into the helper, and
merge gtt pwrite/pread preparation and cleanup paths.

The fence lock is also removed; it will conflict with fence annotations,
because of memory allocations done when pagefaulting inside copy_*_user.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile              |   1 -
 drivers/gpu/drm/i915/gem/i915_gem_fence.c  |  95 ---------
 drivers/gpu/drm/i915/gem/i915_gem_object.h |   5 -
 drivers/gpu/drm/i915/i915_gem.c            | 224 +++++++++++----------
 4 files changed, 114 insertions(+), 211 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_fence.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 2445cc990e15..5112e5d79316 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -136,7 +136,6 @@ gem-y += \
 	gem/i915_gem_dmabuf.o \
 	gem/i915_gem_domain.o \
 	gem/i915_gem_execbuffer.o \
-	gem/i915_gem_fence.o \
 	gem/i915_gem_internal.o \
 	gem/i915_gem_object.o \
 	gem/i915_gem_object_blt.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_fence.c b/drivers/gpu/drm/i915/gem/i915_gem_fence.c
deleted file mode 100644
index 8ab842c80f99..000000000000
--- a/drivers/gpu/drm/i915/gem/i915_gem_fence.c
+++ /dev/null
@@ -1,95 +0,0 @@
-/*
- * SPDX-License-Identifier: MIT
- *
- * Copyright © 2019 Intel Corporation
- */
-
-#include "i915_drv.h"
-#include "i915_gem_object.h"
-
-struct stub_fence {
-	struct dma_fence dma;
-	struct i915_sw_fence chain;
-};
-
-static int __i915_sw_fence_call
-stub_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
-{
-	struct stub_fence *stub = container_of(fence, typeof(*stub), chain);
-
-	switch (state) {
-	case FENCE_COMPLETE:
-		dma_fence_signal(&stub->dma);
-		break;
-
-	case FENCE_FREE:
-		dma_fence_put(&stub->dma);
-		break;
-	}
-
-	return NOTIFY_DONE;
-}
-
-static const char *stub_driver_name(struct dma_fence *fence)
-{
-	return DRIVER_NAME;
-}
-
-static const char *stub_timeline_name(struct dma_fence *fence)
-{
-	return "object";
-}
-
-static void stub_release(struct dma_fence *fence)
-{
-	struct stub_fence *stub = container_of(fence, typeof(*stub), dma);
-
-	i915_sw_fence_fini(&stub->chain);
-
-	BUILD_BUG_ON(offsetof(typeof(*stub), dma));
-	dma_fence_free(&stub->dma);
-}
-
-static const struct dma_fence_ops stub_fence_ops = {
-	.get_driver_name = stub_driver_name,
-	.get_timeline_name = stub_timeline_name,
-	.release = stub_release,
-};
-
-struct dma_fence *
-i915_gem_object_lock_fence(struct drm_i915_gem_object *obj)
-{
-	struct stub_fence *stub;
-
-	assert_object_held(obj);
-
-	stub = kmalloc(sizeof(*stub), GFP_KERNEL);
-	if (!stub)
-		return NULL;
-
-	i915_sw_fence_init(&stub->chain, stub_notify);
-	dma_fence_init(&stub->dma, &stub_fence_ops, &stub->chain.wait.lock,
-		       0, 0);
-
-	if (i915_sw_fence_await_reservation(&stub->chain,
-					    obj->base.resv, NULL, true,
-					    i915_fence_timeout(to_i915(obj->base.dev)),
-					    I915_FENCE_GFP) < 0)
-		goto err;
-
-	dma_resv_add_excl_fence(obj->base.resv, &stub->dma);
-
-	return &stub->dma;
-
-err:
-	stub_release(&stub->dma);
-	return NULL;
-}
-
-void i915_gem_object_unlock_fence(struct drm_i915_gem_object *obj,
-				  struct dma_fence *fence)
-{
-	struct stub_fence *stub = container_of(fence, typeof(*stub), dma);
-
-	i915_sw_fence_commit(&stub->chain);
-}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 0fec91ad6f62..9a81a80ca849 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -180,11 +180,6 @@ static inline void i915_gem_object_unlock(struct drm_i915_gem_object *obj)
 	dma_resv_unlock(obj->base.resv);
 }
 
-struct dma_fence *
-i915_gem_object_lock_fence(struct drm_i915_gem_object *obj);
-void i915_gem_object_unlock_fence(struct drm_i915_gem_object *obj,
-				  struct dma_fence *fence);
-
 static inline void
 i915_gem_object_set_readonly(struct drm_i915_gem_object *obj)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b5311f7ad870..b81fbd907775 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -306,7 +306,6 @@ i915_gem_shmem_pread(struct drm_i915_gem_object *obj,
 {
 	unsigned int needs_clflush;
 	unsigned int idx, offset;
-	struct dma_fence *fence;
 	char __user *user_data;
 	u64 remain;
 	int ret;
@@ -315,19 +314,17 @@ i915_gem_shmem_pread(struct drm_i915_gem_object *obj,
 	if (ret)
 		return ret;
 
+	ret = i915_gem_object_pin_pages(obj);
+	if (ret)
+		goto err_unlock;
+
 	ret = i915_gem_object_prepare_read(obj, &needs_clflush);
-	if (ret) {
-		i915_gem_object_unlock(obj);
-		return ret;
-	}
+	if (ret)
+		goto err_unpin;
 
-	fence = i915_gem_object_lock_fence(obj);
 	i915_gem_object_finish_access(obj);
 	i915_gem_object_unlock(obj);
 
-	if (!fence)
-		return -ENOMEM;
-
 	remain = args->size;
 	user_data = u64_to_user_ptr(args->data_ptr);
 	offset = offset_in_page(args->offset);
@@ -345,7 +342,13 @@ i915_gem_shmem_pread(struct drm_i915_gem_object *obj,
 		offset = 0;
 	}
 
-	i915_gem_object_unlock_fence(obj, fence);
+	i915_gem_object_unpin_pages(obj);
+	return ret;
+
+err_unpin:
+	i915_gem_object_unpin_pages(obj);
+err_unlock:
+	i915_gem_object_unlock(obj);
 	return ret;
 }
 
@@ -373,52 +376,102 @@ gtt_user_read(struct io_mapping *mapping,
 	return unwritten;
 }
 
-static int
-i915_gem_gtt_pread(struct drm_i915_gem_object *obj,
-		   const struct drm_i915_gem_pread *args)
+static struct i915_vma *i915_gem_gtt_prepare(struct drm_i915_gem_object *obj,
+					     struct drm_mm_node *node,
+					     bool write)
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct i915_ggtt *ggtt = &i915->ggtt;
-	intel_wakeref_t wakeref;
-	struct drm_mm_node node;
-	struct dma_fence *fence;
-	void __user *user_data;
 	struct i915_vma *vma;
-	u64 remain, offset;
+	struct i915_gem_ww_ctx ww;
 	int ret;
 
-	wakeref = intel_runtime_pm_get(&i915->runtime_pm);
+	i915_gem_ww_ctx_init(&ww, true);
+retry:
 	vma = ERR_PTR(-ENODEV);
+	ret = i915_gem_object_lock(obj, &ww);
+	if (ret)
+		goto err_ww;
+
+	ret = i915_gem_object_set_to_gtt_domain(obj, write);
+	if (ret)
+		goto err_ww;
+
 	if (!i915_gem_object_is_tiled(obj))
-		vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
-					       PIN_MAPPABLE |
-					       PIN_NONBLOCK /* NOWARN */ |
-					       PIN_NOEVICT);
-	if (!IS_ERR(vma)) {
-		node.start = i915_ggtt_offset(vma);
-		node.flags = 0;
+		vma = i915_gem_object_ggtt_pin_ww(obj, &ww, NULL, 0, 0,
+						  PIN_MAPPABLE |
+						  PIN_NONBLOCK /* NOWARN */ |
+						  PIN_NOEVICT);
+	if (vma == ERR_PTR(-EDEADLK)) {
+		ret = -EDEADLK;
+		goto err_ww;
+	} else if (!IS_ERR(vma)) {
+		node->start = i915_ggtt_offset(vma);
+		node->flags = 0;
 	} else {
-		ret = insert_mappable_node(ggtt, &node, PAGE_SIZE);
+		ret = insert_mappable_node(ggtt, node, PAGE_SIZE);
 		if (ret)
-			goto out_rpm;
-		GEM_BUG_ON(!drm_mm_node_allocated(&node));
+			goto err_ww;
+		GEM_BUG_ON(!drm_mm_node_allocated(node));
+		vma = NULL;
 	}
 
-	ret = i915_gem_object_lock_interruptible(obj, NULL);
-	if (ret)
-		goto out_unpin;
-
-	ret = i915_gem_object_set_to_gtt_domain(obj, false);
+	ret = i915_gem_object_pin_pages(obj);
 	if (ret) {
-		i915_gem_object_unlock(obj);
-		goto out_unpin;
+		if (drm_mm_node_allocated(node)) {
+			ggtt->vm.clear_range(&ggtt->vm, node->start, node->size);
+			remove_mappable_node(ggtt, node);
+		} else {
+			i915_vma_unpin(vma);
+		}
 	}
 
-	fence = i915_gem_object_lock_fence(obj);
-	i915_gem_object_unlock(obj);
-	if (!fence) {
-		ret = -ENOMEM;
-		goto out_unpin;
+err_ww:
+	if (ret == -EDEADLK) {
+		ret = i915_gem_ww_ctx_backoff(&ww);
+		if (!ret)
+			goto retry;
+	}
+	i915_gem_ww_ctx_fini(&ww);
+
+	return ret ? ERR_PTR(ret) : vma;
+}
+
+static void i915_gem_gtt_cleanup(struct drm_i915_gem_object *obj,
+				 struct drm_mm_node *node,
+				 struct i915_vma *vma)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct i915_ggtt *ggtt = &i915->ggtt;
+
+	i915_gem_object_unpin_pages(obj);
+	if (drm_mm_node_allocated(node)) {
+		ggtt->vm.clear_range(&ggtt->vm, node->start, node->size);
+		remove_mappable_node(ggtt, node);
+	} else {
+		i915_vma_unpin(vma);
+	}
+}
+
+static int
+i915_gem_gtt_pread(struct drm_i915_gem_object *obj,
+		   const struct drm_i915_gem_pread *args)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct i915_ggtt *ggtt = &i915->ggtt;
+	intel_wakeref_t wakeref;
+	struct drm_mm_node node;
+	void __user *user_data;
+	struct i915_vma *vma;
+	u64 remain, offset;
+	int ret = 0;
+
+	wakeref = intel_runtime_pm_get(&i915->runtime_pm);
+
+	vma = i915_gem_gtt_prepare(obj, &node, false);
+	if (IS_ERR(vma)) {
+		ret = PTR_ERR(vma);
+		goto out_rpm;
 	}
 
 	user_data = u64_to_user_ptr(args->data_ptr);
@@ -455,14 +508,7 @@ i915_gem_gtt_pread(struct drm_i915_gem_object *obj,
 		offset += page_length;
 	}
 
-	i915_gem_object_unlock_fence(obj, fence);
-out_unpin:
-	if (drm_mm_node_allocated(&node)) {
-		ggtt->vm.clear_range(&ggtt->vm, node.start, node.size);
-		remove_mappable_node(ggtt, &node);
-	} else {
-		i915_vma_unpin(vma);
-	}
+	i915_gem_gtt_cleanup(obj, &node, vma);
 out_rpm:
 	intel_runtime_pm_put(&i915->runtime_pm, wakeref);
 	return ret;
@@ -515,15 +561,10 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		goto out;
 
-	ret = i915_gem_object_pin_pages(obj);
-	if (ret)
-		goto out;
-
 	ret = i915_gem_shmem_pread(obj, args);
 	if (ret == -EFAULT || ret == -ENODEV)
 		ret = i915_gem_gtt_pread(obj, args);
 
-	i915_gem_object_unpin_pages(obj);
 out:
 	i915_gem_object_put(obj);
 	return ret;
@@ -571,11 +612,10 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
 	struct intel_runtime_pm *rpm = &i915->runtime_pm;
 	intel_wakeref_t wakeref;
 	struct drm_mm_node node;
-	struct dma_fence *fence;
 	struct i915_vma *vma;
 	u64 remain, offset;
 	void __user *user_data;
-	int ret;
+	int ret = 0;
 
 	if (i915_gem_object_has_struct_page(obj)) {
 		/*
@@ -593,37 +633,10 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
 		wakeref = intel_runtime_pm_get(rpm);
 	}
 
-	vma = ERR_PTR(-ENODEV);
-	if (!i915_gem_object_is_tiled(obj))
-		vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
-					       PIN_MAPPABLE |
-					       PIN_NONBLOCK /* NOWARN */ |
-					       PIN_NOEVICT);
-	if (!IS_ERR(vma)) {
-		node.start = i915_ggtt_offset(vma);
-		node.flags = 0;
-	} else {
-		ret = insert_mappable_node(ggtt, &node, PAGE_SIZE);
-		if (ret)
-			goto out_rpm;
-		GEM_BUG_ON(!drm_mm_node_allocated(&node));
-	}
-
-	ret = i915_gem_object_lock_interruptible(obj, NULL);
-	if (ret)
-		goto out_unpin;
-
-	ret = i915_gem_object_set_to_gtt_domain(obj, true);
-	if (ret) {
-		i915_gem_object_unlock(obj);
-		goto out_unpin;
-	}
-
-	fence = i915_gem_object_lock_fence(obj);
-	i915_gem_object_unlock(obj);
-	if (!fence) {
-		ret = -ENOMEM;
-		goto out_unpin;
+	vma = i915_gem_gtt_prepare(obj, &node, true);
+	if (IS_ERR(vma)) {
+		ret = PTR_ERR(vma);
+		goto out_rpm;
 	}
 
 	i915_gem_object_invalidate_frontbuffer(obj, ORIGIN_CPU);
@@ -672,14 +685,7 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object *obj,
 	intel_gt_flush_ggtt_writes(ggtt->vm.gt);
 	i915_gem_object_flush_frontbuffer(obj, ORIGIN_CPU);
 
-	i915_gem_object_unlock_fence(obj, fence);
-out_unpin:
-	if (drm_mm_node_allocated(&node)) {
-		ggtt->vm.clear_range(&ggtt->vm, node.start, node.size);
-		remove_mappable_node(ggtt, &node);
-	} else {
-		i915_vma_unpin(vma);
-	}
+	i915_gem_gtt_cleanup(obj, &node, vma);
 out_rpm:
 	intel_runtime_pm_put(rpm, wakeref);
 	return ret;
@@ -719,7 +725,6 @@ i915_gem_shmem_pwrite(struct drm_i915_gem_object *obj,
 	unsigned int partial_cacheline_write;
 	unsigned int needs_clflush;
 	unsigned int offset, idx;
-	struct dma_fence *fence;
 	void __user *user_data;
 	u64 remain;
 	int ret;
@@ -728,19 +733,17 @@ i915_gem_shmem_pwrite(struct drm_i915_gem_object *obj,
 	if (ret)
 		return ret;
 
+	ret = i915_gem_object_pin_pages(obj);
+	if (ret)
+		goto err_unlock;
+
 	ret = i915_gem_object_prepare_write(obj, &needs_clflush);
-	if (ret) {
-		i915_gem_object_unlock(obj);
-		return ret;
-	}
+	if (ret)
+		goto err_unpin;
 
-	fence = i915_gem_object_lock_fence(obj);
 	i915_gem_object_finish_access(obj);
 	i915_gem_object_unlock(obj);
 
-	if (!fence)
-		return -ENOMEM;
-
 	/* If we don't overwrite a cacheline completely we need to be
 	 * careful to have up-to-date data by first clflushing. Don't
 	 * overcomplicate things and flush the entire patch.
@@ -768,8 +771,14 @@ i915_gem_shmem_pwrite(struct drm_i915_gem_object *obj,
 	}
 
 	i915_gem_object_flush_frontbuffer(obj, ORIGIN_CPU);
-	i915_gem_object_unlock_fence(obj, fence);
 
+	i915_gem_object_unpin_pages(obj);
+	return ret;
+
+err_unpin:
+	i915_gem_object_unpin_pages(obj);
+err_unlock:
+	i915_gem_object_unlock(obj);
 	return ret;
 }
 
@@ -826,10 +835,6 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		goto err;
 
-	ret = i915_gem_object_pin_pages(obj);
-	if (ret)
-		goto err;
-
 	ret = -EFAULT;
 	/* We can only do the GTT pwrite on untiled buffers, as otherwise
 	 * it would end up going through the fenced access, and we'll get
@@ -850,7 +855,6 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 			ret = i915_gem_shmem_pwrite(obj, args);
 	}
 
-	i915_gem_object_unpin_pages(obj);
 err:
 	i915_gem_object_put(obj);
 	return ret;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 040/162] drm/i915: Fix workarounds selftest, part 1
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (38 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 039/162] drm/i915: Fix pread/pwrite to work with new locking rules Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 041/162] drm/i915: Prepare for obj->mm.lock removal Matthew Auld
                   ` (121 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

pin_map needs the ww lock, so ensure we pin both before submission.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  3 +
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     | 12 +++
 .../gpu/drm/i915/gt/selftest_workarounds.c    | 76 ++++++++++++-------
 3 files changed, 64 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 9a81a80ca849..da7fd301fc8d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -412,6 +412,9 @@ enum i915_map_type {
 void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
 					   enum i915_map_type type);
 
+void *__must_check i915_gem_object_pin_map_unlocked(struct drm_i915_gem_object *obj,
+						    enum i915_map_type type);
+
 void __i915_gem_object_flush_map(struct drm_i915_gem_object *obj,
 				 unsigned long offset,
 				 unsigned long size);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 5bcd21a8fc4e..b03e58106516 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -397,6 +397,18 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
 	goto out_unlock;
 }
 
+void *i915_gem_object_pin_map_unlocked(struct drm_i915_gem_object *obj,
+				       enum i915_map_type type)
+{
+	void *ret;
+
+	i915_gem_object_lock(obj, NULL);
+	ret = i915_gem_object_pin_map(obj, type);
+	i915_gem_object_unlock(obj);
+
+	return ret;
+}
+
 void __i915_gem_object_flush_map(struct drm_i915_gem_object *obj,
 				 unsigned long offset,
 				 unsigned long size)
diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
index 810ab026a55e..69da2147ed3b 100644
--- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
@@ -111,7 +111,7 @@ read_nonprivs(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
 
 	i915_gem_object_set_cache_coherency(result, I915_CACHE_LLC);
 
-	cs = i915_gem_object_pin_map(result, I915_MAP_WB);
+	cs = i915_gem_object_pin_map_unlocked(result, I915_MAP_WB);
 	if (IS_ERR(cs)) {
 		err = PTR_ERR(cs);
 		goto err_obj;
@@ -217,7 +217,7 @@ static int check_whitelist(struct i915_gem_context *ctx,
 	i915_gem_object_lock(results, NULL);
 	intel_wedge_on_timeout(&wedge, engine->gt, HZ / 5) /* safety net! */
 		err = i915_gem_object_set_to_cpu_domain(results, false);
-	i915_gem_object_unlock(results);
+
 	if (intel_gt_is_wedged(engine->gt))
 		err = -EIO;
 	if (err)
@@ -245,6 +245,7 @@ static int check_whitelist(struct i915_gem_context *ctx,
 
 	i915_gem_object_unpin_map(results);
 out_put:
+	i915_gem_object_unlock(results);
 	i915_gem_object_put(results);
 	return err;
 }
@@ -520,6 +521,7 @@ static int check_dirty_whitelist(struct intel_context *ce)
 
 	for (i = 0; i < engine->whitelist.count; i++) {
 		u32 reg = i915_mmio_reg_offset(engine->whitelist.list[i].reg);
+		struct i915_gem_ww_ctx ww;
 		u64 addr = scratch->node.start;
 		struct i915_request *rq;
 		u32 srm, lrm, rsvd;
@@ -535,6 +537,29 @@ static int check_dirty_whitelist(struct intel_context *ce)
 
 		ro_reg = ro_register(reg);
 
+		i915_gem_ww_ctx_init(&ww, false);
+retry:
+		cs = NULL;
+		err = i915_gem_object_lock(scratch->obj, &ww);
+		if (!err)
+			err = i915_gem_object_lock(batch->obj, &ww);
+		if (!err)
+			err = intel_context_pin_ww(ce, &ww);
+		if (err)
+			goto out;
+
+		cs = i915_gem_object_pin_map(batch->obj, I915_MAP_WC);
+		if (IS_ERR(cs)) {
+			err = PTR_ERR(cs);
+			goto out_ctx;
+		}
+
+		results = i915_gem_object_pin_map(scratch->obj, I915_MAP_WB);
+		if (IS_ERR(results)) {
+			err = PTR_ERR(results);
+			goto out_unmap_batch;
+		}
+
 		/* Clear non priv flags */
 		reg &= RING_FORCE_TO_NONPRIV_ADDRESS_MASK;
 
@@ -546,12 +571,6 @@ static int check_dirty_whitelist(struct intel_context *ce)
 		pr_debug("%s: Writing garbage to %x\n",
 			 engine->name, reg);
 
-		cs = i915_gem_object_pin_map(batch->obj, I915_MAP_WC);
-		if (IS_ERR(cs)) {
-			err = PTR_ERR(cs);
-			goto out_batch;
-		}
-
 		/* SRM original */
 		*cs++ = srm;
 		*cs++ = reg;
@@ -598,11 +617,12 @@ static int check_dirty_whitelist(struct intel_context *ce)
 		i915_gem_object_flush_map(batch->obj);
 		i915_gem_object_unpin_map(batch->obj);
 		intel_gt_chipset_flush(engine->gt);
+		cs = NULL;
 
-		rq = intel_context_create_request(ce);
+		rq = i915_request_create(ce);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
-			goto out_batch;
+			goto out_unmap_scratch;
 		}
 
 		if (engine->emit_init_breadcrumb) { /* Be nice if we hang */
@@ -611,20 +631,16 @@ static int check_dirty_whitelist(struct intel_context *ce)
 				goto err_request;
 		}
 
-		i915_vma_lock(batch);
 		err = i915_request_await_object(rq, batch->obj, false);
 		if (err == 0)
 			err = i915_vma_move_to_active(batch, rq, 0);
-		i915_vma_unlock(batch);
 		if (err)
 			goto err_request;
 
-		i915_vma_lock(scratch);
 		err = i915_request_await_object(rq, scratch->obj, true);
 		if (err == 0)
 			err = i915_vma_move_to_active(scratch, rq,
 						      EXEC_OBJECT_WRITE);
-		i915_vma_unlock(scratch);
 		if (err)
 			goto err_request;
 
@@ -640,13 +656,7 @@ static int check_dirty_whitelist(struct intel_context *ce)
 			pr_err("%s: Futzing %x timedout; cancelling test\n",
 			       engine->name, reg);
 			intel_gt_set_wedged(engine->gt);
-			goto out_batch;
-		}
-
-		results = i915_gem_object_pin_map(scratch->obj, I915_MAP_WB);
-		if (IS_ERR(results)) {
-			err = PTR_ERR(results);
-			goto out_batch;
+			goto out_unmap_scratch;
 		}
 
 		GEM_BUG_ON(values[ARRAY_SIZE(values) - 1] != 0xffffffff);
@@ -657,7 +667,7 @@ static int check_dirty_whitelist(struct intel_context *ce)
 				pr_err("%s: Unable to write to whitelisted register %x\n",
 				       engine->name, reg);
 				err = -EINVAL;
-				goto out_unpin;
+				goto out_unmap_scratch;
 			}
 		} else {
 			rsvd = 0;
@@ -723,15 +733,27 @@ static int check_dirty_whitelist(struct intel_context *ce)
 
 			err = -EINVAL;
 		}
-out_unpin:
+out_unmap_scratch:
 		i915_gem_object_unpin_map(scratch->obj);
+out_unmap_batch:
+		if (cs)
+			i915_gem_object_unpin_map(batch->obj);
+out_ctx:
+		intel_context_unpin(ce);
+out:
+		if (err == -EDEADLK) {
+			err = i915_gem_ww_ctx_backoff(&ww);
+			if (!err)
+				goto retry;
+		}
+		i915_gem_ww_ctx_fini(&ww);
 		if (err)
 			break;
 	}
 
 	if (igt_flush_test(engine->i915))
 		err = -EIO;
-out_batch:
+
 	i915_vma_unpin_and_release(&batch, 0);
 out_scratch:
 	i915_vma_unpin_and_release(&scratch, 0);
@@ -868,7 +890,7 @@ static int scrub_whitelisted_registers(struct i915_gem_context *ctx,
 	if (IS_ERR(batch))
 		return PTR_ERR(batch);
 
-	cs = i915_gem_object_pin_map(batch->obj, I915_MAP_WC);
+	cs = i915_gem_object_pin_map_unlocked(batch->obj, I915_MAP_WC);
 	if (IS_ERR(cs)) {
 		err = PTR_ERR(cs);
 		goto err_batch;
@@ -1003,11 +1025,11 @@ check_whitelisted_registers(struct intel_engine_cs *engine,
 	u32 *a, *b;
 	int i, err;
 
-	a = i915_gem_object_pin_map(A->obj, I915_MAP_WB);
+	a = i915_gem_object_pin_map_unlocked(A->obj, I915_MAP_WB);
 	if (IS_ERR(a))
 		return PTR_ERR(a);
 
-	b = i915_gem_object_pin_map(B->obj, I915_MAP_WB);
+	b = i915_gem_object_pin_map_unlocked(B->obj, I915_MAP_WB);
 	if (IS_ERR(b)) {
 		err = PTR_ERR(b);
 		goto err_a;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 041/162] drm/i915: Prepare for obj->mm.lock removal
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (39 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 040/162] drm/i915: Fix workarounds selftest, part 1 Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 042/162] drm/i915: Add igt_spinner_pin() to allow for ww locking around spinner Matthew Auld
                   ` (120 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Hellström, dri-devel

From: Thomas Hellström <thomas.hellstrom@intel.com>

Stolen objects need to lock, and we may call put_pages when
refcount drops to 0, ensure all calls are handled correctly.

Idea-from: Thomas Hellström <thomas.hellstrom@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h | 13 +++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_pages.c  | 14 ++++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 10 +++++++++-
 3 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index da7fd301fc8d..26ef37532f81 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -125,6 +125,19 @@ i915_gem_object_put(struct drm_i915_gem_object *obj)
 	  ((kref_read(&obj->base.refcount) == 1) &&		\
 	   list_empty_careful(&obj->mm.link) &&			\
 	   list_empty_careful(&obj->vma.list))))
+/*
+ * If more than one potential simultaneous locker, assert held.
+ */
+static inline void assert_object_held_shared(struct drm_i915_gem_object *obj)
+{
+	/*
+	 * Note mm list lookup is protected by
+	 * kref_get_unless_zero().
+	 */
+	if (IS_ENABLED(CONFIG_LOCKDEP) &&
+	    kref_read(&obj->base.refcount) > 0)
+		lockdep_assert_held(&obj->mm.lock);
+}
 
 static inline int __i915_gem_object_lock(struct drm_i915_gem_object *obj,
 					 struct i915_gem_ww_ctx *ww,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index b03e58106516..183aae046b68 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -18,7 +18,7 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 	unsigned long supported = INTEL_INFO(i915)->page_sizes;
 	int i;
 
-	lockdep_assert_held(&obj->mm.lock);
+	assert_object_held_shared(obj);
 
 	if (i915_gem_object_is_volatile(obj))
 		obj->mm.madv = I915_MADV_DONTNEED;
@@ -67,6 +67,7 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 		struct list_head *list;
 		unsigned long flags;
 
+		lockdep_assert_held(&obj->mm.lock);
 		spin_lock_irqsave(&i915->mm.obj_lock, flags);
 
 		i915->mm.shrink_count++;
@@ -88,6 +89,8 @@ int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	int err;
 
+	assert_object_held_shared(obj);
+
 	if (unlikely(obj->mm.madv != I915_MADV_WILLNEED)) {
 		drm_dbg(&i915->drm,
 			"Attempting to obtain a purgeable object\n");
@@ -115,6 +118,8 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 	if (err)
 		return err;
 
+	assert_object_held_shared(obj);
+
 	if (unlikely(!i915_gem_object_has_pages(obj))) {
 		GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj));
 
@@ -142,7 +147,7 @@ void i915_gem_object_truncate(struct drm_i915_gem_object *obj)
 /* Try to discard unwanted pages */
 void i915_gem_object_writeback(struct drm_i915_gem_object *obj)
 {
-	lockdep_assert_held(&obj->mm.lock);
+	assert_object_held_shared(obj);
 	GEM_BUG_ON(i915_gem_object_has_pages(obj));
 
 	if (obj->ops->writeback)
@@ -173,6 +178,8 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)
 {
 	struct sg_table *pages;
 
+	assert_object_held_shared(obj);
+
 	pages = fetch_and_zero(&obj->mm.pages);
 	if (IS_ERR_OR_NULL(pages))
 		return pages;
@@ -200,6 +207,9 @@ int __i915_gem_object_put_pages_locked(struct drm_i915_gem_object *obj)
 	if (i915_gem_object_has_pinned_pages(obj))
 		return -EBUSY;
 
+	/* May be called by shrinker from within get_pages() (on another bo) */
+	assert_object_held_shared(obj);
+
 	i915_gem_object_release_mmap_offset(obj);
 
 	/*
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index 5372b888ba01..ce9086d3a647 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -643,11 +643,19 @@ __i915_gem_object_create_stolen(struct intel_memory_region *mem,
 	cache_level = HAS_LLC(mem->i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
 	i915_gem_object_set_cache_coherency(obj, cache_level);
 
+	if (WARN_ON(!i915_gem_object_trylock(obj))) {
+		err = -EBUSY;
+		goto cleanup;
+	}
+
 	err = i915_gem_object_pin_pages(obj);
-	if (err)
+	if (err) {
+		i915_gem_object_unlock(obj);
 		goto cleanup;
+	}
 
 	i915_gem_object_init_memory_region(obj, mem);
+	i915_gem_object_unlock(obj);
 
 	return obj;
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 042/162] drm/i915: Add igt_spinner_pin() to allow for ww locking around spinner.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (40 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 041/162] drm/i915: Prepare for obj->mm.lock removal Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 043/162] drm/i915: Add ww locking around vm_access() Matthew Auld
                   ` (119 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

By default, we assume that it's called inside igt_create_request
to keep existing selftests working, but allow for manual pinning
when passing a ww context.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/selftests/igt_spinner.c | 136 ++++++++++++-------
 drivers/gpu/drm/i915/selftests/igt_spinner.h |   5 +
 2 files changed, 95 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
index ec0ecb4e4ca6..9c461edb0b73 100644
--- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
+++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
@@ -11,8 +11,6 @@
 
 int igt_spinner_init(struct igt_spinner *spin, struct intel_gt *gt)
 {
-	unsigned int mode;
-	void *vaddr;
 	int err;
 
 	memset(spin, 0, sizeof(*spin));
@@ -23,6 +21,7 @@ int igt_spinner_init(struct igt_spinner *spin, struct intel_gt *gt)
 		err = PTR_ERR(spin->hws);
 		goto err;
 	}
+	i915_gem_object_set_cache_coherency(spin->hws, I915_CACHE_LLC);
 
 	spin->obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
 	if (IS_ERR(spin->obj)) {
@@ -30,34 +29,83 @@ int igt_spinner_init(struct igt_spinner *spin, struct intel_gt *gt)
 		goto err_hws;
 	}
 
-	i915_gem_object_set_cache_coherency(spin->hws, I915_CACHE_LLC);
-	vaddr = i915_gem_object_pin_map(spin->hws, I915_MAP_WB);
-	if (IS_ERR(vaddr)) {
-		err = PTR_ERR(vaddr);
-		goto err_obj;
-	}
-	spin->seqno = memset(vaddr, 0xff, PAGE_SIZE);
-
-	mode = i915_coherent_map_type(gt->i915);
-	vaddr = i915_gem_object_pin_map(spin->obj, mode);
-	if (IS_ERR(vaddr)) {
-		err = PTR_ERR(vaddr);
-		goto err_unpin_hws;
-	}
-	spin->batch = vaddr;
-
 	return 0;
 
-err_unpin_hws:
-	i915_gem_object_unpin_map(spin->hws);
-err_obj:
-	i915_gem_object_put(spin->obj);
 err_hws:
 	i915_gem_object_put(spin->hws);
 err:
 	return err;
 }
 
+static void *igt_spinner_pin_obj(struct intel_context *ce,
+				 struct i915_gem_ww_ctx *ww,
+				 struct drm_i915_gem_object *obj,
+				 unsigned int mode, struct i915_vma **vma)
+{
+	void *vaddr;
+	int ret;
+
+	*vma = i915_vma_instance(obj, ce->vm, NULL);
+	if (IS_ERR(*vma))
+		return ERR_CAST(*vma);
+
+	ret = i915_gem_object_lock(obj, ww);
+	if (ret)
+		return ERR_PTR(ret);
+
+	vaddr = i915_gem_object_pin_map(obj, mode);
+
+	if (!ww)
+		i915_gem_object_unlock(obj);
+
+	if (IS_ERR(vaddr))
+		return vaddr;
+
+	if (ww)
+		ret = i915_vma_pin_ww(*vma, ww, 0, 0, PIN_USER);
+	else
+		ret = i915_vma_pin(*vma, 0, 0, PIN_USER);
+
+	if (ret) {
+		i915_gem_object_unpin_map(obj);
+		return ERR_PTR(ret);
+	}
+
+	return vaddr;
+}
+
+int igt_spinner_pin(struct igt_spinner *spin,
+		    struct intel_context *ce,
+		    struct i915_gem_ww_ctx *ww)
+{
+	void *vaddr;
+
+	if (spin->ce && WARN_ON(spin->ce != ce))
+		return -ENODEV;
+	spin->ce = ce;
+
+	if (!spin->seqno) {
+		vaddr = igt_spinner_pin_obj(ce, ww, spin->hws, I915_MAP_WB, &spin->hws_vma);
+		if (IS_ERR(vaddr))
+			return PTR_ERR(vaddr);
+
+		spin->seqno = memset(vaddr, 0xff, PAGE_SIZE);
+	}
+
+	if (!spin->batch) {
+		unsigned int mode =
+			i915_coherent_map_type(spin->gt->i915);
+
+		vaddr = igt_spinner_pin_obj(ce, ww, spin->obj, mode, &spin->batch_vma);
+		if (IS_ERR(vaddr))
+			return PTR_ERR(vaddr);
+
+		spin->batch = vaddr;
+	}
+
+	return 0;
+}
+
 static unsigned int seqno_offset(u64 fence)
 {
 	return offset_in_page(sizeof(u32) * fence);
@@ -102,27 +150,18 @@ igt_spinner_create_request(struct igt_spinner *spin,
 	if (!intel_engine_can_store_dword(ce->engine))
 		return ERR_PTR(-ENODEV);
 
-	vma = i915_vma_instance(spin->obj, ce->vm, NULL);
-	if (IS_ERR(vma))
-		return ERR_CAST(vma);
-
-	hws = i915_vma_instance(spin->hws, ce->vm, NULL);
-	if (IS_ERR(hws))
-		return ERR_CAST(hws);
+	if (!spin->batch) {
+		err = igt_spinner_pin(spin, ce, NULL);
+		if (err)
+			return ERR_PTR(err);
+	}
 
-	err = i915_vma_pin(vma, 0, 0, PIN_USER);
-	if (err)
-		return ERR_PTR(err);
-
-	err = i915_vma_pin(hws, 0, 0, PIN_USER);
-	if (err)
-		goto unpin_vma;
+	hws = spin->hws_vma;
+	vma = spin->batch_vma;
 
 	rq = intel_context_create_request(ce);
-	if (IS_ERR(rq)) {
-		err = PTR_ERR(rq);
-		goto unpin_hws;
-	}
+	if (IS_ERR(rq))
+		return ERR_CAST(rq);
 
 	err = move_to_active(vma, rq, 0);
 	if (err)
@@ -185,10 +224,6 @@ igt_spinner_create_request(struct igt_spinner *spin,
 		i915_request_set_error_once(rq, err);
 		i915_request_add(rq);
 	}
-unpin_hws:
-	i915_vma_unpin(hws);
-unpin_vma:
-	i915_vma_unpin(vma);
 	return err ? ERR_PTR(err) : rq;
 }
 
@@ -202,6 +237,9 @@ hws_seqno(const struct igt_spinner *spin, const struct i915_request *rq)
 
 void igt_spinner_end(struct igt_spinner *spin)
 {
+	if (!spin->batch)
+		return;
+
 	*spin->batch = MI_BATCH_BUFFER_END;
 	intel_gt_chipset_flush(spin->gt);
 }
@@ -210,10 +248,16 @@ void igt_spinner_fini(struct igt_spinner *spin)
 {
 	igt_spinner_end(spin);
 
-	i915_gem_object_unpin_map(spin->obj);
+	if (spin->batch) {
+		i915_vma_unpin(spin->batch_vma);
+		i915_gem_object_unpin_map(spin->obj);
+	}
 	i915_gem_object_put(spin->obj);
 
-	i915_gem_object_unpin_map(spin->hws);
+	if (spin->seqno) {
+		i915_vma_unpin(spin->hws_vma);
+		i915_gem_object_unpin_map(spin->hws);
+	}
 	i915_gem_object_put(spin->hws);
 }
 
diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.h b/drivers/gpu/drm/i915/selftests/igt_spinner.h
index ec62c9ef320b..fbe5b1625b05 100644
--- a/drivers/gpu/drm/i915/selftests/igt_spinner.h
+++ b/drivers/gpu/drm/i915/selftests/igt_spinner.h
@@ -20,11 +20,16 @@ struct igt_spinner {
 	struct intel_gt *gt;
 	struct drm_i915_gem_object *hws;
 	struct drm_i915_gem_object *obj;
+	struct intel_context *ce;
+	struct i915_vma *hws_vma, *batch_vma;
 	u32 *batch;
 	void *seqno;
 };
 
 int igt_spinner_init(struct igt_spinner *spin, struct intel_gt *gt);
+int igt_spinner_pin(struct igt_spinner *spin,
+		    struct intel_context *ce,
+		    struct i915_gem_ww_ctx *ww);
 void igt_spinner_fini(struct igt_spinner *spin);
 
 struct i915_request *
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 043/162] drm/i915: Add ww locking around vm_access()
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (41 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 042/162] drm/i915: Add igt_spinner_pin() to allow for ww locking around spinner Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 044/162] drm/i915: Increase ww locking for perf Matthew Auld
                   ` (118 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

i915_gem_object_pin_map potentially needs a ww context, so ensure we
have one we can revoke.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index 163208a6260d..2561a2f1e54f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -421,7 +421,9 @@ vm_access(struct vm_area_struct *area, unsigned long addr,
 {
 	struct i915_mmap_offset *mmo = area->vm_private_data;
 	struct drm_i915_gem_object *obj = mmo->obj;
+	struct i915_gem_ww_ctx ww;
 	void *vaddr;
+	int err = 0;
 
 	if (i915_gem_object_is_readonly(obj) && write)
 		return -EACCES;
@@ -430,10 +432,18 @@ vm_access(struct vm_area_struct *area, unsigned long addr,
 	if (addr >= obj->base.size)
 		return -EINVAL;
 
+	i915_gem_ww_ctx_init(&ww, true);
+retry:
+	err = i915_gem_object_lock(obj, &ww);
+	if (err)
+		goto out;
+
 	/* As this is primarily for debugging, let's focus on simplicity */
 	vaddr = i915_gem_object_pin_map(obj, I915_MAP_FORCE_WC);
-	if (IS_ERR(vaddr))
-		return PTR_ERR(vaddr);
+	if (IS_ERR(vaddr)) {
+		err = PTR_ERR(vaddr);
+		goto out;
+	}
 
 	if (write) {
 		memcpy(vaddr + addr, buf, len);
@@ -443,6 +453,16 @@ vm_access(struct vm_area_struct *area, unsigned long addr,
 	}
 
 	i915_gem_object_unpin_map(obj);
+out:
+	if (err == -EDEADLK) {
+		err = i915_gem_ww_ctx_backoff(&ww);
+		if (!err)
+			goto retry;
+	}
+	i915_gem_ww_ctx_fini(&ww);
+
+	if (err)
+		return err;
 
 	return len;
 }
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 044/162] drm/i915: Increase ww locking for perf.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (42 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 043/162] drm/i915: Add ww locking around vm_access() Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 045/162] drm/i915: Lock ww in ucode objects correctly Matthew Auld
                   ` (117 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

We need to lock a few more objects, some temporarily,
add ww lock where needed.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 56 ++++++++++++++++++++++++--------
 1 file changed, 43 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 0b300e0d9561..1f574d29ece5 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1587,7 +1587,7 @@ static int alloc_oa_buffer(struct i915_perf_stream *stream)
 	stream->oa_buffer.vma = vma;
 
 	stream->oa_buffer.vaddr =
-		i915_gem_object_pin_map(bo, I915_MAP_WB);
+		i915_gem_object_pin_map_unlocked(bo, I915_MAP_WB);
 	if (IS_ERR(stream->oa_buffer.vaddr)) {
 		ret = PTR_ERR(stream->oa_buffer.vaddr);
 		goto err_unpin;
@@ -1640,6 +1640,7 @@ static int alloc_noa_wait(struct i915_perf_stream *stream)
 	const u32 base = stream->engine->mmio_base;
 #define CS_GPR(x) GEN8_RING_CS_GPR(base, x)
 	u32 *batch, *ts0, *cs, *jump;
+	struct i915_gem_ww_ctx ww;
 	int ret, i;
 	enum {
 		START_TS,
@@ -1657,15 +1658,21 @@ static int alloc_noa_wait(struct i915_perf_stream *stream)
 		return PTR_ERR(bo);
 	}
 
+	i915_gem_ww_ctx_init(&ww, true);
+retry:
+	ret = i915_gem_object_lock(bo, &ww);
+	if (ret)
+		goto out_ww;
+
 	/*
 	 * We pin in GGTT because we jump into this buffer now because
 	 * multiple OA config BOs will have a jump to this address and it
 	 * needs to be fixed during the lifetime of the i915/perf stream.
 	 */
-	vma = i915_gem_object_ggtt_pin(bo, NULL, 0, 0, PIN_HIGH);
+	vma = i915_gem_object_ggtt_pin_ww(bo, &ww, NULL, 0, 0, PIN_HIGH);
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
-		goto err_unref;
+		goto out_ww;
 	}
 
 	batch = cs = i915_gem_object_pin_map(bo, I915_MAP_WB);
@@ -1799,12 +1806,19 @@ static int alloc_noa_wait(struct i915_perf_stream *stream)
 	__i915_gem_object_release_map(bo);
 
 	stream->noa_wait = vma;
-	return 0;
+	goto out_ww;
 
 err_unpin:
 	i915_vma_unpin_and_release(&vma, 0);
-err_unref:
-	i915_gem_object_put(bo);
+out_ww:
+	if (ret == -EDEADLK) {
+		ret = i915_gem_ww_ctx_backoff(&ww);
+		if (!ret)
+			goto retry;
+	}
+	i915_gem_ww_ctx_fini(&ww);
+	if (ret)
+		i915_gem_object_put(bo);
 	return ret;
 }
 
@@ -1847,6 +1861,7 @@ alloc_oa_config_buffer(struct i915_perf_stream *stream,
 {
 	struct drm_i915_gem_object *obj;
 	struct i915_oa_config_bo *oa_bo;
+	struct i915_gem_ww_ctx ww;
 	size_t config_length = 0;
 	u32 *cs;
 	int err;
@@ -1867,10 +1882,16 @@ alloc_oa_config_buffer(struct i915_perf_stream *stream,
 		goto err_free;
 	}
 
+	i915_gem_ww_ctx_init(&ww, true);
+retry:
+	err = i915_gem_object_lock(obj, &ww);
+	if (err)
+		goto out_ww;
+
 	cs = i915_gem_object_pin_map(obj, I915_MAP_WB);
 	if (IS_ERR(cs)) {
 		err = PTR_ERR(cs);
-		goto err_oa_bo;
+		goto out_ww;
 	}
 
 	cs = write_cs_mi_lri(cs,
@@ -1898,19 +1919,28 @@ alloc_oa_config_buffer(struct i915_perf_stream *stream,
 				       NULL);
 	if (IS_ERR(oa_bo->vma)) {
 		err = PTR_ERR(oa_bo->vma);
-		goto err_oa_bo;
+		goto out_ww;
 	}
 
 	oa_bo->oa_config = i915_oa_config_get(oa_config);
 	llist_add(&oa_bo->node, &stream->oa_config_bos);
 
-	return oa_bo;
+out_ww:
+	if (err == -EDEADLK) {
+		err = i915_gem_ww_ctx_backoff(&ww);
+		if (!err)
+			goto retry;
+	}
+	i915_gem_ww_ctx_fini(&ww);
 
-err_oa_bo:
-	i915_gem_object_put(obj);
+	if (err)
+		i915_gem_object_put(obj);
 err_free:
-	kfree(oa_bo);
-	return ERR_PTR(err);
+	if (err) {
+		kfree(oa_bo);
+		return ERR_PTR(err);
+	}
+	return oa_bo;
 }
 
 static struct i915_vma *
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 045/162] drm/i915: Lock ww in ucode objects correctly
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (43 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 044/162] drm/i915: Increase ww locking for perf Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 046/162] drm/i915: Add ww locking to dma-buf ops Matthew Auld
                   ` (116 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

In the ucode functions, the calls are done before userspace runs,
when debugging using debugfs, or when creating semi-permanent mappings;
we can safely use the unlocked versions that does the ww dance for us.

Because there is no pin_pages_unlocked yet, add it as convenience function.

This removes possible lockdep splats about missing resv lock for ucode.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  2 ++
 drivers/gpu/drm/i915/gem/i915_gem_pages.c  | 20 ++++++++++++++++++++
 drivers/gpu/drm/i915/gt/uc/intel_guc.c     |  2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c |  4 ++--
 drivers/gpu/drm/i915/gt/uc/intel_huc.c     |  2 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c   |  2 +-
 6 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 26ef37532f81..1d4b44151e0c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -358,6 +358,8 @@ i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
 	return __i915_gem_object_get_pages(obj);
 }
 
+int i915_gem_object_pin_pages_unlocked(struct drm_i915_gem_object *obj);
+
 static inline bool
 i915_gem_object_has_pages(struct drm_i915_gem_object *obj)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 183aae046b68..79336735a6e4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -136,6 +136,26 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 	return err;
 }
 
+int i915_gem_object_pin_pages_unlocked(struct drm_i915_gem_object *obj)
+{
+	struct i915_gem_ww_ctx ww;
+	int err;
+
+	i915_gem_ww_ctx_init(&ww, true);
+retry:
+	err = i915_gem_object_lock(obj, &ww);
+	if (!err)
+		err = i915_gem_object_pin_pages(obj);
+
+	if (err == -EDEADLK) {
+		err = i915_gem_ww_ctx_backoff(&ww);
+		if (!err)
+			goto retry;
+	}
+	i915_gem_ww_ctx_fini(&ww);
+	return err;
+}
+
 /* Immediately discard the backing storage */
 void i915_gem_object_truncate(struct drm_i915_gem_object *obj)
 {
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 2a343a977987..a65661eb5d5d 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -694,7 +694,7 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
-	vaddr = i915_gem_object_pin_map(vma->obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
 	if (IS_ERR(vaddr)) {
 		i915_vma_unpin_and_release(&vma, 0);
 		return PTR_ERR(vaddr);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
index 9bbe8a795cb8..8dc8678e7ab0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
@@ -335,7 +335,7 @@ static int guc_log_map(struct intel_guc_log *log)
 	 * buffer pages, so that we can directly get the data
 	 * (up-to-date) from memory.
 	 */
-	vaddr = i915_gem_object_pin_map(log->vma->obj, I915_MAP_WC);
+	vaddr = i915_gem_object_pin_map_unlocked(log->vma->obj, I915_MAP_WC);
 	if (IS_ERR(vaddr))
 		return PTR_ERR(vaddr);
 
@@ -744,7 +744,7 @@ int intel_guc_log_dump(struct intel_guc_log *log, struct drm_printer *p,
 	if (!obj)
 		return 0;
 
-	map = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(map)) {
 		DRM_DEBUG("Failed to pin object\n");
 		drm_puts(p, "(log data unaccessible)\n");
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
index 65eeb44b397d..2126dd81ac38 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
@@ -82,7 +82,7 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
-	vaddr = i915_gem_object_pin_map(vma->obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
 	if (IS_ERR(vaddr)) {
 		i915_vma_unpin_and_release(&vma, 0);
 		return PTR_ERR(vaddr);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 180c23e2e25e..b05076d190cc 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -541,7 +541,7 @@ int intel_uc_fw_init(struct intel_uc_fw *uc_fw)
 	if (!intel_uc_fw_is_available(uc_fw))
 		return -ENOEXEC;
 
-	err = i915_gem_object_pin_pages(uc_fw->obj);
+	err = i915_gem_object_pin_pages_unlocked(uc_fw->obj);
 	if (err) {
 		DRM_DEBUG_DRIVER("%s fw pin-pages err=%d\n",
 				 intel_uc_fw_type_repr(uc_fw->type), err);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 046/162] drm/i915: Add ww locking to dma-buf ops.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (44 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 045/162] drm/i915: Lock ww in ucode objects correctly Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 047/162] drm/i915: Add missing ww lock in intel_dsb_prepare Matthew Auld
                   ` (115 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

vmap is using pin_pages, but needs to use ww locking,
add pin_pages_unlocked to correctly lock the mapping.

Also add ww locking to begin/end cpu access.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 60 ++++++++++++----------
 1 file changed, 33 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 36e3c2765f4c..c4b01e819786 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -82,7 +82,7 @@ static int i915_gem_dmabuf_vmap(struct dma_buf *dma_buf, struct dma_buf_map *map
 	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
 	void *vaddr;
 
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
 	if (IS_ERR(vaddr))
 		return PTR_ERR(vaddr);
 
@@ -123,42 +123,48 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire
 {
 	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
 	bool write = (direction == DMA_BIDIRECTIONAL || direction == DMA_TO_DEVICE);
+	struct i915_gem_ww_ctx ww;
 	int err;
 
-	err = i915_gem_object_pin_pages(obj);
-	if (err)
-		return err;
-
-	err = i915_gem_object_lock_interruptible(obj, NULL);
-	if (err)
-		goto out;
-
-	err = i915_gem_object_set_to_cpu_domain(obj, write);
-	i915_gem_object_unlock(obj);
-
-out:
-	i915_gem_object_unpin_pages(obj);
+	i915_gem_ww_ctx_init(&ww, true);
+retry:
+	err = i915_gem_object_lock(obj, &ww);
+	if (!err)
+		err = i915_gem_object_pin_pages(obj);
+	if (!err) {
+		err = i915_gem_object_set_to_cpu_domain(obj, write);
+		i915_gem_object_unpin_pages(obj);
+	}
+	if (err == -EDEADLK) {
+		err = i915_gem_ww_ctx_backoff(&ww);
+		if (!err)
+			goto retry;
+	}
+	i915_gem_ww_ctx_fini(&ww);
 	return err;
 }
 
 static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction direction)
 {
 	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
+	struct i915_gem_ww_ctx ww;
 	int err;
 
-	err = i915_gem_object_pin_pages(obj);
-	if (err)
-		return err;
-
-	err = i915_gem_object_lock_interruptible(obj, NULL);
-	if (err)
-		goto out;
-
-	err = i915_gem_object_set_to_gtt_domain(obj, false);
-	i915_gem_object_unlock(obj);
-
-out:
-	i915_gem_object_unpin_pages(obj);
+	i915_gem_ww_ctx_init(&ww, true);
+retry:
+	err = i915_gem_object_lock(obj, &ww);
+	if (!err)
+		err = i915_gem_object_pin_pages(obj);
+	if (!err) {
+		err = i915_gem_object_set_to_gtt_domain(obj, false);
+		i915_gem_object_unpin_pages(obj);
+	}
+	if (err == -EDEADLK) {
+		err = i915_gem_ww_ctx_backoff(&ww);
+		if (!err)
+			goto retry;
+	}
+	i915_gem_ww_ctx_fini(&ww);
 	return err;
 }
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 047/162] drm/i915: Add missing ww lock in intel_dsb_prepare.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (45 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 046/162] drm/i915: Add ww locking to dma-buf ops Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 048/162] drm/i915: Fix ww locking in shmem_create_from_object Matthew Auld
                   ` (114 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Because of the long lifetime of the mapping, we cannot wrap this in a
simple limited ww lock. Just use the unlocked version of pin_map,
because we'll likely release the mapping a lot later, in a different
thread.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/display/intel_dsb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c
index 566fa72427b3..857126822a88 100644
--- a/drivers/gpu/drm/i915/display/intel_dsb.c
+++ b/drivers/gpu/drm/i915/display/intel_dsb.c
@@ -293,7 +293,7 @@ void intel_dsb_prepare(struct intel_crtc_state *crtc_state)
 		goto out;
 	}
 
-	buf = i915_gem_object_pin_map(vma->obj, I915_MAP_WC);
+	buf = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WC);
 	if (IS_ERR(buf)) {
 		drm_err(&i915->drm, "Command buffer creation failed\n");
 		i915_vma_unpin_and_release(&vma, I915_VMA_RELEASE_MAP);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 048/162] drm/i915: Fix ww locking in shmem_create_from_object
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (46 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 047/162] drm/i915: Add missing ww lock in intel_dsb_prepare Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 049/162] drm/i915: Use a single page table lock for each gtt Matthew Auld
                   ` (113 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Quick fix, just use the unlocked version.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/shmem_utils.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c b/drivers/gpu/drm/i915/gt/shmem_utils.c
index f011ea42487e..041e2a50160d 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -39,7 +39,7 @@ struct file *shmem_create_from_object(struct drm_i915_gem_object *obj)
 		return file;
 	}
 
-	ptr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	ptr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
 	if (IS_ERR(ptr))
 		return ERR_CAST(ptr);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 049/162] drm/i915: Use a single page table lock for each gtt.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (47 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 048/162] drm/i915: Fix ww locking in shmem_create_from_object Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 050/162] drm/i915/selftests: Prepare huge_pages testcases for obj->mm.lock removal Matthew Auld
                   ` (112 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

We may create page table objects on the fly, but we may need to
wait with the ww lock held. Instead of waiting on a freed obj
lock, ensure we have the same lock for each object to keep
-EDEADLK working. This ensures that i915_vma_pin_ww can lock
the page tables when required.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_ggtt.c  |  8 +++++-
 drivers/gpu/drm/i915/gt/intel_gtt.c   | 38 ++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  5 ++++
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  3 ++-
 drivers/gpu/drm/i915/i915_vma.c       |  4 +++
 5 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 60bd2c8ed8b0..17ecaef1834d 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -615,7 +615,9 @@ static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
 	if (err)
 		goto err_ppgtt;
 
+	i915_gem_object_lock(ppgtt->vm.scratch[0], NULL);
 	err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+	i915_gem_object_unlock(ppgtt->vm.scratch[0]);
 	if (err)
 		goto err_stash;
 
@@ -702,6 +704,7 @@ static void ggtt_cleanup_hw(struct i915_ggtt *ggtt)
 
 	mutex_unlock(&ggtt->vm.mutex);
 	i915_address_space_fini(&ggtt->vm);
+	dma_resv_fini(&ggtt->vm.resv);
 
 	arch_phys_wc_del(ggtt->mtrr);
 
@@ -1078,6 +1081,7 @@ static int ggtt_probe_hw(struct i915_ggtt *ggtt, struct intel_gt *gt)
 	ggtt->vm.gt = gt;
 	ggtt->vm.i915 = i915;
 	ggtt->vm.dma = &i915->drm.pdev->dev;
+	dma_resv_init(&ggtt->vm.resv);
 
 	if (INTEL_GEN(i915) <= 5)
 		ret = i915_gmch_probe(ggtt);
@@ -1085,8 +1089,10 @@ static int ggtt_probe_hw(struct i915_ggtt *ggtt, struct intel_gt *gt)
 		ret = gen6_gmch_probe(ggtt);
 	else
 		ret = gen8_gmch_probe(ggtt);
-	if (ret)
+	if (ret) {
+		dma_resv_fini(&ggtt->vm.resv);
 		return ret;
+	}
 
 	if ((ggtt->vm.total - 1) >> 32) {
 		drm_err(&i915->drm,
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 7bfe9072be9a..070d538cdc56 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -13,16 +13,36 @@
 
 struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
 {
+	struct drm_i915_gem_object *obj;
+
 	if (I915_SELFTEST_ONLY(should_fail(&vm->fault_attr, 1)))
 		i915_gem_shrink_all(vm->i915);
 
-	return i915_gem_object_create_internal(vm->i915, sz);
+	obj = i915_gem_object_create_internal(vm->i915, sz);
+	/* ensure all dma objects have the same reservation class */
+	if (!IS_ERR(obj))
+		obj->base.resv = &vm->resv;
+	return obj;
 }
 
 int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
 	int err;
 
+	i915_gem_object_lock(obj, NULL);
+	err = i915_gem_object_pin_pages(obj);
+	i915_gem_object_unlock(obj);
+	if (err)
+		return err;
+
+	i915_gem_object_make_unshrinkable(obj);
+	return 0;
+}
+
+int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
+{
+	int err;
+
 	err = i915_gem_object_pin_pages(obj);
 	if (err)
 		return err;
@@ -56,6 +76,20 @@ void __i915_vm_close(struct i915_address_space *vm)
 	mutex_unlock(&vm->mutex);
 }
 
+/* lock the vm into the current ww, if we lock one, we lock all */
+int i915_vm_lock_objects(struct i915_address_space *vm,
+			 struct i915_gem_ww_ctx *ww)
+{
+	if (vm->scratch[0]->base.resv == &vm->resv) {
+		return i915_gem_object_lock(vm->scratch[0], ww);
+	} else {
+		struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
+
+		/* We borrowed the scratch page from ggtt, take the top level object */
+		return i915_gem_object_lock(ppgtt->pd->pt.base, ww);
+	}
+}
+
 void i915_address_space_fini(struct i915_address_space *vm)
 {
 	drm_mm_takedown(&vm->mm);
@@ -69,6 +103,7 @@ static void __i915_vm_release(struct work_struct *work)
 
 	vm->cleanup(vm);
 	i915_address_space_fini(vm);
+	dma_resv_fini(&vm->resv);
 
 	kfree(vm);
 }
@@ -98,6 +133,7 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	mutex_init(&vm->mutex);
 	lockdep_set_subclass(&vm->mutex, subclass);
 	i915_gem_shrinker_taints_mutex(vm->i915, &vm->mutex);
+	dma_resv_init(&vm->resv);
 
 	GEM_BUG_ON(!vm->total);
 	drm_mm_init(&vm->mm, 0, vm->total);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 8a33940a71f3..16063b2f0119 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -238,6 +238,7 @@ struct i915_address_space {
 	atomic_t open;
 
 	struct mutex mutex; /* protects vma and our lists */
+	struct dma_resv resv; /* reservation lock for all pd objects, and buffer pool */
 #define VM_CLASS_GGTT 0
 #define VM_CLASS_PPGTT 1
 
@@ -346,6 +347,9 @@ struct i915_ppgtt {
 
 #define i915_is_ggtt(vm) ((vm)->is_ggtt)
 
+int __must_check
+i915_vm_lock_objects(struct i915_address_space *vm, struct i915_gem_ww_ctx *ww);
+
 static inline bool
 i915_vm_is_4lvl(const struct i915_address_space *vm)
 {
@@ -522,6 +526,7 @@ struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
 struct i915_page_directory *__alloc_pd(int npde);
 
 int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
+int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
 
 void free_px(struct i915_address_space *vm,
 	     struct i915_page_table *pt, int lvl);
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index 46d9aceda64c..f3ac47702aee 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -262,7 +262,7 @@ int i915_vm_pin_pt_stash(struct i915_address_space *vm,
 
 	for (n = 0; n < ARRAY_SIZE(stash->pt); n++) {
 		for (pt = stash->pt[n]; pt; pt = pt->stash) {
-			err = pin_pt_dma(vm, pt->base);
+			err = pin_pt_dma_locked(vm, pt->base);
 			if (err)
 				return err;
 		}
@@ -304,6 +304,7 @@ void ppgtt_init(struct i915_ppgtt *ppgtt, struct intel_gt *gt)
 	ppgtt->vm.dma = &i915->drm.pdev->dev;
 	ppgtt->vm.total = BIT_ULL(INTEL_INFO(i915)->ppgtt_size);
 
+	dma_resv_init(&ppgtt->vm.resv);
 	i915_address_space_init(&ppgtt->vm, VM_CLASS_PPGTT);
 
 	ppgtt->vm.vma_ops.bind_vma    = ppgtt_bind_vma;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 63bdb0cc981e..0c7e4191811a 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -908,6 +908,10 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 			if (err)
 				goto err_fence;
 
+			err = i915_vm_lock_objects(vma->vm, ww);
+			if (err)
+				goto err_fence;
+
 			err = i915_vm_pin_pt_stash(vma->vm,
 						   &work->stash);
 			if (err)
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 050/162] drm/i915/selftests: Prepare huge_pages testcases for obj->mm.lock removal.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (48 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 049/162] drm/i915: Use a single page table lock for each gtt Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 051/162] drm/i915/selftests: Prepare client blit " Matthew Auld
                   ` (111 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Straightforward conversion, just convert a bunch of calls to
unlocked versions.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 .../gpu/drm/i915/gem/selftests/huge_pages.c   | 28 ++++++++++++++-----
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index 709c63b9cfc4..586d8bafd7de 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -589,7 +589,7 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg)
 			goto out_put;
 		}
 
-		err = i915_gem_object_pin_pages(obj);
+		err = i915_gem_object_pin_pages_unlocked(obj);
 		if (err)
 			goto out_put;
 
@@ -653,15 +653,19 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg)
 				break;
 		}
 
+		i915_gem_object_lock(obj, NULL);
 		i915_gem_object_unpin_pages(obj);
 		__i915_gem_object_put_pages(obj);
+		i915_gem_object_unlock(obj);
 		i915_gem_object_put(obj);
 	}
 
 	return 0;
 
 out_unpin:
+	i915_gem_object_lock(obj, NULL);
 	i915_gem_object_unpin_pages(obj);
+	i915_gem_object_unlock(obj);
 out_put:
 	i915_gem_object_put(obj);
 
@@ -675,8 +679,10 @@ static void close_object_list(struct list_head *objects,
 
 	list_for_each_entry_safe(obj, on, objects, st_link) {
 		list_del(&obj->st_link);
+		i915_gem_object_lock(obj, NULL);
 		i915_gem_object_unpin_pages(obj);
 		__i915_gem_object_put_pages(obj);
+		i915_gem_object_unlock(obj);
 		i915_gem_object_put(obj);
 	}
 }
@@ -713,7 +719,7 @@ static int igt_mock_ppgtt_huge_fill(void *arg)
 			break;
 		}
 
-		err = i915_gem_object_pin_pages(obj);
+		err = i915_gem_object_pin_pages_unlocked(obj);
 		if (err) {
 			i915_gem_object_put(obj);
 			break;
@@ -889,7 +895,7 @@ static int igt_mock_ppgtt_64K(void *arg)
 			if (IS_ERR(obj))
 				return PTR_ERR(obj);
 
-			err = i915_gem_object_pin_pages(obj);
+			err = i915_gem_object_pin_pages_unlocked(obj);
 			if (err)
 				goto out_object_put;
 
@@ -943,8 +949,10 @@ static int igt_mock_ppgtt_64K(void *arg)
 			}
 
 			i915_vma_unpin(vma);
+			i915_gem_object_lock(obj, NULL);
 			i915_gem_object_unpin_pages(obj);
 			__i915_gem_object_put_pages(obj);
+			i915_gem_object_unlock(obj);
 			i915_gem_object_put(obj);
 		}
 	}
@@ -954,7 +962,9 @@ static int igt_mock_ppgtt_64K(void *arg)
 out_vma_unpin:
 	i915_vma_unpin(vma);
 out_object_unpin:
+	i915_gem_object_lock(obj, NULL);
 	i915_gem_object_unpin_pages(obj);
+	i915_gem_object_unlock(obj);
 out_object_put:
 	i915_gem_object_put(obj);
 
@@ -1024,7 +1034,7 @@ static int __cpu_check_vmap(struct drm_i915_gem_object *obj, u32 dword, u32 val)
 	if (err)
 		return err;
 
-	ptr = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	ptr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(ptr))
 		return PTR_ERR(ptr);
 
@@ -1304,7 +1314,7 @@ static int igt_ppgtt_smoke_huge(void *arg)
 			return err;
 		}
 
-		err = i915_gem_object_pin_pages(obj);
+		err = i915_gem_object_pin_pages_unlocked(obj);
 		if (err) {
 			if (err == -ENXIO || err == -E2BIG) {
 				i915_gem_object_put(obj);
@@ -1327,8 +1337,10 @@ static int igt_ppgtt_smoke_huge(void *arg)
 			       __func__, size, i);
 		}
 out_unpin:
+		i915_gem_object_lock(obj, NULL);
 		i915_gem_object_unpin_pages(obj);
 		__i915_gem_object_put_pages(obj);
+		i915_gem_object_unlock(obj);
 out_put:
 		i915_gem_object_put(obj);
 
@@ -1402,7 +1414,7 @@ static int igt_ppgtt_sanity_check(void *arg)
 				return err;
 			}
 
-			err = i915_gem_object_pin_pages(obj);
+			err = i915_gem_object_pin_pages_unlocked(obj);
 			if (err) {
 				i915_gem_object_put(obj);
 				goto out;
@@ -1416,8 +1428,10 @@ static int igt_ppgtt_sanity_check(void *arg)
 
 			err = igt_write_huge(ctx, obj);
 
+			i915_gem_object_lock(obj, NULL);
 			i915_gem_object_unpin_pages(obj);
 			__i915_gem_object_put_pages(obj);
+			i915_gem_object_unlock(obj);
 			i915_gem_object_put(obj);
 
 			if (err) {
@@ -1462,7 +1476,7 @@ static int igt_tmpfs_fallback(void *arg)
 		goto out_restore;
 	}
 
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
 		goto out_put;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 051/162] drm/i915/selftests: Prepare client blit for obj->mm.lock removal.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (49 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 050/162] drm/i915/selftests: Prepare huge_pages testcases for obj->mm.lock removal Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 052/162] drm/i915/selftests: Prepare coherency tests " Matthew Auld
                   ` (110 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Straightforward conversion, just convert a bunch of calls to
unlocked versions.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
index 4e36d4897ea6..cc782569765f 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
@@ -47,7 +47,7 @@ static int __igt_client_fill(struct intel_engine_cs *engine)
 			goto err_flush;
 		}
 
-		vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+		vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
 		if (IS_ERR(vaddr)) {
 			err = PTR_ERR(vaddr);
 			goto err_put;
@@ -159,7 +159,7 @@ static int prepare_blit(const struct tiled_blits *t,
 	u32 src_pitch, dst_pitch;
 	u32 cmd, *cs;
 
-	cs = i915_gem_object_pin_map(batch, I915_MAP_WC);
+	cs = i915_gem_object_pin_map_unlocked(batch, I915_MAP_WC);
 	if (IS_ERR(cs))
 		return PTR_ERR(cs);
 
@@ -379,7 +379,7 @@ static int verify_buffer(const struct tiled_blits *t,
 	y = i915_prandom_u32_max_state(t->height, prng);
 	p = y * t->width + x;
 
-	vaddr = i915_gem_object_pin_map(buf->vma->obj, I915_MAP_WC);
+	vaddr = i915_gem_object_pin_map_unlocked(buf->vma->obj, I915_MAP_WC);
 	if (IS_ERR(vaddr))
 		return PTR_ERR(vaddr);
 
@@ -566,7 +566,7 @@ static int tiled_blits_prepare(struct tiled_blits *t,
 	int err;
 	int i;
 
-	map = i915_gem_object_pin_map(t->scratch.vma->obj, I915_MAP_WC);
+	map = i915_gem_object_pin_map_unlocked(t->scratch.vma->obj, I915_MAP_WC);
 	if (IS_ERR(map))
 		return PTR_ERR(map);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 052/162] drm/i915/selftests: Prepare coherency tests for obj->mm.lock removal.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (50 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 051/162] drm/i915/selftests: Prepare client blit " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 053/162] drm/i915/selftests: Prepare context " Matthew Auld
                   ` (109 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Straightforward conversion, just convert a bunch of calls to
unlocked versions.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
index 2e439bb269d6..42aa3c5e0621 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
@@ -159,7 +159,7 @@ static int wc_set(struct context *ctx, unsigned long offset, u32 v)
 	if (err)
 		return err;
 
-	map = i915_gem_object_pin_map(ctx->obj, I915_MAP_WC);
+	map = i915_gem_object_pin_map_unlocked(ctx->obj, I915_MAP_WC);
 	if (IS_ERR(map))
 		return PTR_ERR(map);
 
@@ -182,7 +182,7 @@ static int wc_get(struct context *ctx, unsigned long offset, u32 *v)
 	if (err)
 		return err;
 
-	map = i915_gem_object_pin_map(ctx->obj, I915_MAP_WC);
+	map = i915_gem_object_pin_map_unlocked(ctx->obj, I915_MAP_WC);
 	if (IS_ERR(map))
 		return PTR_ERR(map);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 053/162] drm/i915/selftests: Prepare context tests for obj->mm.lock removal.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (51 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 052/162] drm/i915/selftests: Prepare coherency tests " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 054/162] drm/i915/selftests: Prepare dma-buf " Matthew Auld
                   ` (108 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Straightforward conversion, just convert a bunch of calls to
unlocked versions.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index d3f87dc4eda3..5fef592390cb 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -1094,7 +1094,7 @@ __read_slice_count(struct intel_context *ce,
 	if (ret < 0)
 		return ret;
 
-	buf = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	buf = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
 	if (IS_ERR(buf)) {
 		ret = PTR_ERR(buf);
 		return ret;
@@ -1511,7 +1511,7 @@ static int write_to_scratch(struct i915_gem_context *ctx,
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
-	cmd = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
 	if (IS_ERR(cmd)) {
 		err = PTR_ERR(cmd);
 		goto out;
@@ -1622,7 +1622,7 @@ static int read_from_scratch(struct i915_gem_context *ctx,
 		if (err)
 			goto out_vm;
 
-		cmd = i915_gem_object_pin_map(obj, I915_MAP_WB);
+		cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
 		if (IS_ERR(cmd)) {
 			err = PTR_ERR(cmd);
 			goto out;
@@ -1658,7 +1658,7 @@ static int read_from_scratch(struct i915_gem_context *ctx,
 		if (err)
 			goto out_vm;
 
-		cmd = i915_gem_object_pin_map(obj, I915_MAP_WB);
+		cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
 		if (IS_ERR(cmd)) {
 			err = PTR_ERR(cmd);
 			goto out;
@@ -1715,7 +1715,7 @@ static int read_from_scratch(struct i915_gem_context *ctx,
 	if (err)
 		goto out_vm;
 
-	cmd = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
 	if (IS_ERR(cmd)) {
 		err = PTR_ERR(cmd);
 		goto out_vm;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 054/162] drm/i915/selftests: Prepare dma-buf tests for obj->mm.lock removal.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (52 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 053/162] drm/i915/selftests: Prepare context " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 055/162] drm/i915/selftests: Prepare execbuf " Matthew Auld
                   ` (107 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Use pin_pages_unlocked() where we don't have a lock.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
index b6d43880b0c1..dd74bc09ec88 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
@@ -194,7 +194,7 @@ static int igt_dmabuf_import_ownership(void *arg)
 
 	dma_buf_put(dmabuf);
 
-	err = i915_gem_object_pin_pages(obj);
+	err = i915_gem_object_pin_pages_unlocked(obj);
 	if (err) {
 		pr_err("i915_gem_object_pin_pages failed with err=%d\n", err);
 		goto out_obj;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 055/162] drm/i915/selftests: Prepare execbuf tests for obj->mm.lock removal.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (53 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 054/162] drm/i915/selftests: Prepare dma-buf " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 056/162] drm/i915/selftests: Prepare mman testcases " Matthew Auld
                   ` (106 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Also quite simple, a single call needs to use the unlocked version.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
index e1d50a5a1477..4df505e4c53a 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
@@ -116,7 +116,7 @@ static int igt_gpu_reloc(void *arg)
 	if (IS_ERR(scratch))
 		return PTR_ERR(scratch);
 
-	map = i915_gem_object_pin_map(scratch, I915_MAP_WC);
+	map = i915_gem_object_pin_map_unlocked(scratch, I915_MAP_WC);
 	if (IS_ERR(map)) {
 		err = PTR_ERR(map);
 		goto err_scratch;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 056/162] drm/i915/selftests: Prepare mman testcases for obj->mm.lock removal.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (54 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 055/162] drm/i915/selftests: Prepare execbuf " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 057/162] drm/i915/selftests: Prepare object tests " Matthew Auld
                   ` (105 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Ensure we hold the lock around put_pages, and use the unlocked wrappers
for pinning pages and mappings.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index 3ac7628f3bc4..85fff8bed08c 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -321,7 +321,7 @@ static int igt_partial_tiling(void *arg)
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
-	err = i915_gem_object_pin_pages(obj);
+	err = i915_gem_object_pin_pages_unlocked(obj);
 	if (err) {
 		pr_err("Failed to allocate %u pages (%lu total), err=%d\n",
 		       nreal, obj->base.size / PAGE_SIZE, err);
@@ -458,7 +458,7 @@ static int igt_smoke_tiling(void *arg)
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
-	err = i915_gem_object_pin_pages(obj);
+	err = i915_gem_object_pin_pages_unlocked(obj);
 	if (err) {
 		pr_err("Failed to allocate %u pages (%lu total), err=%d\n",
 		       nreal, obj->base.size / PAGE_SIZE, err);
@@ -797,7 +797,7 @@ static int wc_set(struct drm_i915_gem_object *obj)
 {
 	void *vaddr;
 
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(vaddr))
 		return PTR_ERR(vaddr);
 
@@ -813,7 +813,7 @@ static int wc_check(struct drm_i915_gem_object *obj)
 	void *vaddr;
 	int err = 0;
 
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(vaddr))
 		return PTR_ERR(vaddr);
 
@@ -1315,7 +1315,9 @@ static int __igt_mmap_revoke(struct drm_i915_private *i915,
 	}
 
 	if (type != I915_MMAP_TYPE_GTT) {
+		i915_gem_object_lock(obj, NULL);
 		__i915_gem_object_put_pages(obj);
+		i915_gem_object_unlock(obj);
 		if (i915_gem_object_has_pages(obj)) {
 			pr_err("Failed to put-pages object!\n");
 			err = -EINVAL;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 057/162] drm/i915/selftests: Prepare object tests for obj->mm.lock removal.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (55 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 056/162] drm/i915/selftests: Prepare mman testcases " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 058/162] drm/i915/selftests: Prepare object blit " Matthew Auld
                   ` (104 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Convert a single pin_pages call to use the unlocked version.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c
index bf853c40ec65..740ee8086a27 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object.c
@@ -47,7 +47,7 @@ static int igt_gem_huge(void *arg)
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
-	err = i915_gem_object_pin_pages(obj);
+	err = i915_gem_object_pin_pages_unlocked(obj);
 	if (err) {
 		pr_err("Failed to allocate %u pages (%lu total), err=%d\n",
 		       nreal, obj->base.size / PAGE_SIZE, err);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 058/162] drm/i915/selftests: Prepare object blit tests for obj->mm.lock removal.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (56 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 057/162] drm/i915/selftests: Prepare object tests " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 059/162] drm/i915/selftests: Prepare igt_gem_utils " Matthew Auld
                   ` (103 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Use some unlocked versions where we're not holding the ww lock.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
index 23b6e11bbc3e..ee9496f3d11d 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
@@ -262,7 +262,7 @@ static int igt_fill_blt_thread(void *arg)
 			goto err_flush;
 		}
 
-		vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+		vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
 		if (IS_ERR(vaddr)) {
 			err = PTR_ERR(vaddr);
 			goto err_put;
@@ -380,7 +380,7 @@ static int igt_copy_blt_thread(void *arg)
 			goto err_flush;
 		}
 
-		vaddr = i915_gem_object_pin_map(src, I915_MAP_WB);
+		vaddr = i915_gem_object_pin_map_unlocked(src, I915_MAP_WB);
 		if (IS_ERR(vaddr)) {
 			err = PTR_ERR(vaddr);
 			goto err_put_src;
@@ -400,7 +400,7 @@ static int igt_copy_blt_thread(void *arg)
 			goto err_put_src;
 		}
 
-		vaddr = i915_gem_object_pin_map(dst, I915_MAP_WB);
+		vaddr = i915_gem_object_pin_map_unlocked(dst, I915_MAP_WB);
 		if (IS_ERR(vaddr)) {
 			err = PTR_ERR(vaddr);
 			goto err_put_dst;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 059/162] drm/i915/selftests: Prepare igt_gem_utils for obj->mm.lock removal
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (57 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 058/162] drm/i915/selftests: Prepare object blit " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 060/162] drm/i915/selftests: Prepare context selftest " Matthew Auld
                   ` (102 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

igt_emit_store_dw needs to use the unlocked version, as it's not
holding a lock. This fixes igt_gpu_fill_dw() which is used by
some other selftests.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
index e21b5023ca7d..f4e85b4a347d 100644
--- a/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
+++ b/drivers/gpu/drm/i915/gem/selftests/igt_gem_utils.c
@@ -54,7 +54,7 @@ igt_emit_store_dw(struct i915_vma *vma,
 	if (IS_ERR(obj))
 		return ERR_CAST(obj);
 
-	cmd = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(cmd)) {
 		err = PTR_ERR(cmd);
 		goto err;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 060/162] drm/i915/selftests: Prepare context selftest for obj->mm.lock removal
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (58 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 059/162] drm/i915/selftests: Prepare igt_gem_utils " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 061/162] drm/i915/selftests: Prepare hangcheck " Matthew Auld
                   ` (101 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Only needs to convert a single call to the unlocked version.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_context.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
index 1f4020e906a8..d9b0ebc938f1 100644
--- a/drivers/gpu/drm/i915/gt/selftest_context.c
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -88,8 +88,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
 	if (err)
 		goto err;
 
-	vaddr = i915_gem_object_pin_map(ce->state->obj,
-					i915_coherent_map_type(engine->i915));
+	vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
+						 i915_coherent_map_type(engine->i915));
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
 		intel_context_unpin(ce);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 061/162] drm/i915/selftests: Prepare hangcheck for obj->mm.lock removal
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (59 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 060/162] drm/i915/selftests: Prepare context selftest " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 062/162] drm/i915/selftests: Prepare execlists " Matthew Auld
                   ` (100 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Convert a few calls to use the unlocked versions.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index fb5ebf930ab2..e3027cebab5b 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -80,15 +80,15 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
 	}
 
 	i915_gem_object_set_cache_coherency(h->hws, I915_CACHE_LLC);
-	vaddr = i915_gem_object_pin_map(h->hws, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map_unlocked(h->hws, I915_MAP_WB);
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
 		goto err_obj;
 	}
 	h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
 
-	vaddr = i915_gem_object_pin_map(h->obj,
-					i915_coherent_map_type(gt->i915));
+	vaddr = i915_gem_object_pin_map_unlocked(h->obj,
+						 i915_coherent_map_type(gt->i915));
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
 		goto err_unpin_hws;
@@ -149,7 +149,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
 		return ERR_CAST(obj);
 	}
 
-	vaddr = i915_gem_object_pin_map(obj, i915_coherent_map_type(gt->i915));
+	vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
 	if (IS_ERR(vaddr)) {
 		i915_gem_object_put(obj);
 		i915_vm_put(vm);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 062/162] drm/i915/selftests: Prepare execlists for obj->mm.lock removal
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (60 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 061/162] drm/i915/selftests: Prepare hangcheck " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 063/162] drm/i915/selftests: Prepare mocs tests " Matthew Auld
                   ` (99 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Convert normal functions to unlocked versions where needed.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_execlists.c | 34 ++++++++++----------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index 95d41c01d0e0..124011f6fb51 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -1007,7 +1007,7 @@ static int live_timeslice_preempt(void *arg)
 		goto err_obj;
 	}
 
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
 		goto err_obj;
@@ -1315,7 +1315,7 @@ static int live_timeslice_queue(void *arg)
 		goto err_obj;
 	}
 
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
 		goto err_obj;
@@ -1562,7 +1562,7 @@ static int live_busywait_preempt(void *arg)
 		goto err_ctx_lo;
 	}
 
-	map = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(map)) {
 		err = PTR_ERR(map);
 		goto err_obj;
@@ -2678,7 +2678,7 @@ static int create_gang(struct intel_engine_cs *engine,
 	if (err)
 		goto err_obj;
 
-	cs = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	cs = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(cs))
 		goto err_obj;
 
@@ -2960,7 +2960,7 @@ static int live_preempt_gang(void *arg)
 		 * it will terminate the next lowest spinner until there
 		 * are no more spinners and the gang is complete.
 		 */
-		cs = i915_gem_object_pin_map(rq->batch->obj, I915_MAP_WC);
+		cs = i915_gem_object_pin_map_unlocked(rq->batch->obj, I915_MAP_WC);
 		if (!IS_ERR(cs)) {
 			*cs = 0;
 			i915_gem_object_unpin_map(rq->batch->obj);
@@ -3025,7 +3025,7 @@ create_gpr_user(struct intel_engine_cs *engine,
 		return ERR_PTR(err);
 	}
 
-	cs = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	cs = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(cs)) {
 		i915_vma_put(vma);
 		return ERR_CAST(cs);
@@ -3235,7 +3235,7 @@ static int live_preempt_user(void *arg)
 	if (IS_ERR(global))
 		return PTR_ERR(global);
 
-	result = i915_gem_object_pin_map(global->obj, I915_MAP_WC);
+	result = i915_gem_object_pin_map_unlocked(global->obj, I915_MAP_WC);
 	if (IS_ERR(result)) {
 		i915_vma_unpin_and_release(&global, 0);
 		return PTR_ERR(result);
@@ -3628,7 +3628,7 @@ static int live_preempt_smoke(void *arg)
 		goto err_free;
 	}
 
-	cs = i915_gem_object_pin_map(smoke.batch, I915_MAP_WB);
+	cs = i915_gem_object_pin_map_unlocked(smoke.batch, I915_MAP_WB);
 	if (IS_ERR(cs)) {
 		err = PTR_ERR(cs);
 		goto err_batch;
@@ -4231,7 +4231,7 @@ static int preserved_virtual_engine(struct intel_gt *gt,
 		goto out_end;
 	}
 
-	cs = i915_gem_object_pin_map(scratch->obj, I915_MAP_WB);
+	cs = i915_gem_object_pin_map_unlocked(scratch->obj, I915_MAP_WB);
 	if (IS_ERR(cs)) {
 		err = PTR_ERR(cs);
 		goto out_end;
@@ -5259,7 +5259,7 @@ static int __live_lrc_gpr(struct intel_engine_cs *engine,
 		goto err_rq;
 	}
 
-	cs = i915_gem_object_pin_map(scratch->obj, I915_MAP_WB);
+	cs = i915_gem_object_pin_map_unlocked(scratch->obj, I915_MAP_WB);
 	if (IS_ERR(cs)) {
 		err = PTR_ERR(cs);
 		goto err_rq;
@@ -5553,7 +5553,7 @@ store_context(struct intel_context *ce, struct i915_vma *scratch)
 	if (IS_ERR(batch))
 		return batch;
 
-	cs = i915_gem_object_pin_map(batch->obj, I915_MAP_WC);
+	cs = i915_gem_object_pin_map_unlocked(batch->obj, I915_MAP_WC);
 	if (IS_ERR(cs)) {
 		i915_vma_put(batch);
 		return ERR_CAST(cs);
@@ -5717,7 +5717,7 @@ static struct i915_vma *load_context(struct intel_context *ce, u32 poison)
 	if (IS_ERR(batch))
 		return batch;
 
-	cs = i915_gem_object_pin_map(batch->obj, I915_MAP_WC);
+	cs = i915_gem_object_pin_map_unlocked(batch->obj, I915_MAP_WC);
 	if (IS_ERR(cs)) {
 		i915_vma_put(batch);
 		return ERR_CAST(cs);
@@ -5831,29 +5831,29 @@ static int compare_isolation(struct intel_engine_cs *engine,
 	u32 *defaults;
 	int err = 0;
 
-	A[0] = i915_gem_object_pin_map(ref[0]->obj, I915_MAP_WC);
+	A[0] = i915_gem_object_pin_map_unlocked(ref[0]->obj, I915_MAP_WC);
 	if (IS_ERR(A[0]))
 		return PTR_ERR(A[0]);
 
-	A[1] = i915_gem_object_pin_map(ref[1]->obj, I915_MAP_WC);
+	A[1] = i915_gem_object_pin_map_unlocked(ref[1]->obj, I915_MAP_WC);
 	if (IS_ERR(A[1])) {
 		err = PTR_ERR(A[1]);
 		goto err_A0;
 	}
 
-	B[0] = i915_gem_object_pin_map(result[0]->obj, I915_MAP_WC);
+	B[0] = i915_gem_object_pin_map_unlocked(result[0]->obj, I915_MAP_WC);
 	if (IS_ERR(B[0])) {
 		err = PTR_ERR(B[0]);
 		goto err_A1;
 	}
 
-	B[1] = i915_gem_object_pin_map(result[1]->obj, I915_MAP_WC);
+	B[1] = i915_gem_object_pin_map_unlocked(result[1]->obj, I915_MAP_WC);
 	if (IS_ERR(B[1])) {
 		err = PTR_ERR(B[1]);
 		goto err_B0;
 	}
 
-	lrc = i915_gem_object_pin_map(ce->state->obj,
+	lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
 				      i915_coherent_map_type(engine->i915));
 	if (IS_ERR(lrc)) {
 		err = PTR_ERR(lrc);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 063/162] drm/i915/selftests: Prepare mocs tests for obj->mm.lock removal
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (61 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 062/162] drm/i915/selftests: Prepare execlists " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 064/162] drm/i915/selftests: Prepare ring submission " Matthew Auld
                   ` (98 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Use pin_map_unlocked when we're not holding locks.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_mocs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_mocs.c b/drivers/gpu/drm/i915/gt/selftest_mocs.c
index 21dcd91cbd62..eadb41b76d33 100644
--- a/drivers/gpu/drm/i915/gt/selftest_mocs.c
+++ b/drivers/gpu/drm/i915/gt/selftest_mocs.c
@@ -105,7 +105,7 @@ static int live_mocs_init(struct live_mocs *arg, struct intel_gt *gt)
 	if (IS_ERR(arg->scratch))
 		return PTR_ERR(arg->scratch);
 
-	arg->vaddr = i915_gem_object_pin_map(arg->scratch->obj, I915_MAP_WB);
+	arg->vaddr = i915_gem_object_pin_map_unlocked(arg->scratch->obj, I915_MAP_WB);
 	if (IS_ERR(arg->vaddr)) {
 		err = PTR_ERR(arg->vaddr);
 		goto err_scratch;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 064/162] drm/i915/selftests: Prepare ring submission for obj->mm.lock removal
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (62 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 063/162] drm/i915/selftests: Prepare mocs tests " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 065/162] drm/i915/selftests: Prepare timeline tests " Matthew Auld
                   ` (97 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Use unlocked versions when the ww lock is not held.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_ring_submission.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_ring_submission.c b/drivers/gpu/drm/i915/gt/selftest_ring_submission.c
index 3350e7c995bc..99609271c3a7 100644
--- a/drivers/gpu/drm/i915/gt/selftest_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/selftest_ring_submission.c
@@ -35,7 +35,7 @@ static struct i915_vma *create_wally(struct intel_engine_cs *engine)
 		return ERR_PTR(err);
 	}
 
-	cs = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	cs = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(cs)) {
 		i915_gem_object_put(obj);
 		return ERR_CAST(cs);
@@ -212,7 +212,7 @@ static int __live_ctx_switch_wa(struct intel_engine_cs *engine)
 	if (IS_ERR(bb))
 		return PTR_ERR(bb);
 
-	result = i915_gem_object_pin_map(bb->obj, I915_MAP_WC);
+	result = i915_gem_object_pin_map_unlocked(bb->obj, I915_MAP_WC);
 	if (IS_ERR(result)) {
 		intel_context_put(bb->private);
 		i915_vma_unpin_and_release(&bb, 0);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 065/162] drm/i915/selftests: Prepare timeline tests for obj->mm.lock removal
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (63 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 064/162] drm/i915/selftests: Prepare ring submission " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 066/162] drm/i915/selftests: Prepare i915_request " Matthew Auld
                   ` (96 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

We can no longer call intel_timeline_pin with a null argument,
so add a ww loop that locks the backing object.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_timeline.c | 28 ++++++++++++++++++---
 1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c
index 7435abf5a703..d468147a03de 100644
--- a/drivers/gpu/drm/i915/gt/selftest_timeline.c
+++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c
@@ -37,6 +37,26 @@ static unsigned long hwsp_cacheline(struct intel_timeline *tl)
 	return (address + offset_in_page(tl->hwsp_offset)) / CACHELINE_BYTES;
 }
 
+static int selftest_tl_pin(struct intel_timeline *tl)
+{
+	struct i915_gem_ww_ctx ww;
+	int err;
+
+	i915_gem_ww_ctx_init(&ww, false);
+retry:
+	err = i915_gem_object_lock(tl->hwsp_ggtt->obj, &ww);
+	if (!err)
+		err = intel_timeline_pin(tl, &ww);
+
+	if (err == -EDEADLK) {
+		err = i915_gem_ww_ctx_backoff(&ww);
+		if (!err)
+			goto retry;
+	}
+	i915_gem_ww_ctx_fini(&ww);
+	return err;
+}
+
 #define CACHELINES_PER_PAGE (PAGE_SIZE / CACHELINE_BYTES)
 
 struct mock_hwsp_freelist {
@@ -78,7 +98,7 @@ static int __mock_hwsp_timeline(struct mock_hwsp_freelist *state,
 		if (IS_ERR(tl))
 			return PTR_ERR(tl);
 
-		err = intel_timeline_pin(tl, NULL);
+		err = selftest_tl_pin(tl);
 		if (err) {
 			intel_timeline_put(tl);
 			return err;
@@ -464,7 +484,7 @@ checked_tl_write(struct intel_timeline *tl, struct intel_engine_cs *engine, u32
 	struct i915_request *rq;
 	int err;
 
-	err = intel_timeline_pin(tl, NULL);
+	err = selftest_tl_pin(tl);
 	if (err) {
 		rq = ERR_PTR(err);
 		goto out;
@@ -664,7 +684,7 @@ static int live_hwsp_wrap(void *arg)
 	if (!tl->has_initial_breadcrumb || !tl->hwsp_cacheline)
 		goto out_free;
 
-	err = intel_timeline_pin(tl, NULL);
+	err = selftest_tl_pin(tl);
 	if (err)
 		goto out_free;
 
@@ -811,7 +831,7 @@ static int setup_watcher(struct hwsp_watcher *w, struct intel_gt *gt)
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
-	w->map = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	w->map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
 	if (IS_ERR(w->map)) {
 		i915_gem_object_put(obj);
 		return PTR_ERR(w->map);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 066/162] drm/i915/selftests: Prepare i915_request tests for obj->mm.lock removal
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (64 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 065/162] drm/i915/selftests: Prepare timeline tests " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 067/162] drm/i915/selftests: Prepare memory region " Matthew Auld
                   ` (95 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Straightforward conversion by using unlocked versions.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/selftests/i915_request.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c
index e424a6d1a68c..514fa109e40f 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -619,7 +619,7 @@ static struct i915_vma *empty_batch(struct drm_i915_private *i915)
 	if (IS_ERR(obj))
 		return ERR_CAST(obj);
 
-	cmd = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
 	if (IS_ERR(cmd)) {
 		err = PTR_ERR(cmd);
 		goto err;
@@ -781,7 +781,7 @@ static struct i915_vma *recursive_batch(struct drm_i915_private *i915)
 	if (err)
 		goto err;
 
-	cmd = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(cmd)) {
 		err = PTR_ERR(cmd);
 		goto err;
@@ -816,7 +816,7 @@ static int recursive_batch_resolve(struct i915_vma *batch)
 {
 	u32 *cmd;
 
-	cmd = i915_gem_object_pin_map(batch->obj, I915_MAP_WC);
+	cmd = i915_gem_object_pin_map_unlocked(batch->obj, I915_MAP_WC);
 	if (IS_ERR(cmd))
 		return PTR_ERR(cmd);
 
@@ -1069,8 +1069,8 @@ static int live_sequential_engines(void *arg)
 		if (!request[idx])
 			break;
 
-		cmd = i915_gem_object_pin_map(request[idx]->batch->obj,
-					      I915_MAP_WC);
+		cmd = i915_gem_object_pin_map_unlocked(request[idx]->batch->obj,
+						       I915_MAP_WC);
 		if (!IS_ERR(cmd)) {
 			*cmd = MI_BATCH_BUFFER_END;
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 067/162] drm/i915/selftests: Prepare memory region tests for obj->mm.lock removal
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (65 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 066/162] drm/i915/selftests: Prepare i915_request " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 068/162] drm/i915/selftests: Prepare cs engine " Matthew Auld
                   ` (94 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Use the unlocked variants for pin_map and pin_pages, and add lock
around unpinning/putting pages.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 .../drm/i915/selftests/intel_memory_region.c   | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index 27389fb19951..9c20b7065fc5 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -31,10 +31,12 @@ static void close_objects(struct intel_memory_region *mem,
 	struct drm_i915_gem_object *obj, *on;
 
 	list_for_each_entry_safe(obj, on, objects, st_link) {
+		i915_gem_object_lock(obj, NULL);
 		if (i915_gem_object_has_pinned_pages(obj))
 			i915_gem_object_unpin_pages(obj);
 		/* No polluting the memory region between tests */
 		__i915_gem_object_put_pages(obj);
+		i915_gem_object_unlock(obj);
 		list_del(&obj->st_link);
 		i915_gem_object_put(obj);
 	}
@@ -69,7 +71,7 @@ static int igt_mock_fill(void *arg)
 			break;
 		}
 
-		err = i915_gem_object_pin_pages(obj);
+		err = i915_gem_object_pin_pages_unlocked(obj);
 		if (err) {
 			i915_gem_object_put(obj);
 			break;
@@ -109,7 +111,7 @@ igt_object_create(struct intel_memory_region *mem,
 	if (IS_ERR(obj))
 		return obj;
 
-	err = i915_gem_object_pin_pages(obj);
+	err = i915_gem_object_pin_pages_unlocked(obj);
 	if (err)
 		goto put;
 
@@ -123,8 +125,10 @@ igt_object_create(struct intel_memory_region *mem,
 
 static void igt_object_release(struct drm_i915_gem_object *obj)
 {
+	i915_gem_object_lock(obj, NULL);
 	i915_gem_object_unpin_pages(obj);
 	__i915_gem_object_put_pages(obj);
+	i915_gem_object_unlock(obj);
 	list_del(&obj->st_link);
 	i915_gem_object_put(obj);
 }
@@ -356,7 +360,7 @@ static int igt_cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val)
 	if (err)
 		return err;
 
-	ptr = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	ptr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(ptr))
 		return PTR_ERR(ptr);
 
@@ -461,7 +465,7 @@ static int igt_lmem_create(void *arg)
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
-	err = i915_gem_object_pin_pages(obj);
+	err = i915_gem_object_pin_pages_unlocked(obj);
 	if (err)
 		goto out_put;
 
@@ -500,7 +504,7 @@ static int igt_lmem_write_gpu(void *arg)
 		goto out_file;
 	}
 
-	err = i915_gem_object_pin_pages(obj);
+	err = i915_gem_object_pin_pages_unlocked(obj);
 	if (err)
 		goto out_put;
 
@@ -572,7 +576,7 @@ static int igt_lmem_write_cpu(void *arg)
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC);
+	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
 		goto out_put;
@@ -676,7 +680,7 @@ create_region_for_mapping(struct intel_memory_region *mr, u64 size, u32 type,
 		return obj;
 	}
 
-	addr = i915_gem_object_pin_map(obj, type);
+	addr = i915_gem_object_pin_map_unlocked(obj, type);
 	if (IS_ERR(addr)) {
 		i915_gem_object_put(obj);
 		if (PTR_ERR(addr) == -ENXIO)
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 068/162] drm/i915/selftests: Prepare cs engine tests for obj->mm.lock removal
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (66 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 067/162] drm/i915/selftests: Prepare memory region " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 069/162] drm/i915/selftests: Prepare gtt " Matthew Auld
                   ` (93 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Same as other tests, use pin_map_unlocked.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_engine_cs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_cs.c b/drivers/gpu/drm/i915/gt/selftest_engine_cs.c
index 729c3c7b11e2..853d1f02131a 100644
--- a/drivers/gpu/drm/i915/gt/selftest_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/selftest_engine_cs.c
@@ -72,7 +72,7 @@ static struct i915_vma *create_empty_batch(struct intel_context *ce)
 	if (IS_ERR(obj))
 		return ERR_CAST(obj);
 
-	cs = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	cs = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
 	if (IS_ERR(cs)) {
 		err = PTR_ERR(cs);
 		goto err_put;
@@ -208,7 +208,7 @@ static struct i915_vma *create_nop_batch(struct intel_context *ce)
 	if (IS_ERR(obj))
 		return ERR_CAST(obj);
 
-	cs = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	cs = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
 	if (IS_ERR(cs)) {
 		err = PTR_ERR(cs);
 		goto err_put;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 069/162] drm/i915/selftests: Prepare gtt tests for obj->mm.lock removal
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (67 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 068/162] drm/i915/selftests: Prepare cs engine " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 070/162] drm/i915: Finally remove obj->mm.lock Matthew Auld
                   ` (92 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

We need to lock the global gtt dma_resv, use i915_vm_lock_objects
to handle this correctly. Add ww handling for this where required.

Add the object lock around unpin/put pages, and use the unlocked
versions of pin_pages and pin_map where required.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 92 ++++++++++++++-----
 1 file changed, 67 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 2cfe99c79034..d07dd6780005 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -129,7 +129,7 @@ fake_dma_object(struct drm_i915_private *i915, u64 size)
 	obj->cache_level = I915_CACHE_NONE;
 
 	/* Preallocate the "backing storage" */
-	if (i915_gem_object_pin_pages(obj))
+	if (i915_gem_object_pin_pages_unlocked(obj))
 		goto err_obj;
 
 	i915_gem_object_unpin_pages(obj);
@@ -145,6 +145,7 @@ static int igt_ppgtt_alloc(void *arg)
 {
 	struct drm_i915_private *dev_priv = arg;
 	struct i915_ppgtt *ppgtt;
+	struct i915_gem_ww_ctx ww;
 	u64 size, last, limit;
 	int err = 0;
 
@@ -170,6 +171,12 @@ static int igt_ppgtt_alloc(void *arg)
 	limit = totalram_pages() << PAGE_SHIFT;
 	limit = min(ppgtt->vm.total, limit);
 
+	i915_gem_ww_ctx_init(&ww, false);
+retry:
+	err = i915_vm_lock_objects(&ppgtt->vm, &ww);
+	if (err)
+		goto err_ppgtt_cleanup;
+
 	/* Check we can allocate the entire range */
 	for (size = 4096; size <= limit; size <<= 2) {
 		struct i915_vm_pt_stash stash = {};
@@ -214,6 +221,13 @@ static int igt_ppgtt_alloc(void *arg)
 	}
 
 err_ppgtt_cleanup:
+	if (err == -EDEADLK) {
+		err = i915_gem_ww_ctx_backoff(&ww);
+		if (!err)
+			goto retry;
+	}
+	i915_gem_ww_ctx_fini(&ww);
+
 	i915_vm_put(&ppgtt->vm);
 	return err;
 }
@@ -275,7 +289,7 @@ static int lowlevel_hole(struct i915_address_space *vm,
 
 		GEM_BUG_ON(obj->base.size != BIT_ULL(size));
 
-		if (i915_gem_object_pin_pages(obj)) {
+		if (i915_gem_object_pin_pages_unlocked(obj)) {
 			i915_gem_object_put(obj);
 			kfree(order);
 			break;
@@ -296,20 +310,36 @@ static int lowlevel_hole(struct i915_address_space *vm,
 
 			if (vm->allocate_va_range) {
 				struct i915_vm_pt_stash stash = {};
+				struct i915_gem_ww_ctx ww;
+				int err;
+
+				i915_gem_ww_ctx_init(&ww, false);
+retry:
+				err = i915_vm_lock_objects(vm, &ww);
+				if (err)
+					goto alloc_vm_end;
 
+				err = -ENOMEM;
 				if (i915_vm_alloc_pt_stash(vm, &stash,
 							   BIT_ULL(size)))
-					break;
-
-				if (i915_vm_pin_pt_stash(vm, &stash)) {
-					i915_vm_free_pt_stash(vm, &stash);
-					break;
-				}
+					goto alloc_vm_end;
 
-				vm->allocate_va_range(vm, &stash,
-						      addr, BIT_ULL(size));
+				err = i915_vm_pin_pt_stash(vm, &stash);
+				if (!err)
+					vm->allocate_va_range(vm, &stash,
+							      addr, BIT_ULL(size));
 
 				i915_vm_free_pt_stash(vm, &stash);
+alloc_vm_end:
+				if (err == -EDEADLK) {
+					err = i915_gem_ww_ctx_backoff(&ww);
+					if (!err)
+						goto retry;
+				}
+				i915_gem_ww_ctx_fini(&ww);
+
+				if (err)
+					break;
 			}
 
 			mock_vma->pages = obj->mm.pages;
@@ -1165,7 +1195,7 @@ static int igt_ggtt_page(void *arg)
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
-	err = i915_gem_object_pin_pages(obj);
+	err = i915_gem_object_pin_pages_unlocked(obj);
 	if (err)
 		goto out_free;
 
@@ -1332,7 +1362,7 @@ static int igt_gtt_reserve(void *arg)
 			goto out;
 		}
 
-		err = i915_gem_object_pin_pages(obj);
+		err = i915_gem_object_pin_pages_unlocked(obj);
 		if (err) {
 			i915_gem_object_put(obj);
 			goto out;
@@ -1384,7 +1414,7 @@ static int igt_gtt_reserve(void *arg)
 			goto out;
 		}
 
-		err = i915_gem_object_pin_pages(obj);
+		err = i915_gem_object_pin_pages_unlocked(obj);
 		if (err) {
 			i915_gem_object_put(obj);
 			goto out;
@@ -1548,7 +1578,7 @@ static int igt_gtt_insert(void *arg)
 			goto out;
 		}
 
-		err = i915_gem_object_pin_pages(obj);
+		err = i915_gem_object_pin_pages_unlocked(obj);
 		if (err) {
 			i915_gem_object_put(obj);
 			goto out;
@@ -1657,7 +1687,7 @@ static int igt_gtt_insert(void *arg)
 			goto out;
 		}
 
-		err = i915_gem_object_pin_pages(obj);
+		err = i915_gem_object_pin_pages_unlocked(obj);
 		if (err) {
 			i915_gem_object_put(obj);
 			goto out;
@@ -1828,7 +1858,7 @@ static int igt_cs_tlb(void *arg)
 		goto out_vm;
 	}
 
-	batch = i915_gem_object_pin_map(bbe, I915_MAP_WC);
+	batch = i915_gem_object_pin_map_unlocked(bbe, I915_MAP_WC);
 	if (IS_ERR(batch)) {
 		err = PTR_ERR(batch);
 		goto out_put_bbe;
@@ -1844,7 +1874,7 @@ static int igt_cs_tlb(void *arg)
 	}
 
 	/* Track the execution of each request by writing into different slot */
-	batch = i915_gem_object_pin_map(act, I915_MAP_WC);
+	batch = i915_gem_object_pin_map_unlocked(act, I915_MAP_WC);
 	if (IS_ERR(batch)) {
 		err = PTR_ERR(batch);
 		goto out_put_act;
@@ -1891,7 +1921,7 @@ static int igt_cs_tlb(void *arg)
 		goto out_put_out;
 	GEM_BUG_ON(vma->node.start != vm->total - PAGE_SIZE);
 
-	result = i915_gem_object_pin_map(out, I915_MAP_WB);
+	result = i915_gem_object_pin_map_unlocked(out, I915_MAP_WB);
 	if (IS_ERR(result)) {
 		err = PTR_ERR(result);
 		goto out_put_out;
@@ -1907,6 +1937,7 @@ static int igt_cs_tlb(void *arg)
 		while (!__igt_timeout(end_time, NULL)) {
 			struct i915_vm_pt_stash stash = {};
 			struct i915_request *rq;
+			struct i915_gem_ww_ctx ww;
 			u64 offset;
 
 			offset = igt_random_offset(&prng,
@@ -1925,19 +1956,30 @@ static int igt_cs_tlb(void *arg)
 			if (err)
 				goto end;
 
+			i915_gem_ww_ctx_init(&ww, false);
+retry:
+			err = i915_vm_lock_objects(vm, &ww);
+			if (err)
+				goto end_ww;
+
 			err = i915_vm_alloc_pt_stash(vm, &stash, chunk_size);
 			if (err)
-				goto end;
+				goto end_ww;
 
 			err = i915_vm_pin_pt_stash(vm, &stash);
-			if (err) {
-				i915_vm_free_pt_stash(vm, &stash);
-				goto end;
-			}
-
-			vm->allocate_va_range(vm, &stash, offset, chunk_size);
+			if (!err)
+				vm->allocate_va_range(vm, &stash, offset, chunk_size);
 
 			i915_vm_free_pt_stash(vm, &stash);
+end_ww:
+			if (err == -EDEADLK) {
+				err = i915_gem_ww_ctx_backoff(&ww);
+				if (!err)
+					goto retry;
+			}
+			i915_gem_ww_ctx_fini(&ww);
+			if (err)
+				goto end;
 
 			/* Prime the TLB with the dummy pages */
 			for (i = 0; i < count; i++) {
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 070/162] drm/i915: Finally remove obj->mm.lock.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (68 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 069/162] drm/i915/selftests: Prepare gtt " Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 071/162] drm/i915: Keep userpointer bindings if seqcount is unchanged, v2 Matthew Auld
                   ` (91 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

With all callers and selftests fixed to use ww locking, we can now
finally remove this lock.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  2 -
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  7 ++--
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  1 -
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     | 38 ++++---------------
 drivers/gpu/drm/i915/gem/i915_gem_phys.c      | 34 ++++-------------
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  | 37 +++++++++++++-----
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.h  |  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_tiling.c    |  2 -
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  3 +-
 drivers/gpu/drm/i915/i915_debugfs.c           |  4 +-
 drivers/gpu/drm/i915/i915_gem.c               |  8 +---
 drivers/gpu/drm/i915/i915_gem_gtt.c           |  2 +-
 13 files changed, 54 insertions(+), 90 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 028a556ab1a5..08d806bbf48e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -62,8 +62,6 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 			  const struct drm_i915_gem_object_ops *ops,
 			  struct lock_class_key *key, unsigned flags)
 {
-	mutex_init(&obj->mm.lock);
-
 	spin_lock_init(&obj->vma.lock);
 	INIT_LIST_HEAD(&obj->vma.list);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 1d4b44151e0c..d0cc62d1c65e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -136,7 +136,7 @@ static inline void assert_object_held_shared(struct drm_i915_gem_object *obj)
 	 */
 	if (IS_ENABLED(CONFIG_LOCKDEP) &&
 	    kref_read(&obj->base.refcount) > 0)
-		lockdep_assert_held(&obj->mm.lock);
+		assert_object_held(obj);
 }
 
 static inline int __i915_gem_object_lock(struct drm_i915_gem_object *obj,
@@ -350,11 +350,11 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
 static inline int __must_check
 i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
 {
-	might_lock(&obj->mm.lock);
-
 	if (atomic_inc_not_zero(&obj->mm.pages_pin_count))
 		return 0;
 
+	assert_object_held(obj);
+
 	return __i915_gem_object_get_pages(obj);
 }
 
@@ -396,7 +396,6 @@ i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 }
 
 int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
-int __i915_gem_object_put_pages_locked(struct drm_i915_gem_object *obj);
 void i915_gem_object_truncate(struct drm_i915_gem_object *obj);
 void i915_gem_object_writeback(struct drm_i915_gem_object *obj);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 5234c1ed62d4..b172e8cc53ab 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -209,7 +209,6 @@ struct drm_i915_gem_object {
 		 * Protects the pages and their use. Do not use directly, but
 		 * instead go through the pin/unpin interfaces.
 		 */
-		struct mutex lock;
 		atomic_t pages_pin_count;
 		atomic_t shrink_pin;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 79336735a6e4..4a8be759832b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -67,7 +67,7 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 		struct list_head *list;
 		unsigned long flags;
 
-		lockdep_assert_held(&obj->mm.lock);
+		assert_object_held(obj);
 		spin_lock_irqsave(&i915->mm.obj_lock, flags);
 
 		i915->mm.shrink_count++;
@@ -114,9 +114,7 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 {
 	int err;
 
-	err = mutex_lock_interruptible(&obj->mm.lock);
-	if (err)
-		return err;
+	assert_object_held(obj);
 
 	assert_object_held_shared(obj);
 
@@ -125,15 +123,13 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 
 		err = ____i915_gem_object_get_pages(obj);
 		if (err)
-			goto unlock;
+			return err;
 
 		smp_mb__before_atomic();
 	}
 	atomic_inc(&obj->mm.pages_pin_count);
 
-unlock:
-	mutex_unlock(&obj->mm.lock);
-	return err;
+	return 0;
 }
 
 int i915_gem_object_pin_pages_unlocked(struct drm_i915_gem_object *obj)
@@ -220,7 +216,7 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)
 	return pages;
 }
 
-int __i915_gem_object_put_pages_locked(struct drm_i915_gem_object *obj)
+int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 {
 	struct sg_table *pages;
 
@@ -251,21 +247,6 @@ int __i915_gem_object_put_pages_locked(struct drm_i915_gem_object *obj)
 	return 0;
 }
 
-int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
-{
-	int err;
-
-	if (i915_gem_object_has_pinned_pages(obj))
-		return -EBUSY;
-
-	/* May be called by shrinker from within get_pages() (on another bo) */
-	mutex_lock(&obj->mm.lock);
-	err = __i915_gem_object_put_pages_locked(obj);
-	mutex_unlock(&obj->mm.lock);
-
-	return err;
-}
-
 /* The 'mapping' part of i915_gem_object_pin_map() below */
 static void *i915_gem_object_map_page(struct drm_i915_gem_object *obj,
 		enum i915_map_type type)
@@ -366,9 +347,7 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
 	    !i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM))
 		return ERR_PTR(-ENXIO);
 
-	err = mutex_lock_interruptible(&obj->mm.lock);
-	if (err)
-		return ERR_PTR(err);
+	assert_object_held(obj);
 
 	pinned = !(type & I915_MAP_OVERRIDE);
 	type &= ~I915_MAP_OVERRIDE;
@@ -416,15 +395,12 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
 		obj->mm.mapping = page_pack_bits(ptr, type);
 	}
 
-out_unlock:
-	mutex_unlock(&obj->mm.lock);
 	return ptr;
 
 err_unpin:
 	atomic_dec(&obj->mm.pages_pin_count);
 err_unlock:
-	ptr = ERR_PTR(err);
-	goto out_unlock;
+	return ERR_PTR(err);
 }
 
 void *i915_gem_object_pin_map_unlocked(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index f317be5f5e34..435c3b54cf14 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -234,40 +234,22 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align)
 	if (err)
 		return err;
 
-	err = mutex_lock_interruptible(&obj->mm.lock);
-	if (err)
-		return err;
-
-	if (unlikely(!i915_gem_object_has_struct_page(obj)))
-		goto out;
-
-	if (obj->mm.madv != I915_MADV_WILLNEED) {
-		err = -EFAULT;
-		goto out;
-	}
+	if (obj->mm.madv != I915_MADV_WILLNEED)
+		return -EFAULT;
 
-	if (obj->mm.quirked) {
-		err = -EFAULT;
-		goto out;
-	}
+	if (obj->mm.quirked)
+		return -EFAULT;
 
-	if (obj->mm.mapping || i915_gem_object_has_pinned_pages(obj)) {
-		err = -EBUSY;
-		goto out;
-	}
+	if (obj->mm.mapping || i915_gem_object_has_pinned_pages(obj))
+		return -EBUSY;
 
 	if (unlikely(obj->mm.madv != I915_MADV_WILLNEED)) {
 		drm_dbg(obj->base.dev,
 			"Attempting to obtain a purgeable object\n");
-		err = -EFAULT;
-		goto out;
+		return -EFAULT;
 	}
 
-	err = i915_gem_object_shmem_to_phys(obj);
-
-out:
-	mutex_unlock(&obj->mm.lock);
-	return err;
+	return i915_gem_object_shmem_to_phys(obj);
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 7a59fd1ea4e5..b4dd7a709800 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -99,7 +99,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 				goto err_sg;
 			}
 
-			i915_gem_shrink(i915, 2 * page_count, NULL, *s++);
+			i915_gem_shrink(NULL, i915, 2 * page_count, NULL, *s++);
 
 			/*
 			 * We've tried hard to allocate the memory by reaping
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index afc6e5b4dcf1..e42192834c88 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -93,7 +93,8 @@ static void try_to_writeback(struct drm_i915_gem_object *obj,
  * The number of pages of backing storage actually released.
  */
 unsigned long
-i915_gem_shrink(struct drm_i915_private *i915,
+i915_gem_shrink(struct i915_gem_ww_ctx *ww,
+		struct drm_i915_private *i915,
 		unsigned long target,
 		unsigned long *nr_scanned,
 		unsigned int shrink)
@@ -112,6 +113,7 @@ i915_gem_shrink(struct drm_i915_private *i915,
 	intel_wakeref_t wakeref = 0;
 	unsigned long count = 0;
 	unsigned long scanned = 0;
+	int err;
 
 	trace_i915_gem_shrink(i915, target, shrink);
 
@@ -199,23 +201,38 @@ i915_gem_shrink(struct drm_i915_private *i915,
 
 			spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
 
-			if (unsafe_drop_pages(obj, shrink) &&
-			    mutex_trylock(&obj->mm.lock)) {
+			err = 0;
+			if (unsafe_drop_pages(obj, shrink)) {
 				/* May arrive from get_pages on another bo */
-				if (!__i915_gem_object_put_pages_locked(obj)) {
+				if (!ww) {
+					if (!i915_gem_object_trylock(obj))
+						goto skip;
+				} else {
+					err = i915_gem_object_lock(obj, ww);
+					if (err)
+						goto skip;
+				}
+
+				if (!__i915_gem_object_put_pages(obj)) {
 					try_to_writeback(obj, shrink);
 					count += obj->base.size >> PAGE_SHIFT;
 				}
-				mutex_unlock(&obj->mm.lock);
+				if (!ww)
+					i915_gem_object_unlock(obj);
 			}
 
 			scanned += obj->base.size >> PAGE_SHIFT;
+skip:
 			i915_gem_object_put(obj);
 
 			spin_lock_irqsave(&i915->mm.obj_lock, flags);
+			if (err)
+				break;
 		}
 		list_splice_tail(&still_in_list, phase->list);
 		spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
+		if (err)
+			return err;
 	}
 
 	if (shrink & I915_SHRINK_BOUND)
@@ -246,7 +263,7 @@ unsigned long i915_gem_shrink_all(struct drm_i915_private *i915)
 	unsigned long freed = 0;
 
 	with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
-		freed = i915_gem_shrink(i915, -1UL, NULL,
+		freed = i915_gem_shrink(NULL, i915, -1UL, NULL,
 					I915_SHRINK_BOUND |
 					I915_SHRINK_UNBOUND);
 	}
@@ -292,7 +309,7 @@ i915_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
 
 	sc->nr_scanned = 0;
 
-	freed = i915_gem_shrink(i915,
+	freed = i915_gem_shrink(NULL, i915,
 				sc->nr_to_scan,
 				&sc->nr_scanned,
 				I915_SHRINK_BOUND |
@@ -301,7 +318,7 @@ i915_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
 		intel_wakeref_t wakeref;
 
 		with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
-			freed += i915_gem_shrink(i915,
+			freed += i915_gem_shrink(NULL, i915,
 						 sc->nr_to_scan - sc->nr_scanned,
 						 &sc->nr_scanned,
 						 I915_SHRINK_ACTIVE |
@@ -326,7 +343,7 @@ i915_gem_shrinker_oom(struct notifier_block *nb, unsigned long event, void *ptr)
 
 	freed_pages = 0;
 	with_intel_runtime_pm(&i915->runtime_pm, wakeref)
-		freed_pages += i915_gem_shrink(i915, -1UL, NULL,
+		freed_pages += i915_gem_shrink(NULL, i915, -1UL, NULL,
 					       I915_SHRINK_BOUND |
 					       I915_SHRINK_UNBOUND |
 					       I915_SHRINK_WRITEBACK);
@@ -364,7 +381,7 @@ i915_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr
 	intel_wakeref_t wakeref;
 
 	with_intel_runtime_pm(&i915->runtime_pm, wakeref)
-		freed_pages += i915_gem_shrink(i915, -1UL, NULL,
+		freed_pages += i915_gem_shrink(NULL, i915, -1UL, NULL,
 					       I915_SHRINK_BOUND |
 					       I915_SHRINK_UNBOUND |
 					       I915_SHRINK_VMAPS);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
index b397d7785789..8512470f6fd6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
@@ -9,10 +9,12 @@
 #include <linux/bits.h>
 
 struct drm_i915_private;
+struct i915_gem_ww_ctx;
 struct mutex;
 
 /* i915_gem_shrinker.c */
-unsigned long i915_gem_shrink(struct drm_i915_private *i915,
+unsigned long i915_gem_shrink(struct i915_gem_ww_ctx *ww,
+			      struct drm_i915_private *i915,
 			      unsigned long target,
 			      unsigned long *nr_scanned,
 			      unsigned flags);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_tiling.c b/drivers/gpu/drm/i915/gem/i915_gem_tiling.c
index ffcaee74a249..4523a14db86e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_tiling.c
@@ -265,7 +265,6 @@ i915_gem_object_set_tiling(struct drm_i915_gem_object *obj,
 	 * pages to prevent them being swapped out and causing corruption
 	 * due to the change in swizzling.
 	 */
-	mutex_lock(&obj->mm.lock);
 	if (i915_gem_object_has_pages(obj) &&
 	    obj->mm.madv == I915_MADV_WILLNEED &&
 	    i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
@@ -280,7 +279,6 @@ i915_gem_object_set_tiling(struct drm_i915_gem_object *obj,
 			obj->mm.quirked = true;
 		}
 	}
-	mutex_unlock(&obj->mm.lock);
 
 	spin_lock(&obj->vma.lock);
 	for_each_ggtt_vma(vma, obj) {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 0cab9da6669e..fb4bc30fbd9a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -247,7 +247,7 @@ static int i915_gem_object_userptr_unbind(struct drm_i915_gem_object *obj, bool
 	if (GEM_WARN_ON(i915_gem_object_has_pinned_pages(obj)))
 		return -EBUSY;
 
-	mutex_lock(&obj->mm.lock);
+	assert_object_held(obj);
 
 	pages = __i915_gem_object_unset_pages(obj);
 	if (!IS_ERR_OR_NULL(pages))
@@ -255,7 +255,6 @@ static int i915_gem_object_userptr_unbind(struct drm_i915_gem_object *obj, bool
 
 	if (get_pages)
 		err = ____i915_gem_object_get_pages(obj);
-	mutex_unlock(&obj->mm.lock);
 
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 263074c2c097..6d1482c82694 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1510,10 +1510,10 @@ i915_drop_caches_set(void *data, u64 val)
 
 	fs_reclaim_acquire(GFP_KERNEL);
 	if (val & DROP_BOUND)
-		i915_gem_shrink(i915, LONG_MAX, NULL, I915_SHRINK_BOUND);
+		i915_gem_shrink(NULL, i915, LONG_MAX, NULL, I915_SHRINK_BOUND);
 
 	if (val & DROP_UNBOUND)
-		i915_gem_shrink(i915, LONG_MAX, NULL, I915_SHRINK_UNBOUND);
+		i915_gem_shrink(NULL, i915, LONG_MAX, NULL, I915_SHRINK_UNBOUND);
 
 	if (val & DROP_SHRINK_ALL)
 		i915_gem_shrink_all(i915);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b81fbd907775..ef66c0926af6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1063,10 +1063,6 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 	if (err)
 		goto out;
 
-	err = mutex_lock_interruptible(&obj->mm.lock);
-	if (err)
-		goto out_ww;
-
 	if (i915_gem_object_has_pages(obj) &&
 	    i915_gem_object_is_tiled(obj) &&
 	    i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
@@ -1109,9 +1105,7 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 		i915_gem_object_truncate(obj);
 
 	args->retained = obj->mm.madv != __I915_MADV_PURGED;
-	mutex_unlock(&obj->mm.lock);
 
-out_ww:
 	i915_gem_object_unlock(obj);
 out:
 	i915_gem_object_put(obj);
@@ -1292,7 +1286,7 @@ int i915_gem_freeze_late(struct drm_i915_private *i915)
 
 	wakeref = intel_runtime_pm_get(&i915->runtime_pm);
 
-	i915_gem_shrink(i915, -1UL, NULL, ~0);
+	i915_gem_shrink(NULL, i915, -1UL, NULL, ~0);
 	i915_gem_drain_freed_objects(i915);
 
 	list_for_each_entry(obj, &i915->mm.shrink_list, mm.link) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index c5ee1567f3d1..729074ee33d4 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -44,7 +44,7 @@ int i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj,
 		 * the DMA remapper, i915_gem_shrink will return 0.
 		 */
 		GEM_BUG_ON(obj->mm.pages == pages);
-	} while (i915_gem_shrink(to_i915(obj->base.dev),
+	} while (i915_gem_shrink(NULL, to_i915(obj->base.dev),
 				 obj->base.size >> PAGE_SHIFT, NULL,
 				 I915_SHRINK_BOUND |
 				 I915_SHRINK_UNBOUND));
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 071/162] drm/i915: Keep userpointer bindings if seqcount is unchanged, v2.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (69 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 070/162] drm/i915: Finally remove obj->mm.lock Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 072/162] drm/i915: Avoid some false positives in assert_object_held() Matthew Auld
                   ` (90 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx
  Cc: Thomas Hellström, kernel test robot, dri-devel, Dan Carpenter

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Instead of force unbinding and rebinding every time, we try to check
if our notifier seqcount is still correct when pages are bound. This
way we only rebind userptr when we need to, and prevent stalls.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 27 ++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index fb4bc30fbd9a..d1ecc31b5e90 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -275,12 +275,33 @@ int i915_gem_object_userptr_submit_init(struct drm_i915_gem_object *obj)
 	if (ret)
 		return ret;
 
-	/* Make sure userptr is unbound for next attempt, so we don't use stale pages. */
-	ret = i915_gem_object_userptr_unbind(obj, false);
+	/* optimistically try to preserve current pages while unlocked */
+	if (i915_gem_object_has_pages(obj) &&
+	    !mmu_interval_check_retry(&obj->userptr.notifier,
+				      obj->userptr.notifier_seq)) {
+		spin_lock(&i915->mm.notifier_lock);
+		if (obj->userptr.pvec &&
+		    !mmu_interval_read_retry(&obj->userptr.notifier,
+					     obj->userptr.notifier_seq)) {
+			obj->userptr.page_ref++;
+
+			/* We can keep using the current binding, this is the fastpath */
+			ret = 1;
+		}
+		spin_unlock(&i915->mm.notifier_lock);
+	}
+
+	if (!ret) {
+		/* Make sure userptr is unbound for next attempt, so we don't use stale pages. */
+		ret = i915_gem_object_userptr_unbind(obj, false);
+	}
 	i915_gem_object_unlock(obj);
-	if (ret)
+	if (ret < 0)
 		return ret;
 
+	if (ret > 0)
+		return 0;
+
 	notifier_seq = mmu_interval_read_begin(&obj->userptr.notifier);
 
 	pvec = kvmalloc_array(num_pages, sizeof(struct page *), GFP_KERNEL);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 072/162] drm/i915: Avoid some false positives in assert_object_held()
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (70 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 071/162] drm/i915: Keep userpointer bindings if seqcount is unchanged, v2 Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 073/162] drm/i915: Reference contending lock objects Matthew Auld
                   ` (89 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Hellström, dri-devel

From: Thomas Hellström <thomas.hellstrom@intel.com>

In a ww transaction where we've already locked a reservation
object, assert_object_held() might not throw a splat even if
the object is unlocked. Improve on that situation by asserting
that the reservation object's ww mutex is indeed locked.

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index d0cc62d1c65e..d56643b3b518 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -117,7 +117,14 @@ i915_gem_object_put(struct drm_i915_gem_object *obj)
 	__drm_gem_object_put(&obj->base);
 }
 
-#define assert_object_held(obj) dma_resv_assert_held((obj)->base.resv)
+#ifdef CONFIG_LOCKDEP
+#define assert_object_held(obj) do {					\
+		dma_resv_assert_held((obj)->base.resv);			\
+		WARN_ON(!ww_mutex_is_locked(&(obj)->base.resv->lock)); \
+	} while (0)
+#else
+#define assert_object_held(obj) do { } while (0)
+#endif
 
 #define object_is_isolated(obj)					\
 	(!IS_ENABLED(CONFIG_LOCKDEP) ||				\
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 073/162] drm/i915: Reference contending lock objects
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (71 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 072/162] drm/i915: Avoid some false positives in assert_object_held() Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 074/162] drm/i915: Break out dma_resv ww locking utilities to separate files Matthew Auld
                   ` (88 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Hellström, dri-devel

From: Thomas Hellström <thomas.hellstrom@intel.com>

When we lock objects in leaf functions, for example during eviction,
they may disappear as soon as we unreference them, and the locking
context contended pointer then points to a free object.
Fix this by taking a reference on that object, and also unlock the
contending object as soon as we've done the ww transaction relaxation:
The restarted transaction may not even need the contending object,
and keeping the lock is not needed to prevent starvation.
Keeping that lock will unnecessarily requiring us to reference count
all locks on the list and also creates locking confusion around
-EALREADY.

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h | 2 +-
 drivers/gpu/drm/i915/i915_gem.c            | 9 ++++++++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index d56643b3b518..60e27738c39d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -163,7 +163,7 @@ static inline int __i915_gem_object_lock(struct drm_i915_gem_object *obj,
 		ret = 0;
 
 	if (ret == -EDEADLK)
-		ww->contended = obj;
+		ww->contended = i915_gem_object_get(obj);
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ef66c0926af6..2248e65cf5f9 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1370,9 +1370,16 @@ int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ww)
 	else
 		dma_resv_lock_slow(ww->contended->base.resv, &ww->ctx);
 
+	/*
+	 * Unlocking the contended lock again, as might not need it in
+	 * the retried transaction. This does not increase starvation,
+	 * but it's opening up for a wakeup flood if there are many
+	 * transactions relaxing on this object.
+	 */
 	if (!ret)
-		list_add_tail(&ww->contended->obj_link, &ww->obj_list);
+		dma_resv_unlock(ww->contended->base.resv);
 
+	i915_gem_object_put(ww->contended);
 	ww->contended = NULL;
 
 	return ret;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 074/162] drm/i915: Break out dma_resv ww locking utilities to separate files
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (72 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 073/162] drm/i915: Reference contending lock objects Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 075/162] drm/i915: Introduce a for_i915_gem_ww(){} Matthew Auld
                   ` (87 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Hellström, dri-devel

From: Thomas Hellström <thomas.hellstrom@intel.com>

As we're about to add more ww-related functionality,
break out the dma_resv ww locking utilities to their own files

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/Makefile               |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h  |  1 +
 drivers/gpu/drm/i915/gt/intel_renderstate.h |  1 +
 drivers/gpu/drm/i915/i915_gem.c             | 59 ------------------
 drivers/gpu/drm/i915/i915_gem.h             | 12 ----
 drivers/gpu/drm/i915/i915_gem_ww.c          | 66 +++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_ww.h          | 21 +++++++
 7 files changed, 90 insertions(+), 71 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_gem_ww.c
 create mode 100644 drivers/gpu/drm/i915/i915_gem_ww.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 5112e5d79316..ec361d61230b 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -45,6 +45,7 @@ i915-y += i915_drv.o \
 	  i915_switcheroo.o \
 	  i915_sysfs.o \
 	  i915_utils.o \
+	  i915_gem_ww.o \
 	  intel_device_info.o \
 	  intel_dram.o \
 	  intel_memory_region.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 60e27738c39d..c6c7ab181a65 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -15,6 +15,7 @@
 #include "i915_gem_object_types.h"
 #include "i915_gem_gtt.h"
 #include "i915_vma_types.h"
+#include "i915_gem_ww.h"
 
 void i915_gem_init__objects(struct drm_i915_private *i915);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_renderstate.h b/drivers/gpu/drm/i915/gt/intel_renderstate.h
index 713aa1e86c80..d9db833b873b 100644
--- a/drivers/gpu/drm/i915/gt/intel_renderstate.h
+++ b/drivers/gpu/drm/i915/gt/intel_renderstate.h
@@ -26,6 +26,7 @@
 
 #include <linux/types.h>
 #include "i915_gem.h"
+#include "i915_gem_ww.h"
 
 struct i915_request;
 struct intel_context;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2248e65cf5f9..2662d679db6e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1326,65 +1326,6 @@ int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file)
 	return ret;
 }
 
-void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ww, bool intr)
-{
-	ww_acquire_init(&ww->ctx, &reservation_ww_class);
-	INIT_LIST_HEAD(&ww->obj_list);
-	ww->intr = intr;
-	ww->contended = NULL;
-}
-
-static void i915_gem_ww_ctx_unlock_all(struct i915_gem_ww_ctx *ww)
-{
-	struct drm_i915_gem_object *obj;
-
-	while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) {
-		list_del(&obj->obj_link);
-		i915_gem_object_unlock(obj);
-	}
-}
-
-void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj)
-{
-	list_del(&obj->obj_link);
-	i915_gem_object_unlock(obj);
-}
-
-void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ww)
-{
-	i915_gem_ww_ctx_unlock_all(ww);
-	WARN_ON(ww->contended);
-	ww_acquire_fini(&ww->ctx);
-}
-
-int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ww)
-{
-	int ret = 0;
-
-	if (WARN_ON(!ww->contended))
-		return -EINVAL;
-
-	i915_gem_ww_ctx_unlock_all(ww);
-	if (ww->intr)
-		ret = dma_resv_lock_slow_interruptible(ww->contended->base.resv, &ww->ctx);
-	else
-		dma_resv_lock_slow(ww->contended->base.resv, &ww->ctx);
-
-	/*
-	 * Unlocking the contended lock again, as might not need it in
-	 * the retried transaction. This does not increase starvation,
-	 * but it's opening up for a wakeup flood if there are many
-	 * transactions relaxing on this object.
-	 */
-	if (!ret)
-		dma_resv_unlock(ww->contended->base.resv);
-
-	i915_gem_object_put(ww->contended);
-	ww->contended = NULL;
-
-	return ret;
-}
-
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/mock_gem_device.c"
 #include "selftests/i915_gem.c"
diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index a4cad3f154ca..f333e88a2b6e 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -116,16 +116,4 @@ static inline bool __tasklet_is_scheduled(struct tasklet_struct *t)
 	return test_bit(TASKLET_STATE_SCHED, &t->state);
 }
 
-struct i915_gem_ww_ctx {
-	struct ww_acquire_ctx ctx;
-	struct list_head obj_list;
-	bool intr;
-	struct drm_i915_gem_object *contended;
-};
-
-void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
-void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ctx);
-int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ctx);
-void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj);
-
 #endif /* __I915_GEM_H__ */
diff --git a/drivers/gpu/drm/i915/i915_gem_ww.c b/drivers/gpu/drm/i915/i915_gem_ww.c
new file mode 100644
index 000000000000..43960d8595eb
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_ww.c
@@ -0,0 +1,66 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+#include <linux/dma-resv.h>
+#include "i915_gem_ww.h"
+#include "gem/i915_gem_object.h"
+
+void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ww, bool intr)
+{
+	ww_acquire_init(&ww->ctx, &reservation_ww_class);
+	INIT_LIST_HEAD(&ww->obj_list);
+	ww->intr = intr;
+	ww->contended = NULL;
+}
+
+static void i915_gem_ww_ctx_unlock_all(struct i915_gem_ww_ctx *ww)
+{
+	struct drm_i915_gem_object *obj;
+
+	while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) {
+		list_del(&obj->obj_link);
+		i915_gem_object_unlock(obj);
+	}
+}
+
+void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj)
+{
+	list_del(&obj->obj_link);
+	i915_gem_object_unlock(obj);
+}
+
+void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ww)
+{
+	i915_gem_ww_ctx_unlock_all(ww);
+	WARN_ON(ww->contended);
+	ww_acquire_fini(&ww->ctx);
+}
+
+int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ww)
+{
+	int ret = 0;
+
+	if (WARN_ON(!ww->contended))
+		return -EINVAL;
+
+	i915_gem_ww_ctx_unlock_all(ww);
+	if (ww->intr)
+		ret = dma_resv_lock_slow_interruptible(ww->contended->base.resv, &ww->ctx);
+	else
+		dma_resv_lock_slow(ww->contended->base.resv, &ww->ctx);
+
+	/*
+	 * Unlocking the contended lock again, as might not need it in
+	 * the retried transaction. This does not increase starvation,
+	 * but it's opening up for a wakeup flood if there are many
+	 * transactions relaxing on this object.
+	 */
+	if (!ret)
+		dma_resv_unlock(ww->contended->base.resv);
+
+	i915_gem_object_put(ww->contended);
+	ww->contended = NULL;
+
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h b/drivers/gpu/drm/i915/i915_gem_ww.h
new file mode 100644
index 000000000000..f2d8769e4118
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_ww.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+#ifndef __I915_GEM_WW_H__
+#define __I915_GEM_WW_H__
+
+#include <drm/drm_drv.h>
+
+struct i915_gem_ww_ctx {
+	struct ww_acquire_ctx ctx;
+	struct list_head obj_list;
+	struct drm_i915_gem_object *contended;
+	bool intr;
+};
+
+void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
+void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ctx);
+int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ctx);
+void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj);
+#endif
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 075/162] drm/i915: Introduce a for_i915_gem_ww(){}
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (73 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 074/162] drm/i915: Break out dma_resv ww locking utilities to separate files Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 076/162] drm/i915: Untangle the vma pages_mutex Matthew Auld
                   ` (86 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Hellström, dri-devel

From: Thomas Hellström <thomas.hellstrom@intel.com>

Introduce a for_i915_gem_ww(){} utility to help make the code
around a ww transaction more readable.

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_ww.h | 31 +++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h b/drivers/gpu/drm/i915/i915_gem_ww.h
index f2d8769e4118..f6b1a796667b 100644
--- a/drivers/gpu/drm/i915/i915_gem_ww.h
+++ b/drivers/gpu/drm/i915/i915_gem_ww.h
@@ -11,11 +11,40 @@ struct i915_gem_ww_ctx {
 	struct ww_acquire_ctx ctx;
 	struct list_head obj_list;
 	struct drm_i915_gem_object *contended;
-	bool intr;
+	unsigned short intr;
+	unsigned short loop;
 };
 
 void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
 void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ctx);
 int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ctx);
 void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj);
+
+/* Internal functions used by the inlines! Don't use. */
+static inline int __i915_gem_ww_fini(struct i915_gem_ww_ctx *ww, int err)
+{
+	ww->loop = 0;
+	if (err == -EDEADLK) {
+		err = i915_gem_ww_ctx_backoff(ww);
+		if (!err)
+			ww->loop = 1;
+	}
+
+	if (!ww->loop)
+		i915_gem_ww_ctx_fini(ww);
+
+	return err;
+}
+
+static inline void
+__i915_gem_ww_init(struct i915_gem_ww_ctx *ww, bool intr)
+{
+	i915_gem_ww_ctx_init(ww, intr);
+	ww->loop = 1;
+}
+
+#define for_i915_gem_ww(_ww, _err, _intr)			\
+	for (__i915_gem_ww_init(_ww, _intr); (_ww)->loop;	\
+	     _err = __i915_gem_ww_fini(_ww, _err))
+
 #endif
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 076/162] drm/i915: Untangle the vma pages_mutex
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (74 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 075/162] drm/i915: Introduce a for_i915_gem_ww(){} Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 077/162] drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete Matthew Auld
                   ` (85 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Hellström, dri-devel

From: Thomas Hellström <thomas.hellstrom@intel.com>

Move the vma pages_mutex out of the way from the object ww locks.

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_vma.c | 30 ++++++++++++++++--------------
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 0c7e4191811a..7243ab593aec 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -792,28 +792,30 @@ static int vma_get_pages(struct i915_vma *vma)
 	if (atomic_add_unless(&vma->pages_count, 1, 0))
 		return 0;
 
+	if (vma->obj) {
+		err = i915_gem_object_pin_pages(vma->obj);
+		if (err)
+			return err;
+	}
+
 	/* Allocations ahoy! */
-	if (mutex_lock_interruptible(&vma->pages_mutex))
-		return -EINTR;
+	if (mutex_lock_interruptible(&vma->pages_mutex)) {
+		err = -EINTR;
+		goto unpin;
+	}
 
 	if (!atomic_read(&vma->pages_count)) {
-		if (vma->obj) {
-			err = i915_gem_object_pin_pages(vma->obj);
-			if (err)
-				goto unlock;
-		}
-
 		err = vma->ops->set_pages(vma);
-		if (err) {
-			if (vma->obj)
-				i915_gem_object_unpin_pages(vma->obj);
+		if (err)
 			goto unlock;
-		}
 	}
 	atomic_inc(&vma->pages_count);
 
 unlock:
 	mutex_unlock(&vma->pages_mutex);
+unpin:
+	if (err && vma->obj)
+		__i915_gem_object_unpin_pages(vma->obj);
 
 	return err;
 }
@@ -826,10 +828,10 @@ static void __vma_put_pages(struct i915_vma *vma, unsigned int count)
 	if (atomic_sub_return(count, &vma->pages_count) == 0) {
 		vma->ops->clear_pages(vma);
 		GEM_BUG_ON(vma->pages);
-		if (vma->obj)
-			i915_gem_object_unpin_pages(vma->obj);
 	}
 	mutex_unlock(&vma->pages_mutex);
+	if (vma->obj)
+		i915_gem_object_unpin_pages(vma->obj);
 }
 
 static void vma_put_pages(struct i915_vma *vma)
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 077/162] drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (75 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 076/162] drm/i915: Untangle the vma pages_mutex Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 078/162] drm/i915: Return error value when bo not in LMEM for discrete Matthew Auld
                   ` (84 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mohammed Khajapasha, dri-devel

From: Mohammed Khajapasha <mohammed.khajapasha@intel.com>

use local memory io BAR address for fbdev's fb_mmap() operation on
discrete, fbdev uses the physical address of our framebuffer for its
fb_mmap() fn.

Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
---
 drivers/gpu/drm/i915/display/intel_fbdev.c | 27 +++++++++++++++++-----
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
index bdf44e923cc0..831e99e0785c 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -178,6 +178,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	unsigned long flags = 0;
 	bool prealloc = false;
 	void __iomem *vaddr;
+	struct drm_i915_gem_object *obj;
 	int ret;
 
 	if (intel_fb &&
@@ -232,13 +233,27 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	info->fbops = &intelfb_ops;
 
 	/* setup aperture base/size for vesafb takeover */
-	info->apertures->ranges[0].base = ggtt->gmadr.start;
-	info->apertures->ranges[0].size = ggtt->mappable_end;
+	obj = intel_fb_obj(&intel_fb->base);
+	if (HAS_LMEM(dev_priv) && i915_gem_object_is_lmem(obj)) {
+		struct intel_memory_region *mem = obj->mm.region;
+
+		info->apertures->ranges[0].base = mem->io_start;
+		info->apertures->ranges[0].size = mem->total;
+
+		/* Use fbdev's framebuffer from lmem for discrete */
+		info->fix.smem_start =
+			(unsigned long)(mem->io_start +
+					i915_gem_object_get_dma_address(obj, 0));
+		info->fix.smem_len = obj->base.size;
+	} else {
+		info->apertures->ranges[0].base = ggtt->gmadr.start;
+		info->apertures->ranges[0].size = ggtt->mappable_end;
 
-	/* Our framebuffer is the entirety of fbdev's system memory */
-	info->fix.smem_start =
-		(unsigned long)(ggtt->gmadr.start + vma->node.start);
-	info->fix.smem_len = vma->node.size;
+		/* Our framebuffer is the entirety of fbdev's system memory */
+		info->fix.smem_start =
+			(unsigned long)(ggtt->gmadr.start + vma->node.start);
+		info->fix.smem_len = vma->node.size;
+	}
 
 	vaddr = i915_vma_pin_iomap(vma);
 	if (IS_ERR(vaddr)) {
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 078/162] drm/i915: Return error value when bo not in LMEM for discrete
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (76 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 077/162] drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 079/162] drm/i915/dmabuf: Disallow LMEM objects from dma-buf Matthew Auld
                   ` (83 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mohammed Khajapasha, Michael J . Ruhl, Animesh Manna, dri-devel

From: Mohammed Khajapasha <mohammed.khajapasha@intel.com>

Return EREMOTE value when frame buffer object is not backed by LMEM
for discrete. If Local memory is supported by hardware the framebuffer
backing gem objects should be from local memory.

Signed-off-by: Mohammed Khajapasha <mohammed.khajapasha@intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Animesh Manna <animesh.manna@intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 8a7945f55278..95ed1e06ea55 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -18054,11 +18054,20 @@ intel_user_framebuffer_create(struct drm_device *dev,
 	struct drm_framebuffer *fb;
 	struct drm_i915_gem_object *obj;
 	struct drm_mode_fb_cmd2 mode_cmd = *user_mode_cmd;
+	struct drm_i915_private *i915;
 
 	obj = i915_gem_object_lookup(filp, mode_cmd.handles[0]);
 	if (!obj)
 		return ERR_PTR(-ENOENT);
 
+	/* object is backed with LMEM for discrete */
+	i915 = to_i915(obj->base.dev);
+	if (HAS_LMEM(i915) && !i915_gem_object_is_lmem(obj)) {
+		/* object is "remote", not in local memory */
+		i915_gem_object_put(obj);
+		return ERR_PTR(-EREMOTE);
+	}
+
 	fb = intel_framebuffer_create(obj, &mode_cmd);
 	i915_gem_object_put(obj);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 079/162] drm/i915/dmabuf: Disallow LMEM objects from dma-buf
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (77 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 078/162] drm/i915: Return error value when bo not in LMEM for discrete Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 080/162] drm/i915/lmem: Fail driver init if LMEM training failed Matthew Auld
                   ` (82 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Michael J. Ruhl, dri-devel

From: "Michael J. Ruhl" <michael.j.ruhl@intel.com>

The dma-buf interface for i915 does not currently support
LMEM backed objects.

Check imported objects to see if they are from i915 and if they
are LMEM.  If they are, reject the import.

This check is needed in two places, once on import, and then a
recheck in the mapping path in the off chance that an object
was migrated to LMEM after import.

Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index c4b01e819786..018d02cc4af5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -9,6 +9,7 @@
 #include <linux/dma-resv.h>
 
 #include "i915_drv.h"
+#include "i915_gem_lmem.h"
 #include "i915_gem_object.h"
 #include "i915_scatterlist.h"
 
@@ -25,6 +26,11 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme
 	struct scatterlist *src, *dst;
 	int ret, i;
 
+	if (i915_gem_object_is_lmem(obj)) {
+		ret = -ENOTSUPP;
+		goto err;
+	}
+
 	ret = i915_gem_object_pin_pages(obj);
 	if (ret)
 		goto err;
@@ -248,6 +254,10 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
 			 */
 			return &i915_gem_object_get(obj)->base;
 		}
+
+		/* not our device, but still a i915 object? */
+		if (i915_gem_object_is_lmem(obj))
+			return ERR_PTR(-ENOTSUPP);
 	}
 
 	/* need to attach */
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 080/162] drm/i915/lmem: Fail driver init if LMEM training failed
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (78 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 079/162] drm/i915/dmabuf: Disallow LMEM objects from dma-buf Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 081/162] HAX drm/i915/lmem: support CPU relocations Matthew Auld
                   ` (81 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Caz Yokoyama, dri-devel

From: Matt Roper <matthew.d.roper@intel.com>

Boot firmware performs memory training and health assessment during
startup.  If the memory training fails, the firmware will consider the
GPU unusable and will instruct the punit to keep the GT powered down.
If this happens, our driver will be unable to communicate with the GT
(all GT registers will read back as 0, forcewake requests will timeout,
etc.) so we should abort driver initialization if this happens.  We can
confirm that LMEM was initialized successfully via sgunit register
GU_CNTL.

Bspec: 53111
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Cc: Caz Yokoyama <Caz.Yokoyama@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h     |  3 +++
 drivers/gpu/drm/i915/intel_uncore.c | 12 ++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 5375b219cc3b..bf9ba1e361bb 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -487,6 +487,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define GAB_CTL				_MMIO(0x24000)
 #define   GAB_CTL_CONT_AFTER_PAGEFAULT	(1 << 8)
 
+#define GU_CNTL				_MMIO(0x101010)
+#define   LMEM_INIT			REG_BIT(7)
+
 #define GEN6_STOLEN_RESERVED		_MMIO(0x1082C0)
 #define GEN6_STOLEN_RESERVED_ADDR_MASK	(0xFFF << 20)
 #define GEN7_STOLEN_RESERVED_ADDR_MASK	(0x3FFF << 18)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 1c14a07eba7d..1630452e82b8 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1901,6 +1901,18 @@ int intel_uncore_init_mmio(struct intel_uncore *uncore)
 	if (ret)
 		return ret;
 
+	/*
+	 * The boot firmware initializes local memory and assesses its health.
+	 * If memory training fails, the punit will have been instructed to
+	 * keep the GT powered down; we won't be able to communicate with it
+	 * and we should not continue with driver initialization.
+	 */
+	if (IS_DGFX(i915) &&
+	    !(__raw_uncore_read32(uncore, GU_CNTL) & LMEM_INIT)) {
+		drm_err(&i915->drm, "LMEM not initialized by firmware\n");
+		return -ENODEV;
+	}
+
 	if (INTEL_GEN(i915) > 5 && !intel_vgpu_active(i915))
 		uncore->flags |= UNCORE_HAS_FORCEWAKE;
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 081/162] HAX drm/i915/lmem: support CPU relocations
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (79 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 080/162] drm/i915/lmem: Fail driver init if LMEM training failed Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 082/162] HAX drm/i915/lmem: support pread and pwrite Matthew Auld
                   ` (80 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Abdiel Janulgue, dri-devel, Thomas Hellström, Rodrigo Vivi

** DO NOT MERGE. RELOCATION SUPPORT WILL BE DROPPED FROM DG1+ **

Add LMEM support for the CPU reloc path. When doing relocations we have
both a GPU and CPU reloc path, as well as some debugging options to force a
particular path. The GPU reloc path is preferred when the object
is not currently idle, otherwise we use the CPU reloc path. Since we
can't kmap the object, and the mappable aperture might not be available,
add support for mapping it through LMEMBAR.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 53 +++++++++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c      | 12 +++++
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h      |  4 ++
 3 files changed, 65 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 91f0c3fd9a4b..e73a761a7d1f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -14,6 +14,7 @@
 #include "display/intel_frontbuffer.h"
 
 #include "gem/i915_gem_ioctls.h"
+#include "gem/i915_gem_lmem.h"
 #include "gt/intel_context.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_buffer_pool.h"
@@ -278,6 +279,7 @@ struct i915_execbuffer {
 		bool has_llc : 1;
 		bool has_fence : 1;
 		bool needs_unfenced : 1;
+		bool is_lmem : 1;
 
 		struct i915_request *rq;
 		u32 *rq_cmd;
@@ -1049,6 +1051,7 @@ static void reloc_cache_init(struct reloc_cache *cache,
 	cache->has_fence = cache->gen < 4;
 	cache->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment;
 	cache->node.flags = 0;
+	cache->is_lmem = false;
 	reloc_cache_clear(cache);
 }
 
@@ -1128,10 +1131,14 @@ static void reloc_cache_reset(struct reloc_cache *cache, struct i915_execbuffer
 	} else {
 		struct i915_ggtt *ggtt = cache_to_ggtt(cache);
 
-		intel_gt_flush_ggtt_writes(ggtt->vm.gt);
+		if (!cache->is_lmem)
+			intel_gt_flush_ggtt_writes(ggtt->vm.gt);
 		io_mapping_unmap_atomic((void __iomem *)vaddr);
 
-		if (drm_mm_node_allocated(&cache->node)) {
+		if (cache->is_lmem) {
+			i915_gem_object_unpin_pages((struct drm_i915_gem_object *)cache->node.mm);
+			cache->is_lmem = false;
+		} else if (drm_mm_node_allocated(&cache->node)) {
 			ggtt->vm.clear_range(&ggtt->vm,
 					     cache->node.start,
 					     cache->node.size);
@@ -1184,6 +1191,40 @@ static void *reloc_kmap(struct drm_i915_gem_object *obj,
 	return vaddr;
 }
 
+static void *reloc_lmem(struct drm_i915_gem_object *obj,
+			struct reloc_cache *cache,
+			unsigned long page)
+{
+	void *vaddr;
+	int err;
+
+	GEM_BUG_ON(use_cpu_reloc(cache, obj));
+
+	if (cache->vaddr) {
+		io_mapping_unmap_atomic((void __force __iomem *) unmask_page(cache->vaddr));
+	} else {
+		err = i915_gem_object_pin_pages(obj);
+		if (err)
+			return ERR_PTR(err);
+
+		err = i915_gem_object_set_to_wc_domain(obj, true);
+		if (err) {
+			i915_gem_object_unpin_pages(obj);
+			return ERR_PTR(err);
+		}
+
+		cache->node.mm = (void *)obj;
+		cache->is_lmem = true;
+	}
+
+	vaddr = i915_gem_object_lmem_io_map_page_atomic(obj, page);
+
+	cache->vaddr = (unsigned long)vaddr;
+	cache->page = page;
+
+	return vaddr;
+}
+
 static void *reloc_iomap(struct drm_i915_gem_object *obj,
 			 struct i915_execbuffer *eb,
 			 unsigned long page)
@@ -1262,8 +1303,12 @@ static void *reloc_vaddr(struct drm_i915_gem_object *obj,
 		vaddr = unmask_page(cache->vaddr);
 	} else {
 		vaddr = NULL;
-		if ((cache->vaddr & KMAP) == 0)
-			vaddr = reloc_iomap(obj, eb, page);
+		if ((cache->vaddr & KMAP) == 0) {
+			if (i915_gem_object_is_lmem(obj))
+				vaddr = reloc_lmem(obj, cache, page);
+			else
+				vaddr = reloc_iomap(obj, eb, page);
+		}
 		if (!vaddr)
 			vaddr = reloc_kmap(obj, cache, page);
 	}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index e953965f8263..f6c4d5998ff9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -17,6 +17,18 @@ const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
 	.release = i915_gem_object_release_memory_region,
 };
 
+void __iomem *
+i915_gem_object_lmem_io_map_page_atomic(struct drm_i915_gem_object *obj,
+					unsigned long n)
+{
+	resource_size_t offset;
+
+	offset = i915_gem_object_get_dma_address(obj, n);
+	offset -= obj->mm.region->region.start;
+
+	return io_mapping_map_atomic_wc(&obj->mm.region->iomap, offset);
+}
+
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
 {
 	return obj->ops == &i915_gem_lmem_obj_ops;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
index fc3f15580fe3..bf7e11fad17b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
@@ -14,6 +14,10 @@ struct intel_memory_region;
 
 extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
 
+void __iomem *
+i915_gem_object_lmem_io_map_page_atomic(struct drm_i915_gem_object *obj,
+					unsigned long n);
+
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
 
 struct drm_i915_gem_object *
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 082/162] HAX drm/i915/lmem: support pread and pwrite
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (80 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 081/162] HAX drm/i915/lmem: support CPU relocations Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:05 ` [RFC PATCH 083/162] drm/i915: Update the helper to set correct mapping Matthew Auld
                   ` (79 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx
  Cc: Abdiel Janulgue, Steve Hampson, dri-devel, Thomas Hellström

** DO NOT MERGE. PREAD/WRITE SUPPORT WILL BE DROPPED FROM DG1+ **

We need to add support for pread'ing and pwriting an LMEM object.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Steve Hampson <steven.t.hampson@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c | 186 +++++++++++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h |   2 +
 2 files changed, 188 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index f6c4d5998ff9..840b68eb10d3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -8,6 +8,177 @@
 #include "gem/i915_gem_lmem.h"
 #include "i915_drv.h"
 
+static int
+i915_ww_pin_lock_interruptible(struct drm_i915_gem_object *obj)
+{
+	struct i915_gem_ww_ctx ww;
+	int ret;
+
+	for_i915_gem_ww(&ww, ret, true) {
+		ret = i915_gem_object_lock(obj, &ww);
+		if (ret)
+			continue;
+
+		ret = i915_gem_object_pin_pages(obj);
+		if (ret)
+			continue;
+
+		ret = i915_gem_object_set_to_wc_domain(obj, false);
+		if (ret)
+			goto out_unpin;
+
+		ret = i915_gem_object_wait(obj,
+					   I915_WAIT_INTERRUPTIBLE,
+					   MAX_SCHEDULE_TIMEOUT);
+		if (!ret)
+			continue;
+
+out_unpin:
+		i915_gem_object_unpin_pages(obj);
+
+		/* Unlocking is done implicitly */
+	}
+
+	return ret;
+}
+
+int i915_gem_object_lmem_pread(struct drm_i915_gem_object *obj,
+			       const struct drm_i915_gem_pread *arg)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct intel_runtime_pm *rpm = &i915->runtime_pm;
+	intel_wakeref_t wakeref;
+	char __user *user_data;
+	unsigned int offset;
+	unsigned long idx;
+	u64 remain;
+	int ret;
+
+	ret = i915_gem_object_wait(obj,
+				   I915_WAIT_INTERRUPTIBLE,
+				   MAX_SCHEDULE_TIMEOUT);
+	if (ret)
+		return ret;
+
+	ret = i915_ww_pin_lock_interruptible(obj);
+	if (ret)
+		return ret;
+
+	wakeref = intel_runtime_pm_get(rpm);
+
+	remain = arg->size;
+	user_data = u64_to_user_ptr(arg->data_ptr);
+	offset = offset_in_page(arg->offset);
+	for (idx = arg->offset >> PAGE_SHIFT; remain; idx++) {
+		unsigned long unwritten;
+		void __iomem *vaddr;
+		int length;
+
+		length = remain;
+		if (offset + length > PAGE_SIZE)
+			length = PAGE_SIZE - offset;
+
+		vaddr = i915_gem_object_lmem_io_map_page_atomic(obj, idx);
+		if (!vaddr) {
+			ret = -ENOMEM;
+			goto out_put;
+		}
+		unwritten = __copy_to_user_inatomic(user_data,
+						    (void __force *)vaddr + offset,
+						    length);
+		io_mapping_unmap_atomic(vaddr);
+		if (unwritten) {
+			vaddr = i915_gem_object_lmem_io_map_page(obj, idx);
+			unwritten = copy_to_user(user_data,
+						 (void __force *)vaddr + offset,
+						 length);
+			io_mapping_unmap(vaddr);
+		}
+		if (unwritten) {
+			ret = -EFAULT;
+			goto out_put;
+		}
+
+		remain -= length;
+		user_data += length;
+		offset = 0;
+	}
+
+out_put:
+	intel_runtime_pm_put(rpm, wakeref);
+	i915_gem_object_unpin_pages(obj);
+
+	return ret;
+}
+
+static int i915_gem_object_lmem_pwrite(struct drm_i915_gem_object *obj,
+				       const struct drm_i915_gem_pwrite *arg)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct intel_runtime_pm *rpm = &i915->runtime_pm;
+	intel_wakeref_t wakeref;
+	char __user *user_data;
+	unsigned int offset;
+	unsigned long idx;
+	u64 remain;
+	int ret;
+
+	ret = i915_gem_object_wait(obj,
+				   I915_WAIT_INTERRUPTIBLE,
+				   MAX_SCHEDULE_TIMEOUT);
+	if (ret)
+		return ret;
+
+	ret = i915_ww_pin_lock_interruptible(obj);
+	if (ret)
+		return ret;
+
+	wakeref = intel_runtime_pm_get(rpm);
+
+	remain = arg->size;
+	user_data = u64_to_user_ptr(arg->data_ptr);
+	offset = offset_in_page(arg->offset);
+	for (idx = arg->offset >> PAGE_SHIFT; remain; idx++) {
+		unsigned long unwritten;
+		void __iomem *vaddr;
+		int length;
+
+		length = remain;
+		if (offset + length > PAGE_SIZE)
+			length = PAGE_SIZE - offset;
+
+		vaddr = i915_gem_object_lmem_io_map_page_atomic(obj, idx);
+		if (!vaddr) {
+			ret = -ENOMEM;
+			goto out_put;
+		}
+
+		unwritten = __copy_from_user_inatomic_nocache((void __force *)vaddr + offset,
+							      user_data, length);
+		io_mapping_unmap_atomic(vaddr);
+		if (unwritten) {
+			vaddr = i915_gem_object_lmem_io_map_page(obj, idx);
+			unwritten = copy_from_user((void __force *)vaddr + offset,
+						   user_data, length);
+			io_mapping_unmap(vaddr);
+		}
+		if (unwritten) {
+			ret = -EFAULT;
+			goto out_put;
+		}
+
+		remain -= length;
+		user_data += length;
+		offset = 0;
+	}
+
+out_put:
+	intel_runtime_pm_put(rpm, wakeref);
+	i915_gem_object_unpin_pages(obj);
+
+	return ret;
+}
+
 const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
 	.name = "i915_gem_object_lmem",
 	.flags = I915_GEM_OBJECT_HAS_IOMEM,
@@ -15,8 +186,23 @@ const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
 	.get_pages = i915_gem_object_get_pages_buddy,
 	.put_pages = i915_gem_object_put_pages_buddy,
 	.release = i915_gem_object_release_memory_region,
+
+	.pread = i915_gem_object_lmem_pread,
+	.pwrite = i915_gem_object_lmem_pwrite,
 };
 
+void __iomem *
+i915_gem_object_lmem_io_map_page(struct drm_i915_gem_object *obj,
+				 unsigned long n)
+{
+	resource_size_t offset;
+
+	offset = i915_gem_object_get_dma_address(obj, n);
+	offset -= obj->mm.region->region.start;
+
+	return io_mapping_map_wc(&obj->mm.region->iomap, offset, PAGE_SIZE);
+}
+
 void __iomem *
 i915_gem_object_lmem_io_map_page_atomic(struct drm_i915_gem_object *obj,
 					unsigned long n)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
index bf7e11fad17b..a24d94bc380f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
@@ -14,6 +14,8 @@ struct intel_memory_region;
 
 extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
 
+void __iomem *i915_gem_object_lmem_io_map_page(struct drm_i915_gem_object *obj,
+					       unsigned long n);
 void __iomem *
 i915_gem_object_lmem_io_map_page_atomic(struct drm_i915_gem_object *obj,
 					unsigned long n);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 083/162] drm/i915: Update the helper to set correct mapping
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (81 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 082/162] HAX drm/i915/lmem: support pread and pwrite Matthew Auld
@ 2020-11-27 12:05 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 084/162] drm/i915: introduce kernel blitter_context Matthew Auld
                   ` (78 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:05 UTC (permalink / raw)
  To: intel-gfx
  Cc: CQ Tang, Venkata Sandeep Dhanalakota, dri-devel, Michal Wajdeczko

From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

Determine the possible coherent map type based on object location,
and if target has llc or if user requires an always coherent
mapping.

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c            |  3 ++-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c            |  2 +-
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c |  4 ++--
 drivers/gpu/drm/i915/gt/intel_ring.c                 |  9 ++++++---
 drivers/gpu/drm/i915/gt/intel_timeline.c             |  8 ++++++--
 drivers/gpu/drm/i915/gt/selftest_context.c           |  3 ++-
 drivers/gpu/drm/i915/gt/selftest_execlists.c         |  3 ++-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c         |  4 ++--
 drivers/gpu/drm/i915/gt/uc/intel_guc.c               |  4 +++-
 drivers/gpu/drm/i915/gt/uc/intel_huc.c               |  4 +++-
 drivers/gpu/drm/i915/i915_drv.h                      | 11 +++++++++--
 drivers/gpu/drm/i915/selftests/igt_spinner.c         |  4 ++--
 12 files changed, 40 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 420c6a35f3ed..677c97ded81d 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -680,7 +680,8 @@ static int init_status_page(struct intel_engine_cs *engine)
 	if (ret)
 		goto err;
 
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map(obj,
+					i915_coherent_map_type(engine->i915, obj, true));
 	if (IS_ERR(vaddr)) {
 		ret = PTR_ERR(vaddr);
 		goto err_unpin;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 5d51144ef074..1b2009b4dcb7 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -24,7 +24,7 @@ static void dbg_poison_ce(struct intel_context *ce)
 
 	if (ce->state) {
 		struct drm_i915_gem_object *obj = ce->state->obj;
-		int type = i915_coherent_map_type(ce->engine->i915);
+		int type = i915_coherent_map_type(ce->engine->i915, obj, true);
 		void *map;
 
 		if (!i915_gem_object_trylock(ce->state->obj))
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 7eec42b27bc1..582a9044727e 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -3535,8 +3535,8 @@ __execlists_context_pre_pin(struct intel_context *ce,
 	GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
 
 	*vaddr = i915_gem_object_pin_map(ce->state->obj,
-					i915_coherent_map_type(ce->engine->i915) |
-					I915_MAP_OVERRIDE);
+					 i915_coherent_map_type(ce->engine->i915, ce->state->obj, false) |
+					 I915_MAP_OVERRIDE);
 	if (IS_ERR(*vaddr))
 		return PTR_ERR(*vaddr);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
index 4034a4bac7f0..d636c6ed88b7 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -51,9 +51,12 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
 
 	if (i915_vma_is_map_and_fenceable(vma))
 		addr = (void __force *)i915_vma_pin_iomap(vma);
-	else
-		addr = i915_gem_object_pin_map(vma->obj,
-					       i915_coherent_map_type(vma->vm->i915));
+	else {
+		int type = i915_coherent_map_type(vma->vm->i915, vma->obj, false);
+
+		addr = i915_gem_object_pin_map(vma->obj, type);
+	}
+
 	if (IS_ERR(addr)) {
 		ret = PTR_ERR(addr);
 		goto err_ring;
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index b2d04717db20..065943781586 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -31,6 +31,7 @@ static int __hwsp_alloc(struct intel_gt *gt, struct intel_timeline_hwsp *hwsp)
 {
 	struct drm_i915_private *i915 = gt->i915;
 	struct drm_i915_gem_object *obj;
+	int type;
 	int ret;
 
 	obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
@@ -47,7 +48,8 @@ static int __hwsp_alloc(struct intel_gt *gt, struct intel_timeline_hwsp *hwsp)
 	}
 
 	/* Pin early so we can call i915_ggtt_pin_unlocked(). */
-	hwsp->vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	type = i915_coherent_map_type(i915, obj, true);
+	hwsp->vaddr = i915_gem_object_pin_map(obj, type);
 	if (IS_ERR(hwsp->vaddr)) {
 		ret = PTR_ERR(hwsp->vaddr);
 		goto out_unlock;
@@ -235,9 +237,11 @@ intel_timeline_pin_map(struct intel_timeline *timeline)
 	if (!timeline->hwsp_cacheline) {
 		struct drm_i915_gem_object *obj = timeline->hwsp_ggtt->obj;
 		u32 ofs = offset_in_page(timeline->hwsp_offset);
+		int type;
 		void *vaddr;
 
-		vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+		type = i915_coherent_map_type(timeline->gt->i915, obj, true);
+		vaddr = i915_gem_object_pin_map(obj, type);
 		if (IS_ERR(vaddr))
 			return PTR_ERR(vaddr);
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
index d9b0ebc938f1..86b6795dc4f3 100644
--- a/drivers/gpu/drm/i915/gt/selftest_context.c
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -89,7 +89,8 @@ static int __live_context_size(struct intel_engine_cs *engine)
 		goto err;
 
 	vaddr = i915_gem_object_pin_map_unlocked(ce->state->obj,
-						 i915_coherent_map_type(engine->i915));
+						 i915_coherent_map_type(engine->i915,
+									ce->state->obj, false));
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
 		intel_context_unpin(ce);
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index 124011f6fb51..cb17da6a616f 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -5854,7 +5854,8 @@ static int compare_isolation(struct intel_engine_cs *engine,
 	}
 
 	lrc = i915_gem_object_pin_map_unlocked(ce->state->obj,
-				      i915_coherent_map_type(engine->i915));
+					       i915_coherent_map_type(engine->i915,
+								      ce->state->obj, true));
 	if (IS_ERR(lrc)) {
 		err = PTR_ERR(lrc);
 		goto err_B1;
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index e3027cebab5b..bc93dba3c8df 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -88,7 +88,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
 	h->seqno = memset(vaddr, 0xff, PAGE_SIZE);
 
 	vaddr = i915_gem_object_pin_map_unlocked(h->obj,
-						 i915_coherent_map_type(gt->i915));
+						 i915_coherent_map_type(gt->i915, h->obj, false));
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
 		goto err_unpin_hws;
@@ -149,7 +149,7 @@ hang_create_request(struct hang *h, struct intel_engine_cs *engine)
 		return ERR_CAST(obj);
 	}
 
-	vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915));
+	vaddr = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(gt->i915, obj, false));
 	if (IS_ERR(vaddr)) {
 		i915_gem_object_put(obj);
 		i915_vm_put(vm);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index a65661eb5d5d..b54b9de31c3e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -694,7 +694,9 @@ int intel_guc_allocate_and_map_vma(struct intel_guc *guc, u32 size,
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
-	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
+						 i915_coherent_map_type(guc_to_gt(guc)->i915,
+									vma->obj, true));
 	if (IS_ERR(vaddr)) {
 		i915_vma_unpin_and_release(&vma, 0);
 		return PTR_ERR(vaddr);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
index 2126dd81ac38..56d2144dc6a0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
@@ -82,7 +82,9 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
-	vaddr = i915_gem_object_pin_map_unlocked(vma->obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map_unlocked(vma->obj,
+						 i915_coherent_map_type(gt->i915,
+									vma->obj, true));
 	if (IS_ERR(vaddr)) {
 		i915_vma_unpin_and_release(&vma, 0);
 		return PTR_ERR(vaddr);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ce8d5ff8b9f4..13cb4936f15c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -78,6 +78,7 @@
 #include "gem/i915_gem_context_types.h"
 #include "gem/i915_gem_shrinker.h"
 #include "gem/i915_gem_stolen.h"
+#include "gem/i915_gem_lmem.h"
 
 #include "gt/intel_engine.h"
 #include "gt/intel_gt_types.h"
@@ -2027,9 +2028,15 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
 }
 
 static inline enum i915_map_type
-i915_coherent_map_type(struct drm_i915_private *i915)
+i915_coherent_map_type(struct drm_i915_private *i915,
+		       struct drm_i915_gem_object *obj, bool always_coherent)
 {
-	return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
+	if (i915_gem_object_is_lmem(obj))
+		return I915_MAP_WC;
+	if (HAS_LLC(i915) || always_coherent)
+		return I915_MAP_WB;
+	else
+		return I915_MAP_WC;
 }
 
 static inline u64 i915_cs_timestamp_ns_to_ticks(struct drm_i915_private *i915, u64 val)
diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
index 9c461edb0b73..b2a1f98c97f5 100644
--- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
+++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
@@ -93,9 +93,9 @@ int igt_spinner_pin(struct igt_spinner *spin,
 	}
 
 	if (!spin->batch) {
-		unsigned int mode =
-			i915_coherent_map_type(spin->gt->i915);
+		unsigned int mode;
 
+		mode = i915_coherent_map_type(spin->gt->i915, spin->obj, false);
 		vaddr = igt_spinner_pin_obj(ce, ww, spin->obj, mode, &spin->batch_vma);
 		if (IS_ERR(vaddr))
 			return PTR_ERR(vaddr);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 084/162] drm/i915: introduce kernel blitter_context
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (82 preceding siblings ...)
  2020-11-27 12:05 ` [RFC PATCH 083/162] drm/i915: Update the helper to set correct mapping Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 085/162] drm/i915/region: support basic eviction Matthew Auld
                   ` (77 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Abdiel Janulgue, dri-devel

We may be without a context to perform various internal blitter
operations, for example when performing object migration. Piggybacking
off the kernel_context is probably a bad idea, since it has other uses.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine.h       |  2 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    | 40 +++++++++++++++++++-
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  1 +
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 760fefdfe392..188c5ff6dc64 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -186,6 +186,8 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
 #define I915_GEM_HWS_PREEMPT_ADDR	(I915_GEM_HWS_PREEMPT * sizeof(u32))
 #define I915_GEM_HWS_SEQNO		0x40
 #define I915_GEM_HWS_SEQNO_ADDR		(I915_GEM_HWS_SEQNO * sizeof(u32))
+#define I915_GEM_HWS_BLITTER		0x42
+#define I915_GEM_HWS_BLITTER_ADDR	(I915_GEM_HWS_BLITTER * sizeof(u32))
 #define I915_GEM_HWS_SCRATCH		0x80
 
 #define I915_HWS_CSB_BUF0_INDEX		0x10
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 677c97ded81d..0ba020346566 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -819,6 +819,7 @@ create_pinned_context(struct intel_engine_cs *engine,
 	int err;
 
 	ce = intel_context_create(engine);
+
 	if (IS_ERR(ce))
 		return ce;
 
@@ -851,6 +852,20 @@ create_kernel_context(struct intel_engine_cs *engine)
 				     &kernel, "kernel_context");
 }
 
+static struct intel_context *
+create_blitter_context(struct intel_engine_cs *engine)
+{
+	static struct lock_class_key blitter;
+	struct intel_context *ce;
+
+	ce = create_pinned_context(engine, I915_GEM_HWS_BLITTER_ADDR, &blitter,
+				   "blitter_context");
+	if (IS_ERR(ce))
+		return ce;
+
+	return ce;
+}
+
 /**
  * intel_engines_init_common - initialize cengine state which might require hw access
  * @engine: Engine to initialize.
@@ -881,17 +896,33 @@ static int engine_init_common(struct intel_engine_cs *engine)
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
+	engine->kernel_context = ce;
 	ret = measure_breadcrumb_dw(ce);
 	if (ret < 0)
 		goto err_context;
 
 	engine->emit_fini_breadcrumb_dw = ret;
-	engine->kernel_context = ce;
+
+	/*
+	 * The blitter context is used to quickly memset or migrate objects
+	 * in local memory, so it has to always be available.
+	 */
+	if (engine->class == COPY_ENGINE_CLASS) {
+		ce = create_blitter_context(engine);
+		if (IS_ERR(ce)) {
+			ret = PTR_ERR(ce);
+			goto err_unpin;
+		}
+
+		engine->blitter_context = ce;
+	}
 
 	return 0;
 
+err_unpin:
+	intel_context_unpin(engine->kernel_context);
 err_context:
-	intel_context_put(ce);
+	intel_context_put(engine->kernel_context);
 	return ret;
 }
 
@@ -947,6 +978,11 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 	if (engine->default_state)
 		fput(engine->default_state);
 
+	if (engine->blitter_context) {
+		intel_context_unpin(engine->blitter_context);
+		intel_context_put(engine->blitter_context);
+	}
+
 	if (engine->kernel_context) {
 		intel_context_unpin(engine->kernel_context);
 		intel_context_put(engine->kernel_context);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index ee6312601c56..cb2de4bf86ba 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -347,6 +347,7 @@ struct intel_engine_cs {
 	struct llist_head barrier_tasks;
 
 	struct intel_context *kernel_context; /* pinned */
+	struct intel_context *blitter_context; /* pinned; exists for BCS only */
 
 	intel_engine_mask_t saturated; /* submitting semaphores too late? */
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 085/162] drm/i915/region: support basic eviction
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (83 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 084/162] drm/i915: introduce kernel blitter_context Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 086/162] drm/i915: Add blit functions that can be called from within a WW transaction Matthew Auld
                   ` (76 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Abdiel Janulgue, dri-devel

Support basic eviction for regions.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  | 59 ++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.h  |  4 +
 drivers/gpu/drm/i915/i915_gem.c               | 17 +++++
 drivers/gpu/drm/i915/intel_memory_region.c    | 24 +++++-
 .../drm/i915/selftests/intel_memory_region.c  | 76 +++++++++++++++++++
 6 files changed, 178 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index b172e8cc53ab..6d101275bc9d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -226,6 +226,7 @@ struct drm_i915_gem_object {
 		 * region->obj_lock.
 		 */
 		struct list_head region_link;
+		struct list_head tmp_link;
 
 		struct sg_table *pages;
 		void *mapping;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index e42192834c88..4d346df8fd5b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -16,6 +16,7 @@
 #include "gt/intel_gt_requests.h"
 
 #include "i915_trace.h"
+#include "gt/intel_gt_requests.h"
 
 static bool swap_available(void)
 {
@@ -271,6 +272,64 @@ unsigned long i915_gem_shrink_all(struct drm_i915_private *i915)
 	return freed;
 }
 
+int i915_gem_shrink_memory_region(struct intel_memory_region *mem,
+				  resource_size_t target)
+{
+	struct drm_i915_private *i915 = mem->i915;
+	struct drm_i915_gem_object *obj;
+	resource_size_t purged;
+	LIST_HEAD(purgeable);
+	int err = -ENOSPC;
+
+	intel_gt_retire_requests(&i915->gt);
+
+	purged = 0;
+
+	mutex_lock(&mem->objects.lock);
+
+	while ((obj = list_first_entry_or_null(&mem->objects.purgeable,
+					       typeof(*obj),
+					       mm.region_link))) {
+		list_move_tail(&obj->mm.region_link, &purgeable);
+
+		if (!i915_gem_object_has_pages(obj))
+			continue;
+
+		if (i915_gem_object_is_framebuffer(obj))
+			continue;
+
+		if (!kref_get_unless_zero(&obj->base.refcount))
+			continue;
+
+		mutex_unlock(&mem->objects.lock);
+
+		if (!i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE)) {
+			if (i915_gem_object_trylock(obj)) {
+				__i915_gem_object_put_pages(obj);
+				if (!i915_gem_object_has_pages(obj)) {
+					purged += obj->base.size;
+					if (!i915_gem_object_is_volatile(obj))
+						obj->mm.madv = __I915_MADV_PURGED;
+				}
+				i915_gem_object_unlock(obj);
+			}
+		}
+
+		i915_gem_object_put(obj);
+
+		mutex_lock(&mem->objects.lock);
+
+		if (purged >= target) {
+			err = 0;
+			break;
+		}
+	}
+
+	list_splice_tail(&purgeable, &mem->objects.purgeable);
+	mutex_unlock(&mem->objects.lock);
+	return err;
+}
+
 static unsigned long
 i915_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
index 8512470f6fd6..c945f3b587d6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
@@ -7,10 +7,12 @@
 #define __I915_GEM_SHRINKER_H__
 
 #include <linux/bits.h>
+#include <linux/types.h>
 
 struct drm_i915_private;
 struct i915_gem_ww_ctx;
 struct mutex;
+struct intel_memory_region;
 
 /* i915_gem_shrinker.c */
 unsigned long i915_gem_shrink(struct i915_gem_ww_ctx *ww,
@@ -29,5 +31,7 @@ void i915_gem_driver_register__shrinker(struct drm_i915_private *i915);
 void i915_gem_driver_unregister__shrinker(struct drm_i915_private *i915);
 void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915,
 				    struct mutex *mutex);
+int i915_gem_shrink_memory_region(struct intel_memory_region *mem,
+				  resource_size_t target);
 
 #endif /* __I915_GEM_SHRINKER_H__ */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2662d679db6e..ef2124c17a7f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1104,6 +1104,23 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 	    !i915_gem_object_has_pages(obj))
 		i915_gem_object_truncate(obj);
 
+	if (obj->mm.region && i915_gem_object_has_pages(obj)) {
+		mutex_lock(&obj->mm.region->objects.lock);
+
+		switch (obj->mm.madv) {
+		case I915_MADV_WILLNEED:
+			list_move(&obj->mm.region_link,
+				  &obj->mm.region->objects.list);
+			break;
+		default:
+			list_move(&obj->mm.region_link,
+				  &obj->mm.region->objects.purgeable);
+			break;
+		}
+
+		mutex_unlock(&obj->mm.region->objects.lock);
+	}
+
 	args->retained = obj->mm.madv != __I915_MADV_PURGED;
 
 	i915_gem_object_unlock(obj);
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index b326993a1026..308f89b87834 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -97,7 +97,8 @@ __intel_memory_region_get_pages_buddy(struct intel_memory_region *mem,
 	do {
 		struct i915_buddy_block *block;
 		unsigned int order;
-
+		bool retry = true;
+retry:
 		order = fls(n_pages) - 1;
 		GEM_BUG_ON(order > mem->mm.max_order);
 		GEM_BUG_ON(order < min_order);
@@ -107,8 +108,25 @@ __intel_memory_region_get_pages_buddy(struct intel_memory_region *mem,
 			if (!IS_ERR(block))
 				break;
 
-			if (order-- == min_order)
-				goto err_free_blocks;
+			if (order-- == min_order) {
+				resource_size_t target;
+				int err;
+
+				if (!retry)
+					goto err_free_blocks;
+
+				target = n_pages * mem->mm.chunk_size;
+
+				mutex_unlock(&mem->mm_lock);
+				err = i915_gem_shrink_memory_region(mem,
+								    target);
+				mutex_lock(&mem->mm_lock);
+				if (err)
+					goto err_free_blocks;
+
+				retry = false;
+				goto retry;
+			}
 		} while (1);
 
 		n_pages -= BIT(order);
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index 9c20b7065fc5..84525ddba321 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -848,12 +848,88 @@ static int perf_memcpy(void *arg)
 	return 0;
 }
 
+static void igt_mark_evictable(struct drm_i915_gem_object *obj)
+{
+	i915_gem_object_unpin_pages(obj);
+	obj->mm.madv = I915_MADV_DONTNEED;
+	list_move(&obj->mm.region_link, &obj->mm.region->objects.purgeable);
+}
+
+static int igt_mock_shrink(void *arg)
+{
+	struct intel_memory_region *mem = arg;
+	struct drm_i915_gem_object *obj;
+	unsigned long n_objects;
+	LIST_HEAD(objects);
+	resource_size_t target;
+	resource_size_t total;
+	int err = 0;
+
+	target = mem->mm.chunk_size;
+	total = resource_size(&mem->region);
+	n_objects = total / target;
+
+	while (n_objects--) {
+		obj = i915_gem_object_create_region(mem,
+						    target,
+						    0);
+		if (IS_ERR(obj)) {
+			err = PTR_ERR(obj);
+			goto err_close_objects;
+		}
+
+		list_add(&obj->st_link, &objects);
+
+		err = i915_gem_object_pin_pages(obj);
+		if (err)
+			goto err_close_objects;
+
+		/*
+		 * Make half of the region evictable, though do so in a
+		 * horribly fragmented fashion.
+		 */
+		if (n_objects % 2)
+			igt_mark_evictable(obj);
+	}
+
+	while (target <= total / 2) {
+		obj = i915_gem_object_create_region(mem, target, 0);
+		if (IS_ERR(obj)) {
+			err = PTR_ERR(obj);
+			goto err_close_objects;
+		}
+
+		list_add(&obj->st_link, &objects);
+
+		/* Provoke the shrinker to start violently swinging its axe! */
+		err = i915_gem_object_pin_pages(obj);
+		if (err) {
+			pr_err("failed to shrink for target=%pa", &target);
+			goto err_close_objects;
+		}
+
+		/* Again, half of the region should remain evictable */
+		igt_mark_evictable(obj);
+
+		target <<= 1;
+	}
+
+err_close_objects:
+	close_objects(mem, &objects);
+
+	if (err == -ENOMEM)
+		err = 0;
+
+	return err;
+}
+
 int intel_memory_region_mock_selftests(void)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(igt_mock_fill),
 		SUBTEST(igt_mock_contiguous),
 		SUBTEST(igt_mock_splintered_region),
+		SUBTEST(igt_mock_shrink),
 	};
 	struct intel_memory_region *mem;
 	struct drm_i915_private *i915;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 086/162] drm/i915: Add blit functions that can be called from within a WW transaction
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (84 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 085/162] drm/i915/region: support basic eviction Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 087/162] drm/i915: Delay publishing objects on the eviction lists Matthew Auld
                   ` (75 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Hellström, dri-devel

From: Thomas Hellström <thomas.hellstrom@intel.com>

We want to be able to blit from within a ww transaction, so add
blit functions that are able to do that. Also take care to unlock the
blit batch-buffer after use so it isn't recycled locked.

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_object_blt.c    | 91 +++++++++++++------
 .../gpu/drm/i915/gem/i915_gem_object_blt.h    | 10 ++
 2 files changed, 72 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
index e0b873c3f46a..b41b076f6864 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
@@ -145,11 +145,11 @@ move_obj_to_gpu(struct drm_i915_gem_object *obj,
 	return i915_request_await_object(rq, obj, write);
 }
 
-int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
-			     struct intel_context *ce,
-			     u32 value)
+int i915_gem_object_ww_fill_blt(struct drm_i915_gem_object *obj,
+				struct i915_gem_ww_ctx *ww,
+				struct intel_context *ce,
+				u32 value)
 {
-	struct i915_gem_ww_ctx ww;
 	struct i915_request *rq;
 	struct i915_vma *batch;
 	struct i915_vma *vma;
@@ -159,22 +159,16 @@ int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
-	i915_gem_ww_ctx_init(&ww, true);
 	intel_engine_pm_get(ce->engine);
-retry:
-	err = i915_gem_object_lock(obj, &ww);
+	err = intel_context_pin_ww(ce, ww);
 	if (err)
 		goto out;
 
-	err = intel_context_pin_ww(ce, &ww);
-	if (err)
-		goto out;
-
-	err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_USER);
+	err = i915_vma_pin_ww(vma, ww, 0, 0, PIN_USER);
 	if (err)
 		goto out_ctx;
 
-	batch = intel_emit_vma_fill_blt(ce, vma, &ww, value);
+	batch = intel_emit_vma_fill_blt(ce, vma, ww, value);
 	if (IS_ERR(batch)) {
 		err = PTR_ERR(batch);
 		goto out_vma;
@@ -210,22 +204,43 @@ int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
 
 	i915_request_add(rq);
 out_batch:
+	i915_gem_ww_unlock_single(batch->obj);
 	intel_emit_vma_release(ce, batch);
 out_vma:
 	i915_vma_unpin(vma);
 out_ctx:
 	intel_context_unpin(ce);
 out:
+	intel_engine_pm_put(ce->engine);
+	return err;
+}
+
+int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
+				struct intel_context *ce,
+				u32 value)
+{
+	struct i915_gem_ww_ctx ww;
+	int err;
+
+	i915_gem_ww_ctx_init(&ww, true);
+retry:
+	err = i915_gem_object_lock(obj, &ww);
+	if (err)
+		goto out_err;
+
+	err = i915_gem_object_ww_fill_blt(obj, &ww, ce, value);
+out_err:
 	if (err == -EDEADLK) {
 		err = i915_gem_ww_ctx_backoff(&ww);
 		if (!err)
 			goto retry;
 	}
 	i915_gem_ww_ctx_fini(&ww);
-	intel_engine_pm_put(ce->engine);
+
 	return err;
 }
 
+
 /* Wa_1209644611:icl,ehl */
 static bool wa_1209644611_applies(struct drm_i915_private *i915, u32 size)
 {
@@ -354,13 +369,13 @@ struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce,
 	return ERR_PTR(err);
 }
 
-int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
-			     struct drm_i915_gem_object *dst,
-			     struct intel_context *ce)
+int i915_gem_object_ww_copy_blt(struct drm_i915_gem_object *src,
+				struct drm_i915_gem_object *dst,
+				struct i915_gem_ww_ctx *ww,
+				struct intel_context *ce)
 {
 	struct i915_address_space *vm = ce->vm;
 	struct i915_vma *vma[2], *batch;
-	struct i915_gem_ww_ctx ww;
 	struct i915_request *rq;
 	int err, i;
 
@@ -372,26 +387,20 @@ int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
 	if (IS_ERR(vma[1]))
 		return PTR_ERR(vma[1]);
 
-	i915_gem_ww_ctx_init(&ww, true);
 	intel_engine_pm_get(ce->engine);
-retry:
-	err = i915_gem_object_lock(src, &ww);
-	if (!err)
-		err = i915_gem_object_lock(dst, &ww);
-	if (!err)
-		err = intel_context_pin_ww(ce, &ww);
+	err = intel_context_pin_ww(ce, ww);
 	if (err)
 		goto out;
 
-	err = i915_vma_pin_ww(vma[0], &ww, 0, 0, PIN_USER);
+	err = i915_vma_pin_ww(vma[0], ww, 0, 0, PIN_USER);
 	if (err)
 		goto out_ctx;
 
-	err = i915_vma_pin_ww(vma[1], &ww, 0, 0, PIN_USER);
+	err = i915_vma_pin_ww(vma[1], ww, 0, 0, PIN_USER);
 	if (unlikely(err))
 		goto out_unpin_src;
 
-	batch = intel_emit_vma_copy_blt(ce, &ww, vma[0], vma[1]);
+	batch = intel_emit_vma_copy_blt(ce, ww, vma[0], vma[1]);
 	if (IS_ERR(batch)) {
 		err = PTR_ERR(batch);
 		goto out_unpin_dst;
@@ -437,6 +446,7 @@ int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
 
 	i915_request_add(rq);
 out_batch:
+	i915_gem_ww_unlock_single(batch->obj);
 	intel_emit_vma_release(ce, batch);
 out_unpin_dst:
 	i915_vma_unpin(vma[1]);
@@ -445,13 +455,36 @@ int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
 out_ctx:
 	intel_context_unpin(ce);
 out:
+	intel_engine_pm_put(ce->engine);
+	return err;
+}
+
+int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
+			     struct drm_i915_gem_object *dst,
+			     struct intel_context *ce)
+{
+	struct i915_gem_ww_ctx ww;
+	int err;
+
+	i915_gem_ww_ctx_init(&ww, true);
+retry:
+	err = i915_gem_object_lock(src, &ww);
+	if (err)
+		goto out_err;
+
+	err = i915_gem_object_lock(dst, &ww);
+	if (err)
+		goto out_err;
+
+	err = i915_gem_object_ww_copy_blt(src, dst, &ww, ce);
+out_err:
 	if (err == -EDEADLK) {
 		err = i915_gem_ww_ctx_backoff(&ww);
 		if (!err)
 			goto retry;
 	}
 	i915_gem_ww_ctx_fini(&ww);
-	intel_engine_pm_put(ce->engine);
+
 	return err;
 }
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
index 2409fdcccf0e..da3d66abde64 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
@@ -36,4 +36,14 @@ int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
 			     struct drm_i915_gem_object *dst,
 			     struct intel_context *ce);
 
+int i915_gem_object_ww_fill_blt(struct drm_i915_gem_object *obj,
+				struct i915_gem_ww_ctx *ww,
+				struct intel_context *ce,
+				u32 value);
+
+int i915_gem_object_ww_copy_blt(struct drm_i915_gem_object *src,
+				struct drm_i915_gem_object *dst,
+				struct i915_gem_ww_ctx *ww,
+				struct intel_context *ce);
+
 #endif
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 087/162] drm/i915: Delay publishing objects on the eviction lists
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (85 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 086/162] drm/i915: Add blit functions that can be called from within a WW transaction Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 088/162] drm/i915: support basic object migration Matthew Auld
                   ` (74 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Hellström, dri-devel

From: Thomas Hellström <thomas.hellstrom@intel.com>

When an object is published on an eviction list, it's considered for
eviction and can be locked by other threads. This is strictly not
necessary until the object has pages. To limit eviction lookups that
need to discard the object and facilitate a longer period during
which we can lock the object isolated (trylock or ww lock without
chance of deadlock or interruption), delay eviction list publishing
until pages are set. Also take the object off the eviction lists when
pages are unset. Finally make sure that an object is either locked or
isolated when eviction list manipulation happens.

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c |  2 ++
 drivers/gpu/drm/i915/gem/i915_gem_pages.c  | 22 +++++++++++++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_region.c | 18 ++----------------
 3 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 08d806bbf48e..5326b4b5a9f7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -66,6 +66,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 	INIT_LIST_HEAD(&obj->vma.list);
 
 	INIT_LIST_HEAD(&obj->mm.link);
+	INIT_LIST_HEAD(&obj->mm.region_link);
 
 	INIT_LIST_HEAD(&obj->lut_list);
 	spin_lock_init(&obj->lut_lock);
@@ -79,6 +80,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 	GEM_BUG_ON(flags & ~I915_BO_ALLOC_FLAGS);
 	obj->flags = flags;
 
+	obj->mm.region = NULL;
 	obj->mm.madv = I915_MADV_WILLNEED;
 	INIT_RADIX_TREE(&obj->mm.get_page.radix, GFP_KERNEL | __GFP_NOWARN);
 	mutex_init(&obj->mm.get_page.lock);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 4a8be759832b..eacad971b955 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -16,6 +16,8 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	unsigned long supported = INTEL_INFO(i915)->page_sizes;
+	struct intel_memory_region *mem;
+	struct list_head *list;
 	int i;
 
 	assert_object_held_shared(obj);
@@ -64,7 +66,6 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 	GEM_BUG_ON(!HAS_PAGE_SIZES(i915, obj->mm.page_sizes.sg));
 
 	if (i915_gem_object_is_shrinkable(obj)) {
-		struct list_head *list;
 		unsigned long flags;
 
 		assert_object_held(obj);
@@ -82,6 +83,18 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 		atomic_set(&obj->mm.shrink_pin, 0);
 		spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
 	}
+
+	mem = obj->mm.region;
+	if (mem) {
+		mutex_lock(&mem->objects.lock);
+		GEM_WARN_ON(!list_empty(&obj->mm.region_link));
+		if (obj->mm.madv != I915_MADV_WILLNEED)
+			list = &mem->objects.purgeable;
+		else
+			list = &mem->objects.list;
+		list_move_tail(&obj->mm.region_link, list);
+		mutex_unlock(&mem->objects.lock);
+	}
 }
 
 int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
@@ -192,6 +205,7 @@ static void unmap_object(struct drm_i915_gem_object *obj, void *ptr)
 struct sg_table *
 __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)
 {
+	struct intel_memory_region *mem = obj->mm.region;
 	struct sg_table *pages;
 
 	assert_object_held_shared(obj);
@@ -205,6 +219,12 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)
 
 	i915_gem_object_make_unshrinkable(obj);
 
+	if (mem) {
+		mutex_lock(&mem->objects.lock);
+		list_del_init(&obj->mm.region_link);
+		mutex_unlock(&mem->objects.lock);
+	}
+
 	if (obj->mm.mapping) {
 		unmap_object(obj, page_mask_bits(obj->mm.mapping));
 		obj->mm.mapping = NULL;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c
index 6a96741253b3..58bf5f9e3199 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
@@ -105,30 +105,16 @@ void i915_gem_object_init_memory_region(struct drm_i915_gem_object *obj,
 					struct intel_memory_region *mem)
 {
 	INIT_LIST_HEAD(&obj->mm.blocks);
+	WARN_ON(i915_gem_object_has_pages(obj));
 	obj->mm.region = intel_memory_region_get(mem);
 
 	if (obj->base.size <= mem->min_page_size)
 		obj->flags |= I915_BO_ALLOC_CONTIGUOUS;
-
-	mutex_lock(&mem->objects.lock);
-
-	if (obj->flags & I915_BO_ALLOC_VOLATILE)
-		list_add(&obj->mm.region_link, &mem->objects.purgeable);
-	else
-		list_add(&obj->mm.region_link, &mem->objects.list);
-
-	mutex_unlock(&mem->objects.lock);
 }
 
 void i915_gem_object_release_memory_region(struct drm_i915_gem_object *obj)
 {
-	struct intel_memory_region *mem = obj->mm.region;
-
-	mutex_lock(&mem->objects.lock);
-	list_del(&obj->mm.region_link);
-	mutex_unlock(&mem->objects.lock);
-
-	intel_memory_region_put(mem);
+	intel_memory_region_put(obj->mm.region);
 }
 
 struct drm_i915_gem_object *
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 088/162] drm/i915: support basic object migration
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (86 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 087/162] drm/i915: Delay publishing objects on the eviction lists Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 089/162] drm/i915/dg1: Fix occasional migration error Matthew Auld
                   ` (73 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx
  Cc: Abdiel Janulgue, Tvrtko Ursulin, Sudeep Dutt, dri-devel, CQ Tang,
	Daniele Ceraolo Spurio, Prathap Kumar Valsan

We are going want to able to move objects between different regions
like system memory and local memory. In the future everything should
be just another region.

Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c      |  13 ++
 drivers/gpu/drm/i915/gem/i915_gem_mman.h      |   2 +
 drivers/gpu/drm/i915/gem/i915_gem_object.c    | 125 +++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   9 +
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     |   2 +-
 .../drm/i915/selftests/intel_memory_region.c  | 174 +++++++++++++++++-
 6 files changed, 322 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index 2561a2f1e54f..4e8a05c35252 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -546,6 +546,19 @@ void i915_gem_object_release_mmap_offset(struct drm_i915_gem_object *obj)
 	spin_unlock(&obj->mmo.lock);
 }
 
+/**
+ * i915_gem_object_release_mmap - remove physical page mappings
+ * @obj: obj in question
+ *
+ * Preserve the reservation of the mmapping with the DRM core code, but
+ * relinquish ownership of the pages back to the system.
+ */
+void i915_gem_object_release_mmap(struct drm_i915_gem_object *obj)
+{
+	i915_gem_object_release_mmap_gtt(obj);
+	i915_gem_object_release_mmap_offset(obj);
+}
+
 static struct i915_mmap_offset *
 lookup_mmo(struct drm_i915_gem_object *obj,
 	   enum i915_mmap_type mmap_type)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.h b/drivers/gpu/drm/i915/gem/i915_gem_mman.h
index efee9e0d2508..7c5ccdf59359 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.h
@@ -24,6 +24,8 @@ int i915_gem_dumb_mmap_offset(struct drm_file *file_priv,
 			      struct drm_device *dev,
 			      u32 handle, u64 *offset);
 
+void i915_gem_object_release_mmap(struct drm_i915_gem_object *obj);
+
 void __i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj);
 void i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 5326b4b5a9f7..7ff430503497 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -26,11 +26,14 @@
 
 #include "display/intel_frontbuffer.h"
 #include "gt/intel_gt.h"
+#include "gt/intel_gt_requests.h"
 #include "i915_drv.h"
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
 #include "i915_gem_mman.h"
 #include "i915_gem_object.h"
+#include "i915_gem_object_blt.h"
+#include "i915_gem_region.h"
 #include "i915_globals.h"
 #include "i915_trace.h"
 
@@ -311,6 +314,128 @@ static void i915_gem_free_object(struct drm_gem_object *gem_obj)
 		queue_work(i915->wq, &i915->mm.free_work);
 }
 
+int i915_gem_object_prepare_move(struct drm_i915_gem_object *obj)
+{
+	int err;
+
+	assert_object_held(obj);
+
+	if (obj->mm.madv != I915_MADV_WILLNEED)
+		return -EINVAL;
+
+	if (i915_gem_object_needs_bit17_swizzle(obj))
+		return -EINVAL;
+
+	if (i915_gem_object_is_framebuffer(obj))
+		return -EBUSY;
+
+	i915_gem_object_release_mmap(obj);
+
+	GEM_BUG_ON(obj->mm.mapping);
+	GEM_BUG_ON(obj->base.filp && mapping_mapped(obj->base.filp->f_mapping));
+
+	err = i915_gem_object_wait(obj,
+				   I915_WAIT_INTERRUPTIBLE |
+				   I915_WAIT_ALL,
+				   MAX_SCHEDULE_TIMEOUT);
+	if (err)
+		return err;
+
+	return i915_gem_object_unbind(obj,
+				      I915_GEM_OBJECT_UNBIND_ACTIVE);
+}
+
+int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
+			    struct i915_gem_ww_ctx *ww,
+			    struct intel_context *ce,
+			    enum intel_region_id id)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct drm_i915_gem_object *donor;
+	struct intel_memory_region *mem;
+	struct sg_table *pages = NULL;
+	unsigned int page_sizes;
+	int err = 0;
+
+	assert_object_held(obj);
+	GEM_BUG_ON(id >= INTEL_REGION_UNKNOWN);
+	GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
+	if (obj->mm.region->id == id)
+		return 0;
+
+	mem = i915->mm.regions[id];
+
+	donor = i915_gem_object_create_region(mem, obj->base.size, 0);
+	if (IS_ERR(donor)) {
+		err = PTR_ERR(donor);
+		return err;
+	}
+
+	err = i915_gem_object_lock(donor, ww);
+	if (err)
+		goto err_put_donor;
+
+	/* Copy backing-pages if we have to */
+	if (i915_gem_object_has_pages(obj) ||
+	    obj->base.filp) {
+		err = i915_gem_object_ww_copy_blt(obj, donor, ww, ce);
+		if (err)
+			goto unlock_donor;
+	}
+
+	err = i915_gem_object_set_to_cpu_domain(donor, false);
+	if (err)
+		goto unlock_donor;
+
+	intel_gt_retire_requests(&i915->gt);
+
+	i915_gem_object_unbind(donor, 0);
+	err = i915_gem_object_unbind(obj, 0);
+	if (err)
+		goto unlock_donor;
+
+	pages = __i915_gem_object_unset_pages(obj);
+	if (pages)
+		obj->ops->put_pages(obj, pages);
+
+	page_sizes = donor->mm.page_sizes.phys;
+	pages = __i915_gem_object_unset_pages(donor);
+
+	if (obj->ops->release)
+		obj->ops->release(obj);
+
+	/* We need still need a little special casing for shmem */
+	if (obj->base.filp)
+		fput(fetch_and_zero(&obj->base.filp));
+	else if (donor->base.filp) {
+		atomic_long_inc(&donor->base.filp->f_count);
+		obj->base.filp = donor->base.filp;
+	}
+
+	obj->base.size = donor->base.size;
+	obj->mm.region = intel_memory_region_get(mem);
+	obj->flags = donor->flags;
+	obj->ops = donor->ops;
+	obj->cache_level = donor->cache_level;
+	obj->cache_coherent = donor->cache_coherent;
+	obj->cache_dirty = donor->cache_dirty;
+
+	list_replace_init(&donor->mm.blocks, &obj->mm.blocks);
+
+	/* set pages after migrated */
+	if (pages)
+		__i915_gem_object_set_pages(obj, pages, page_sizes);
+
+	GEM_BUG_ON(i915_gem_object_has_pages(donor));
+	GEM_BUG_ON(i915_gem_object_has_pinned_pages(donor));
+unlock_donor:
+	i915_gem_ww_unlock_single(donor);
+err_put_donor:
+	i915_gem_object_put(donor);
+
+	return err;
+}
+
 static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj)
 {
 	return !(obj->cache_level == I915_CACHE_NONE ||
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index c6c7ab181a65..1a1aa71a4494 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -51,8 +51,17 @@ void i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj,
 				    struct sg_table *pages);
 
 
+enum intel_region_id;
+int i915_gem_object_prepare_move(struct drm_i915_gem_object *obj);
+int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
+			    struct i915_gem_ww_ctx *ww,
+			    struct intel_context *ce,
+			    enum intel_region_id id);
+
 void i915_gem_flush_free_objects(struct drm_i915_private *i915);
 
+void __i915_gem_object_reset_page_iter(struct drm_i915_gem_object *obj);
+
 struct sg_table *
 __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj);
 void i915_gem_object_truncate(struct drm_i915_gem_object *obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index eacad971b955..2cdb7cf63383 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -183,7 +183,7 @@ void i915_gem_object_writeback(struct drm_i915_gem_object *obj)
 		obj->ops->writeback(obj);
 }
 
-static void __i915_gem_object_reset_page_iter(struct drm_i915_gem_object *obj)
+void __i915_gem_object_reset_page_iter(struct drm_i915_gem_object *obj)
 {
 	struct radix_tree_iter iter;
 	void __rcu **slot;
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index 84525ddba321..7acb94e0e5fe 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -14,6 +14,7 @@
 
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_lmem.h"
+#include "gem/i915_gem_object_blt.h"
 #include "gem/i915_gem_region.h"
 #include "gem/i915_gem_object_blt.h"
 #include "gem/selftests/igt_gem_utils.h"
@@ -476,6 +477,71 @@ static int igt_lmem_create(void *arg)
 	return err;
 }
 
+static int igt_smem_create_migrate(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_context *ce = i915->gt.engine[BCS0]->kernel_context;
+	struct drm_i915_gem_object *obj;
+	struct i915_gem_ww_ctx ww;
+	int err = 0;
+
+	/* Switch object backing-store on create */
+	obj = i915_gem_object_create_lmem(i915, PAGE_SIZE, 0);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	for_i915_gem_ww(&ww, err, true) {
+		err = i915_gem_object_lock(obj, &ww);
+		if (err)
+			continue;
+
+		err = i915_gem_object_migrate(obj, &ww, ce, INTEL_REGION_SMEM);
+		if (err)
+			continue;
+
+		err = i915_gem_object_pin_pages(obj);
+		if (err)
+			continue;
+
+		i915_gem_object_unpin_pages(obj);
+	}
+	i915_gem_object_put(obj);
+
+	return err;
+}
+
+static int igt_lmem_create_migrate(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_context *ce = i915->gt.engine[BCS0]->kernel_context;
+	struct drm_i915_gem_object *obj;
+	struct i915_gem_ww_ctx ww;
+	int err = 0;
+
+	/* Switch object backing-store on create */
+	obj = i915_gem_object_create_shmem(i915, PAGE_SIZE);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	for_i915_gem_ww(&ww, err, true) {
+		err = i915_gem_object_lock(obj, &ww);
+		if (err)
+			continue;
+
+		err = i915_gem_object_migrate(obj, &ww, ce, INTEL_REGION_LMEM);
+		if (err)
+			continue;
+
+		err = i915_gem_object_pin_pages(obj);
+		if (err)
+			continue;
+
+		i915_gem_object_unpin_pages(obj);
+	}
+	i915_gem_object_put(obj);
+
+	return err;
+}
 static int igt_lmem_write_gpu(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
@@ -880,7 +946,7 @@ static int igt_mock_shrink(void *arg)
 
 		list_add(&obj->st_link, &objects);
 
-		err = i915_gem_object_pin_pages(obj);
+		err = i915_gem_object_pin_pages_unlocked(obj);
 		if (err)
 			goto err_close_objects;
 
@@ -902,7 +968,7 @@ static int igt_mock_shrink(void *arg)
 		list_add(&obj->st_link, &objects);
 
 		/* Provoke the shrinker to start violently swinging its axe! */
-		err = i915_gem_object_pin_pages(obj);
+		err = i915_gem_object_pin_pages_unlocked(obj);
 		if (err) {
 			pr_err("failed to shrink for target=%pa", &target);
 			goto err_close_objects;
@@ -923,6 +989,107 @@ static int igt_mock_shrink(void *arg)
 	return err;
 }
 
+static int lmem_pages_migrate_one(struct i915_gem_ww_ctx *ww,
+				  struct intel_context *ce,
+				  struct drm_i915_gem_object *obj)
+{
+	int err;
+
+	err = i915_gem_object_lock(obj, ww);
+	if (err)
+		return err;
+
+	err = i915_gem_object_wait(obj,
+				   I915_WAIT_INTERRUPTIBLE |
+				   I915_WAIT_PRIORITY |
+				   I915_WAIT_ALL,
+				   MAX_SCHEDULE_TIMEOUT);
+	if (err)
+		return err;
+
+	err = i915_gem_object_prepare_move(obj);
+	if (err)
+		return err;
+
+	if (i915_gem_object_is_lmem(obj)) {
+		err = i915_gem_object_migrate(obj, ww, ce, INTEL_REGION_SMEM);
+		if (err)
+			return err;
+
+		if (i915_gem_object_is_lmem(obj)) {
+			pr_err("object still backed by lmem\n");
+			err = -EINVAL;
+		}
+
+		if (!list_empty(&obj->mm.blocks)) {
+			pr_err("object leaking memory region\n");
+			err = -EINVAL;
+		}
+
+		if (!i915_gem_object_has_struct_page(obj)) {
+			pr_err("object not backed by struct page\n");
+			err = -EINVAL;
+		}
+
+	} else {
+		err = i915_gem_object_migrate(obj, ww, ce, INTEL_REGION_LMEM);
+		if (err)
+			return err;
+
+		if (i915_gem_object_has_struct_page(obj)) {
+			pr_err("object still backed by struct page\n");
+			err = -EINVAL;
+		}
+
+		if (!i915_gem_object_is_lmem(obj)) {
+			pr_err("object not backed by lmem\n");
+			err = -EINVAL;
+		}
+	}
+
+	return err;
+}
+
+static int igt_lmem_pages_migrate(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct drm_i915_gem_object *obj;
+	struct intel_context *ce;
+	struct i915_gem_ww_ctx ww;
+	int err;
+	int i;
+
+	if (!HAS_ENGINE(&i915->gt, BCS0))
+		return 0;
+
+	ce = i915->gt.engine[BCS0]->kernel_context;
+
+	/* From LMEM to shmem and back again */
+
+	obj = i915_gem_object_create_lmem(i915, SZ_2M, 0);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	err = i915_gem_object_fill_blt(obj, ce, 0);
+	if (err)
+		goto out_put;
+
+	for (i = 1; i <= 4; ++i) {
+		for_i915_gem_ww(&ww, err, true)
+			err = lmem_pages_migrate_one(&ww, ce, obj);
+		if (err)
+			break;
+
+		err = i915_gem_object_fill_blt(obj, ce, 0xdeadbeaf);
+		if (err)
+			break;
+	}
+out_put:
+	i915_gem_object_put(obj);
+
+	return err;
+}
+
 int intel_memory_region_mock_selftests(void)
 {
 	static const struct i915_subtest tests[] = {
@@ -960,6 +1127,9 @@ int intel_memory_region_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(igt_lmem_create),
 		SUBTEST(igt_lmem_write_cpu),
 		SUBTEST(igt_lmem_write_gpu),
+		SUBTEST(igt_smem_create_migrate),
+		SUBTEST(igt_lmem_create_migrate),
+		SUBTEST(igt_lmem_pages_migrate),
 	};
 
 	if (!HAS_LMEM(i915)) {
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 089/162] drm/i915/dg1: Fix occasional migration error
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (87 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 088/162] drm/i915: support basic object migration Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 090/162] drm/i915/query: Expose memory regions through the query uAPI Matthew Auld
                   ` (72 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, Sudeep Dutt, dri-devel

From: CQ Tang <cq.tang@intel.com>

We posted blitter copying operation. Then we call
i915_gem_object_set_to_cpu_domain(), inside this function, we
call i915_gem_object_wait() with interruptible flag. Sometimes
this wait call gets interrupted by the blitter copying complete
interrupt. This will make migration operation to fail.
So before calling i915_gem_object_set_to_cpu_domain(), we call
i915_gem_object_wait() with non-interruptible flag to wait for
the blitter operation to finish.

Signed-off-by: CQ Tang <cq.tang@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 7ff430503497..49935245a4a8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -381,6 +381,17 @@ int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
 		err = i915_gem_object_ww_copy_blt(obj, donor, ww, ce);
 		if (err)
 			goto unlock_donor;
+
+		/*
+		 * Occasionally i915_gem_object_wait() called inside
+		 * i915_gem_object_set_to_cpu_domain() get interrupted
+		 * and return -ERESTARTSYS, this will make migration
+		 * operation fail. So adding a non-interruptible wait
+		 * before changing the object domain.
+		 */
+		err = i915_gem_object_wait(donor, 0, MAX_SCHEDULE_TIMEOUT);
+		if (err)
+			goto unlock_donor;
 	}
 
 	err = i915_gem_object_set_to_cpu_domain(donor, false);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 090/162] drm/i915/query: Expose memory regions through the query uAPI
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (88 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 089/162] drm/i915/dg1: Fix occasional migration error Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 091/162] drm/i915: Store gt in memory region Matthew Auld
                   ` (71 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Abdiel Janulgue, dri-devel

From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

Returns the available memory region areas supported by the HW.

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 12 ++++-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.h |  3 ++
 drivers/gpu/drm/i915/i915_drv.c            |  2 +-
 drivers/gpu/drm/i915/i915_pci.c            |  2 +-
 drivers/gpu/drm/i915/i915_query.c          | 62 ++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_memory_region.c | 32 ++++++-----
 drivers/gpu/drm/i915/intel_memory_region.h | 38 +++++++------
 include/uapi/drm/i915_drm.h                | 58 ++++++++++++++++++++
 8 files changed, 172 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index ce9086d3a647..25e3cc53316e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -704,11 +704,19 @@ _i915_gem_object_create_stolen(struct intel_memory_region *mem,
 	return obj;
 }
 
+struct intel_memory_region *i915_stolen_region(struct drm_i915_private *i915)
+{
+	if (HAS_LMEM(i915))
+		return i915->mm.regions[INTEL_REGION_STOLEN_LMEM];
+
+	return i915->mm.regions[INTEL_REGION_STOLEN_SMEM];
+}
+
 struct drm_i915_gem_object *
 i915_gem_object_create_stolen(struct drm_i915_private *i915,
 			      resource_size_t size)
 {
-	return i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_STOLEN],
+	return i915_gem_object_create_region(i915_stolen_region(i915),
 					     size, I915_BO_ALLOC_CONTIGUOUS);
 }
 
@@ -748,7 +756,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
 					       resource_size_t stolen_offset,
 					       resource_size_t size)
 {
-	struct intel_memory_region *mem = i915->mm.regions[INTEL_REGION_STOLEN];
+	struct intel_memory_region *mem = i915_stolen_region(i915);
 	struct drm_i915_gem_object *obj;
 	struct drm_mm_node *stolen;
 	int ret;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
index 61e028063f9f..67f6264f3ff9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
@@ -22,6 +22,9 @@ int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *dev_priv,
 void i915_gem_stolen_remove_node(struct drm_i915_private *dev_priv,
 				 struct drm_mm_node *node);
 struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915);
+
+struct intel_memory_region *i915_stolen_region(struct drm_i915_private *i915);
+
 struct drm_i915_gem_object *
 i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
 			      resource_size_t size);
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 320856b665a1..07b3a89ec09e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -843,7 +843,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		if (INTEL_GEN(i915) >= 9 && i915_selftest.live < 0 &&
 		    i915->params.fake_lmem_start) {
 			mkwrite_device_info(i915)->memory_regions =
-				REGION_SMEM | REGION_LMEM | REGION_STOLEN;
+				REGION_SMEM | REGION_LMEM | REGION_STOLEN_SMEM;
 			GEM_BUG_ON(!HAS_LMEM(i915));
 		}
 	}
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 11fe790b1969..8243178a56f9 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -154,7 +154,7 @@
 	.page_sizes = I915_GTT_PAGE_SIZE_4K
 
 #define GEN_DEFAULT_REGIONS \
-	.memory_regions = REGION_SMEM | REGION_STOLEN
+	.memory_regions = REGION_SMEM | REGION_STOLEN_SMEM
 
 #define I830_FEATURES \
 	GEN(2), \
diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
index fed337ad7b68..d4ca040c528b 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -419,11 +419,73 @@ static int query_perf_config(struct drm_i915_private *i915,
 	}
 }
 
+static int query_memregion_info(struct drm_i915_private *dev_priv,
+				struct drm_i915_query_item *query_item)
+{
+	struct drm_i915_query_memory_regions __user *query_ptr =
+		u64_to_user_ptr(query_item->data_ptr);
+	struct drm_i915_memory_region_info __user *info_ptr =
+		&query_ptr->regions[0];
+	struct drm_i915_memory_region_info info = { };
+	struct drm_i915_query_memory_regions query;
+	u32 total_length;
+	int ret, i;
+
+	if (query_item->flags != 0)
+		return -EINVAL;
+
+	total_length = sizeof(query);
+	for (i = 0; i < ARRAY_SIZE(dev_priv->mm.regions); ++i) {
+		struct intel_memory_region *region = dev_priv->mm.regions[i];
+
+		if (!region)
+			continue;
+
+		total_length += sizeof(info);
+	}
+
+	ret = copy_query_item(&query, sizeof(query), total_length, query_item);
+	if (ret != 0)
+		return ret;
+
+	if (query.num_regions)
+		return -EINVAL;
+
+	for (i = 0; i < ARRAY_SIZE(query.rsvd); ++i) {
+		if (query.rsvd[i])
+			return  -EINVAL;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(dev_priv->mm.regions); ++i) {
+		struct intel_memory_region *region = dev_priv->mm.regions[i];
+
+		if (!region)
+			continue;
+
+		info.region.memory_class = region->type;
+		info.region.memory_instance = region->instance;
+		info.probed_size = region->total;
+		info.unallocated_size = region->avail;
+
+		if (__copy_to_user(info_ptr, &info, sizeof(info)))
+			return -EFAULT;
+
+		query.num_regions++;
+		info_ptr++;
+	}
+
+	if (__copy_to_user(query_ptr, &query, sizeof(query)))
+		return -EFAULT;
+
+	return total_length;
+}
+
 static int (* const i915_query_funcs[])(struct drm_i915_private *dev_priv,
 					struct drm_i915_query_item *query_item) = {
 	query_topology_info,
 	query_engine_info,
 	query_perf_config,
+	query_memregion_info,
 };
 
 int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index 308f89b87834..dca1e367ab98 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -6,14 +6,19 @@
 #include "intel_memory_region.h"
 #include "i915_drv.h"
 
-/* XXX: Hysterical raisins. BIT(inst) needs to just be (inst) at some point. */
-#define REGION_MAP(type, inst) \
-	BIT((type) + INTEL_MEMORY_TYPE_SHIFT) | BIT(inst)
-
-const u32 intel_region_map[] = {
-	[INTEL_REGION_SMEM] = REGION_MAP(INTEL_MEMORY_SYSTEM, 0),
-	[INTEL_REGION_LMEM] = REGION_MAP(INTEL_MEMORY_LOCAL, 0),
-	[INTEL_REGION_STOLEN] = REGION_MAP(INTEL_MEMORY_STOLEN, 0),
+const struct intel_memory_region_info intel_region_map[] = {
+       [INTEL_REGION_SMEM] = {
+               .class = INTEL_MEMORY_SYSTEM,
+               .instance = 0,
+       },
+       [INTEL_REGION_LMEM] = {
+               .class = INTEL_MEMORY_LOCAL,
+               .instance = 0,
+       },
+       [INTEL_REGION_STOLEN_SMEM] = {
+               .class = INTEL_MEMORY_STOLEN_SYSTEM,
+               .instance = 0,
+       },
 };
 
 struct intel_memory_region *
@@ -263,17 +268,18 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
 
 	for (i = 0; i < ARRAY_SIZE(i915->mm.regions); i++) {
 		struct intel_memory_region *mem = ERR_PTR(-ENODEV);
-		u32 type;
+		u16 type, instance;
 
 		if (!HAS_REGION(i915, BIT(i)))
 			continue;
 
-		type = MEMORY_TYPE_FROM_REGION(intel_region_map[i]);
+		type = intel_region_map[i].class;
+		instance = intel_region_map[i].instance;
 		switch (type) {
 		case INTEL_MEMORY_SYSTEM:
 			mem = i915_gem_shmem_setup(i915);
 			break;
-		case INTEL_MEMORY_STOLEN:
+		case INTEL_MEMORY_STOLEN_SYSTEM:
 			mem = i915_gem_stolen_setup(i915);
 			break;
 		case INTEL_MEMORY_LOCAL:
@@ -289,9 +295,9 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
 			goto out_cleanup;
 		}
 
-		mem->id = intel_region_map[i];
+		mem->id = i;
 		mem->type = type;
-		mem->instance = MEMORY_INSTANCE_FROM_REGION(intel_region_map[i]);
+		mem->instance = instance;
 
 		i915->mm.regions[i] = mem;
 	}
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index 232490d89a83..c047cf7c5e7c 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -11,6 +11,7 @@
 #include <linux/mutex.h>
 #include <linux/io-mapping.h>
 #include <drm/drm_mm.h>
+#include <drm/i915_drm.h>
 
 #include "i915_buddy.h"
 
@@ -19,30 +20,25 @@ struct drm_i915_gem_object;
 struct intel_memory_region;
 struct sg_table;
 
-/**
- *  Base memory type
- */
 enum intel_memory_type {
-	INTEL_MEMORY_SYSTEM = 0,
-	INTEL_MEMORY_LOCAL,
-	INTEL_MEMORY_STOLEN,
+	INTEL_MEMORY_SYSTEM = I915_MEMORY_CLASS_SYSTEM,
+	INTEL_MEMORY_LOCAL = I915_MEMORY_CLASS_DEVICE,
+	INTEL_MEMORY_STOLEN_SYSTEM = I915_MEMORY_CLASS_STOLEN_SYSTEM,
+	INTEL_MEMORY_STOLEN_LOCAL = I915_MEMORY_CLASS_STOLEN_DEVICE,
 };
 
 enum intel_region_id {
 	INTEL_REGION_SMEM = 0,
 	INTEL_REGION_LMEM,
-	INTEL_REGION_STOLEN,
+	INTEL_REGION_STOLEN_SMEM,
+	INTEL_REGION_STOLEN_LMEM,
 	INTEL_REGION_UNKNOWN, /* Should be last */
 };
 
 #define REGION_SMEM     BIT(INTEL_REGION_SMEM)
 #define REGION_LMEM     BIT(INTEL_REGION_LMEM)
-#define REGION_STOLEN   BIT(INTEL_REGION_STOLEN)
-
-#define INTEL_MEMORY_TYPE_SHIFT 16
-
-#define MEMORY_TYPE_FROM_REGION(r) (ilog2((r) >> INTEL_MEMORY_TYPE_SHIFT))
-#define MEMORY_INSTANCE_FROM_REGION(r) (ilog2((r) & 0xffff))
+#define REGION_STOLEN_SMEM   BIT(INTEL_REGION_STOLEN_SMEM)
+#define REGION_STOLEN_LMEM   BIT(INTEL_REGION_STOLEN_LMEM)
 
 #define I915_ALLOC_MIN_PAGE_SIZE  BIT(0)
 #define I915_ALLOC_CONTIGUOUS     BIT(1)
@@ -51,10 +47,12 @@ enum intel_region_id {
 	for (id = 0; id < ARRAY_SIZE((i915)->mm.regions); id++) \
 		for_each_if((mr) = (i915)->mm.regions[id])
 
-/**
- * Memory regions encoded as type | instance
- */
-extern const u32 intel_region_map[];
+struct intel_memory_region_info {
+       u16 class;
+       u16 instance;
+};
+
+extern const struct intel_memory_region_info intel_region_map[];
 
 struct intel_memory_region_ops {
 	unsigned int flags;
@@ -89,9 +87,9 @@ struct intel_memory_region {
 	resource_size_t total;
 	resource_size_t avail;
 
-	unsigned int type;
-	unsigned int instance;
-	unsigned int id;
+	u16 type;
+	u16 instance;
+	enum intel_region_id id;
 	char name[8];
 
 	dma_addr_t remap_addr;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index fa1f3d62f9a6..41845203250d 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -2175,6 +2175,7 @@ struct drm_i915_query_item {
 #define DRM_I915_QUERY_TOPOLOGY_INFO    1
 #define DRM_I915_QUERY_ENGINE_INFO	2
 #define DRM_I915_QUERY_PERF_CONFIG      3
+#define DRM_I915_QUERY_MEMORY_REGIONS   4
 /* Must be kept compact -- no holes and well documented */
 
 	/*
@@ -2375,6 +2376,63 @@ struct drm_i915_query_perf_config {
 	__u8 data[];
 };
 
+enum drm_i915_gem_memory_class {
+	I915_MEMORY_CLASS_SYSTEM = 0,
+	I915_MEMORY_CLASS_DEVICE,
+	I915_MEMORY_CLASS_STOLEN_SYSTEM,
+	I915_MEMORY_CLASS_STOLEN_DEVICE,
+};
+
+struct drm_i915_gem_memory_class_instance {
+	__u16 memory_class; /* see enum drm_i915_gem_memory_class */
+	__u16 memory_instance;
+};
+
+/**
+ * struct drm_i915_memory_region_info
+ *
+ * Describes one region as known to the driver.
+ */
+struct drm_i915_memory_region_info {
+	/** class:instance pair encoding */
+	struct drm_i915_gem_memory_class_instance region;
+
+	/** MBZ */
+	__u32 rsvd0;
+
+	/** MBZ */
+	__u64 caps;
+
+	/** MBZ */
+	__u64 flags;
+
+	/** Memory probed by the driver (-1 = unknown) */
+	__u64 probed_size;
+
+	/** Estimate of memory remaining (-1 = unknown) */
+	__u64 unallocated_size;
+
+	/** MBZ */
+	__u64 rsvd1[8];
+};
+
+/**
+ * struct drm_i915_query_memory_regions
+ *
+ * Region info query enumerates all regions known to the driver by filling in
+ * an array of struct drm_i915_memory_region_info structures.
+ */
+struct drm_i915_query_memory_regions {
+	/** Number of supported regions */
+	__u32 num_regions;
+
+	/** MBZ */
+	__u32 rsvd[3];
+
+	/* Info about each supported region */
+	struct drm_i915_memory_region_info regions[];
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 091/162] drm/i915: Store gt in memory region
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (89 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 090/162] drm/i915/query: Expose memory regions through the query uAPI Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 092/162] drm/i915/uapi: introduce drm_i915_gem_create_ext Matthew Auld
                   ` (70 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Prathap Kumar Valsan, dri-devel, Tvrtko Ursulin

From: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>

Store pointer to gt closest to its memory region so that we can access
the engines corresponding to that gt via memory region.

Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_memory_region.c | 1 +
 drivers/gpu/drm/i915/intel_memory_region.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index dca1e367ab98..6f40748901da 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -298,6 +298,7 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
 		mem->id = i;
 		mem->type = type;
 		mem->instance = instance;
+		mem->gt = &i915->gt;
 
 		i915->mm.regions[i] = mem;
 	}
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index c047cf7c5e7c..15dcb57b4b5a 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -91,6 +91,7 @@ struct intel_memory_region {
 	u16 instance;
 	enum intel_region_id id;
 	char name[8];
+	struct intel_gt *gt; /* GT closest to this region. */
 
 	dma_addr_t remap_addr;
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 092/162] drm/i915/uapi: introduce drm_i915_gem_create_ext
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (90 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 091/162] drm/i915: Store gt in memory region Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 13:25   ` [Intel-gfx] " Chris Wilson
                     ` (2 more replies)
  2020-11-27 12:06 ` [RFC PATCH 093/162] drm/i915/lmem: allocate cmd ring in lmem Matthew Auld
                   ` (69 subsequent siblings)
  161 siblings, 3 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

Same old gem_create but with now with extensions support. This is needed
to support various upcoming usecases. For now we use the extensions
mechanism to support setting an immutable-priority-list of potential
placements, at creation time.

If we wish to set the placements/regions we can simply do:

struct drm_i915_gem_object_param region_param = { … }; /* Unchanged */
struct drm_i915_gem_create_ext_setparam setparam_region = {
    .base = { .name = I915_GEM_CREATE_EXT_SETPARAM },
    .param = region_param,
}

struct drm_i915_gem_create_ext create_ext = {
	.size = 16 * PAGE_SIZE,
	.extensions = (uintptr_t)&setparam_region,
};
int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
if (err) ...

If we use the normal gem_create or gem_create_ext without the
extensions/placements then we still get the old behaviour with only
placing the object in system memory.

One important change here is the returned size will now be rounded up to
the correct size, depending on the list of placements, where we might
have minimum page-size restrictions on some platforms when dealing with
device local-memory.

Also, we still keep around the i915_gem_object_setparam ioctl, although
that is now restricted by the placement list(i.e we are not allowed to
add new placements), and longer term that will be going away wrt setting
placements, since it was deemed that the kernel doesn't need to support
a dynamic list of placements, which is now solidified by this uapi
change.

Testcase: igt/gem_create/create-ext-placement-sanity-check
Testcase: igt/gem_create/create-ext-placement-each
Testcase: igt/gem_create/create-ext-placement-all
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_create.c    | 398 ++++++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |   2 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   9 +
 drivers/gpu/drm/i915/gem/i915_gem_region.c    |   4 +
 drivers/gpu/drm/i915/i915_drv.c               |   2 +-
 drivers/gpu/drm/i915/i915_gem.c               | 103 +----
 drivers/gpu/drm/i915/intel_memory_region.c    |  20 +
 drivers/gpu/drm/i915/intel_memory_region.h    |   4 +
 include/uapi/drm/i915_drm.h                   |  60 +++
 10 files changed, 500 insertions(+), 103 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_create.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index ec361d61230b..3955134feca7 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -134,6 +134,7 @@ gem-y += \
 	gem/i915_gem_clflush.o \
 	gem/i915_gem_client_blt.o \
 	gem/i915_gem_context.o \
+	gem/i915_gem_create.o \
 	gem/i915_gem_dmabuf.o \
 	gem/i915_gem_domain.o \
 	gem/i915_gem_execbuffer.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c
new file mode 100644
index 000000000000..6f6dd4f1ce7e
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -0,0 +1,398 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include "gem/i915_gem_ioctls.h"
+#include "gem/i915_gem_lmem.h"
+#include "gem/i915_gem_object_blt.h"
+#include "gem/i915_gem_region.h"
+
+#include "i915_drv.h"
+#include "i915_user_extensions.h"
+
+static u32 max_page_size(struct intel_memory_region **placements,
+			 int n_placements)
+{
+	u32 max_page_size = 0;
+	int i;
+
+	for (i = 0; i < n_placements; ++i) {
+		max_page_size = max_t(u32, max_page_size,
+				      placements[i]->min_page_size);
+	}
+
+	GEM_BUG_ON(!max_page_size);
+	return max_page_size;
+}
+
+static int
+i915_gem_create(struct drm_file *file,
+		struct intel_memory_region **placements,
+		int n_placements,
+		u64 *size_p,
+		u32 *handle_p)
+{
+	struct drm_i915_gem_object *obj;
+	u32 handle;
+	u64 size;
+	int ret;
+
+	size = round_up(*size_p, max_page_size(placements, n_placements));
+	if (size == 0)
+		return -EINVAL;
+
+	/* For most of the ABI (e.g. mmap) we think in system pages */
+	GEM_BUG_ON(!IS_ALIGNED(size, PAGE_SIZE));
+
+	/* Allocate the new object */
+	obj = i915_gem_object_create_region(placements[0], size, 0);
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
+
+	if (i915_gem_object_is_lmem(obj)) {
+		struct intel_gt *gt = obj->mm.region->gt;
+		struct intel_context *ce = gt->engine[BCS0]->blitter_context;
+
+		/*
+		 * XXX: We really want to move this to get_pages(), but we
+		 * require grabbing the BKL for the blitting operation which is
+		 * annoying. In the pipeline is support for async get_pages()
+		 * which should fit nicely for this. Also note that the actual
+		 * clear should be done async(we currently do an object_wait
+		 * which is pure garbage), we just need to take care if
+		 * userspace opts of implicit sync for the execbuf, to avoid any
+		 * potential info leak.
+		 */
+
+retry:
+		ret = i915_gem_object_fill_blt(obj, ce, 0);
+		if (ret == -EINTR)
+			goto retry;
+		if (ret) {
+			/*
+			 * XXX: Post the error to where we would normally gather
+			 * and clear the pages. This better reflects the final
+			 * uapi behaviour, once we are at the point where we can
+			 * move the clear worker to get_pages().
+			 */
+			i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE);
+			i915_gem_object_lock(obj, NULL);
+			__i915_gem_object_put_pages(obj);
+			i915_gem_object_unlock(obj);
+			obj->mm.gem_create_posted_err = ret;
+			goto handle_create;
+		}
+
+		/*
+		 * XXX: Occasionally i915_gem_object_wait() called inside
+		 * i915_gem_object_set_to_cpu_domain() get interrupted
+		 * and return -ERESTARTSYS, this will cause go clearing
+		 * code below and also set the gem_create_posted_err.
+		 * moreover, the clearing sometimes fails because the
+		 * object is still pinned by the blitter clearing code.
+		 * this makes us to have an object with or without lmem
+		 * pages, and with gem_create_posted_err = -ERESTARTSYS.
+		 * Under lmem pressure, if the object has pages, we might
+		 * swap out this object to smem. Next when user space
+		 * code use this object in gem_execbuf() call, get_pages()
+		 * operation will return -ERESTARTSYS error code, which
+		 * causes user space code to fail.
+		 *
+		 * To avoid this problem, we add a non-interruptible
+		 * wait before setting object to cpu domain.
+		 */
+		i915_gem_object_lock(obj, NULL);
+		ret = i915_gem_object_wait(obj, 0, MAX_SCHEDULE_TIMEOUT);
+		if (!ret)
+			ret = i915_gem_object_set_to_cpu_domain(obj, false);
+		if (ret) {
+			i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE);
+			__i915_gem_object_put_pages(obj);
+			obj->mm.gem_create_posted_err = ret;
+			i915_gem_object_unlock(obj);
+			goto handle_create;
+		}
+		i915_gem_object_unlock(obj);
+	}
+
+handle_create:
+	ret = drm_gem_handle_create(file, &obj->base, &handle);
+	/* drop reference from allocate - handle holds it now */
+	i915_gem_object_put(obj);
+	if (ret)
+		return ret;
+
+	obj->mm.placements = placements;
+	obj->mm.n_placements = n_placements;
+
+	*handle_p = handle;
+	*size_p = size;
+	return 0;
+}
+
+int
+i915_gem_dumb_create(struct drm_file *file,
+		     struct drm_device *dev,
+		     struct drm_mode_create_dumb *args)
+{
+	struct intel_memory_region **placements;
+	enum intel_memory_type mem_type;
+	int cpp = DIV_ROUND_UP(args->bpp, 8);
+	u32 format;
+	int ret;
+
+	switch (cpp) {
+	case 1:
+		format = DRM_FORMAT_C8;
+		break;
+	case 2:
+		format = DRM_FORMAT_RGB565;
+		break;
+	case 4:
+		format = DRM_FORMAT_XRGB8888;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	/* have to work out size/pitch and return them */
+	args->pitch = ALIGN(args->width * cpp, 64);
+
+	/* align stride to page size so that we can remap */
+	if (args->pitch > intel_plane_fb_max_stride(to_i915(dev), format,
+						    DRM_FORMAT_MOD_LINEAR))
+		args->pitch = ALIGN(args->pitch, 4096);
+
+	if (args->pitch < args->width)
+		return -EINVAL;
+
+	args->size = mul_u32_u32(args->pitch, args->height);
+
+	mem_type = INTEL_MEMORY_SYSTEM;
+	if (HAS_LMEM(to_i915(dev)))
+		mem_type = INTEL_MEMORY_LOCAL;
+
+	placements = kmalloc(sizeof(struct intel_memory_region *), GFP_KERNEL);
+	if (!placements)
+		return -ENOMEM;
+
+	placements[0] = intel_memory_region_by_type(to_i915(dev), mem_type);
+
+	ret = i915_gem_create(file,
+			      placements, 1,
+			      &args->size, &args->handle);
+	if (ret)
+		kfree(placements);
+
+	return ret;
+}
+
+struct create_ext {
+	struct drm_i915_private *i915;
+	struct intel_memory_region **placements;
+	int n_placements;
+};
+
+static void repr_placements(char *buf, size_t size,
+			    struct intel_memory_region **placements,
+			    int n_placements)
+{
+	int i;
+
+	buf[0] = '\0';
+
+	for (i = 0; i < n_placements; i++) {
+		struct intel_memory_region *mr = placements[i];
+		int r;
+
+		r = snprintf(buf, size, "\n  %s -> { class: %d, inst: %d }",
+			     mr->name, mr->type, mr->instance);
+		if (r >= size)
+			return;
+
+		buf += r;
+		size -= r;
+	}
+}
+
+static int set_placements(struct drm_i915_gem_object_param *args,
+			  struct create_ext *ext_data)
+{
+	struct drm_i915_private *i915 = ext_data->i915;
+	struct drm_i915_gem_memory_class_instance __user *uregions =
+		u64_to_user_ptr(args->data);
+	struct intel_memory_region **placements;
+	u32 mask;
+	int i, ret = 0;
+
+	if (args->handle) {
+		DRM_DEBUG("Handle should be zero\n");
+		ret = -EINVAL;
+	}
+
+	if (!args->size) {
+		DRM_DEBUG("Size is zero\n");
+		ret = -EINVAL;
+	}
+
+	if (args->size > ARRAY_SIZE(i915->mm.regions)) {
+		DRM_DEBUG("Too many placements\n");
+		ret = -EINVAL;
+	}
+
+	if (ret)
+		return ret;
+
+	placements = kmalloc_array(args->size,
+				   sizeof(struct intel_memory_region *),
+				   GFP_KERNEL);
+	if (!placements)
+		return -ENOMEM;
+
+	mask = 0;
+	for (i = 0; i < args->size; i++) {
+		struct drm_i915_gem_memory_class_instance region;
+		struct intel_memory_region *mr;
+
+		if (copy_from_user(&region, uregions, sizeof(region))) {
+			ret = -EFAULT;
+			goto out_free;
+		}
+
+		mr = intel_memory_region_lookup(i915,
+						region.memory_class,
+						region.memory_instance);
+		if (!mr) {
+			DRM_DEBUG("Device is missing region { class: %d, inst: %d } at index = %d\n",
+				  region.memory_class, region.memory_instance, i);
+			ret = -EINVAL;
+			goto out_dump;
+		}
+
+		if (mask & BIT(mr->id)) {
+			DRM_DEBUG("Found duplicate placement %s -> { class: %d, inst: %d } at index = %d\n",
+				  mr->name, region.memory_class,
+				  region.memory_instance, i);
+			ret = -EINVAL;
+			goto out_dump;
+		}
+
+		placements[i] = mr;
+		mask |= BIT(mr->id);
+
+		++uregions;
+	}
+
+	if (ext_data->placements) {
+		ret = -EINVAL;
+		goto out_dump;
+	}
+
+	ext_data->placements = placements;
+	ext_data->n_placements = args->size;
+
+	return 0;
+
+out_dump:
+	if (1) {
+		char buf[256];
+
+		if (ext_data->placements) {
+			repr_placements(buf,
+					sizeof(buf),
+					ext_data->placements,
+					ext_data->n_placements);
+			DRM_DEBUG("Placements were already set in previous SETPARAM. Existing placements: %s\n",
+				  buf);
+		}
+
+		repr_placements(buf, sizeof(buf), placements, i);
+		DRM_DEBUG("New placements(so far validated): %s\n", buf);
+	}
+
+out_free:
+	kfree(placements);
+	return ret;
+}
+
+static int __create_setparam(struct drm_i915_gem_object_param *args,
+			     struct create_ext *ext_data)
+{
+	if (!(args->param & I915_OBJECT_PARAM)) {
+		DRM_DEBUG("Missing I915_OBJECT_PARAM namespace\n");
+		return -EINVAL;
+	}
+
+	switch (lower_32_bits(args->param)) {
+	case I915_PARAM_MEMORY_REGIONS:
+		return set_placements(args, ext_data);
+	}
+
+	return -EINVAL;
+}
+
+static int create_setparam(struct i915_user_extension __user *base, void *data)
+{
+	struct drm_i915_gem_create_ext_setparam ext;
+
+	if (copy_from_user(&ext, base, sizeof(ext)))
+		return -EFAULT;
+
+	return __create_setparam(&ext.param, data);
+}
+
+static const i915_user_extension_fn create_extensions[] = {
+	[I915_GEM_CREATE_EXT_SETPARAM] = create_setparam,
+};
+
+/**
+ * Creates a new mm object and returns a handle to it.
+ * @dev: drm device pointer
+ * @data: ioctl data blob
+ * @file: drm file pointer
+ */
+int
+i915_gem_create_ioctl(struct drm_device *dev, void *data,
+		      struct drm_file *file)
+{
+	struct drm_i915_private *i915 = to_i915(dev);
+	struct create_ext ext_data = { .i915 = i915 };
+	struct drm_i915_gem_create_ext *args = data;
+	int ret;
+
+	i915_gem_flush_free_objects(i915);
+
+	ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
+				   create_extensions,
+				   ARRAY_SIZE(create_extensions),
+				   &ext_data);
+	if (ret)
+		goto err_free;
+
+	if (!ext_data.placements) {
+		struct intel_memory_region **placements;
+		enum intel_memory_type mem_type = INTEL_MEMORY_SYSTEM;
+
+		placements = kmalloc(sizeof(struct intel_memory_region *),
+				     GFP_KERNEL);
+		if (!placements)
+			return -ENOMEM;
+
+		placements[0] = intel_memory_region_by_type(i915, mem_type);
+
+		ext_data.placements = placements;
+		ext_data.n_placements = 1;
+	}
+
+	ret = i915_gem_create(file,
+			      ext_data.placements,
+			      ext_data.n_placements,
+			      &args->size, &args->handle);
+	if (!ret)
+		return 0;
+
+err_free:
+	kfree(ext_data.placements);
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 49935245a4a8..89b530841126 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -254,6 +254,8 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 		if (obj->ops->release)
 			obj->ops->release(obj);
 
+		kfree(obj->mm.placements);
+
 		/* But keep the pointer alive for RCU-protected lookups */
 		call_rcu(&obj->rcu, __i915_gem_free_object_rcu);
 		cond_resched();
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 6d101275bc9d..115ad32c303f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -212,6 +212,15 @@ struct drm_i915_gem_object {
 		atomic_t pages_pin_count;
 		atomic_t shrink_pin;
 
+		/**
+		 * Priority list of potential placements for this object.
+		 */
+		struct intel_memory_region **placements;
+		int n_placements;
+
+		/* XXX: Nasty hack, see gem_create */
+		int gem_create_posted_err;
+
 		/**
 		 * Memory region for this object.
 		 */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c
index 58bf5f9e3199..8f352ba6202d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
@@ -33,6 +33,10 @@ i915_gem_object_get_pages_buddy(struct drm_i915_gem_object *obj)
 	unsigned int sg_page_sizes;
 	int ret;
 
+	/* XXX: Check if we have any post. This is nasty hack, see gem_create */
+	if (obj->mm.gem_create_posted_err)
+		return obj->mm.gem_create_posted_err;
+
 	st = kmalloc(sizeof(*st), GFP_KERNEL);
 	if (!st)
 		return -ENOMEM;
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 07b3a89ec09e..f4540c048cd9 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1729,7 +1729,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_GEM_THROTTLE, i915_gem_throttle_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_ENTERVT, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF_DRV(I915_GEM_LEAVEVT, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
-	DRM_IOCTL_DEF_DRV(I915_GEM_CREATE, i915_gem_create_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_CREATE_EXT, i915_gem_create_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_PREAD, i915_gem_pread_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_PWRITE, i915_gem_pwrite_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_MMAP, i915_gem_mmap_ioctl, DRM_RENDER_ALLOW),
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ef2124c17a7f..bf67f323a1ae 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -43,6 +43,7 @@
 #include "gem/i915_gem_clflush.h"
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_ioctls.h"
+#include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_mman.h"
 #include "gem/i915_gem_region.h"
 #include "gt/intel_engine_user.h"
@@ -179,108 +180,6 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 	return ret;
 }
 
-static int
-i915_gem_create(struct drm_file *file,
-		struct intel_memory_region *mr,
-		u64 *size_p,
-		u32 *handle_p)
-{
-	struct drm_i915_gem_object *obj;
-	u32 handle;
-	u64 size;
-	int ret;
-
-	GEM_BUG_ON(!is_power_of_2(mr->min_page_size));
-	size = round_up(*size_p, mr->min_page_size);
-	if (size == 0)
-		return -EINVAL;
-
-	/* For most of the ABI (e.g. mmap) we think in system pages */
-	GEM_BUG_ON(!IS_ALIGNED(size, PAGE_SIZE));
-
-	/* Allocate the new object */
-	obj = i915_gem_object_create_region(mr, size, 0);
-	if (IS_ERR(obj))
-		return PTR_ERR(obj);
-
-	ret = drm_gem_handle_create(file, &obj->base, &handle);
-	/* drop reference from allocate - handle holds it now */
-	i915_gem_object_put(obj);
-	if (ret)
-		return ret;
-
-	*handle_p = handle;
-	*size_p = size;
-	return 0;
-}
-
-int
-i915_gem_dumb_create(struct drm_file *file,
-		     struct drm_device *dev,
-		     struct drm_mode_create_dumb *args)
-{
-	enum intel_memory_type mem_type;
-	int cpp = DIV_ROUND_UP(args->bpp, 8);
-	u32 format;
-
-	switch (cpp) {
-	case 1:
-		format = DRM_FORMAT_C8;
-		break;
-	case 2:
-		format = DRM_FORMAT_RGB565;
-		break;
-	case 4:
-		format = DRM_FORMAT_XRGB8888;
-		break;
-	default:
-		return -EINVAL;
-	}
-
-	/* have to work out size/pitch and return them */
-	args->pitch = ALIGN(args->width * cpp, 64);
-
-	/* align stride to page size so that we can remap */
-	if (args->pitch > intel_plane_fb_max_stride(to_i915(dev), format,
-						    DRM_FORMAT_MOD_LINEAR))
-		args->pitch = ALIGN(args->pitch, 4096);
-
-	if (args->pitch < args->width)
-		return -EINVAL;
-
-	args->size = mul_u32_u32(args->pitch, args->height);
-
-	mem_type = INTEL_MEMORY_SYSTEM;
-	if (HAS_LMEM(to_i915(dev)))
-		mem_type = INTEL_MEMORY_LOCAL;
-
-	return i915_gem_create(file,
-			       intel_memory_region_by_type(to_i915(dev),
-							   mem_type),
-			       &args->size, &args->handle);
-}
-
-/**
- * Creates a new mm object and returns a handle to it.
- * @dev: drm device pointer
- * @data: ioctl data blob
- * @file: drm file pointer
- */
-int
-i915_gem_create_ioctl(struct drm_device *dev, void *data,
-		      struct drm_file *file)
-{
-	struct drm_i915_private *i915 = to_i915(dev);
-	struct drm_i915_gem_create *args = data;
-
-	i915_gem_flush_free_objects(i915);
-
-	return i915_gem_create(file,
-			       intel_memory_region_by_type(i915,
-							   INTEL_MEMORY_SYSTEM),
-			       &args->size, &args->handle);
-}
-
 static int
 shmem_pread(struct page *page, int offset, int len, char __user *user_data,
 	    bool needs_clflush)
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index 6f40748901da..67240bddf2ca 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -21,6 +21,26 @@ const struct intel_memory_region_info intel_region_map[] = {
        },
 };
 
+struct intel_memory_region *
+intel_memory_region_lookup(struct drm_i915_private *i915,
+			   u16 class, u16 instance)
+{
+	int i;
+
+	/* XXX: consider maybe converting to an rb tree at some point */
+	for (i = 0; i < ARRAY_SIZE(i915->mm.regions); ++i) {
+		struct intel_memory_region *region = i915->mm.regions[i];
+
+		if (!region)
+			continue;
+
+		if (region->type == class && region->instance == instance)
+			return region;
+	}
+
+	return NULL;
+}
+
 struct intel_memory_region *
 intel_memory_region_by_type(struct drm_i915_private *i915,
 			    enum intel_memory_type mem_type)
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index 15dcb57b4b5a..20431d3ce490 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -102,6 +102,10 @@ struct intel_memory_region {
 	} objects;
 };
 
+struct intel_memory_region *
+intel_memory_region_lookup(struct drm_i915_private *i915,
+			   u16 class, u16 instance);
+
 int intel_memory_region_init_buddy(struct intel_memory_region *mem);
 void intel_memory_region_release_buddy(struct intel_memory_region *mem);
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 41845203250d..f6e3a0462414 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -391,6 +391,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_GEM_ENTERVT	DRM_IO(DRM_COMMAND_BASE + DRM_I915_GEM_ENTERVT)
 #define DRM_IOCTL_I915_GEM_LEAVEVT	DRM_IO(DRM_COMMAND_BASE + DRM_I915_GEM_LEAVEVT)
 #define DRM_IOCTL_I915_GEM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_CREATE, struct drm_i915_gem_create)
+#define DRM_IOCTL_I915_GEM_CREATE_EXT	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_CREATE, struct drm_i915_gem_create_ext)
 #define DRM_IOCTL_I915_GEM_PREAD	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_PREAD, struct drm_i915_gem_pread)
 #define DRM_IOCTL_I915_GEM_PWRITE	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_PWRITE, struct drm_i915_gem_pwrite)
 #define DRM_IOCTL_I915_GEM_MMAP		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_MMAP, struct drm_i915_gem_mmap)
@@ -728,6 +729,27 @@ struct drm_i915_gem_create {
 	__u32 pad;
 };
 
+struct drm_i915_gem_create_ext {
+
+	/**
+	 * Requested size for the object.
+	 *
+	 * The (page-aligned) allocated size for the object will be returned.
+	 */
+	__u64 size;
+	/**
+	 * Returned handle for the object.
+	 *
+	 * Object handles are nonzero.
+	 */
+	__u32 handle;
+	__u32 pad;
+#define I915_GEM_CREATE_EXT_SETPARAM (1u << 0)
+#define I915_GEM_CREATE_EXT_FLAGS_UNKNOWN \
+	(-(I915_GEM_CREATE_EXT_SETPARAM << 1))
+	__u64 extensions;
+};
+
 struct drm_i915_gem_pread {
 	/** Handle for the object being read. */
 	__u32 handle;
@@ -1698,6 +1720,44 @@ struct drm_i915_gem_context_param {
 	__u64 value;
 };
 
+struct drm_i915_gem_object_param {
+	/* Object handle (0 for I915_GEM_CREATE_EXT_SETPARAM) */
+	__u32 handle;
+
+	/* Data pointer size */
+	__u32 size;
+
+/*
+ * I915_OBJECT_PARAM:
+ *
+ * Select object namespace for the param.
+ */
+#define I915_OBJECT_PARAM  (1ull<<32)
+
+/*
+ * I915_PARAM_MEMORY_REGIONS:
+ *
+ * Set the data pointer with the desired set of placements in priority
+ * order(each entry must be unique and supported by the device), as an array of
+ * drm_i915_gem_memory_class_instance, or an equivalent layout of class:instance
+ * pair encodings. See DRM_I915_QUERY_MEMORY_REGIONS for how to query the
+ * supported regions.
+ *
+ * Note that this requires the I915_OBJECT_PARAM namespace:
+ *	.param = I915_OBJECT_PARAM | I915_PARAM_MEMORY_REGIONS
+ */
+#define I915_PARAM_MEMORY_REGIONS 0x1
+	__u64 param;
+
+	/* Data value or pointer */
+	__u64 data;
+};
+
+struct drm_i915_gem_create_ext_setparam {
+	struct i915_user_extension base;
+	struct drm_i915_gem_object_param param;
+};
+
 /**
  * Context SSEU programming
  *
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 093/162] drm/i915/lmem: allocate cmd ring in lmem
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (91 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 092/162] drm/i915/uapi: introduce drm_i915_gem_create_ext Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 13:27   ` Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 094/162] drm/i915/dg1: Do not check r->sgt.pfn for NULL Matthew Auld
                   ` (68 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Michel Thierry, Abdiel Janulgue, dri-devel

From: Michel Thierry <michel.thierry@intel.com>

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_ring.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
index d636c6ed88b7..aa75e644f3f2 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -4,6 +4,7 @@
  * Copyright © 2019 Intel Corporation
  */
 
+#include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_object.h"
 #include "i915_drv.h"
 #include "i915_vma.h"
@@ -111,10 +112,16 @@ static struct i915_vma *create_ring_vma(struct i915_ggtt *ggtt, int size)
 	struct i915_vma *vma;
 
 	obj = ERR_PTR(-ENODEV);
-	if (i915_ggtt_has_aperture(ggtt))
-		obj = i915_gem_object_create_stolen(i915, size);
-	if (IS_ERR(obj))
-		obj = i915_gem_object_create_internal(i915, size);
+	if (HAS_LMEM(i915)) {
+		obj = i915_gem_object_create_lmem(i915, size,
+						  I915_BO_ALLOC_CONTIGUOUS |
+						  I915_BO_ALLOC_VOLATILE);
+	} else {
+		if (i915_ggtt_has_aperture(ggtt))
+			obj = i915_gem_object_create_stolen(i915, size);
+		if (IS_ERR(obj))
+			obj = i915_gem_object_create_internal(i915, size);
+	}
 	if (IS_ERR(obj))
 		return ERR_CAST(obj);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 094/162] drm/i915/dg1: Do not check r->sgt.pfn for NULL
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (92 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 093/162] drm/i915/lmem: allocate cmd ring in lmem Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 095/162] drm/i915/dg1: Introduce dmabuf mmap to LMEM Matthew Auld
                   ` (67 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kui Wen, dri-devel

From: Kui Wen <kui.wen@intel.com>

When user space does mmap, kernel would map the physical page of local memory
to virtual memory address. The r->sgt.pfn is page address allocated from
local memory and the local memory region is from 0 to LMEM size. Hence the
r->sgt.pfn is possible to be 0, but this's normal case.

Signed-off-by: Kui Wen <kui.wen@intel.com>
---
 drivers/gpu/drm/i915/i915_mm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_mm.c b/drivers/gpu/drm/i915/i915_mm.c
index 43039dc8c607..dcf6b3e5bfdf 100644
--- a/drivers/gpu/drm/i915/i915_mm.c
+++ b/drivers/gpu/drm/i915/i915_mm.c
@@ -62,7 +62,7 @@ static int remap_sg(pte_t *pte, unsigned long addr, void *data)
 {
 	struct remap_pfn *r = data;
 
-	if (GEM_WARN_ON(!r->sgt.pfn))
+	if (GEM_WARN_ON(!use_dma(r->iobase) && !r->sgt.pfn))
 		return -EINVAL;
 
 	/* Special PTE are not associated with any struct page */
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 095/162] drm/i915/dg1: Introduce dmabuf mmap to LMEM
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (93 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 094/162] drm/i915/dg1: Do not check r->sgt.pfn for NULL Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 096/162] drm/i915: setup the LMEM region Matthew Auld
                   ` (66 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Michael J. Ruhl, Brian Welty, dri-devel

From: "Michael J. Ruhl" <michael.j.ruhl@intel.com>

The i915 GEM dmabuf mmap interface assumes all BOs are SHMEM. When
the BO is backed by LMEM, this assumption doesn't work so well.

Introduce the dmabuf mmap interface to LMEM by adding the appropriate
VMA faulting mechanism and update dmabuf to allow for LMEM backed BOs by
leveraging the gem_mman path.

Cc: Brian Welty <brian.welty@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |  59 +++++++---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c      | 102 ++++++++++--------
 drivers/gpu/drm/i915/gem/i915_gem_mman.h      |   9 ++
 .../drm/i915/gem/selftests/i915_gem_mman.c    |  12 +--
 4 files changed, 118 insertions(+), 64 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 018d02cc4af5..85528eeaacbc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -10,6 +10,7 @@
 
 #include "i915_drv.h"
 #include "i915_gem_lmem.h"
+#include "i915_gem_mman.h"
 #include "i915_gem_object.h"
 #include "i915_scatterlist.h"
 
@@ -105,7 +106,41 @@ static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf, struct dma_buf_map *
 	i915_gem_object_unpin_map(obj);
 }
 
-static int i915_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct *vma)
+/**
+ * i915_gem_dmabuf_update_vma - Setup VMA information for exported LMEM
+ * objects
+ * @obj: valid LMEM object
+ * @vma: va;od vma
+ *
+ * NOTE: on success, the final _object_put() will be done by the VMA
+ * vm_close() callback.
+ */
+static int i915_gem_dmabuf_update_vma(struct drm_i915_gem_object *obj,
+				      struct vm_area_struct *vma)
+{
+	struct i915_mmap_offset *mmo;
+	int err;
+
+	i915_gem_object_get(obj);
+	mmo = i915_gem_mmap_offset_attach(obj, I915_MMAP_TYPE_WC, NULL);
+	if (IS_ERR(mmo)) {
+		err = PTR_ERR(mmo);
+		goto out;
+	}
+
+	err = i915_gem_update_vma_info(obj, mmo, vma);
+	if (err)
+		goto out;
+
+	return 0;
+
+out:
+	i915_gem_object_put(obj);
+	return err;
+}
+
+static int i915_gem_dmabuf_mmap(struct dma_buf *dma_buf,
+				struct vm_area_struct *vma)
 {
 	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
 	int ret;
@@ -113,16 +148,20 @@ static int i915_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct *
 	if (obj->base.size < vma->vm_end - vma->vm_start)
 		return -EINVAL;
 
-	if (!obj->base.filp)
-		return -ENODEV;
+	/* shmem */
+	if (obj->base.filp) {
+		ret = call_mmap(obj->base.filp, vma);
+		if (ret)
+			return ret;
 
-	ret = call_mmap(obj->base.filp, vma);
-	if (ret)
-		return ret;
+		vma_set_file(vma, obj->base.filp);
+		return 0;
+	}
 
-	vma_set_file(vma, obj->base.filp);
+	if (i915_gem_object_is_lmem(obj))
+		return i915_gem_dmabuf_update_vma(obj, vma);
 
-	return 0;
+	return -ENODEV;
 }
 
 static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction direction)
@@ -254,10 +293,6 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
 			 */
 			return &i915_gem_object_get(obj)->base;
 		}
-
-		/* not our device, but still a i915 object? */
-		if (i915_gem_object_is_lmem(obj))
-			return ERR_PTR(-ENOTSUPP);
 	}
 
 	/* need to attach */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index 4e8a05c35252..33ccd4d665d4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -620,10 +620,10 @@ insert_mmo(struct drm_i915_gem_object *obj, struct i915_mmap_offset *mmo)
 	return mmo;
 }
 
-static struct i915_mmap_offset *
-mmap_offset_attach(struct drm_i915_gem_object *obj,
-		   enum i915_mmap_type mmap_type,
-		   struct drm_file *file)
+struct i915_mmap_offset *
+i915_gem_mmap_offset_attach(struct drm_i915_gem_object *obj,
+			    enum i915_mmap_type mmap_type,
+			    struct drm_file *file)
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct i915_mmap_offset *mmo;
@@ -696,7 +696,7 @@ __assign_mmap_offset(struct drm_file *file,
 		goto out;
 	}
 
-	mmo = mmap_offset_attach(obj, mmap_type, file);
+	mmo = i915_gem_mmap_offset_attach(obj, mmap_type, file);
 	if (IS_ERR(mmo)) {
 		err = PTR_ERR(mmo);
 		goto out;
@@ -867,56 +867,22 @@ static struct file *mmap_singleton(struct drm_i915_private *i915)
 	return file;
 }
 
-/*
- * This overcomes the limitation in drm_gem_mmap's assignment of a
- * drm_gem_object as the vma->vm_private_data. Since we need to
- * be able to resolve multiple mmap offsets which could be tied
- * to a single gem object.
- */
-int i915_gem_mmap(struct file *filp, struct vm_area_struct *vma)
+int i915_gem_update_vma_info(struct drm_i915_gem_object *obj,
+			     struct i915_mmap_offset *mmo,
+			     struct vm_area_struct *vma)
 {
-	struct drm_vma_offset_node *node;
-	struct drm_file *priv = filp->private_data;
-	struct drm_device *dev = priv->minor->dev;
-	struct drm_i915_gem_object *obj = NULL;
-	struct i915_mmap_offset *mmo = NULL;
 	struct file *anon;
 
-	if (drm_dev_is_unplugged(dev))
-		return -ENODEV;
-
-	rcu_read_lock();
-	drm_vma_offset_lock_lookup(dev->vma_offset_manager);
-	node = drm_vma_offset_exact_lookup_locked(dev->vma_offset_manager,
-						  vma->vm_pgoff,
-						  vma_pages(vma));
-	if (node && drm_vma_node_is_allowed(node, priv)) {
-		/*
-		 * Skip 0-refcnted objects as it is in the process of being
-		 * destroyed and will be invalid when the vma manager lock
-		 * is released.
-		 */
-		mmo = container_of(node, struct i915_mmap_offset, vma_node);
-		obj = i915_gem_object_get_rcu(mmo->obj);
-	}
-	drm_vma_offset_unlock_lookup(dev->vma_offset_manager);
-	rcu_read_unlock();
-	if (!obj)
-		return node ? -EACCES : -EINVAL;
-
 	if (i915_gem_object_is_readonly(obj)) {
-		if (vma->vm_flags & VM_WRITE) {
-			i915_gem_object_put(obj);
+		if (vma->vm_flags & VM_WRITE)
 			return -EINVAL;
-		}
+
 		vma->vm_flags &= ~VM_MAYWRITE;
 	}
 
-	anon = mmap_singleton(to_i915(dev));
-	if (IS_ERR(anon)) {
-		i915_gem_object_put(obj);
+	anon = mmap_singleton(to_i915(obj->base.dev));
+	if (IS_ERR(anon))
 		return PTR_ERR(anon);
-	}
 
 	vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
 	vma->vm_private_data = mmo;
@@ -962,6 +928,50 @@ int i915_gem_mmap(struct file *filp, struct vm_area_struct *vma)
 	return 0;
 }
 
+/*
+ * This overcomes the limitation in drm_gem_mmap's assignment of a
+ * drm_gem_object as the vma->vm_private_data. Since we need to
+ * be able to resolve multiple mmap offsets which could be tied
+ * to a single gem object.
+ */
+int i915_gem_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+	struct drm_vma_offset_node *node;
+	struct drm_file *priv = filp->private_data;
+	struct drm_device *dev = priv->minor->dev;
+	struct drm_i915_gem_object *obj = NULL;
+	struct i915_mmap_offset *mmo = NULL;
+	int err;
+
+	if (drm_dev_is_unplugged(dev))
+		return -ENODEV;
+
+	rcu_read_lock();
+	drm_vma_offset_lock_lookup(dev->vma_offset_manager);
+	node = drm_vma_offset_exact_lookup_locked(dev->vma_offset_manager,
+						  vma->vm_pgoff,
+						  vma_pages(vma));
+	if (node && drm_vma_node_is_allowed(node, priv)) {
+		/*
+		 * Skip 0-refcnted objects as it is in the process of being
+		 * destroyed and will be invalid when the vma manager lock
+		 * is released.
+		 */
+		mmo = container_of(node, struct i915_mmap_offset, vma_node);
+		obj = i915_gem_object_get_rcu(mmo->obj);
+	}
+	drm_vma_offset_unlock_lookup(dev->vma_offset_manager);
+	rcu_read_unlock();
+	if (!obj)
+		return node ? -EACCES : -EINVAL;
+
+	err = i915_gem_update_vma_info(obj, mmo, vma);
+	if (err)
+		i915_gem_object_put(obj);
+
+	return err;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/i915_gem_mman.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.h b/drivers/gpu/drm/i915/gem/i915_gem_mman.h
index 7c5ccdf59359..dfd19da0b3e7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.h
@@ -10,6 +10,8 @@
 #include <linux/mm_types.h>
 #include <linux/types.h>
 
+#include "gem/i915_gem_object_types.h"
+
 struct drm_device;
 struct drm_file;
 struct drm_i915_gem_object;
@@ -31,4 +33,11 @@ void i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj);
 
 void i915_gem_object_release_mmap_offset(struct drm_i915_gem_object *obj);
 
+struct i915_mmap_offset *
+i915_gem_mmap_offset_attach(struct drm_i915_gem_object *obj,
+			    enum i915_mmap_type mmap_type,
+			    struct drm_file *file);
+int i915_gem_update_vma_info(struct drm_i915_gem_object *obj,
+			     struct i915_mmap_offset *mmo,
+			     struct vm_area_struct *vma);
 #endif
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index 85fff8bed08c..5701549b5d13 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -583,7 +583,7 @@ static bool assert_mmap_offset(struct drm_i915_private *i915,
 	if (IS_ERR(obj))
 		return false;
 
-	mmo = mmap_offset_attach(obj, I915_MMAP_OFFSET_GTT, NULL);
+	mmo = i915_gem_mmap_offset_attach(obj, I915_MMAP_OFFSET_GTT, NULL);
 	i915_gem_object_put(obj);
 
 	return PTR_ERR_OR_ZERO(mmo) == expected;
@@ -686,7 +686,7 @@ static int igt_mmap_offset_exhaustion(void *arg)
 		goto out;
 	}
 
-	mmo = mmap_offset_attach(obj, I915_MMAP_OFFSET_GTT, NULL);
+	mmo = i915_gem_mmap_offset_attach(obj, I915_MMAP_OFFSET_GTT, NULL);
 	if (IS_ERR(mmo)) {
 		pr_err("Unable to insert object into reclaimed hole\n");
 		err = PTR_ERR(mmo);
@@ -860,7 +860,7 @@ static int __igt_mmap(struct drm_i915_private *i915,
 	if (err)
 		return err;
 
-	mmo = mmap_offset_attach(obj, type, NULL);
+	mmo = i915_gem_mmap_offset_attach(obj, type, NULL);
 	if (IS_ERR(mmo))
 		return PTR_ERR(mmo);
 
@@ -996,7 +996,7 @@ static int __igt_mmap_access(struct drm_i915_private *i915,
 	if (!can_mmap(obj, type) || !can_access(obj))
 		return 0;
 
-	mmo = mmap_offset_attach(obj, type, NULL);
+	mmo = i915_gem_mmap_offset_attach(obj, type, NULL);
 	if (IS_ERR(mmo))
 		return PTR_ERR(mmo);
 
@@ -1109,7 +1109,7 @@ static int __igt_mmap_gpu(struct drm_i915_private *i915,
 	if (err)
 		return err;
 
-	mmo = mmap_offset_attach(obj, type, NULL);
+	mmo = i915_gem_mmap_offset_attach(obj, type, NULL);
 	if (IS_ERR(mmo))
 		return PTR_ERR(mmo);
 
@@ -1285,7 +1285,7 @@ static int __igt_mmap_revoke(struct drm_i915_private *i915,
 	if (!can_mmap(obj, type))
 		return 0;
 
-	mmo = mmap_offset_attach(obj, type, NULL);
+	mmo = i915_gem_mmap_offset_attach(obj, type, NULL);
 	if (IS_ERR(mmo))
 		return PTR_ERR(mmo);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 096/162] drm/i915: setup the LMEM region
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (94 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 095/162] drm/i915/dg1: Introduce dmabuf mmap to LMEM Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-30 10:14   ` Jani Nikula
  2020-11-27 12:06 ` [RFC PATCH 097/162] drm/i915: Distinction of memory regions Matthew Auld
                   ` (65 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Abdiel Janulgue, Lucas De Marchi, dri-devel, Rodrigo Vivi

Hook up the LMEM region. Addresses will start from zero, and for CPU
access we get LMEM_BAR which is just a 1:1 mapping of said region.

Based on a patch from Michel Thierry.

Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h            |  3 ++
 drivers/gpu/drm/i915/intel_memory_region.c | 11 ++++++-
 drivers/gpu/drm/i915/intel_region_lmem.c   | 38 ++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_region_lmem.h   |  2 ++
 4 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index bf9ba1e361bb..1af1966ac461 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -12063,6 +12063,9 @@ enum skl_power_gate {
 
 #define GEN12_GLOBAL_MOCS(i)	_MMIO(0x4000 + (i) * 4) /* Global MOCS regs */
 
+#define GEN12_LMEM_CFG_ADDR		_MMIO(0xcf58)
+#define   LMEM_ENABLE			(1 << 31)
+
 /* gamt regs */
 #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
 #define   GEN8_L3_LRA_1_GPGPU_DEFAULT_VALUE_BDW  0x67F1427F /* max/min for LRA1/2 */
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index 67240bddf2ca..1f26bc06ec20 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -303,7 +303,16 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
 			mem = i915_gem_stolen_setup(i915);
 			break;
 		case INTEL_MEMORY_LOCAL:
-			mem = intel_setup_fake_lmem(i915);
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+			if (IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM)) {
+				if (INTEL_GEN(i915) >= 9 && i915_selftest.live < 0 &&
+				    i915->params.fake_lmem_start)
+					mem = intel_setup_fake_lmem(i915);
+			}
+#endif
+
+			if (IS_ERR(mem))
+				mem = i915_gem_setup_lmem(i915);
 			break;
 		}
 
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index 40d8f1a95df6..e98582c76de1 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -136,3 +136,41 @@ intel_setup_fake_lmem(struct drm_i915_private *i915)
 
 	return mem;
 }
+
+static struct intel_memory_region *
+setup_lmem(struct drm_i915_private *dev_priv)
+{
+	struct pci_dev *pdev = dev_priv->drm.pdev;
+	struct intel_memory_region *mem;
+	resource_size_t io_start;
+	resource_size_t size;
+
+	/* Enables Local Memory functionality in GAM */
+	I915_WRITE(GEN12_LMEM_CFG_ADDR, I915_READ(GEN12_LMEM_CFG_ADDR) | LMEM_ENABLE);
+
+	io_start = pci_resource_start(pdev, 2);
+	size = pci_resource_len(pdev, 2);
+
+	mem = intel_memory_region_create(dev_priv,
+					 0,
+					 size,
+					 I915_GTT_PAGE_SIZE_4K,
+					 io_start,
+					 &intel_region_lmem_ops);
+	if (!IS_ERR(mem)) {
+		DRM_INFO("Intel graphics LMEM: %pR\n", &mem->region);
+		DRM_INFO("Intel graphics LMEM IO start: %llx\n",
+			 (u64)mem->io_start);
+		DRM_INFO("Intel graphics LMEM size: %llx\n",
+			 (u64)size);
+	}
+
+	return mem;
+}
+
+struct intel_memory_region *
+i915_gem_setup_lmem(struct drm_i915_private *i915)
+{
+	return setup_lmem(i915);
+}
+
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.h b/drivers/gpu/drm/i915/intel_region_lmem.h
index 213def7c7b8a..054e729035c1 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.h
+++ b/drivers/gpu/drm/i915/intel_region_lmem.h
@@ -10,6 +10,8 @@ struct drm_i915_private;
 
 extern const struct intel_memory_region_ops intel_region_lmem_ops;
 
+struct intel_memory_region *i915_gem_setup_lmem(struct drm_i915_private *i915);
+
 struct intel_memory_region *
 intel_setup_fake_lmem(struct drm_i915_private *i915);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 097/162] drm/i915: Distinction of memory regions
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (95 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 096/162] drm/i915: setup the LMEM region Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 13:30   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 098/162] drm/i915/gtt: map the PD up front Matthew Auld
                   ` (64 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx
  Cc: Adam Miszczak, Tvrtko Ursulin, dri-devel, Zbigniew Kempczyński

From: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>

IGTs should be able to choose testing strategy depending on memory
regions and its sizes. Add region instance number to make this
easier and descriptive.

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Adam Miszczak <adam.miszczak@intel.com>
Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 drivers/gpu/drm/i915/intel_memory_region.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index 1f26bc06ec20..cea44ddebe46 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -329,6 +329,10 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
 		mem->instance = instance;
 		mem->gt = &i915->gt;
 
+		if (HAS_LMEM(mem->i915) && type != INTEL_MEMORY_SYSTEM)
+			intel_memory_region_set_name(mem, "%s%u",
+						     mem->name, mem->instance);
+
 		i915->mm.regions[i] = mem;
 	}
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 098/162] drm/i915/gtt: map the PD up front
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (96 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 097/162] drm/i915: Distinction of memory regions Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 13:31   ` Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 099/162] drm/i915/gtt/dgfx: place the PD in LMEM Matthew Auld
                   ` (63 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Chris Wilson

We need to general our accessor for the page directories and tables from
using the simple kmap_atomic to support local memory, and this setup
must be done on acquisition of the backing storage prior to entering
fence execution contexts. Here we replace the kmap with the object
maping code that for simple single page shmemfs object will return a
plain kmap, that is then kept for the lifetime of the page directory.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../drm/i915/gem/selftests/i915_gem_context.c | 11 +----
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 11 ++---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 26 ++++------
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |  2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c           | 48 +++++++++----------
 drivers/gpu/drm/i915/gt/intel_gtt.h           | 11 +++--
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  7 ++-
 drivers/gpu/drm/i915/i915_vma.c               |  3 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 10 ++--
 drivers/gpu/drm/i915/selftests/i915_perf.c    |  3 +-
 10 files changed, 54 insertions(+), 78 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 5fef592390cb..ce70d0a3afb2 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -1740,7 +1740,6 @@ static int read_from_scratch(struct i915_gem_context *ctx,
 static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
 {
 	struct i915_address_space *vm;
-	struct page *page;
 	u32 *vaddr;
 	int err = 0;
 
@@ -1748,24 +1747,18 @@ static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
 	if (!vm)
 		return -ENODEV;
 
-	page = __px_page(vm->scratch[0]);
-	if (!page) {
+	if (!vm->scratch[0]) {
 		pr_err("No scratch page!\n");
 		return -EINVAL;
 	}
 
-	vaddr = kmap(page);
-	if (!vaddr) {
-		pr_err("No (mappable) scratch page!\n");
-		return -EINVAL;
-	}
+	vaddr = __px_vaddr(vm->scratch[0]);
 
 	memcpy(out, vaddr, sizeof(*out));
 	if (memchr_inv(vaddr, *out, PAGE_SIZE)) {
 		pr_err("Inconsistent initial state of scratch page!\n");
 		err = -EINVAL;
 	}
-	kunmap(page);
 
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index 680bd9442eb0..78ad7d8a8bcc 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -105,9 +105,8 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 		 * entries back to scratch.
 		 */
 
-		vaddr = kmap_atomic_px(pt);
+		vaddr = px_vaddr(pt);
 		memset32(vaddr + pte, scratch_pte, count);
-		kunmap_atomic(vaddr);
 
 		pte = 0;
 	}
@@ -129,7 +128,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 
 	GEM_BUG_ON(!pd->entry[act_pt]);
 
-	vaddr = kmap_atomic_px(i915_pt_entry(pd, act_pt));
+	vaddr = px_vaddr(i915_pt_entry(pd, act_pt));
 	do {
 		GEM_BUG_ON(sg_dma_len(iter.sg) < I915_GTT_PAGE_SIZE);
 		vaddr[act_pte] = pte_encode | GEN6_PTE_ADDR_ENCODE(iter.dma);
@@ -145,12 +144,10 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 		}
 
 		if (++act_pte == GEN6_PTES) {
-			kunmap_atomic(vaddr);
-			vaddr = kmap_atomic_px(i915_pt_entry(pd, ++act_pt));
+			vaddr = px_vaddr(i915_pt_entry(pd, ++act_pt));
 			act_pte = 0;
 		}
 	} while (1);
-	kunmap_atomic(vaddr);
 
 	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 }
@@ -244,7 +241,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
 		goto err_scratch0;
 	}
 
-	ret = pin_pt_dma(vm, vm->scratch[1]);
+	ret = map_pt_dma(vm, vm->scratch[1]);
 	if (ret)
 		goto err_scratch1;
 
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index a37c968ef8f7..a3093dd4b86d 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -237,11 +237,10 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
 			    atomic_read(&pt->used));
 			GEM_BUG_ON(!count || count >= atomic_read(&pt->used));
 
-			vaddr = kmap_atomic_px(pt);
+			vaddr = px_vaddr(pt);
 			memset64(vaddr + gen8_pd_index(start, 0),
 				 vm->scratch[0]->encode,
 				 count);
-			kunmap_atomic(vaddr);
 
 			atomic_sub(count, &pt->used);
 			start += count;
@@ -370,7 +369,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 	gen8_pte_t *vaddr;
 
 	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
-	vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
+	vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
 	do {
 		GEM_BUG_ON(sg_dma_len(iter->sg) < I915_GTT_PAGE_SIZE);
 		vaddr[gen8_pd_index(idx, 0)] = pte_encode | iter->dma;
@@ -397,12 +396,10 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 			}
 
 			clflush_cache_range(vaddr, PAGE_SIZE);
-			kunmap_atomic(vaddr);
-			vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
+			vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
 		}
 	} while (1);
 	clflush_cache_range(vaddr, PAGE_SIZE);
-	kunmap_atomic(vaddr);
 
 	return idx;
 }
@@ -437,7 +434,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			encode |= GEN8_PDE_PS_2M;
 			page_size = I915_GTT_PAGE_SIZE_2M;
 
-			vaddr = kmap_atomic_px(pd);
+			vaddr = px_vaddr(pd);
 		} else {
 			struct i915_page_table *pt =
 				i915_pt_entry(pd, __gen8_pte_index(start, 1));
@@ -452,7 +449,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			     rem >= (I915_PDES - index) * I915_GTT_PAGE_SIZE))
 				maybe_64K = __gen8_pte_index(start, 1);
 
-			vaddr = kmap_atomic_px(pt);
+			vaddr = px_vaddr(pt);
 		}
 
 		do {
@@ -486,7 +483,6 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 		} while (rem >= page_size && index < I915_PDES);
 
 		clflush_cache_range(vaddr, PAGE_SIZE);
-		kunmap_atomic(vaddr);
 
 		/*
 		 * Is it safe to mark the 2M block as 64K? -- Either we have
@@ -500,9 +496,8 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 		      !iter->sg && IS_ALIGNED(vma->node.start +
 					      vma->node.size,
 					      I915_GTT_PAGE_SIZE_2M)))) {
-			vaddr = kmap_atomic_px(pd);
+			vaddr = px_vaddr(pd);
 			vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
-			kunmap_atomic(vaddr);
 			page_size = I915_GTT_PAGE_SIZE_64K;
 
 			/*
@@ -518,12 +513,11 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 				u16 i;
 
 				encode = vma->vm->scratch[0]->encode;
-				vaddr = kmap_atomic_px(i915_pt_entry(pd, maybe_64K));
+				vaddr = px_vaddr(i915_pt_entry(pd, maybe_64K));
 
 				for (i = 1; i < index; i += 16)
 					memset64(vaddr + i, encode, 15);
 
-				kunmap_atomic(vaddr);
 			}
 		}
 
@@ -592,7 +586,7 @@ static int gen8_init_scratch(struct i915_address_space *vm)
 		if (IS_ERR(obj))
 			goto free_scratch;
 
-		ret = pin_pt_dma(vm, obj);
+		ret = map_pt_dma(vm, obj);
 		if (ret) {
 			i915_gem_object_put(obj);
 			goto free_scratch;
@@ -629,7 +623,7 @@ static int gen8_preallocate_top_level_pdp(struct i915_ppgtt *ppgtt)
 		if (IS_ERR(pde))
 			return PTR_ERR(pde);
 
-		err = pin_pt_dma(vm, pde->pt.base);
+		err = map_pt_dma(vm, pde->pt.base);
 		if (err) {
 			i915_gem_object_put(pde->pt.base);
 			free_pd(vm, pde);
@@ -665,7 +659,7 @@ gen8_alloc_top_pd(struct i915_address_space *vm)
 		goto err_pd;
 	}
 
-	err = pin_pt_dma(vm, pd->pt.base);
+	err = map_pt_dma(vm, pd->pt.base);
 	if (err)
 		goto err_pd;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 17ecaef1834d..4560e03067a7 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -616,7 +616,7 @@ static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
 		goto err_ppgtt;
 
 	i915_gem_object_lock(ppgtt->vm.scratch[0], NULL);
-	err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+	err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
 	i915_gem_object_unlock(ppgtt->vm.scratch[0]);
 	if (err)
 		goto err_stash;
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 070d538cdc56..f3a263f09368 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -25,27 +25,25 @@ struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
 	return obj;
 }
 
-int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
+int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
-	int err;
+	void *vaddr;
 
-	i915_gem_object_lock(obj, NULL);
-	err = i915_gem_object_pin_pages(obj);
-	i915_gem_object_unlock(obj);
-	if (err)
-		return err;
+	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
+	if (IS_ERR(vaddr))
+		return PTR_ERR(vaddr);
 
 	i915_gem_object_make_unshrinkable(obj);
 	return 0;
 }
 
-int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
+int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
-	int err;
+	void *vaddr;
 
-	err = i915_gem_object_pin_pages(obj);
-	if (err)
-		return err;
+	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	if (IS_ERR(vaddr))
+		return PTR_ERR(vaddr);
 
 	i915_gem_object_make_unshrinkable(obj);
 	return 0;
@@ -155,6 +153,14 @@ void clear_pages(struct i915_vma *vma)
 	memset(&vma->page_sizes, 0, sizeof(vma->page_sizes));
 }
 
+void *__px_vaddr(struct drm_i915_gem_object *p)
+{
+	enum i915_map_type type;
+
+	GEM_BUG_ON(!i915_gem_object_has_pages(p));
+	return page_unpack_bits(p->mm.mapping, &type);
+}
+
 dma_addr_t __px_dma(struct drm_i915_gem_object *p)
 {
 	GEM_BUG_ON(!i915_gem_object_has_pages(p));
@@ -170,32 +176,22 @@ struct page *__px_page(struct drm_i915_gem_object *p)
 void
 fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count)
 {
-	struct page *page = __px_page(p);
-	void *vaddr;
+	void *vaddr = __px_vaddr(p);
 
-	vaddr = kmap(page);
 	memset64(vaddr, val, count);
 	clflush_cache_range(vaddr, PAGE_SIZE);
-	kunmap(page);
 }
 
 static void poison_scratch_page(struct drm_i915_gem_object *scratch)
 {
-	struct sgt_iter sgt;
-	struct page *page;
+	void *vaddr = __px_vaddr(scratch);
 	u8 val;
 
 	val = 0;
 	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
 		val = POISON_FREE;
 
-	for_each_sgt_page(page, sgt, scratch->mm.pages) {
-		void *vaddr;
-
-		vaddr = kmap(page);
-		memset(vaddr, val, PAGE_SIZE);
-		kunmap(page);
-	}
+	memset(vaddr, val, scratch->base.size);
 }
 
 int setup_scratch_page(struct i915_address_space *vm)
@@ -225,7 +221,7 @@ int setup_scratch_page(struct i915_address_space *vm)
 		if (IS_ERR(obj))
 			goto skip;
 
-		if (pin_pt_dma(vm, obj))
+		if (map_pt_dma(vm, obj))
 			goto skip_obj;
 
 		/* We need a single contiguous page for our scratch */
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 16063b2f0119..5b8ea9c8c654 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -176,6 +176,9 @@ struct page *__px_page(struct drm_i915_gem_object *p);
 dma_addr_t __px_dma(struct drm_i915_gem_object *p);
 #define px_dma(px) (__px_dma(px_base(px)))
 
+void *__px_vaddr(struct drm_i915_gem_object *p);
+#define px_vaddr(px) (__px_vaddr(px_base(px)))
+
 #define px_pt(px) \
 	__px_choose_expr(px, struct i915_page_table *, __x, \
 	__px_choose_expr(px, struct i915_page_directory *, &__x->pt, \
@@ -506,8 +509,6 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt);
 void i915_ggtt_suspend(struct i915_ggtt *gtt);
 void i915_ggtt_resume(struct i915_ggtt *ggtt);
 
-#define kmap_atomic_px(px) kmap_atomic(__px_page(px_base(px)))
-
 void
 fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count);
 
@@ -525,8 +526,8 @@ struct i915_page_table *alloc_pt(struct i915_address_space *vm);
 struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
 struct i915_page_directory *__alloc_pd(int npde);
 
-int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
-int pin_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
+int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
+int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
 
 void free_px(struct i915_address_space *vm,
 	     struct i915_page_table *pt, int lvl);
@@ -573,7 +574,7 @@ void setup_private_pat(struct intel_uncore *uncore);
 int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash,
 			   u64 size);
-int i915_vm_pin_pt_stash(struct i915_address_space *vm,
+int i915_vm_map_pt_stash(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash);
 void i915_vm_free_pt_stash(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash);
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index f3ac47702aee..8e7b77cc4594 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -85,11 +85,10 @@ write_dma_entry(struct drm_i915_gem_object * const pdma,
 		const unsigned short idx,
 		const u64 encoded_entry)
 {
-	u64 * const vaddr = kmap_atomic(__px_page(pdma));
+	u64 * const vaddr = __px_vaddr(pdma);
 
 	vaddr[idx] = encoded_entry;
 	clflush_cache_range(&vaddr[idx], sizeof(u64));
-	kunmap_atomic(vaddr);
 }
 
 void
@@ -254,7 +253,7 @@ int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
 	return 0;
 }
 
-int i915_vm_pin_pt_stash(struct i915_address_space *vm,
+int i915_vm_map_pt_stash(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash)
 {
 	struct i915_page_table *pt;
@@ -262,7 +261,7 @@ int i915_vm_pin_pt_stash(struct i915_address_space *vm,
 
 	for (n = 0; n < ARRAY_SIZE(stash->pt); n++) {
 		for (pt = stash->pt[n]; pt; pt = pt->stash) {
-			err = pin_pt_dma_locked(vm, pt->base);
+			err = map_pt_dma_locked(vm, pt->base);
 			if (err)
 				return err;
 		}
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 7243ab593aec..82f60cc43a90 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -914,8 +914,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 			if (err)
 				goto err_fence;
 
-			err = i915_vm_pin_pt_stash(vma->vm,
-						   &work->stash);
+			err = i915_vm_map_pt_stash(vma->vm, &work->stash);
 			if (err)
 				goto err_fence;
 		}
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index d07dd6780005..9653d7c259a5 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -185,7 +185,7 @@ static int igt_ppgtt_alloc(void *arg)
 		if (err)
 			goto err_ppgtt_cleanup;
 
-		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+		err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
 		if (err) {
 			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
 			goto err_ppgtt_cleanup;
@@ -207,7 +207,7 @@ static int igt_ppgtt_alloc(void *arg)
 		if (err)
 			goto err_ppgtt_cleanup;
 
-		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+		err = i915_vm_map_pt_stash(&ppgtt->vm, &stash);
 		if (err) {
 			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
 			goto err_ppgtt_cleanup;
@@ -324,11 +324,10 @@ static int lowlevel_hole(struct i915_address_space *vm,
 							   BIT_ULL(size)))
 					goto alloc_vm_end;
 
-				err = i915_vm_pin_pt_stash(vm, &stash);
+				err = i915_vm_map_pt_stash(vm, &stash);
 				if (!err)
 					vm->allocate_va_range(vm, &stash,
 							      addr, BIT_ULL(size));
-
 				i915_vm_free_pt_stash(vm, &stash);
 alloc_vm_end:
 				if (err == -EDEADLK) {
@@ -1966,10 +1965,9 @@ static int igt_cs_tlb(void *arg)
 			if (err)
 				goto end_ww;
 
-			err = i915_vm_pin_pt_stash(vm, &stash);
+			err = i915_vm_map_pt_stash(vm, &stash);
 			if (!err)
 				vm->allocate_va_range(vm, &stash, offset, chunk_size);
-
 			i915_vm_free_pt_stash(vm, &stash);
 end_ww:
 			if (err == -EDEADLK) {
diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c b/drivers/gpu/drm/i915/selftests/i915_perf.c
index debbac660519..6a7abb3e2bb5 100644
--- a/drivers/gpu/drm/i915/selftests/i915_perf.c
+++ b/drivers/gpu/drm/i915/selftests/i915_perf.c
@@ -307,7 +307,7 @@ static int live_noa_gpr(void *arg)
 	}
 
 	/* Poison the ce->vm so we detect writes not to the GGTT gt->scratch */
-	scratch = kmap(__px_page(ce->vm->scratch[0]));
+	scratch = __px_vaddr(ce->vm->scratch[0]);
 	memset(scratch, POISON_FREE, PAGE_SIZE);
 
 	rq = intel_context_create_request(ce);
@@ -405,7 +405,6 @@ static int live_noa_gpr(void *arg)
 out_rq:
 	i915_request_put(rq);
 out_ce:
-	kunmap(__px_page(ce->vm->scratch[0]));
 	intel_context_put(ce);
 out:
 	stream_destroy(stream);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 099/162] drm/i915/gtt/dgfx: place the PD in LMEM
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (97 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 098/162] drm/i915/gtt: map the PD up front Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 100/162] drm/i915/gtt: make flushing conditional Matthew Auld
                   ` (62 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

It's a requirement that for dgfx we place all the paging structures in
device local-memory.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c |  5 ++++-
 drivers/gpu/drm/i915/gt/intel_gtt.c  | 27 +++++++++++++++++++++++++--
 drivers/gpu/drm/i915/gt/intel_gtt.h  |  1 +
 3 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index a3093dd4b86d..f67e0332ccbc 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -702,7 +702,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 	 */
 	ppgtt->vm.has_read_only = !IS_GEN_RANGE(gt->i915, 11, 12);
 
-	ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
+	if (IS_DGFX(gt->i915))
+		ppgtt->vm.alloc_pt_dma = alloc_pt_lmem;
+	else
+		ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
 
 	err = gen8_init_scratch(&ppgtt->vm);
 	if (err)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index f3a263f09368..2605bfd39a15 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -7,10 +7,23 @@
 
 #include <linux/fault-inject.h>
 
+#include "gem/i915_gem_lmem.h"
 #include "i915_trace.h"
 #include "intel_gt.h"
 #include "intel_gtt.h"
 
+struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz)
+{
+	struct drm_i915_gem_object *obj;
+
+	obj = i915_gem_object_create_lmem(vm->i915, sz, 0);
+
+	/* ensure all dma objects have the same reservation class */
+	if (!IS_ERR(obj))
+		obj->base.resv = &vm->resv;
+	return obj;
+}
+
 struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
 {
 	struct drm_i915_gem_object *obj;
@@ -27,9 +40,14 @@ struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
 
 int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
+	enum i915_map_type type;
 	void *vaddr;
 
-	vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
+	type = I915_MAP_WB;
+	if (i915_gem_object_is_lmem(obj))
+		type = I915_MAP_WC;
+
+	vaddr = i915_gem_object_pin_map_unlocked(obj, type);
 	if (IS_ERR(vaddr))
 		return PTR_ERR(vaddr);
 
@@ -39,9 +57,14 @@ int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 
 int map_pt_dma_locked(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
+	enum i915_map_type type;
 	void *vaddr;
 
-	vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+	type = I915_MAP_WB;
+	if (i915_gem_object_is_lmem(obj))
+		type = I915_MAP_WC;
+
+	vaddr = i915_gem_object_pin_map(obj, type);
 	if (IS_ERR(vaddr))
 		return PTR_ERR(vaddr);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 5b8ea9c8c654..bdbdfded60cc 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -522,6 +522,7 @@ int setup_scratch_page(struct i915_address_space *vm);
 void free_scratch(struct i915_address_space *vm);
 
 struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz);
+struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz);
 struct i915_page_table *alloc_pt(struct i915_address_space *vm);
 struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
 struct i915_page_directory *__alloc_pd(int npde);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 100/162] drm/i915/gtt: make flushing conditional
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (98 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 099/162] drm/i915/gtt/dgfx: place the PD in LMEM Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 101/162] drm/i915/gtt/dg1: add PTE_LM plumbing for PPGTT Matthew Auld
                   ` (61 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Now that PDs can also be mapped as WC, we can forgo all the flushing for
such mappings.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 .../drm/i915/gem/selftests/i915_gem_context.c |  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          |  6 ++---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 26 ++++++++++++-------
 drivers/gpu/drm/i915/gt/intel_gtt.c           | 20 ++++++++++----
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  4 +--
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  6 +++--
 drivers/gpu/drm/i915/selftests/i915_perf.c    |  2 +-
 7 files changed, 42 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index ce70d0a3afb2..e52cc74db2b1 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -1752,7 +1752,7 @@ static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
 		return -EINVAL;
 	}
 
-	vaddr = __px_vaddr(vm->scratch[0]);
+	vaddr = __px_vaddr(vm->scratch[0], NULL);
 
 	memcpy(out, vaddr, sizeof(*out));
 	if (memchr_inv(vaddr, *out, PAGE_SIZE)) {
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index 78ad7d8a8bcc..8d12e9334861 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -105,7 +105,7 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 		 * entries back to scratch.
 		 */
 
-		vaddr = px_vaddr(pt);
+		vaddr = px_vaddr(pt, NULL);
 		memset32(vaddr + pte, scratch_pte, count);
 
 		pte = 0;
@@ -128,7 +128,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 
 	GEM_BUG_ON(!pd->entry[act_pt]);
 
-	vaddr = px_vaddr(i915_pt_entry(pd, act_pt));
+	vaddr = px_vaddr(i915_pt_entry(pd, act_pt), NULL);
 	do {
 		GEM_BUG_ON(sg_dma_len(iter.sg) < I915_GTT_PAGE_SIZE);
 		vaddr[act_pte] = pte_encode | GEN6_PTE_ADDR_ENCODE(iter.dma);
@@ -144,7 +144,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 		}
 
 		if (++act_pte == GEN6_PTES) {
-			vaddr = px_vaddr(i915_pt_entry(pd, ++act_pt));
+			vaddr = px_vaddr(i915_pt_entry(pd, ++act_pt), NULL);
 			act_pte = 0;
 		}
 	} while (1);
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index f67e0332ccbc..e2f1dfc48d43 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -237,7 +237,7 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
 			    atomic_read(&pt->used));
 			GEM_BUG_ON(!count || count >= atomic_read(&pt->used));
 
-			vaddr = px_vaddr(pt);
+			vaddr = px_vaddr(pt, NULL);
 			memset64(vaddr + gen8_pd_index(start, 0),
 				 vm->scratch[0]->encode,
 				 count);
@@ -367,9 +367,10 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 	struct i915_page_directory *pd;
 	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
 	gen8_pte_t *vaddr;
+	bool needs_flush;
 
 	pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
-	vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
+	vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)), &needs_flush);
 	do {
 		GEM_BUG_ON(sg_dma_len(iter->sg) < I915_GTT_PAGE_SIZE);
 		vaddr[gen8_pd_index(idx, 0)] = pte_encode | iter->dma;
@@ -395,11 +396,14 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 				pd = pdp->entry[gen8_pd_index(idx, 2)];
 			}
 
-			clflush_cache_range(vaddr, PAGE_SIZE);
-			vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
+			if (needs_flush)
+				clflush_cache_range(vaddr, PAGE_SIZE);
+			vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)),
+					 &needs_flush);
 		}
 	} while (1);
-	clflush_cache_range(vaddr, PAGE_SIZE);
+	if (needs_flush)
+		clflush_cache_range(vaddr, PAGE_SIZE);
 
 	return idx;
 }
@@ -412,6 +416,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
 	unsigned int rem = sg_dma_len(iter->sg);
 	u64 start = vma->node.start;
+	bool needs_flush;
 
 	GEM_BUG_ON(!i915_vm_is_4lvl(vma->vm));
 
@@ -434,7 +439,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			encode |= GEN8_PDE_PS_2M;
 			page_size = I915_GTT_PAGE_SIZE_2M;
 
-			vaddr = px_vaddr(pd);
+			vaddr = px_vaddr(pd, &needs_flush);
 		} else {
 			struct i915_page_table *pt =
 				i915_pt_entry(pd, __gen8_pte_index(start, 1));
@@ -449,7 +454,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			     rem >= (I915_PDES - index) * I915_GTT_PAGE_SIZE))
 				maybe_64K = __gen8_pte_index(start, 1);
 
-			vaddr = px_vaddr(pt);
+			vaddr = px_vaddr(pt, &needs_flush);
 		}
 
 		do {
@@ -482,7 +487,8 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			}
 		} while (rem >= page_size && index < I915_PDES);
 
-		clflush_cache_range(vaddr, PAGE_SIZE);
+		if (needs_flush)
+			clflush_cache_range(vaddr, PAGE_SIZE);
 
 		/*
 		 * Is it safe to mark the 2M block as 64K? -- Either we have
@@ -496,7 +502,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 		      !iter->sg && IS_ALIGNED(vma->node.start +
 					      vma->node.size,
 					      I915_GTT_PAGE_SIZE_2M)))) {
-			vaddr = px_vaddr(pd);
+			vaddr = px_vaddr(pd, NULL);
 			vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
 			page_size = I915_GTT_PAGE_SIZE_64K;
 
@@ -513,7 +519,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 				u16 i;
 
 				encode = vma->vm->scratch[0]->encode;
-				vaddr = px_vaddr(i915_pt_entry(pd, maybe_64K));
+				vaddr = px_vaddr(i915_pt_entry(pd, maybe_64K), NULL);
 
 				for (i = 1; i < index; i += 16)
 					memset64(vaddr + i, encode, 15);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 2605bfd39a15..eee8338e330b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -176,12 +176,19 @@ void clear_pages(struct i915_vma *vma)
 	memset(&vma->page_sizes, 0, sizeof(vma->page_sizes));
 }
 
-void *__px_vaddr(struct drm_i915_gem_object *p)
+void *__px_vaddr(struct drm_i915_gem_object *p, bool *needs_flush)
 {
 	enum i915_map_type type;
+	void *vaddr;
 
 	GEM_BUG_ON(!i915_gem_object_has_pages(p));
-	return page_unpack_bits(p->mm.mapping, &type);
+
+	vaddr = page_unpack_bits(p->mm.mapping, &type);
+
+	if (needs_flush)
+		*needs_flush = type != I915_MAP_WC;
+
+	return vaddr;
 }
 
 dma_addr_t __px_dma(struct drm_i915_gem_object *p)
@@ -199,15 +206,18 @@ struct page *__px_page(struct drm_i915_gem_object *p)
 void
 fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count)
 {
-	void *vaddr = __px_vaddr(p);
+	bool needs_flush;
+	void *vaddr;
 
+	vaddr = __px_vaddr(p, &needs_flush);
 	memset64(vaddr, val, count);
-	clflush_cache_range(vaddr, PAGE_SIZE);
+	if (needs_flush)
+		clflush_cache_range(vaddr, PAGE_SIZE);
 }
 
 static void poison_scratch_page(struct drm_i915_gem_object *scratch)
 {
-	void *vaddr = __px_vaddr(scratch);
+	void *vaddr = __px_vaddr(scratch, NULL);
 	u8 val;
 
 	val = 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index bdbdfded60cc..d96bd19d1b47 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -176,8 +176,8 @@ struct page *__px_page(struct drm_i915_gem_object *p);
 dma_addr_t __px_dma(struct drm_i915_gem_object *p);
 #define px_dma(px) (__px_dma(px_base(px)))
 
-void *__px_vaddr(struct drm_i915_gem_object *p);
-#define px_vaddr(px) (__px_vaddr(px_base(px)))
+void *__px_vaddr(struct drm_i915_gem_object *p, bool *needs_flush);
+#define px_vaddr(px, needs_flush) (__px_vaddr(px_base(px), needs_flush))
 
 #define px_pt(px) \
 	__px_choose_expr(px, struct i915_page_table *, __x, \
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index 8e7b77cc4594..2d74ae950e4b 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -85,10 +85,12 @@ write_dma_entry(struct drm_i915_gem_object * const pdma,
 		const unsigned short idx,
 		const u64 encoded_entry)
 {
-	u64 * const vaddr = __px_vaddr(pdma);
+	bool needs_flush;
+	u64 * const vaddr = __px_vaddr(pdma, &needs_flush);
 
 	vaddr[idx] = encoded_entry;
-	clflush_cache_range(&vaddr[idx], sizeof(u64));
+	if (needs_flush)
+		clflush_cache_range(&vaddr[idx], sizeof(u64));
 }
 
 void
diff --git a/drivers/gpu/drm/i915/selftests/i915_perf.c b/drivers/gpu/drm/i915/selftests/i915_perf.c
index 6a7abb3e2bb5..6698750ffe8d 100644
--- a/drivers/gpu/drm/i915/selftests/i915_perf.c
+++ b/drivers/gpu/drm/i915/selftests/i915_perf.c
@@ -307,7 +307,7 @@ static int live_noa_gpr(void *arg)
 	}
 
 	/* Poison the ce->vm so we detect writes not to the GGTT gt->scratch */
-	scratch = __px_vaddr(ce->vm->scratch[0]);
+	scratch = __px_vaddr(ce->vm->scratch[0], NULL);
 	memset(scratch, POISON_FREE, PAGE_SIZE);
 
 	rq = intel_context_create_request(ce);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 101/162] drm/i915/gtt/dg1: add PTE_LM plumbing for PPGTT
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (99 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 100/162] drm/i915/gtt: make flushing conditional Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 13:35   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 102/162] drm/i915/gtt/dg1: add PTE_LM plumbing for GGTT Matthew Auld
                   ` (60 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx
  Cc: Abdiel Janulgue, dri-devel, Venkata Sandeep Dhanalakota,
	Daniele Ceraolo Spurio, Niranjana Vishwanathapura

For the PTEs we get an LM bit, to signal whether the page resides in
SMEM or LMEM.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 35 ++++++++++++++++++++++-----
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  3 +++
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  4 +++
 3 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index e2f1dfc48d43..b6fcebeef02a 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -5,6 +5,7 @@
 
 #include <linux/log2.h>
 
+#include "gem/i915_gem_lmem.h"
 #include "gen8_ppgtt.h"
 #include "i915_scatterlist.h"
 #include "i915_trace.h"
@@ -50,6 +51,21 @@ static u64 gen8_pte_encode(dma_addr_t addr,
 	return pte;
 }
 
+static u64 gen12_pte_encode(dma_addr_t addr,
+			    enum i915_cache_level level,
+			    u32 flags)
+{
+	gen8_pte_t pte = addr | _PAGE_PRESENT | _PAGE_RW;
+
+	if (unlikely(flags & PTE_READ_ONLY))
+		pte &= ~_PAGE_RW;
+
+	if (flags & PTE_LM)
+		pte |= GEN12_PPGTT_PTE_LM;
+
+	return pte;
+}
+
 static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
 {
 	struct drm_i915_private *i915 = ppgtt->vm.i915;
@@ -365,7 +381,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 		      u32 flags)
 {
 	struct i915_page_directory *pd;
-	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
+	const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, flags);
 	gen8_pte_t *vaddr;
 	bool needs_flush;
 
@@ -413,7 +429,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 				   enum i915_cache_level cache_level,
 				   u32 flags)
 {
-	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
+	const gen8_pte_t pte_encode = vma->vm->pte_encode(0, cache_level, flags);
 	unsigned int rem = sg_dma_len(iter->sg);
 	u64 start = vma->node.start;
 	bool needs_flush;
@@ -558,6 +574,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
 
 static int gen8_init_scratch(struct i915_address_space *vm)
 {
+	u32 pte_flags = vm->has_read_only;
 	int ret;
 	int i;
 
@@ -581,9 +598,12 @@ static int gen8_init_scratch(struct i915_address_space *vm)
 	if (ret)
 		return ret;
 
+	if (i915_gem_object_is_lmem(vm->scratch[0]))
+		pte_flags |= PTE_LM;
+
 	vm->scratch[0]->encode =
-		gen8_pte_encode(px_dma(vm->scratch[0]),
-				I915_CACHE_LLC, vm->has_read_only);
+		vm->pte_encode(px_dma(vm->scratch[0]),
+			       I915_CACHE_LLC, pte_flags);
 
 	for (i = 1; i <= vm->top; i++) {
 		struct drm_i915_gem_object *obj;
@@ -713,6 +733,11 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 	else
 		ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
 
+	if (INTEL_GEN(gt->i915) >= 12)
+		ppgtt->vm.pte_encode = gen12_pte_encode;
+	else
+		ppgtt->vm.pte_encode = gen8_pte_encode;
+
 	err = gen8_init_scratch(&ppgtt->vm);
 	if (err)
 		goto err_free;
@@ -734,8 +759,6 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 	ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc;
 	ppgtt->vm.clear_range = gen8_ppgtt_clear;
 
-	ppgtt->vm.pte_encode = gen8_pte_encode;
-
 	if (intel_vgpu_active(gt->i915))
 		gen8_ppgtt_notify_vgt(ppgtt, true);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index d96bd19d1b47..f47899ef36f4 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -85,6 +85,8 @@ typedef u64 gen8_pte_t;
 #define BYT_PTE_SNOOPED_BY_CPU_CACHES	REG_BIT(2)
 #define BYT_PTE_WRITEABLE		REG_BIT(1)
 
+#define GEN12_PPGTT_PTE_LM (1 << 11)
+
 /*
  * Cacheability Control is a 4-bit value. The low three bits are stored in bits
  * 3:1 of the PTE, while the fourth bit is stored in bit 11 of the PTE.
@@ -268,6 +270,7 @@ struct i915_address_space {
 			  enum i915_cache_level level,
 			  u32 flags); /* Create a valid PTE */
 #define PTE_READ_ONLY	BIT(0)
+#define PTE_LM          BIT(1)
 
 	void (*allocate_va_range)(struct i915_address_space *vm,
 				  struct i915_vm_pt_stash *stash,
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index 2d74ae950e4b..731d8730fa5f 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -7,6 +7,8 @@
 
 #include "i915_trace.h"
 #include "intel_gtt.h"
+#include "gem/i915_gem_lmem.h"
+#include "gem/i915_gem_region.h"
 #include "gen6_ppgtt.h"
 #include "gen8_ppgtt.h"
 
@@ -193,6 +195,8 @@ void ppgtt_bind_vma(struct i915_address_space *vm,
 	pte_flags = 0;
 	if (i915_gem_object_is_readonly(vma->obj))
 		pte_flags |= PTE_READ_ONLY;
+	if (i915_gem_object_is_lmem(vma->obj))
+		pte_flags |= PTE_LM;
 
 	vm->insert_entries(vm, vma, cache_level, pte_flags);
 	wmb();
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 102/162] drm/i915/gtt/dg1: add PTE_LM plumbing for GGTT
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (100 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 101/162] drm/i915/gtt/dg1: add PTE_LM plumbing for PPGTT Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 103/162] drm/i915: allocate context from LMEM Matthew Auld
                   ` (59 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Abdiel Janulgue, Daniele Ceraolo Spurio, dri-devel

Based on a patch from Michel Thierry.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 24 ++++++++++++++++++------
 drivers/gpu/drm/i915/gt/intel_gtt.h  |  3 ++-
 2 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 4560e03067a7..26aa5debd7e9 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -10,6 +10,7 @@
 
 #include <drm/i915_drm.h>
 
+#include "gem/i915_gem_lmem.h"
 #include "intel_gt.h"
 #include "i915_drv.h"
 #include "i915_scatterlist.h"
@@ -180,7 +181,12 @@ static u64 gen8_ggtt_pte_encode(dma_addr_t addr,
 				enum i915_cache_level level,
 				u32 flags)
 {
-	return addr | _PAGE_PRESENT;
+	gen8_pte_t pte = addr | _PAGE_PRESENT;
+
+	if (flags & PTE_LM)
+		pte |= GEN12_GGTT_PTE_LM;
+
+	return pte;
 }
 
 static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
@@ -192,13 +198,13 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
 				  dma_addr_t addr,
 				  u64 offset,
 				  enum i915_cache_level level,
-				  u32 unused)
+				  u32 flags)
 {
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
 	gen8_pte_t __iomem *pte =
 		(gen8_pte_t __iomem *)ggtt->gsm + offset / I915_GTT_PAGE_SIZE;
 
-	gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, 0));
+	gen8_set_pte(pte, gen8_ggtt_pte_encode(addr, level, flags));
 
 	ggtt->invalidate(ggtt);
 }
@@ -208,7 +214,7 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
 				     enum i915_cache_level level,
 				     u32 flags)
 {
-	const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, 0);
+	const gen8_pte_t pte_encode = gen8_ggtt_pte_encode(0, level, flags);
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
 	gen8_pte_t __iomem *gte;
 	gen8_pte_t __iomem *end;
@@ -448,8 +454,10 @@ static void ggtt_bind_vma(struct i915_address_space *vm,
 
 	/* Applicable to VLV (gen8+ do not support RO in the GGTT) */
 	pte_flags = 0;
-	if (i915_gem_object_is_readonly(obj))
+	if (vma->vm->has_read_only && i915_gem_object_is_readonly(obj))
 		pte_flags |= PTE_READ_ONLY;
+	if (i915_gem_object_is_lmem(obj))
+		pte_flags |= PTE_LM;
 
 	vm->insert_entries(vm, vma, cache_level, pte_flags);
 	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
@@ -765,6 +773,7 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
 	struct drm_i915_private *i915 = ggtt->vm.i915;
 	struct pci_dev *pdev = i915->drm.pdev;
 	phys_addr_t phys_addr;
+	u32 pte_flags = 0;
 	int ret;
 
 	/* For Modern GENs the PTEs and register space are split in the BAR */
@@ -794,9 +803,12 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
 		return ret;
 	}
 
+	if (i915_gem_object_is_lmem(ggtt->vm.scratch[0]))
+		pte_flags |= PTE_LM;
+
 	ggtt->vm.scratch[0]->encode =
 		ggtt->vm.pte_encode(px_dma(ggtt->vm.scratch[0]),
-				    I915_CACHE_NONE, 0);
+				    I915_CACHE_NONE, pte_flags);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index f47899ef36f4..db3626c0ee20 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -85,7 +85,8 @@ typedef u64 gen8_pte_t;
 #define BYT_PTE_SNOOPED_BY_CPU_CACHES	REG_BIT(2)
 #define BYT_PTE_WRITEABLE		REG_BIT(1)
 
-#define GEN12_PPGTT_PTE_LM (1 << 11)
+#define GEN12_GGTT_PTE_LM	(1 << 1)
+#define GEN12_PPGTT_PTE_LM	(1 << 11)
 
 /*
  * Cacheability Control is a 4-bit value. The low three bits are stored in bits
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 103/162] drm/i915: allocate context from LMEM
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (101 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 102/162] drm/i915/gtt/dg1: add PTE_LM plumbing for GGTT Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 13:37   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 104/162] drm/i915: move engine scratch to LMEM Matthew Auld
                   ` (58 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Abdiel Janulgue, dri-devel

Based on a patch from Michel Thierry.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 .../drm/i915/gt/intel_execlists_submission.c  | 31 ++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 582a9044727e..c640b90711fd 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -108,6 +108,8 @@
  */
 #include <linux/interrupt.h>
 
+#include "gem/i915_gem_lmem.h"
+
 #include "i915_drv.h"
 #include "i915_perf.h"
 #include "i915_trace.h"
@@ -4660,6 +4662,21 @@ static struct intel_timeline *pinned_timeline(struct intel_context *ce)
 						 page_unmask_bits(tl));
 }
 
+static int context_clear_lmem(struct drm_i915_gem_object *ctx_obj)
+{
+	void *vaddr;
+
+	vaddr = i915_gem_object_pin_map(ctx_obj, I915_MAP_WC);
+	if (IS_ERR(vaddr))
+		return PTR_ERR(vaddr);
+
+	memset64(vaddr, 0, ctx_obj->base.size / sizeof(u64));
+
+	i915_gem_object_unpin_map(ctx_obj);
+
+	return 0;
+}
+
 static int __execlists_context_alloc(struct intel_context *ce,
 				     struct intel_engine_cs *engine)
 {
@@ -4680,10 +4697,22 @@ static int __execlists_context_alloc(struct intel_context *ce,
 		context_size += PAGE_SIZE;
 	}
 
-	ctx_obj = i915_gem_object_create_shmem(engine->i915, context_size);
+	if (HAS_LMEM(engine->i915)) {
+		ctx_obj = i915_gem_object_create_lmem(engine->i915,
+						      context_size,
+						      I915_BO_ALLOC_CONTIGUOUS);
+	} else {
+		ctx_obj = i915_gem_object_create_shmem(engine->i915, context_size);
+	}
 	if (IS_ERR(ctx_obj))
 		return PTR_ERR(ctx_obj);
 
+	if (HAS_LMEM(engine->i915)) {
+		ret = context_clear_lmem(ctx_obj);
+		if (ret)
+			goto error_deref_obj;
+	}
+
 	vma = i915_vma_instance(ctx_obj, &engine->gt->ggtt->vm, NULL);
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 104/162] drm/i915: move engine scratch to LMEM
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (102 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 103/162] drm/i915: allocate context from LMEM Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 105/162] drm/i915: Provide a way to disable PCIe relaxed write ordering Matthew Auld
                   ` (57 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Abdiel Janulgue, dri-devel

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gt.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 44f1d51e5ae5..caf2e72de1a6 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -4,6 +4,8 @@
  */
 
 #include "debugfs_gt.h"
+
+#include "gem/i915_gem_lmem.h"
 #include "i915_drv.h"
 #include "intel_context.h"
 #include "intel_gt.h"
@@ -342,9 +344,15 @@ static int intel_gt_init_scratch(struct intel_gt *gt, unsigned int size)
 	struct i915_vma *vma;
 	int ret;
 
-	obj = i915_gem_object_create_stolen(i915, size);
-	if (IS_ERR(obj))
-		obj = i915_gem_object_create_internal(i915, size);
+	if (HAS_LMEM(i915)) {
+		obj = i915_gem_object_create_lmem(i915, size,
+						  I915_BO_ALLOC_CONTIGUOUS |
+						  I915_BO_ALLOC_VOLATILE);
+	} else {
+		obj = i915_gem_object_create_stolen(i915, size);
+		if (IS_ERR(obj))
+			obj = i915_gem_object_create_internal(i915, size);
+	}
 	if (IS_ERR(obj)) {
 		DRM_ERROR("Failed to allocate scratch page\n");
 		return PTR_ERR(obj);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 105/162] drm/i915: Provide a way to disable PCIe relaxed write ordering
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (103 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 104/162] drm/i915: move engine scratch to LMEM Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 106/162] drm/i915: i915 returns -EBUSY on thread contention Matthew Auld
                   ` (56 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Abdiel Janulgue, Stuart Summers, dri-devel

From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

For performance writes over PCIe may not be strictly ordered by default.
This provides an option to expose a kernel configuration option to disable
relaxed ordering and turn on strict ordering instead for debug purposes.

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/Kconfig.debug         | 11 +++++++++++
 drivers/gpu/drm/i915/intel_memory_region.c | 12 ++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug
index 0fb7fd0ef717..65533cbbcb82 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -222,3 +222,14 @@ config DRM_I915_DEBUG_RUNTIME_PM
 	  driver loading, suspend and resume operations.
 
 	  If in doubt, say "N"
+
+config DRM_I915_PCIE_STRICT_WRITE_ORDERING
+	bool "Enable PCIe strict ordering "
+	depends on DRM_I915
+	default n
+	help
+	  Relaxed ordering in writes is enabled by default to improve system
+	  performance. Strict ordering can be selected instead to assist in
+	  debugging.
+
+	  If in doubt, say "N".
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index cea44ddebe46..043541d409bd 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -286,6 +286,18 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
 {
 	int err, i;
 
+	/* All platforms currently have system memory */
+	GEM_BUG_ON(!HAS_REGION(i915, REGION_SMEM));
+
+	if (IS_DGFX(i915)) {
+		if (IS_ENABLED(CONFIG_DRM_I915_PCIE_STRICT_WRITE_ORDERING))
+			pcie_capability_clear_word(i915->drm.pdev, PCI_EXP_DEVCTL,
+						   PCI_EXP_DEVCTL_RELAX_EN);
+		else
+			pcie_capability_set_word(i915->drm.pdev, PCI_EXP_DEVCTL,
+						 PCI_EXP_DEVCTL_RELAX_EN);
+	}
+
 	for (i = 0; i < ARRAY_SIZE(i915->mm.regions); i++) {
 		struct intel_memory_region *mem = ERR_PTR(-ENODEV);
 		u16 type, instance;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 106/162] drm/i915: i915 returns -EBUSY on thread contention
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (104 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 105/162] drm/i915: Provide a way to disable PCIe relaxed write ordering Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 107/162] drm/i915: setup GPU device lmem region Matthew Auld
                   ` (55 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx
  Cc: Abdiel Janulgue, Matthew Brost, Lucas De Marchi, Sudeep Dutt,
	dri-devel, CQ Tang, Venkata S Dhanalakota, Neel Desai, Francesco,
	Balestrieri, Niranjana Vishwanathapura

From: CQ Tang <cq.tang@intel.com>

During high threads contention, the same object had been pinned
with a different type. A new pinning will catch -EBUSY if the
FORCE flag is not specified.

This error was observed on DG1 silicon during PO.

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Balestrieri, Francesco <francesco.balestrieri@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Venkata S Dhanalakota <venkata.s.dhanalakota@intel.com>
Cc: Neel Desai <neel.desai@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object_blt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
index b41b076f6864..1096f27627d4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
@@ -57,7 +57,7 @@ struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce,
 	/* we pinned the pool, mark it as such */
 	intel_gt_buffer_pool_mark_used(pool);
 
-	cmd = i915_gem_object_pin_map(pool->obj, I915_MAP_WC);
+	cmd = i915_gem_object_pin_map(pool->obj, I915_MAP_FORCE_WC);
 	if (IS_ERR(cmd)) {
 		err = PTR_ERR(cmd);
 		goto out_unpin;
@@ -297,7 +297,7 @@ struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce,
 	/* we pinned the pool, mark it as such */
 	intel_gt_buffer_pool_mark_used(pool);
 
-	cmd = i915_gem_object_pin_map(pool->obj, I915_MAP_WC);
+	cmd = i915_gem_object_pin_map(pool->obj, I915_MAP_FORCE_WC);
 	if (IS_ERR(cmd)) {
 		err = PTR_ERR(cmd);
 		goto out_unpin;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 107/162] drm/i915: setup GPU device lmem region
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (105 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 106/162] drm/i915: i915 returns -EBUSY on thread contention Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-30 11:18   ` Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 108/162] drm/i915: Fix object page offset within a region Matthew Auld
                   ` (54 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx
  Cc: Abdiel Janulgue, Matthew Brost, Sudeep Dutt, Chris P Wilson,
	CQ Tang, Venkata S Dhanalakota, dri-devel, Neel Desai, Francesco,
	Balestrieri, Niranjana Vishwanathapura

From: CQ Tang <cq.tang@intel.com>

The lmem region needs to remove the stolen part.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Chris P Wilson <chris.p.wilson@intel.com>
Cc: Balestrieri, Francesco <francesco.balestrieri@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Venkata S Dhanalakota <venkata.s.dhanalakota@intel.com>
Cc: Neel Desai <neel.desai@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/i915_reg.h          |  2 ++
 drivers/gpu/drm/i915/intel_region_lmem.c | 11 +++++++----
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 1af1966ac461..0e01ea0cb0a4 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -12066,6 +12066,8 @@ enum skl_power_gate {
 #define GEN12_LMEM_CFG_ADDR		_MMIO(0xcf58)
 #define   LMEM_ENABLE			(1 << 31)
 
+#define GEN12_GSMBASE			_MMIO(0x108100)
+
 /* gamt regs */
 #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
 #define   GEN8_L3_LRA_1_GPGPU_DEFAULT_VALUE_BDW  0x67F1427F /* max/min for LRA1/2 */
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index e98582c76de1..7f2b31d469b0 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -140,20 +140,23 @@ intel_setup_fake_lmem(struct drm_i915_private *i915)
 static struct intel_memory_region *
 setup_lmem(struct drm_i915_private *dev_priv)
 {
+	struct intel_uncore *uncore = &dev_priv->uncore;
 	struct pci_dev *pdev = dev_priv->drm.pdev;
 	struct intel_memory_region *mem;
 	resource_size_t io_start;
-	resource_size_t size;
+	resource_size_t lmem_size;
 
 	/* Enables Local Memory functionality in GAM */
 	I915_WRITE(GEN12_LMEM_CFG_ADDR, I915_READ(GEN12_LMEM_CFG_ADDR) | LMEM_ENABLE);
 
+	/* Stolen starts from GSMBASE on DG1 */
+	lmem_size = intel_uncore_read64(uncore, GEN12_GSMBASE);
+
 	io_start = pci_resource_start(pdev, 2);
-	size = pci_resource_len(pdev, 2);
 
 	mem = intel_memory_region_create(dev_priv,
 					 0,
-					 size,
+					 lmem_size,
 					 I915_GTT_PAGE_SIZE_4K,
 					 io_start,
 					 &intel_region_lmem_ops);
@@ -162,7 +165,7 @@ setup_lmem(struct drm_i915_private *dev_priv)
 		DRM_INFO("Intel graphics LMEM IO start: %llx\n",
 			 (u64)mem->io_start);
 		DRM_INFO("Intel graphics LMEM size: %llx\n",
-			 (u64)size);
+			 (u64)lmem_size);
 	}
 
 	return mem;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 108/162] drm/i915: Fix object page offset within a region
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (106 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 107/162] drm/i915: setup GPU device lmem region Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 109/162] drm/i915: add i915_gem_object_is_devmem() function Matthew Auld
                   ` (53 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx
  Cc: Abdiel Janulgue, Sudeep Dutt, dri-devel, CQ Tang,
	Niranjana Vishwanathapura

From: CQ Tang <cq.tang@intel.com>

Adjust the page offset with region start dma address.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index d8cac4c5881f..16424755e89c 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1051,7 +1051,9 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 		for_each_sgt_daddr(dma, iter, vma->pages) {
 			void __iomem *s;
 
-			s = io_mapping_map_wc(&mem->iomap, dma, PAGE_SIZE);
+			s = io_mapping_map_wc(&mem->iomap,
+					      dma - mem->region.start,
+					      PAGE_SIZE);
 			ret = compress_page(compress,
 					    (void __force *)s, dst,
 					    true);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 109/162] drm/i915: add i915_gem_object_is_devmem() function
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (107 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 108/162] drm/i915: Fix object page offset within a region Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 110/162] drm/i915: finish memory region support for stolen objects Matthew Auld
                   ` (52 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx
  Cc: Abdiel Janulgue, Matthew Brost, Sudeep Dutt, Chris P Wilson,
	CQ Tang, Venkata S Dhanalakota, dri-devel, Francesco Balestrieri,
	Neel Desai, Niranjana Vishwanathapura

From: CQ Tang <cq.tang@intel.com>

We have three memory region types: INTEL_SMEM, INTEL_LMEM, and
INTEL_STOLEN. We also have two types of memory:	system memory
and device memory (or called local memory).

Memory region with type INTEL_SMEM only has system memory; the
other two types of memory regions could have either system memory
or device memory.

This function is used to distinguish real local device memory or
system memory (including fake local memmory and bios stolen system
memory) for INTEL_LMEM and INTEL_STOLEN memory region type.

PPGTT will program PTE_LM bit based on this value.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Chris P Wilson <chris.p.wilson@intel.com>
Cc: Francesco Balestrieri <francesco.balestrieri@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Venkata S Dhanalakota <venkata.s.dhanalakota@intel.com>
Cc: Neel Desai <neel.desai@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c   | 11 ++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  1 +
 drivers/gpu/drm/i915/gt/intel_ggtt.c       |  2 +-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c      |  2 +-
 drivers/gpu/drm/i915/intel_memory_region.h |  1 +
 drivers/gpu/drm/i915/intel_region_lmem.c   |  3 +++
 6 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index 840b68eb10d3..e56874e54fde 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -217,7 +217,16 @@ i915_gem_object_lmem_io_map_page_atomic(struct drm_i915_gem_object *obj,
 
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
 {
-	return obj->ops == &i915_gem_lmem_obj_ops;
+	struct intel_memory_region *region = obj->mm.region;
+
+	return region && (region->is_devmem || region->type == INTEL_MEMORY_LOCAL);
+}
+
+bool i915_gem_object_is_devmem(struct drm_i915_gem_object *obj)
+{
+	struct intel_memory_region *region = obj->mm.region;
+
+	return region && region->is_devmem;
 }
 
 struct drm_i915_gem_object *
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
index a24d94bc380f..a1b6a10050bf 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
@@ -21,6 +21,7 @@ i915_gem_object_lmem_io_map_page_atomic(struct drm_i915_gem_object *obj,
 					unsigned long n);
 
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
+bool i915_gem_object_is_devmem(struct drm_i915_gem_object *obj);
 
 struct drm_i915_gem_object *
 i915_gem_object_create_lmem(struct drm_i915_private *i915,
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 26aa5debd7e9..eed5b640e493 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -456,7 +456,7 @@ static void ggtt_bind_vma(struct i915_address_space *vm,
 	pte_flags = 0;
 	if (vma->vm->has_read_only && i915_gem_object_is_readonly(obj))
 		pte_flags |= PTE_READ_ONLY;
-	if (i915_gem_object_is_lmem(obj))
+	if (i915_gem_object_is_devmem(obj))
 		pte_flags |= PTE_LM;
 
 	vm->insert_entries(vm, vma, cache_level, pte_flags);
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index 731d8730fa5f..34a02643bb75 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -195,7 +195,7 @@ void ppgtt_bind_vma(struct i915_address_space *vm,
 	pte_flags = 0;
 	if (i915_gem_object_is_readonly(vma->obj))
 		pte_flags |= PTE_READ_ONLY;
-	if (i915_gem_object_is_lmem(vma->obj))
+	if (i915_gem_object_is_devmem(vma->obj))
 		pte_flags |= PTE_LM;
 
 	vm->insert_entries(vm, vma, cache_level, pte_flags);
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index 20431d3ce490..ed827c770d47 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -92,6 +92,7 @@ struct intel_memory_region {
 	enum intel_region_id id;
 	char name[8];
 	struct intel_gt *gt; /* GT closest to this region. */
+	bool is_devmem;	/* true for device memory */
 
 	dma_addr_t remap_addr;
 
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index 7f2b31d469b0..939cf0d195a5 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -166,6 +166,9 @@ setup_lmem(struct drm_i915_private *dev_priv)
 			 (u64)mem->io_start);
 		DRM_INFO("Intel graphics LMEM size: %llx\n",
 			 (u64)lmem_size);
+
+		/* this is real device memory */
+		mem->is_devmem = true;
 	}
 
 	return mem;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 110/162] drm/i915: finish memory region support for stolen objects.
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (108 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 109/162] drm/i915: add i915_gem_object_is_devmem() function Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 111/162] drm/i915/lmem: support optional CPU clearing for special internal use Matthew Auld
                   ` (51 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx
  Cc: Abdiel Janulgue, Matthew Brost, Lucas De Marchi, Sudeep Dutt,
	Chris P Wilson, CQ Tang, Venkata S Dhanalakota, dri-devel,
	Neel Desai, Francesco, Balestrieri, Niranjana Vishwanathapura

From: CQ Tang <cq.tang@intel.com>

Current stolen code has partial memory region support. This patch
finish the rest of code, so object memory are allocated from stolen
memory region.

However, three "global" variables are still kept for the display code
to access, "i915->dsm", "i915->dsm_reserved",
and "i915->stolen_usable_size",

This is to reduce the amount of code change. Also there is only one
display per device, while there could be multipes stolen memory region.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Chris P Wilson <chris.p.wilson@intel.com>
Cc: Balestrieri, Francesco <francesco.balestrieri@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Venkata S Dhanalakota <venkata.s.dhanalakota@intel.com>
Cc: Neel Desai <neel.desai@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/i915/display/intel_fbc.c   |  20 ++-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 185 ++++++++++-----------
 drivers/gpu/drm/i915/gem/i915_gem_stolen.h |   7 +-
 drivers/gpu/drm/i915/gt/selftest_reset.c   |   5 +-
 drivers/gpu/drm/i915/i915_drv.h            |   6 -
 drivers/gpu/drm/i915/intel_memory_region.h |   3 +
 6 files changed, 112 insertions(+), 114 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c b/drivers/gpu/drm/i915/display/intel_fbc.c
index a5b072816a7b..2ad8ddc7e266 100644
--- a/drivers/gpu/drm/i915/display/intel_fbc.c
+++ b/drivers/gpu/drm/i915/display/intel_fbc.c
@@ -437,6 +437,7 @@ static int find_compression_threshold(struct drm_i915_private *dev_priv,
 				      unsigned int size,
 				      unsigned int fb_cpp)
 {
+	struct intel_memory_region *mem = i915_stolen_region(dev_priv);
 	int compression_threshold = 1;
 	int ret;
 	u64 end;
@@ -460,7 +461,7 @@ static int find_compression_threshold(struct drm_i915_private *dev_priv,
 	 */
 
 	/* Try to over-allocate to reduce reallocations and fragmentation. */
-	ret = i915_gem_stolen_insert_node_in_range(dev_priv, node, size <<= 1,
+	ret = i915_gem_stolen_insert_node_in_range(mem, node, size <<= 1,
 						   4096, 0, end);
 	if (ret == 0)
 		return compression_threshold;
@@ -471,7 +472,7 @@ static int find_compression_threshold(struct drm_i915_private *dev_priv,
 	    (fb_cpp == 2 && compression_threshold == 2))
 		return 0;
 
-	ret = i915_gem_stolen_insert_node_in_range(dev_priv, node, size >>= 1,
+	ret = i915_gem_stolen_insert_node_in_range(mem, node, size >>= 1,
 						   4096, 0, end);
 	if (ret && INTEL_GEN(dev_priv) <= 4) {
 		return 0;
@@ -486,6 +487,7 @@ static int find_compression_threshold(struct drm_i915_private *dev_priv,
 static int intel_fbc_alloc_cfb(struct drm_i915_private *dev_priv,
 			       unsigned int size, unsigned int fb_cpp)
 {
+	struct intel_memory_region *mem = i915_stolen_region(dev_priv);
 	struct intel_fbc *fbc = &dev_priv->fbc;
 	struct drm_mm_node *compressed_llb;
 	int ret;
@@ -515,7 +517,7 @@ static int intel_fbc_alloc_cfb(struct drm_i915_private *dev_priv,
 		if (!compressed_llb)
 			goto err_fb;
 
-		ret = i915_gem_stolen_insert_node(dev_priv, compressed_llb,
+		ret = i915_gem_stolen_insert_node(mem, compressed_llb,
 						  4096, 4096);
 		if (ret)
 			goto err_fb;
@@ -542,15 +544,16 @@ static int intel_fbc_alloc_cfb(struct drm_i915_private *dev_priv,
 
 err_fb:
 	kfree(compressed_llb);
-	i915_gem_stolen_remove_node(dev_priv, &fbc->compressed_fb);
+	i915_gem_stolen_remove_node(mem, &fbc->compressed_fb);
 err_llb:
-	if (drm_mm_initialized(&dev_priv->mm.stolen))
+	if (drm_mm_initialized(&mem->stolen))
 		drm_info_once(&dev_priv->drm, "not enough stolen space for compressed buffer (need %d more bytes), disabling. Hint: you may be able to increase stolen memory size in the BIOS to avoid this.\n", size);
 	return -ENOSPC;
 }
 
 static void __intel_fbc_cleanup_cfb(struct drm_i915_private *dev_priv)
 {
+	struct intel_memory_region *mem = i915_stolen_region(dev_priv);
 	struct intel_fbc *fbc = &dev_priv->fbc;
 
 	if (WARN_ON(intel_fbc_hw_is_active(dev_priv)))
@@ -560,11 +563,11 @@ static void __intel_fbc_cleanup_cfb(struct drm_i915_private *dev_priv)
 		return;
 
 	if (fbc->compressed_llb) {
-		i915_gem_stolen_remove_node(dev_priv, fbc->compressed_llb);
+		i915_gem_stolen_remove_node(mem, fbc->compressed_llb);
 		kfree(fbc->compressed_llb);
 	}
 
-	i915_gem_stolen_remove_node(dev_priv, &fbc->compressed_fb);
+	i915_gem_stolen_remove_node(mem, &fbc->compressed_fb);
 }
 
 void intel_fbc_cleanup_cfb(struct drm_i915_private *dev_priv)
@@ -1468,12 +1471,13 @@ static bool need_fbc_vtd_wa(struct drm_i915_private *dev_priv)
 void intel_fbc_init(struct drm_i915_private *dev_priv)
 {
 	struct intel_fbc *fbc = &dev_priv->fbc;
+	struct intel_memory_region *mem = i915_stolen_region(dev_priv);
 
 	INIT_WORK(&fbc->underrun_work, intel_fbc_underrun_work_fn);
 	mutex_init(&fbc->lock);
 	fbc->active = false;
 
-	if (!drm_mm_initialized(&dev_priv->mm.stolen))
+	if (!mem || !drm_mm_initialized(&mem->stolen))
 		mkwrite_device_info(dev_priv)->display.has_fbc = false;
 
 	if (need_fbc_vtd_wa(dev_priv))
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index 25e3cc53316e..0ddf48e472a0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -27,44 +27,44 @@
  * for is a boon.
  */
 
-int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *i915,
+int i915_gem_stolen_insert_node_in_range(struct intel_memory_region *mem,
 					 struct drm_mm_node *node, u64 size,
 					 unsigned alignment, u64 start, u64 end)
 {
 	int ret;
 
-	if (!drm_mm_initialized(&i915->mm.stolen))
+	if (!drm_mm_initialized(&mem->stolen))
 		return -ENODEV;
 
 	/* WaSkipStolenMemoryFirstPage:bdw+ */
-	if (INTEL_GEN(i915) >= 8 && start < 4096)
+	if (INTEL_GEN(mem->i915) >= 8 && start < 4096)
 		start = 4096;
 
-	mutex_lock(&i915->mm.stolen_lock);
-	ret = drm_mm_insert_node_in_range(&i915->mm.stolen, node,
+	mutex_lock(&mem->mm_lock);
+	ret = drm_mm_insert_node_in_range(&mem->stolen, node,
 					  size, alignment, 0,
 					  start, end, DRM_MM_INSERT_BEST);
-	mutex_unlock(&i915->mm.stolen_lock);
+	mutex_unlock(&mem->mm_lock);
 
 	return ret;
 }
 
-int i915_gem_stolen_insert_node(struct drm_i915_private *i915,
+int i915_gem_stolen_insert_node(struct intel_memory_region *mem,
 				struct drm_mm_node *node, u64 size,
 				unsigned alignment)
 {
-	return i915_gem_stolen_insert_node_in_range(i915, node,
+	return i915_gem_stolen_insert_node_in_range(mem, node,
 						    size, alignment,
 						    I915_GEM_STOLEN_BIAS,
 						    U64_MAX);
 }
 
-void i915_gem_stolen_remove_node(struct drm_i915_private *i915,
+void i915_gem_stolen_remove_node(struct intel_memory_region *mem,
 				 struct drm_mm_node *node)
 {
-	mutex_lock(&i915->mm.stolen_lock);
+	mutex_lock(&mem->mm_lock);
 	drm_mm_remove_node(node);
-	mutex_unlock(&i915->mm.stolen_lock);
+	mutex_unlock(&mem->mm_lock);
 }
 
 static int i915_adjust_stolen(struct drm_i915_private *i915,
@@ -159,12 +159,12 @@ static int i915_adjust_stolen(struct drm_i915_private *i915,
 	return 0;
 }
 
-static void i915_gem_cleanup_stolen(struct drm_i915_private *i915)
+static void i915_gem_cleanup_stolen(struct intel_memory_region *mem)
 {
-	if (!drm_mm_initialized(&i915->mm.stolen))
+	if (!drm_mm_initialized(&mem->stolen))
 		return;
 
-	drm_mm_takedown(&i915->mm.stolen);
+	drm_mm_takedown(&mem->stolen);
 }
 
 static void g4x_get_stolen_reserved(struct drm_i915_private *i915,
@@ -374,14 +374,13 @@ static void icl_get_stolen_reserved(struct drm_i915_private *i915,
 	}
 }
 
-static int i915_gem_init_stolen(struct drm_i915_private *i915)
+static int i915_gem_init_stolen(struct intel_memory_region *mem)
 {
+	struct drm_i915_private *i915 = mem->i915;
 	struct intel_uncore *uncore = &i915->uncore;
 	resource_size_t reserved_base, stolen_top;
 	resource_size_t reserved_total, reserved_size;
 
-	mutex_init(&i915->mm.stolen_lock);
-
 	if (intel_vgpu_active(i915)) {
 		drm_notice(&i915->drm,
 			   "%s, disabling use of stolen memory\n",
@@ -396,10 +395,10 @@ static int i915_gem_init_stolen(struct drm_i915_private *i915)
 		return 0;
 	}
 
-	if (resource_size(&intel_graphics_stolen_res) == 0)
+	if (resource_size(&mem->region) == 0)
 		return 0;
 
-	i915->dsm = intel_graphics_stolen_res;
+	i915->dsm = mem->region;
 
 	if (i915_adjust_stolen(i915, &i915->dsm))
 		return 0;
@@ -492,7 +491,7 @@ static int i915_gem_init_stolen(struct drm_i915_private *i915)
 		resource_size(&i915->dsm) - reserved_total;
 
 	/* Basic memrange allocator for stolen space. */
-	drm_mm_init(&i915->mm.stolen, 0, i915->stolen_usable_size);
+	drm_mm_init(&mem->stolen, 0, i915->stolen_usable_size);
 
 	return 0;
 }
@@ -535,14 +534,14 @@ static void dbg_poison(struct i915_ggtt *ggtt,
 }
 
 static struct sg_table *
-i915_pages_create_for_stolen(struct drm_device *dev,
+i915_pages_create_for_stolen(struct drm_i915_gem_object *obj,
 			     resource_size_t offset, resource_size_t size)
 {
-	struct drm_i915_private *i915 = to_i915(dev);
+	struct intel_memory_region *mem = obj->mm.region;
 	struct sg_table *st;
 	struct scatterlist *sg;
 
-	GEM_BUG_ON(range_overflows(offset, size, resource_size(&i915->dsm)));
+	GEM_BUG_ON(range_overflows(offset, size, resource_size(&mem->region)));
 
 	/* We hide that we have no struct page backing our stolen object
 	 * by wrapping the contiguous physical allocation with a fake
@@ -562,7 +561,7 @@ i915_pages_create_for_stolen(struct drm_device *dev,
 	sg->offset = 0;
 	sg->length = size;
 
-	sg_dma_address(sg) = (dma_addr_t)i915->dsm.start + offset;
+	sg_dma_address(sg) = (dma_addr_t)mem->region.start + offset;
 	sg_dma_len(sg) = size;
 
 	return st;
@@ -571,7 +570,7 @@ i915_pages_create_for_stolen(struct drm_device *dev,
 static int i915_gem_object_get_pages_stolen(struct drm_i915_gem_object *obj)
 {
 	struct sg_table *pages =
-		i915_pages_create_for_stolen(obj->base.dev,
+		i915_pages_create_for_stolen(obj,
 					     obj->stolen->start,
 					     obj->stolen->size);
 	if (IS_ERR(pages))
@@ -590,118 +589,113 @@ static int i915_gem_object_get_pages_stolen(struct drm_i915_gem_object *obj)
 static void i915_gem_object_put_pages_stolen(struct drm_i915_gem_object *obj,
 					     struct sg_table *pages)
 {
-	/* Should only be called from i915_gem_object_release_stolen() */
+	struct intel_memory_region *mem = obj->mm.region;
+	struct drm_mm_node *stolen = fetch_and_zero(&obj->stolen);
+
+	GEM_BUG_ON(!mem);
+	GEM_BUG_ON(!stolen);
 
 	dbg_poison(&to_i915(obj->base.dev)->ggtt,
 		   sg_dma_address(pages->sgl),
 		   sg_dma_len(pages->sgl),
 		   POISON_FREE);
 
+	i915_gem_stolen_remove_node(mem, stolen);
+	kfree(stolen);
+
 	sg_free_table(pages);
 	kfree(pages);
 }
 
-static void
-i915_gem_object_release_stolen(struct drm_i915_gem_object *obj)
-{
-	struct drm_i915_private *i915 = to_i915(obj->base.dev);
-	struct drm_mm_node *stolen = fetch_and_zero(&obj->stolen);
-
-	GEM_BUG_ON(!stolen);
-
-	i915_gem_object_release_memory_region(obj);
-
-	i915_gem_stolen_remove_node(i915, stolen);
-	kfree(stolen);
-}
-
 static const struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
 	.name = "i915_gem_object_stolen",
 	.get_pages = i915_gem_object_get_pages_stolen,
 	.put_pages = i915_gem_object_put_pages_stolen,
-	.release = i915_gem_object_release_stolen,
+	.release = i915_gem_object_release_memory_region,
 };
 
 static struct drm_i915_gem_object *
 __i915_gem_object_create_stolen(struct intel_memory_region *mem,
-				struct drm_mm_node *stolen)
+			       resource_size_t size,
+			       unsigned int flags)
 {
 	static struct lock_class_key lock_class;
+	struct drm_i915_private *i915 = mem->i915;
 	struct drm_i915_gem_object *obj;
-	unsigned int cache_level;
-	int err = -ENOMEM;
+
+	if (!drm_mm_initialized(&mem->stolen))
+		return ERR_PTR(-ENODEV);
+
+	if (size == 0)
+		return ERR_PTR(-EINVAL);
 
 	obj = i915_gem_object_alloc();
 	if (!obj)
-		goto err;
+		return ERR_PTR(-ENOMEM);
 
-	drm_gem_private_object_init(&mem->i915->drm, &obj->base, stolen->size);
-	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class, 0);
+	drm_gem_private_object_init(&i915->drm, &obj->base, size);
+	i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class,
+			     flags);
 
-	obj->stolen = stolen;
 	obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
-	cache_level = HAS_LLC(mem->i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
-	i915_gem_object_set_cache_coherency(obj, cache_level);
-
-	if (WARN_ON(!i915_gem_object_trylock(obj))) {
-		err = -EBUSY;
-		goto cleanup;
-	}
-
-	err = i915_gem_object_pin_pages(obj);
-	if (err) {
-		i915_gem_object_unlock(obj);
-		goto cleanup;
-	}
+	obj->cache_level = HAS_LLC(i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
 
 	i915_gem_object_init_memory_region(obj, mem);
-	i915_gem_object_unlock(obj);
 
 	return obj;
-
-cleanup:
-	i915_gem_object_free(obj);
-err:
-	return ERR_PTR(err);
 }
 
 static struct drm_i915_gem_object *
-_i915_gem_object_create_stolen(struct intel_memory_region *mem,
-			       resource_size_t size,
-			       unsigned int flags)
+i915_gem_object_create_stolen_region(struct intel_memory_region *mem,
+				     resource_size_t size,
+				     unsigned int flags)
 {
-	struct drm_i915_private *i915 = mem->i915;
-	struct drm_i915_gem_object *obj;
+	struct drm_i915_gem_object *obj, *err;
 	struct drm_mm_node *stolen;
 	int ret;
 
-	if (!drm_mm_initialized(&i915->mm.stolen))
-		return ERR_PTR(-ENODEV);
-
-	if (size == 0)
-		return ERR_PTR(-EINVAL);
-
 	stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
 	if (!stolen)
 		return ERR_PTR(-ENOMEM);
 
-	ret = i915_gem_stolen_insert_node(i915, stolen, size, 4096);
+	ret = i915_gem_stolen_insert_node(mem, stolen, size,
+					  mem->min_page_size);
 	if (ret) {
-		obj = ERR_PTR(ret);
+		err = ERR_PTR(ret);
 		goto err_free;
 	}
 
-	obj = __i915_gem_object_create_stolen(mem, stolen);
-	if (IS_ERR(obj))
+	obj = __i915_gem_object_create_stolen(mem, size,
+					      I915_BO_ALLOC_CONTIGUOUS);
+	if (IS_ERR(obj)) {
+		err = obj;
 		goto err_remove;
+	}
+
+	/* must set before pin pages */
+	obj->stolen = stolen;
+
+	/* if pinning fails, caller needs to free stolen */
+	if (drm_WARN_ON(obj->base.dev, !i915_gem_object_trylock(obj))) {
+		ret = -EBUSY;
+		goto free_obj;
+	}
+	ret = i915_gem_object_pin_pages(obj);
+	i915_gem_object_unlock(obj);
+	if (ret) {
+		err = ERR_PTR(ret);
+		goto free_obj;
+	}
 
 	return obj;
 
+free_obj:
+	i915_gem_object_put(obj);
 err_remove:
-	i915_gem_stolen_remove_node(i915, stolen);
+	i915_gem_stolen_remove_node(mem, stolen);
 err_free:
 	kfree(stolen);
-	return obj;
+	return err;
 }
 
 struct intel_memory_region *i915_stolen_region(struct drm_i915_private *i915)
@@ -728,18 +722,18 @@ static int init_stolen(struct intel_memory_region *mem)
 	 * Initialise stolen early so that we may reserve preallocated
 	 * objects for the BIOS to KMS transition.
 	 */
-	return i915_gem_init_stolen(mem->i915);
+	return i915_gem_init_stolen(mem);
 }
 
 static void release_stolen(struct intel_memory_region *mem)
 {
-	i915_gem_cleanup_stolen(mem->i915);
+	i915_gem_cleanup_stolen(mem);
 }
 
 static const struct intel_memory_region_ops i915_region_stolen_ops = {
 	.init = init_stolen,
 	.release = release_stolen,
-	.create_object = _i915_gem_object_create_stolen,
+	.create_object = i915_gem_object_create_stolen_region,
 };
 
 struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915)
@@ -761,9 +755,6 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
 	struct drm_mm_node *stolen;
 	int ret;
 
-	if (!drm_mm_initialized(&i915->mm.stolen))
-		return ERR_PTR(-ENODEV);
-
 	drm_dbg(&i915->drm,
 		"creating preallocated stolen object: stolen_offset=%pa, size=%pa\n",
 		&stolen_offset, &size);
@@ -780,23 +771,27 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
 
 	stolen->start = stolen_offset;
 	stolen->size = size;
-	mutex_lock(&i915->mm.stolen_lock);
-	ret = drm_mm_reserve_node(&i915->mm.stolen, stolen);
-	mutex_unlock(&i915->mm.stolen_lock);
+	mutex_lock(&mem->mm_lock);
+	ret = drm_mm_reserve_node(&mem->stolen, stolen);
+	mutex_unlock(&mem->mm_lock);
 	if (ret) {
 		obj = ERR_PTR(ret);
 		goto err_free;
 	}
 
-	obj = __i915_gem_object_create_stolen(mem, stolen);
+	obj = __i915_gem_object_create_stolen(mem, size,
+					      I915_BO_ALLOC_CONTIGUOUS);
 	if (IS_ERR(obj))
 		goto err_stolen;
 
+	/* must set before pin pages */
+	obj->stolen = stolen;
+
 	i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
 	return obj;
 
 err_stolen:
-	i915_gem_stolen_remove_node(i915, stolen);
+	i915_gem_stolen_remove_node(mem, stolen);
 err_free:
 	kfree(stolen);
 	return obj;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
index 67f6264f3ff9..f64a5552e56b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
@@ -11,15 +11,16 @@
 struct drm_i915_private;
 struct drm_mm_node;
 struct drm_i915_gem_object;
+struct intel_memory_region;
 
-int i915_gem_stolen_insert_node(struct drm_i915_private *dev_priv,
+int i915_gem_stolen_insert_node(struct intel_memory_region *mem,
 				struct drm_mm_node *node, u64 size,
 				unsigned alignment);
-int i915_gem_stolen_insert_node_in_range(struct drm_i915_private *dev_priv,
+int i915_gem_stolen_insert_node_in_range(struct intel_memory_region *mem,
 					 struct drm_mm_node *node, u64 size,
 					 unsigned alignment, u64 start,
 					 u64 end);
-void i915_gem_stolen_remove_node(struct drm_i915_private *dev_priv,
+void i915_gem_stolen_remove_node(struct intel_memory_region *mem,
 				 struct drm_mm_node *node);
 struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915);
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_reset.c b/drivers/gpu/drm/i915/gt/selftest_reset.c
index ef5aeebbeeb0..7f4fd49bdd73 100644
--- a/drivers/gpu/drm/i915/gt/selftest_reset.c
+++ b/drivers/gpu/drm/i915/gt/selftest_reset.c
@@ -20,6 +20,7 @@ __igt_reset_stolen(struct intel_gt *gt,
 {
 	struct i915_ggtt *ggtt = &gt->i915->ggtt;
 	const struct resource *dsm = &gt->i915->dsm;
+	struct intel_memory_region *mem = i915_stolen_region(gt->i915);
 	resource_size_t num_pages, page;
 	struct intel_engine_cs *engine;
 	intel_wakeref_t wakeref;
@@ -92,7 +93,7 @@ __igt_reset_stolen(struct intel_gt *gt,
 				      ggtt->error_capture.start,
 				      PAGE_SIZE);
 
-		if (!__drm_mm_interval_first(&gt->i915->mm.stolen,
+		if (!__drm_mm_interval_first(&mem->stolen,
 					     page << PAGE_SHIFT,
 					     ((page + 1) << PAGE_SHIFT) - 1))
 			memset32(s, STACK_MAGIC, PAGE_SIZE / sizeof(u32));
@@ -139,7 +140,7 @@ __igt_reset_stolen(struct intel_gt *gt,
 		x = crc32_le(0, in, PAGE_SIZE);
 
 		if (x != crc[page] &&
-		    !__drm_mm_interval_first(&gt->i915->mm.stolen,
+		    !__drm_mm_interval_first(&mem->stolen,
 					     page << PAGE_SHIFT,
 					     ((page + 1) << PAGE_SHIFT) - 1)) {
 			pr_debug("unused stolen page %pa modified by GPU reset\n",
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 13cb4936f15c..1366b53ac8c9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -549,12 +549,6 @@ struct intel_l3_parity {
 };
 
 struct i915_gem_mm {
-	/** Memory allocator for GTT stolen memory */
-	struct drm_mm stolen;
-	/** Protects the usage of the GTT stolen memory allocator. This is
-	 * always the inner lock when overlapping with struct_mutex. */
-	struct mutex stolen_lock;
-
 	/* Protects bound_list/unbound_list and #drm_i915_gem_object.mm.link */
 	spinlock_t obj_lock;
 
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index ed827c770d47..b7a9e34faaf1 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -6,6 +6,7 @@
 #ifndef __INTEL_MEMORY_REGION_H__
 #define __INTEL_MEMORY_REGION_H__
 
+#include <drm/drm_mm.h>
 #include <linux/kref.h>
 #include <linux/ioport.h>
 #include <linux/mutex.h>
@@ -77,6 +78,8 @@ struct intel_memory_region {
 	/* For fake LMEM */
 	struct drm_mm_node fake_mappable;
 
+	struct drm_mm stolen;
+
 	struct i915_buddy_mm mm;
 	struct mutex mm_lock;
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 111/162] drm/i915/lmem: support optional CPU clearing for special internal use
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (109 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 110/162] drm/i915: finish memory region support for stolen objects Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 112/162] drm/i915/guc: put all guc objects in lmem when available Matthew Auld
                   ` (50 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

For some internal device local-memory objects it would be useful to have
an option to CPU clear the pages upon gathering the backing store. Note
that this might be before the blitter is useable, which is the case for
some internal GuC objects.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  6 +-
 drivers/gpu/drm/i915/gem/i915_gem_region.c    | 20 +++++
 .../drm/i915/selftests/intel_memory_region.c  | 90 ++++++++++++++++++-
 3 files changed, 113 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 115ad32c303f..8d639509b78b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -166,10 +166,12 @@ struct drm_i915_gem_object {
 #define I915_BO_ALLOC_CONTIGUOUS BIT(0)
 #define I915_BO_ALLOC_VOLATILE   BIT(1)
 #define I915_BO_ALLOC_STRUCT_PAGE BIT(2)
+#define I915_BO_ALLOC_CPU_CLEAR  BIT(3)
 #define I915_BO_ALLOC_FLAGS (I915_BO_ALLOC_CONTIGUOUS | \
 			     I915_BO_ALLOC_VOLATILE | \
-			     I915_BO_ALLOC_STRUCT_PAGE)
-#define I915_BO_READONLY         BIT(3)
+			     I915_BO_ALLOC_STRUCT_PAGE | \
+			     I915_BO_ALLOC_CPU_CLEAR)
+#define I915_BO_READONLY         BIT(4)
 
 	/*
 	 * Is the object to be mapped as read-only to the GPU
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c
index 8f352ba6202d..e497ff374b13 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
@@ -95,6 +95,26 @@ i915_gem_object_get_pages_buddy(struct drm_i915_gem_object *obj)
 	sg_mark_end(sg);
 	i915_sg_trim(st);
 
+	/* Intended for kernel internal use only */
+	if (obj->flags & I915_BO_ALLOC_CPU_CLEAR) {
+		struct scatterlist *sg;
+		unsigned long i;
+
+		for_each_sg(st->sgl, sg, st->nents, i) {
+			unsigned int length;
+			void __iomem *vaddr;
+			dma_addr_t daddr;
+
+			daddr = sg_dma_address(sg);
+			daddr -= mem->region.start;
+			length = sg_dma_len(sg);
+
+			vaddr = io_mapping_map_wc(&mem->iomap, daddr, length);
+			memset64(vaddr, 0, length / sizeof(u64));
+			io_mapping_unmap(vaddr);
+		}
+	}
+
 	__i915_gem_object_set_pages(obj, st, sg_page_sizes);
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index 7acb94e0e5fe..93e067951e0f 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -361,7 +361,7 @@ static int igt_cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val)
 	if (err)
 		return err;
 
-	ptr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
+	ptr = i915_gem_object_pin_map(obj, I915_MAP_WC);
 	if (IS_ERR(ptr))
 		return PTR_ERR(ptr);
 
@@ -441,7 +441,9 @@ static int igt_gpu_write(struct i915_gem_context *ctx,
 		if (err)
 			break;
 
+		i915_gem_object_lock(obj, NULL);
 		err = igt_cpu_check(obj, dword, rng);
+		i915_gem_object_unlock(obj);
 		if (err)
 			break;
 	} while (!__igt_timeout(end_time, NULL));
@@ -542,6 +544,91 @@ static int igt_lmem_create_migrate(void *arg)
 
 	return err;
 }
+
+static int igt_lmem_create_cleared_cpu(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	I915_RND_STATE(prng);
+	IGT_TIMEOUT(end_time);
+	u32 size, val, i;
+	int err;
+
+	i915_gem_drain_freed_objects(i915);
+
+	size = max_t(u32, PAGE_SIZE, i915_prandom_u32_max_state(SZ_32M, &prng));
+	size = round_up(size, PAGE_SIZE);
+	i = 0;
+
+	do {
+		struct drm_i915_gem_object *obj;
+		void __iomem *vaddr;
+		unsigned int flags;
+		unsigned long n;
+		u32 dword;
+
+		/*
+		 * Alternate between cleared and uncleared allocations, while
+		 * also dirtying the pages each time to check that they either
+		 * remain dirty or are indeed cleared. Allocations should be
+		 * deterministic.
+		 */
+
+		flags = I915_BO_ALLOC_CPU_CLEAR;
+		if (i & 1)
+			flags = 0;
+		else
+			val = 0;
+
+		obj = i915_gem_object_create_lmem(i915, size, flags);
+		if (IS_ERR(obj))
+			return PTR_ERR(obj);
+
+		i915_gem_object_lock(obj, NULL);
+		err = i915_gem_object_pin_pages(obj);
+		if (err)
+			goto out_put;
+
+		dword = i915_prandom_u32_max_state(PAGE_SIZE / sizeof(u32),
+						   &prng);
+
+		err = igt_cpu_check(obj, dword, val);
+		if (err) {
+			pr_err("%s failed with size=%u, flags=%u\n",
+			       __func__, size, flags);
+			goto out_unpin;
+		}
+
+		vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC);
+		if (IS_ERR(vaddr)) {
+			err = PTR_ERR(vaddr);
+			goto out_unpin;
+		}
+
+		val = prandom_u32_state(&prng);
+
+		for (n = 0; n < obj->base.size >> PAGE_SHIFT; ++n) {
+			memset32(vaddr + n * PAGE_SIZE, val,
+				 PAGE_SIZE / sizeof(u32));
+		}
+
+		i915_gem_object_unpin_map(obj);
+out_unpin:
+		i915_gem_object_unpin_pages(obj);
+		__i915_gem_object_put_pages(obj);
+out_put:
+		i915_gem_object_unlock(obj);
+		i915_gem_object_put(obj);
+
+		if (err)
+			break;
+		++i;
+	} while (!__igt_timeout(end_time, NULL));
+
+	pr_info("%s completed (%u) iterations\n", __func__, i);
+
+	return err;
+}
+
 static int igt_lmem_write_gpu(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
@@ -1125,6 +1212,7 @@ int intel_memory_region_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(igt_lmem_create),
+		SUBTEST(igt_lmem_create_cleared_cpu),
 		SUBTEST(igt_lmem_write_cpu),
 		SUBTEST(igt_lmem_write_gpu),
 		SUBTEST(igt_smem_create_migrate),
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 112/162] drm/i915/guc: put all guc objects in lmem when available
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (110 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 111/162] drm/i915/lmem: support optional CPU clearing for special internal use Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 113/162] drm/i915: Create stolen memory region from local memory Matthew Auld
                   ` (49 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx
  Cc: Abdiel Janulgue, dri-devel, Daniele Ceraolo Spurio,
	Radoslaw Szwichtenberg, Vinay Belgaumkar, Michal Wajdeczko

From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

The firmware binary has to be loaded from lmem and the recommendation is
to put all other objects in there as well. Note that we don't fall back
to system memory if the allocation in lmem fails because all objects are
allocated during driver load and if we have issues with lmem at that point
something is seriously wrong with the system, so no point in trying to
handle it.

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> #v1
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c  | 41 +++++++++++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h  |  8 +++++
 drivers/gpu/drm/i915/gt/uc/intel_guc.c    |  9 ++++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 11 ++++--
 drivers/gpu/drm/i915/gt/uc/intel_huc.c    | 14 ++++++--
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 35 ++++++++++++++++---
 6 files changed, 107 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index e56874e54fde..71c07e1f6f26 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -215,6 +215,21 @@ i915_gem_object_lmem_io_map_page_atomic(struct drm_i915_gem_object *obj,
 	return io_mapping_map_atomic_wc(&obj->mm.region->iomap, offset);
 }
 
+void __iomem *
+i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
+			    unsigned long n,
+			    unsigned long size)
+{
+	resource_size_t offset;
+
+	GEM_BUG_ON(!i915_gem_object_is_contiguous(obj));
+
+	offset = i915_gem_object_get_dma_address(obj, n);
+	offset -= obj->mm.region->region.start;
+
+	return io_mapping_map_wc(&obj->mm.region->iomap, offset, size);
+}
+
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
 {
 	struct intel_memory_region *region = obj->mm.region;
@@ -229,6 +244,32 @@ bool i915_gem_object_is_devmem(struct drm_i915_gem_object *obj)
 	return region && region->is_devmem;
 }
 
+struct drm_i915_gem_object *
+i915_gem_object_create_lmem_from_data(struct drm_i915_private *i915,
+				      const void *data, size_t size)
+{
+	struct drm_i915_gem_object *obj;
+	void *map;
+
+	obj = i915_gem_object_create_lmem(i915,
+					  round_up(size, PAGE_SIZE),
+					  I915_BO_ALLOC_CONTIGUOUS);
+	if (IS_ERR(obj))
+		return obj;
+
+	map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
+	if (IS_ERR(map)) {
+		i915_gem_object_put(obj);
+		return map;
+	}
+
+	memcpy(map, data, size);
+
+	i915_gem_object_unpin_map(obj);
+
+	return obj;
+}
+
 struct drm_i915_gem_object *
 i915_gem_object_create_lmem(struct drm_i915_private *i915,
 			    resource_size_t size,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
index a1b6a10050bf..e11e0545e39c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
@@ -14,6 +14,10 @@ struct intel_memory_region;
 
 extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
 
+void __iomem *
+i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
+			    unsigned long n,
+			    unsigned long size);
 void __iomem *i915_gem_object_lmem_io_map_page(struct drm_i915_gem_object *obj,
 					       unsigned long n);
 void __iomem *
@@ -23,6 +27,10 @@ i915_gem_object_lmem_io_map_page_atomic(struct drm_i915_gem_object *obj,
 bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
 bool i915_gem_object_is_devmem(struct drm_i915_gem_object *obj);
 
+struct drm_i915_gem_object *
+i915_gem_object_create_lmem_from_data(struct drm_i915_private *i915,
+				      const void *data, size_t size);
+
 struct drm_i915_gem_object *
 i915_gem_object_create_lmem(struct drm_i915_private *i915,
 			    resource_size_t size,
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index b54b9de31c3e..703726825c50 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -3,6 +3,7 @@
  * Copyright © 2014-2019 Intel Corporation
  */
 
+#include "gem/i915_gem_lmem.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_irq.h"
 #include "gt/intel_gt_pm_irq.h"
@@ -650,7 +651,13 @@ struct i915_vma *intel_guc_allocate_vma(struct intel_guc *guc, u32 size)
 	u64 flags;
 	int ret;
 
-	obj = i915_gem_object_create_shmem(gt->i915, size);
+	if (HAS_LMEM(gt->i915))
+		obj = i915_gem_object_create_lmem(gt->i915, size,
+						  I915_BO_ALLOC_CPU_CLEAR |
+						  I915_BO_ALLOC_CONTIGUOUS);
+	else
+		obj = i915_gem_object_create_shmem(gt->i915, size);
+
 	if (IS_ERR(obj))
 		return ERR_CAST(obj);
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
index f9d0907ea1a5..8790052f1562 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
@@ -41,7 +41,7 @@ static void guc_prepare_xfer(struct intel_uncore *uncore)
 }
 
 /* Copy RSA signature from the fw image to HW for verification */
-static void guc_xfer_rsa(struct intel_uc_fw *guc_fw,
+static int guc_xfer_rsa(struct intel_uc_fw *guc_fw,
 			 struct intel_uncore *uncore)
 {
 	u32 rsa[UOS_RSA_SCRATCH_COUNT];
@@ -49,10 +49,13 @@ static void guc_xfer_rsa(struct intel_uc_fw *guc_fw,
 	int i;
 
 	copied = intel_uc_fw_copy_rsa(guc_fw, rsa, sizeof(rsa));
-	GEM_BUG_ON(copied < sizeof(rsa));
+	if (copied < sizeof(rsa))
+		return -ENOMEM;
 
 	for (i = 0; i < UOS_RSA_SCRATCH_COUNT; i++)
 		intel_uncore_write(uncore, UOS_RSA_SCRATCH(i), rsa[i]);
+
+	return 0;
 }
 
 /*
@@ -142,7 +145,9 @@ int intel_guc_fw_upload(struct intel_guc *guc)
 	 * by the DMA engine in one operation, whereas the RSA signature is
 	 * loaded via MMIO.
 	 */
-	guc_xfer_rsa(&guc->fw, uncore);
+	ret = guc_xfer_rsa(&guc->fw, uncore);
+	if (ret)
+		goto out;
 
 	/*
 	 * Current uCode expects the code to be loaded at 8k; locations below
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
index 56d2144dc6a0..c70bd024f1e1 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
@@ -87,17 +87,25 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
 									vma->obj, true));
 	if (IS_ERR(vaddr)) {
 		i915_vma_unpin_and_release(&vma, 0);
-		return PTR_ERR(vaddr);
+		err = PTR_ERR(vaddr);
+		goto unpin_out;
 	}
 
 	copied = intel_uc_fw_copy_rsa(&huc->fw, vaddr, vma->size);
-	GEM_BUG_ON(copied < huc->fw.rsa_size);
-
 	i915_gem_object_unpin_map(vma->obj);
 
+	if (copied < huc->fw.rsa_size) {
+		err = -ENOMEM;
+		goto unpin_out;
+	}
+
 	huc->rsa_data = vma;
 
 	return 0;
+
+unpin_out:
+	i915_vma_unpin_and_release(&vma, 0);
+	return err;
 }
 
 static void intel_huc_rsa_data_destroy(struct intel_huc *huc)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index b05076d190cc..795eca2bd5b4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -7,6 +7,7 @@
 #include <linux/firmware.h>
 #include <drm/drm_print.h>
 
+#include "gem/i915_gem_lmem.h"
 #include "intel_uc_fw.h"
 #include "intel_uc_fw_abi.h"
 #include "i915_drv.h"
@@ -371,7 +372,11 @@ int intel_uc_fw_fetch(struct intel_uc_fw *uc_fw)
 	if (uc_fw->type == INTEL_UC_FW_TYPE_GUC)
 		uc_fw->private_data_size = css->private_data_size;
 
-	obj = i915_gem_object_create_shmem_from_data(i915, fw->data, fw->size);
+	if (HAS_LMEM(i915))
+		obj = i915_gem_object_create_lmem_from_data(i915, fw->data, fw->size);
+	else
+		obj = i915_gem_object_create_shmem_from_data(i915, fw->data, fw->size);
+
 	if (IS_ERR(obj)) {
 		err = PTR_ERR(obj);
 		goto fail;
@@ -420,14 +425,19 @@ static void uc_fw_bind_ggtt(struct intel_uc_fw *uc_fw)
 		.pages = obj->mm.pages,
 		.vm = &ggtt->vm,
 	};
+	u32 pte_flags = 0;
 
 	GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
 	GEM_BUG_ON(dummy.node.size > ggtt->uc_fw.size);
 
 	/* uc_fw->obj cache domains were not controlled across suspend */
-	drm_clflush_sg(dummy.pages);
+	if (i915_gem_object_has_struct_page(obj))
+		drm_clflush_sg(dummy.pages);
+
+	if (i915_gem_object_is_lmem(obj))
+		pte_flags |= PTE_LM;
 
-	ggtt->vm.insert_entries(&ggtt->vm, &dummy, I915_CACHE_NONE, 0);
+	ggtt->vm.insert_entries(&ggtt->vm, &dummy, I915_CACHE_NONE, pte_flags);
 }
 
 static void uc_fw_unbind_ggtt(struct intel_uc_fw *uc_fw)
@@ -592,7 +602,24 @@ size_t intel_uc_fw_copy_rsa(struct intel_uc_fw *uc_fw, void *dst, u32 max_len)
 
 	GEM_BUG_ON(!intel_uc_fw_is_available(uc_fw));
 
-	return sg_pcopy_to_buffer(pages->sgl, pages->nents, dst, size, offset);
+	if (i915_gem_object_is_lmem(uc_fw->obj)) {
+		unsigned long page_idx = offset >> PAGE_SHIFT;
+		unsigned int page_off = offset_in_page(offset);
+		void __iomem *vaddr;
+
+		vaddr = i915_gem_object_lmem_io_map(uc_fw->obj,
+						    page_idx,
+						    page_off + size);
+		if (!vaddr)
+			return 0;
+
+		memcpy(dst, vaddr + page_off, size);
+		io_mapping_unmap(vaddr);
+		return size;
+	} else {
+		return sg_pcopy_to_buffer(pages->sgl, pages->nents,
+					  dst, size, offset);
+	}
 }
 
 /**
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 113/162] drm/i915: Create stolen memory region from local memory
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (111 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 112/162] drm/i915/guc: put all guc objects in lmem when available Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-12-07 13:39   ` [Intel-gfx] " Jani Nikula
  2020-11-27 12:06 ` [RFC PATCH 114/162] drm/i915/lmem: Bypass aperture when lmem is available Matthew Auld
                   ` (48 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx
  Cc: Abdiel Janulgue, Matthew Brost, Lucas De Marchi, Sudeep Dutt,
	Chris P Wilson, CQ Tang, Venkata S Dhanalakota, dri-devel,
	Neel Desai, Francesco, Balestrieri, Niranjana Vishwanathapura

From: CQ Tang <cq.tang@intel.com>

Add "REGION_STOLEN" device info to dg1, create stolen memory
region from upper portion of local device memory, starting
from DSMBASE.

The memory region is marked with "is_devmem=true".

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Chris P Wilson <chris.p.wilson@intel.com>
Cc: Balestrieri, Francesco <francesco.balestrieri@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Venkata S Dhanalakota <venkata.s.dhanalakota@intel.com>
Cc: Neel Desai <neel.desai@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c   |  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  7 +++
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 56 +++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_pci.c            |  2 +-
 drivers/gpu/drm/i915/i915_reg.h            |  1 +
 drivers/gpu/drm/i915/intel_memory_region.c |  5 ++
 drivers/gpu/drm/i915/intel_memory_region.h |  2 +-
 7 files changed, 71 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index 71c07e1f6f26..b2fd2bc862c0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -111,8 +111,8 @@ int i915_gem_object_lmem_pread(struct drm_i915_gem_object *obj,
 	return ret;
 }
 
-static int i915_gem_object_lmem_pwrite(struct drm_i915_gem_object *obj,
-				       const struct drm_i915_gem_pwrite *arg)
+int i915_gem_object_lmem_pwrite(struct drm_i915_gem_object *obj,
+				const struct drm_i915_gem_pwrite *arg)
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct intel_runtime_pm *rpm = &i915->runtime_pm;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
index e11e0545e39c..c59aa6c014c7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
@@ -11,9 +11,16 @@
 struct drm_i915_private;
 struct drm_i915_gem_object;
 struct intel_memory_region;
+struct drm_i915_gem_pread;
+struct drm_i915_gem_pwrite;
 
 extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
 
+int i915_gem_object_lmem_pread(struct drm_i915_gem_object *obj,
+			       const struct drm_i915_gem_pread *args);
+int i915_gem_object_lmem_pwrite(struct drm_i915_gem_object *obj,
+				const struct drm_i915_gem_pwrite *args);
+
 void __iomem *
 i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
 			    unsigned long n,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index 0ddf48e472a0..633745336f40 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -10,6 +10,7 @@
 #include <drm/drm_mm.h>
 #include <drm/i915_drm.h>
 
+#include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_region.h"
 #include "i915_drv.h"
 #include "i915_gem_stolen.h"
@@ -121,6 +122,14 @@ static int i915_adjust_stolen(struct drm_i915_private *i915,
 		}
 	}
 
+	/*
+	 * With device local memory, we don't need to check the address range,
+	 * this is device memory physical address, could overlap with system
+	 * memory.
+	 */
+	if (HAS_LMEM(i915))
+		return 0;
+
 	/*
 	 * Verify that nothing else uses this physical address. Stolen
 	 * memory should be reserved by the BIOS and hidden from the
@@ -607,7 +616,7 @@ static void i915_gem_object_put_pages_stolen(struct drm_i915_gem_object *obj,
 	kfree(pages);
 }
 
-static const struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
+static struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
 	.name = "i915_gem_object_stolen",
 	.get_pages = i915_gem_object_get_pages_stolen,
 	.put_pages = i915_gem_object_put_pages_stolen,
@@ -716,7 +725,19 @@ i915_gem_object_create_stolen(struct drm_i915_private *i915,
 
 static int init_stolen(struct intel_memory_region *mem)
 {
-	intel_memory_region_set_name(mem, "stolen");
+	if (mem->type == INTEL_MEMORY_STOLEN_SYSTEM)
+		intel_memory_region_set_name(mem, "stolen-system");
+	else
+		intel_memory_region_set_name(mem, "stolen-local");
+
+	if (HAS_LMEM(mem->i915)) {
+		i915_gem_object_stolen_ops.pread = i915_gem_object_lmem_pread;
+		i915_gem_object_stolen_ops.pwrite = i915_gem_object_lmem_pwrite;
+		if (!io_mapping_init_wc(&mem->iomap,
+					mem->io_start,
+					resource_size(&mem->region)))
+			return -EIO;
+	}
 
 	/*
 	 * Initialise stolen early so that we may reserve preallocated
@@ -736,8 +757,39 @@ static const struct intel_memory_region_ops i915_region_stolen_ops = {
 	.create_object = i915_gem_object_create_stolen_region,
 };
 
+static
+struct intel_memory_region *setup_lmem_stolen(struct drm_i915_private *i915)
+{
+	struct intel_uncore *uncore = &i915->uncore;
+	struct pci_dev *pdev = i915->drm.pdev;
+	struct intel_memory_region *mem;
+	resource_size_t io_start;
+	resource_size_t lmem_size;
+	u64 lmem_base;
+
+	lmem_base = intel_uncore_read64(uncore, GEN12_DSMBASE);
+	lmem_size = pci_resource_len(pdev, 2) - lmem_base;
+	io_start = pci_resource_start(pdev, 2) + lmem_base;
+
+	mem = intel_memory_region_create(i915, lmem_base, lmem_size,
+					 I915_GTT_PAGE_SIZE_4K, io_start,
+					 &i915_region_stolen_ops);
+	if (!IS_ERR(mem)) {
+		DRM_INFO("Intel graphics stolen LMEM: %pR\n", &mem->region);
+		DRM_INFO("Intel graphics stolen LMEM IO start: %llx\n",
+			 (u64)mem->io_start);
+		/* this is real device memory */
+		mem->is_devmem = true;
+	}
+
+	return mem;
+}
+
 struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915)
 {
+	if (HAS_LMEM(i915))
+		return setup_lmem_stolen(i915);
+
 	return intel_memory_region_create(i915,
 					  intel_graphics_stolen_res.start,
 					  resource_size(&intel_graphics_stolen_res),
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 8243178a56f9..c3d9b36ef651 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -907,7 +907,7 @@ static const struct intel_device_info rkl_info = {
 
 #define GEN12_DGFX_FEATURES \
 	GEN12_FEATURES, \
-	.memory_regions = REGION_SMEM | REGION_LMEM, \
+	.memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \
 	.has_master_unit_irq = 1, \
 	.has_llc = 0, \
 	.has_snoop = 1, \
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 0e01ea0cb0a4..3c8350f108e4 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -12067,6 +12067,7 @@ enum skl_power_gate {
 #define   LMEM_ENABLE			(1 << 31)
 
 #define GEN12_GSMBASE			_MMIO(0x108100)
+#define GEN12_DSMBASE			_MMIO(0x1080C0)
 
 /* gamt regs */
 #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index 043541d409bd..c7a1d84e7ee8 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -19,6 +19,10 @@ const struct intel_memory_region_info intel_region_map[] = {
                .class = INTEL_MEMORY_STOLEN_SYSTEM,
                .instance = 0,
        },
+       [INTEL_REGION_STOLEN_LMEM] = {
+	       .class = INTEL_MEMORY_STOLEN_LOCAL,
+	       .instance = 0,
+       },
 };
 
 struct intel_memory_region *
@@ -311,6 +315,7 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
 		case INTEL_MEMORY_SYSTEM:
 			mem = i915_gem_shmem_setup(i915);
 			break;
+		case INTEL_MEMORY_STOLEN_LOCAL: /* fallthrough */
 		case INTEL_MEMORY_STOLEN_SYSTEM:
 			mem = i915_gem_stolen_setup(i915);
 			break;
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index b7a9e34faaf1..8da82cb2afe3 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -93,7 +93,7 @@ struct intel_memory_region {
 	u16 type;
 	u16 instance;
 	enum intel_region_id id;
-	char name[8];
+	char name[16];
 	struct intel_gt *gt; /* GT closest to this region. */
 	bool is_devmem;	/* true for device memory */
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 114/162] drm/i915/lmem: Bypass aperture when lmem is available
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (112 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 113/162] drm/i915: Create stolen memory region from local memory Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 115/162] drm/i915/lmem: reset the lmem buffer created by fbdev Matthew Auld
                   ` (47 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx
  Cc: Anusha Srivatsa, Lucas De Marchi, dri-devel, CQ Tang,
	Daniele Ceraolo Spurio, Dhinakaran Pandiyan, Chris P Wilson,
	Daniel Vetter

From: Anusha Srivatsa <anusha.srivatsa@intel.com>

In the scenario where local memory is available, we have
rely on CPU access via lmem directly instead of aperture.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Chris P Wilson <chris.p.wilson@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/i915/display/intel_fbdev.c | 23 +++++++++++++++-------
 drivers/gpu/drm/i915/i915_vma.c            | 19 ++++++++++++------
 2 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
index 831e99e0785c..65539fab6269 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -41,6 +41,7 @@
 #include <drm/drm_fb_helper.h>
 #include <drm/drm_fourcc.h>
 
+#include "gem/i915_gem_lmem.h"
 #include "i915_drv.h"
 #include "intel_display_types.h"
 #include "intel_fbdev.h"
@@ -137,14 +138,22 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
 	size = mode_cmd.pitches[0] * mode_cmd.height;
 	size = PAGE_ALIGN(size);
 
-	/* If the FB is too big, just don't use it since fbdev is not very
-	 * important and we should probably use that space with FBC or other
-	 * features. */
 	obj = ERR_PTR(-ENODEV);
-	if (size * 2 < dev_priv->stolen_usable_size)
-		obj = i915_gem_object_create_stolen(dev_priv, size);
-	if (IS_ERR(obj))
-		obj = i915_gem_object_create_shmem(dev_priv, size);
+	if (HAS_LMEM(dev_priv)) {
+		obj = i915_gem_object_create_lmem(dev_priv, size,
+						  I915_BO_ALLOC_CONTIGUOUS);
+	} else {
+		/*
+		 * If the FB is too big, just don't use it since fbdev is not very
+		 * important and we should probably use that space with FBC or other
+		 * features.
+		 */
+		if (size * 2 < dev_priv->stolen_usable_size)
+			obj = i915_gem_object_create_stolen(dev_priv, size);
+		if (IS_ERR(obj))
+			obj = i915_gem_object_create_shmem(dev_priv, size);
+	}
+
 	if (IS_ERR(obj)) {
 		drm_err(&dev_priv->drm, "failed to allocate framebuffer\n");
 		return PTR_ERR(obj);
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 82f60cc43a90..59fe82af48b2 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -27,6 +27,7 @@
 
 #include "display/intel_frontbuffer.h"
 
+#include "gem/i915_gem_lmem.h"
 #include "gt/intel_engine.h"
 #include "gt/intel_engine_heartbeat.h"
 #include "gt/intel_gt.h"
@@ -448,9 +449,11 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 	void __iomem *ptr;
 	int err;
 
-	if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
-		err = -ENODEV;
-		goto err;
+	if (!i915_gem_object_is_devmem(vma->obj)) {
+		if (GEM_WARN_ON(!i915_vma_is_map_and_fenceable(vma))) {
+			err = -ENODEV;
+			goto err;
+		}
 	}
 
 	GEM_BUG_ON(!i915_vma_is_ggtt(vma));
@@ -458,9 +461,13 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 
 	ptr = READ_ONCE(vma->iomap);
 	if (ptr == NULL) {
-		ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
-					vma->node.start,
-					vma->node.size);
+		if (i915_gem_object_is_devmem(vma->obj))
+			ptr = i915_gem_object_lmem_io_map(vma->obj, 0,
+							  vma->obj->base.size);
+		else
+			ptr = io_mapping_map_wc(&i915_vm_to_ggtt(vma->vm)->iomap,
+						vma->node.start,
+						vma->node.size);
 		if (ptr == NULL) {
 			err = -ENOMEM;
 			goto err;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 115/162] drm/i915/lmem: reset the lmem buffer created by fbdev
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (113 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 114/162] drm/i915/lmem: Bypass aperture when lmem is available Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 116/162] drm/i915/dsb: Enable lmem for dsb Matthew Auld
                   ` (46 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Animesh Manna, dri-devel

From: Animesh Manna <animesh.manna@intel.com>

Newly created lmem buffer by fbdev need reset otherwise it has old
garbage data. Same logic was present for stolen memory, extended
the same for lmem.

Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
---
 drivers/gpu/drm/i915/display/intel_fbdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
index 65539fab6269..6bd3bbe42bf0 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -280,7 +280,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	 * If the object is stolen however, it will be full of whatever
 	 * garbage was left in there.
 	 */
-	if (vma->obj->stolen && !prealloc)
+	if ((vma->obj->stolen || HAS_LMEM(dev_priv)) && !prealloc)
 		memset_io(info->screen_base, 0, info->screen_size);
 
 	/* Use default scratch pixmap (info->pixmap.flags = FB_PIXMAP_SYSTEM) */
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 116/162] drm/i915/dsb: Enable lmem for dsb
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (114 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 115/162] drm/i915/lmem: reset the lmem buffer created by fbdev Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 117/162] drm/i915: Reintroduce mem->reserved Matthew Auld
                   ` (45 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Animesh Manna, Lucas De Marchi, dri-devel

From: Animesh Manna <animesh.manna@intel.com>

For dgfx, DSB should use local memory instead of system memory. Using
local memory surely brings performance improvement as local memory is
close to gpu. Also want to avoid multiple gpu using system memory.

Use LMEM API to create gem object needed for DSB command buffer.

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/i915/display/intel_dsb.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c
index 857126822a88..73795e415ad5 100644
--- a/drivers/gpu/drm/i915/display/intel_dsb.c
+++ b/drivers/gpu/drm/i915/display/intel_dsb.c
@@ -6,6 +6,7 @@
 
 #include "i915_drv.h"
 #include "intel_display_types.h"
+#include "gem/i915_gem_lmem.h"
 
 #define DSB_BUF_SIZE    (2 * PAGE_SIZE)
 
@@ -278,7 +279,11 @@ void intel_dsb_prepare(struct intel_crtc_state *crtc_state)
 
 	wakeref = intel_runtime_pm_get(&i915->runtime_pm);
 
-	obj = i915_gem_object_create_internal(i915, DSB_BUF_SIZE);
+	if (HAS_LMEM(i915))
+		obj = i915_gem_object_create_lmem(i915, DSB_BUF_SIZE, 0);
+	else
+		obj = i915_gem_object_create_internal(i915, DSB_BUF_SIZE);
+
 	if (IS_ERR(obj)) {
 		drm_err(&i915->drm, "Gem object creation failed\n");
 		kfree(dsb);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 117/162] drm/i915: Reintroduce mem->reserved
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (115 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 116/162] drm/i915/dsb: Enable lmem for dsb Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 118/162] drm/i915/dg1: Reserve first 1MB of local memory Matthew Auld
                   ` (44 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Abdiel Janulgue, dri-devel

From: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>

In the following patch we need to reserve regions unaccessible to the
driver during initialization, so add back mem->reserved for collecting
such regions.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_memory_region.c    |  2 +
 drivers/gpu/drm/i915/intel_memory_region.h    |  2 +
 .../drm/i915/selftests/intel_memory_region.c  | 89 +++++++++++++++++++
 3 files changed, 93 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index c7a1d84e7ee8..554fdd7735a8 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -203,6 +203,7 @@ int intel_memory_region_init_buddy(struct intel_memory_region *mem)
 
 void intel_memory_region_release_buddy(struct intel_memory_region *mem)
 {
+	i915_buddy_free_list(&mem->mm, &mem->reserved);
 	i915_buddy_fini(&mem->mm);
 }
 
@@ -232,6 +233,7 @@ intel_memory_region_create(struct drm_i915_private *i915,
 	mutex_init(&mem->objects.lock);
 	INIT_LIST_HEAD(&mem->objects.list);
 	INIT_LIST_HEAD(&mem->objects.purgeable);
+	INIT_LIST_HEAD(&mem->reserved);
 
 	mutex_init(&mem->mm_lock);
 
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index 8da82cb2afe3..0bfc1fa36f74 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -97,6 +97,8 @@ struct intel_memory_region {
 	struct intel_gt *gt; /* GT closest to this region. */
 	bool is_devmem;	/* true for device memory */
 
+	struct list_head reserved;
+
 	dma_addr_t remap_addr;
 
 	struct {
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index 93e067951e0f..9df0a4f657c1 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -134,6 +134,94 @@ static void igt_object_release(struct drm_i915_gem_object *obj)
 	i915_gem_object_put(obj);
 }
 
+static int igt_reserve_range(struct intel_memory_region *mem,
+			     struct list_head *reserved,
+			     u64 offset,
+			     u64 size)
+{
+	int ret;
+	LIST_HEAD(blocks);
+
+	ret = i915_buddy_alloc_range(&mem->mm, &blocks, offset, size);
+	if (!ret)
+		list_splice_tail(&blocks, reserved);
+
+	return ret;
+}
+
+static int igt_mock_reserve(void *arg)
+{
+	struct drm_i915_gem_object *obj;
+	struct intel_memory_region *mem = arg;
+	resource_size_t avail = resource_size(&mem->region);
+	I915_RND_STATE(prng);
+	LIST_HEAD(objects);
+	LIST_HEAD(reserved);
+	u32 i, offset, count, *order;
+	u64 allocated, cur_avail;
+	const u32 chunk_size = SZ_32M;
+	int err = 0;
+
+	count = avail / chunk_size;
+	order = i915_random_order(count, &prng);
+	if (!order)
+		return 0;
+
+	/* Reserve a bunch of ranges within the region */
+	for (i = 0; i < count; ++i) {
+		u64 start = order[i] * chunk_size;
+		u64 size = i915_prandom_u32_max_state(chunk_size, &prng);
+
+		/* Allow for some really big holes */
+		if (!size)
+			continue;
+
+		size = round_up(size, PAGE_SIZE);
+		offset = igt_random_offset(&prng, 0, chunk_size, size,
+					   PAGE_SIZE);
+
+		err = igt_reserve_range(mem, &reserved, start + offset, size);
+		if (err) {
+			pr_err("%s failed to reserve range", __func__);
+			goto out_close;
+		}
+
+		/* XXX: maybe sanity check the block range here? */
+		avail -= size;
+	}
+
+	/* Try to see if we can allocate from the remaining space */
+	allocated = 0;
+	cur_avail = avail;
+	do {
+		u64 size = i915_prandom_u32_max_state(cur_avail, &prng);
+
+		size = max_t(u64, round_up(size, PAGE_SIZE), (u64)PAGE_SIZE);
+		obj = igt_object_create(mem, &objects, size, 0);
+
+		if (IS_ERR(obj)) {
+			if (PTR_ERR(obj) == -ENXIO)
+				break;
+
+			err = PTR_ERR(obj);
+			goto out_close;
+		}
+		cur_avail -= size;
+		allocated += size;
+	} while (1);
+
+	if (allocated != avail) {
+		pr_err("%s mismatch between allocation and free space", __func__);
+		err = -EINVAL;
+	}
+
+out_close:
+	kfree(order);
+	close_objects(mem, &objects);
+	i915_buddy_free_list(&mem->mm, &reserved);
+	return err;
+}
+
 static int igt_mock_contiguous(void *arg)
 {
 	struct intel_memory_region *mem = arg;
@@ -1180,6 +1268,7 @@ static int igt_lmem_pages_migrate(void *arg)
 int intel_memory_region_mock_selftests(void)
 {
 	static const struct i915_subtest tests[] = {
+		SUBTEST(igt_mock_reserve),
 		SUBTEST(igt_mock_fill),
 		SUBTEST(igt_mock_contiguous),
 		SUBTEST(igt_mock_splintered_region),
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 118/162] drm/i915/dg1: Reserve first 1MB of local memory
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (116 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 117/162] drm/i915: Reintroduce mem->reserved Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 13:52   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 119/162] drm/i915/dg1: Read OPROM via SPI controller Matthew Auld
                   ` (43 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

From: Imre Deak <imre.deak@intel.com>

On DG1 A0/B0 steppings the first 1MB of local memory must be reserved.
One reason for this is that the 0xA0000-0xB0000 range is not accessible
by the display, probably since this region is redirected to another
memory location for legacy VGA compatibility.

BSpec: 50586
Testcase: igt/kms_big_fb/linear-64bpp-rotate-0
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/intel_region_lmem.c | 52 ++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index 939cf0d195a5..eafef7034680 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -137,6 +137,48 @@ intel_setup_fake_lmem(struct drm_i915_private *i915)
 	return mem;
 }
 
+static void get_legacy_lowmem_region(struct intel_uncore *uncore,
+				     u64 *start, u32 *size)
+{
+	*start = 0;
+	*size = 0;
+
+	if (!IS_DG1_REVID(uncore->i915, DG1_REVID_A0, DG1_REVID_B0))
+		return;
+
+	*size = SZ_1M;
+
+	DRM_DEBUG_DRIVER("LMEM: reserved legacy low-memory [0x%llx-0x%llx]\n",
+			 *start, *start + *size);
+}
+
+static int reserve_lowmem_region(struct intel_uncore *uncore,
+				 struct intel_memory_region *mem)
+{
+	u64 reserve_start;
+	u64 reserve_end;
+	u64 region_start;
+	u32 region_size;
+	int ret;
+
+	get_legacy_lowmem_region(uncore, &region_start, &region_size);
+	reserve_start = region_start;
+	reserve_end = region_start + region_size;
+
+	if (!reserve_end)
+		return 0;
+
+	DRM_INFO("LMEM: reserving low-memory region [0x%llx-0x%llx]\n",
+		 reserve_start, reserve_end);
+	ret = i915_buddy_alloc_range(&mem->mm, &mem->reserved,
+				     reserve_start,
+				     reserve_end - reserve_start);
+	if (ret)
+		DRM_ERROR("LMEM: reserving low memory region failed\n");
+
+	return ret;
+}
+
 static struct intel_memory_region *
 setup_lmem(struct drm_i915_private *dev_priv)
 {
@@ -160,6 +202,16 @@ setup_lmem(struct drm_i915_private *dev_priv)
 					 I915_GTT_PAGE_SIZE_4K,
 					 io_start,
 					 &intel_region_lmem_ops);
+	if (!IS_ERR(mem)) {
+		int err;
+
+		err = reserve_lowmem_region(uncore, mem);
+		if (err) {
+			intel_memory_region_put(mem);
+			return ERR_PTR(err);
+		}
+	}
+
 	if (!IS_ERR(mem)) {
 		DRM_INFO("Intel graphics LMEM: %pR\n", &mem->region);
 		DRM_INFO("Intel graphics LMEM IO start: %llx\n",
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 119/162] drm/i915/dg1: Read OPROM via SPI controller
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (117 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 118/162] drm/i915/dg1: Reserve first 1MB of local memory Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-30 10:16   ` [Intel-gfx] " Jani Nikula
  2020-11-27 12:06 ` [RFC PATCH 120/162] drm/i915/oprom: Basic sanitization Matthew Auld
                   ` (42 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Lucas De Marchi, dri-devel, Jon Bloomfield, Tomas Winkler

From: Clint Taylor <clinton.a.taylor@intel.com>

Read OPROM SPI through MMIO and find VBT entry since we can't use
OpRegion and PCI mapping may not work on some systems due to the BIOS
not leaving the Option ROM mapped.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bios.c | 80 +++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_reg.h           |  8 +++
 2 files changed, 82 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
index 4cc949b228f2..91044fc52acb 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -2086,6 +2086,66 @@ bool intel_bios_is_valid_vbt(const void *buf, size_t size)
 	return vbt;
 }
 
+static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *dev_priv)
+{
+	u32 count, data, found, store = 0;
+	u32 static_region, oprom_offset;
+	u32 oprom_size = 0x200000;
+	u16 vbt_size;
+	u32 *vbt;
+
+	static_region = I915_READ(SPI_STATIC_REGIONS);
+	static_region &= OPTIONROM_SPI_REGIONID_MASK;
+	I915_WRITE(PRIMARY_SPI_REGIONID, static_region);
+
+	oprom_offset = I915_READ(OROM_OFFSET);
+	oprom_offset &= OROM_OFFSET_MASK;
+
+	for (count = 0; count < oprom_size; count += 4) {
+		I915_WRITE(PRIMARY_SPI_ADDRESS, oprom_offset + count);
+		data = I915_READ(PRIMARY_SPI_TRIGGER);
+
+		if (data == *((const u32 *)"$VBT")) {
+			found = oprom_offset + count;
+			break;
+		}
+	}
+
+	if (count >= oprom_size)
+		goto err_not_found;
+
+	/* Get VBT size and allocate space for the VBT */
+	I915_WRITE(PRIMARY_SPI_ADDRESS, found +
+		   offsetof(struct vbt_header, vbt_size));
+	vbt_size = I915_READ(PRIMARY_SPI_TRIGGER);
+	vbt_size &= 0xffff;
+
+	vbt = kzalloc(vbt_size, GFP_KERNEL);
+	if (!vbt) {
+		DRM_ERROR("Unable to allocate %u bytes for VBT storage\n",
+			  vbt_size);
+		goto err_not_found;
+	}
+
+	for (count = 0; count < vbt_size; count += 4) {
+		I915_WRITE(PRIMARY_SPI_ADDRESS, found + count);
+		data = I915_READ(PRIMARY_SPI_TRIGGER);
+		*(vbt + store++) = data;
+	}
+
+	if (!intel_bios_is_valid_vbt(vbt, vbt_size))
+		goto err_free_vbt;
+
+	DRM_DEBUG_KMS("Found valid VBT in SPI flash\n");
+
+	return (struct vbt_header *)vbt;
+
+err_free_vbt:
+	kfree(vbt);
+err_not_found:
+	return NULL;
+}
+
 static struct vbt_header *oprom_get_vbt(struct drm_i915_private *dev_priv)
 {
 	struct pci_dev *pdev = dev_priv->drm.pdev;
@@ -2135,6 +2195,8 @@ static struct vbt_header *oprom_get_vbt(struct drm_i915_private *dev_priv)
 
 	pci_unmap_rom(pdev, oprom);
 
+	DRM_DEBUG_KMS("Found valid VBT in PCI ROM\n");
+
 	return vbt;
 
 err_free_vbt:
@@ -2169,17 +2231,23 @@ void intel_bios_init(struct drm_i915_private *dev_priv)
 
 	init_vbt_defaults(dev_priv);
 
-	/* If the OpRegion does not have VBT, look in PCI ROM. */
+	/*
+	 * If the OpRegion does not have VBT, look in SPI flash through MMIO or
+	 * PCI mapping
+	 */
+	if (!vbt && IS_DGFX(dev_priv)) {
+		oprom_vbt = spi_oprom_get_vbt(dev_priv);
+		vbt = oprom_vbt;
+	}
+
 	if (!vbt) {
 		oprom_vbt = oprom_get_vbt(dev_priv);
-		if (!oprom_vbt)
-			goto out;
-
 		vbt = oprom_vbt;
-
-		drm_dbg_kms(&dev_priv->drm, "Found valid VBT in PCI ROM\n");
 	}
 
+	if (!vbt)
+		goto out;
+
 	bdb = get_bdb_header(vbt);
 
 	drm_dbg_kms(&dev_priv->drm,
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 3c8350f108e4..f00289574ac8 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -12413,6 +12413,14 @@ enum skl_power_gate {
 #define   DP_PIN_ASSIGNMENT_MASK(idx)		(0xf << ((idx) * 4))
 #define   DP_PIN_ASSIGNMENT(idx, x)		((x) << ((idx) * 4))
 
+#define PRIMARY_SPI_TRIGGER			_MMIO(0x102040)
+#define PRIMARY_SPI_ADDRESS			_MMIO(0x102080)
+#define PRIMARY_SPI_REGIONID			_MMIO(0x102084)
+#define SPI_STATIC_REGIONS			_MMIO(0x102090)
+#define   OPTIONROM_SPI_REGIONID_MASK		REG_GENMASK(7, 0)
+#define OROM_OFFSET				_MMIO(0x1020c0)
+#define   OROM_OFFSET_MASK			REG_GENMASK(20, 16)
+
 /* This register controls the Display State Buffer (DSB) engines. */
 #define _DSBSL_INSTANCE_BASE		0x70B00
 #define DSBSL_INSTANCE(pipe, id)	(_DSBSL_INSTANCE_BASE + \
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 120/162] drm/i915/oprom: Basic sanitization
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (118 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 119/162] drm/i915/dg1: Read OPROM via SPI controller Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-30 10:24   ` [Intel-gfx] " Jani Nikula
  2020-11-27 12:06 ` [RFC PATCH 121/162] drm/i915: WA for zero memory channel Matthew Auld
                   ` (41 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Jani Nikula, Anshuman Gupta, Uma Shankar, dri-devel

From: Anshuman Gupta <anshuman.gupta@intel.com>

Sanitize OPROM header, CPD signature and OPROM PCI version.
OPROM_HEADER, EXPANSION_ROM_HEADER and OPROM_MEU_BLOB structures
and PCI struct offsets are provided by GSC counterparts.
These are yet to be Documented in B.Spec.
After successful sanitization, extract VBT from opregion
image.

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bios.c     |  49 +++--
 drivers/gpu/drm/i915/display/intel_opregion.c | 169 ++++++++++++++++++
 drivers/gpu/drm/i915/display/intel_opregion.h |  31 +++-
 3 files changed, 221 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
index 91044fc52acb..358576bc0be2 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -2088,37 +2088,36 @@ bool intel_bios_is_valid_vbt(const void *buf, size_t size)
 
 static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *dev_priv)
 {
-	u32 count, data, found, store = 0;
-	u32 static_region, oprom_offset;
-	u32 oprom_size = 0x200000;
-	u16 vbt_size;
-	u32 *vbt;
-
-	static_region = I915_READ(SPI_STATIC_REGIONS);
-	static_region &= OPTIONROM_SPI_REGIONID_MASK;
-	I915_WRITE(PRIMARY_SPI_REGIONID, static_region);
+	u32 count, found;
+	u32 *vbt, *oprom_opreg = NULL;
+	u16 vbt_size, opreg_size;
+	u8 *parse_ptr;
 
-	oprom_offset = I915_READ(OROM_OFFSET);
-	oprom_offset &= OROM_OFFSET_MASK;
+	if (intel_oprom_verify_signature(&oprom_opreg, &opreg_size, dev_priv)) {
+		drm_err(&dev_priv->drm, "oprom signature verification failed\n");
+		goto err_not_found;
+	}
 
-	for (count = 0; count < oprom_size; count += 4) {
-		I915_WRITE(PRIMARY_SPI_ADDRESS, oprom_offset + count);
-		data = I915_READ(PRIMARY_SPI_TRIGGER);
+	if (!oprom_opreg) {
+		drm_err(&dev_priv->drm, "opregion not found\n");
+		goto err_not_found;
+	}
 
-		if (data == *((const u32 *)"$VBT")) {
-			found = oprom_offset + count;
+	for (count = 0; count < opreg_size; count += 4) {
+		if (oprom_opreg[count / 4] == *((const u32 *)"$VBT")) {
+			found = count;
 			break;
 		}
 	}
 
-	if (count >= oprom_size)
+	if (count >= opreg_size) {
+		drm_err(&dev_priv->drm, "VBT not found in opregion\n");
 		goto err_not_found;
+	}
 
 	/* Get VBT size and allocate space for the VBT */
-	I915_WRITE(PRIMARY_SPI_ADDRESS, found +
-		   offsetof(struct vbt_header, vbt_size));
-	vbt_size = I915_READ(PRIMARY_SPI_TRIGGER);
-	vbt_size &= 0xffff;
+	parse_ptr = (u8 *)oprom_opreg + found;
+	vbt_size = ((struct vbt_header *)parse_ptr)->vbt_size;
 
 	vbt = kzalloc(vbt_size, GFP_KERNEL);
 	if (!vbt) {
@@ -2127,16 +2126,12 @@ static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *dev_priv)
 		goto err_not_found;
 	}
 
-	for (count = 0; count < vbt_size; count += 4) {
-		I915_WRITE(PRIMARY_SPI_ADDRESS, found + count);
-		data = I915_READ(PRIMARY_SPI_TRIGGER);
-		*(vbt + store++) = data;
-	}
-
+	memcpy(vbt, parse_ptr, vbt_size);
 	if (!intel_bios_is_valid_vbt(vbt, vbt_size))
 		goto err_free_vbt;
 
 	DRM_DEBUG_KMS("Found valid VBT in SPI flash\n");
+	kfree(oprom_opreg);
 
 	return (struct vbt_header *)vbt;
 
diff --git a/drivers/gpu/drm/i915/display/intel_opregion.c b/drivers/gpu/drm/i915/display/intel_opregion.c
index 4f77cf849171..81e5946393dd 100644
--- a/drivers/gpu/drm/i915/display/intel_opregion.c
+++ b/drivers/gpu/drm/i915/display/intel_opregion.c
@@ -983,6 +983,175 @@ int intel_opregion_setup(struct drm_i915_private *dev_priv)
 	return err;
 }
 
+static int oprom_image_parse_helper(u8 *parse_ptr, u8 *last_img, u8 *code_type,
+				    struct drm_i915_private *i915)
+{
+	u8 size_512_bytes;
+
+	if (((union oprom_header *)parse_ptr)->signature != OPROM_IMAGE_MAGIC) {
+		drm_err(&i915->drm, "Wrong OPROM header signature.\n");
+		return -EINVAL;
+	}
+
+	size_512_bytes = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_IMAGE_LENGTH_OFFSET];
+	*code_type = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_CODE_TYPE_OFFSET];
+	*last_img = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_LAST_IMAGE_INDICATOR_OFFSET];
+
+	return size_512_bytes;
+}
+
+static void spi_read_oprom_helper(size_t len, u32 offset, u32 *buf,
+				  struct drm_i915_private *dev_priv)
+{
+	u32 count, data;
+
+	for (count = 0; count < len; count += 4) {
+		I915_WRITE(PRIMARY_SPI_ADDRESS, offset + count);
+		data = I915_READ(PRIMARY_SPI_TRIGGER);
+		buf[count / 4] = data;
+	}
+}
+
+/**
+ *	+        DASH+G OPROM IMAGE LAYOUT           +
+ *	+--------+-------+---------------------------+
+ *	| Offset | Value |   ROM Header Fields       +-----> Image 1 (CSS)
+ *	+--------------------------------------------+
+ *	|    0h  |  55h  |   ROM Signature Byte1     |
+ *	|    1h  |  AAh  |   ROM Signature Byte2     |
+ *	|    2h  |  xx   |        Reserved           |
+ *	|  18+19h|  xx   |  Ptr to PCI DataStructure |
+ *	+----------------+---------------------------+
+ *	|           PCI Data Structure               |
+ *	+--------------------------------------------+
+ *	|    .       .             .                 |
+ *	|    .       .             .                 |
+ *	|    10  +  xx   +     Image Length          |
+ *	|    14  +  xx   +     Code Type             |
+ *	|    15  +  xx   +  Last Image Indicator     |
+ *	|    .       .             .                 |
+ *	+--------------------------------------------+
+ *	|               MEU BLOB                     |
+ *	+--------------------------------------------+
+ *	|              CPD Header                    |
+ *	|              CPD Entry                     |
+ *	|              Reserved                      |
+ *	|           SignedDataPart1                  |
+ *	|              PublicKey                     |
+ *	|            RSA Signature                   |
+ *	|           SignedDataPart2                  |
+ *	|            IFWI Metadata                   |
+ *	+--------+-------+---------------------------+
+ *	|    .   |   .   |         .                 |
+ *	|    .   |   .   |         .                 |
+ *	+--------------------------------------------+
+ *	| Offset | Value |   ROM Header Fields       +-----> Image 2 (Config Data) (Offset: 0x800)
+ *	+--------------------------------------------+
+ *	|    0h  |  55h  |   ROM Signature Byte1     |
+ *	|    1h  |  AAh  |   ROM Signature Byte2     |
+ *	|    2h  |  xx   |        Reserved           |
+ *	|  18+19h|  xx   |  Ptr to PCI DataStructure |
+ *	+----------------+---------------------------+
+ *	|           PCI Data Structure               |
+ *	+--------------------------------------------+
+ *	|    .       .             .                 |
+ *	|    .       .             .                 |
+ *	|    10  +  xx   +     Image Length          |
+ *	|    14  +  xx   +      Code Type            |
+ *	|    15  +  xx   +   Last Image Indicator    |
+ *	|    .       .             .                 |
+ *	|    1A  +  3C   + Ptr to Opregion Signature |
+ *	|    .       .             .                 |
+ *	|    .       .             .                 |
+ *	|   83Ch + IntelGraphicsMem                  | <---+ Opregion Signature
+ *	+--------+-----------------------------------+
+ *
+ * intel_oprom_verify_signature() verify OPROM signature.
+ * @opreg: pointer to opregion buffer output.
+ * @opreg_size: pointer to opregion size output.
+ * @dev_priv: i915 device.
+ */
+int
+intel_oprom_verify_signature(u32 **opreg, u16 *opreg_size,
+			     struct drm_i915_private *dev_priv)
+{
+	u8 img_sig[sizeof(OPREGION_SIGNATURE)];
+	u8 code_type, last_img;
+	u32 static_region, offset;
+	u32 *oprom_img, *oprom_img_hdr;
+	u16 opreg_base, img_len;
+	u8 *parse_ptr;
+	int img_size;
+	int ret = -EINVAL;
+
+	/* initialize SPI to read the OPROM */
+	static_region = I915_READ(SPI_STATIC_REGIONS);
+	static_region &= OPTIONROM_SPI_REGIONID_MASK;
+	I915_WRITE(PRIMARY_SPI_REGIONID, static_region);
+	/* read OPROM offset in SPI flash */
+	offset = I915_READ(OROM_OFFSET);
+	offset &= OROM_OFFSET_MASK;
+
+	oprom_img_hdr = kzalloc(OPROM_INITIAL_READ_SIZE, GFP_KERNEL);
+	if (!oprom_img_hdr)
+		return -ENOMEM;
+
+	do {
+		spi_read_oprom_helper(OPROM_INITIAL_READ_SIZE, offset,
+				      oprom_img_hdr, dev_priv);
+		img_size = oprom_image_parse_helper((u8 *)oprom_img_hdr, &last_img,
+						    &code_type, dev_priv);
+		if (img_size <= 0) {
+			ret = -EINVAL;
+			goto err_free_hdr;
+		}
+
+		img_len = img_size * OPROM_BYTE_BOUNDARY;
+		oprom_img = kzalloc(img_len, GFP_KERNEL);
+		if (!oprom_img) {
+			ret = -ENOMEM;
+			goto err_free_hdr;
+		}
+
+		spi_read_oprom_helper(img_len, offset, oprom_img, dev_priv);
+		parse_ptr = (u8 *)oprom_img;
+		offset = offset + img_len;
+
+		/* opregion base offset */
+		opreg_base = ((struct expansion_rom_header *)parse_ptr)->opregion_base;
+		/* CPD or opreg signature is present at opregion_base offset */
+		memcpy(img_sig, parse_ptr + opreg_base, sizeof(OPREGION_SIGNATURE));
+
+		if (!memcmp(img_sig, OPREGION_SIGNATURE, sizeof(OPREGION_SIGNATURE) - 1)) {
+			*opreg = oprom_img;
+			*opreg_size = img_len;
+			drm_dbg_kms(&dev_priv->drm, "Found opregion image\n");
+			ret = 0;
+			break;
+		} else if (!memcmp(img_sig, CPD_SIGNATURE, NUM_CPD_BYTES)) {
+			if (code_type != OPROM_CSS_CODE_TYPE) {
+				drm_err(&dev_priv->drm, "Invalid OPROM\n");
+				ret = -EINVAL;
+				goto err_free_img;
+			}
+			drm_dbg_kms(&dev_priv->drm, "Found CSS image\n");
+			/* proceed here onwards for signature authentication */
+			kfree(oprom_img);
+			continue;
+		}
+
+	} while (last_img != LAST_IMG_INDICATOR);
+
+	return ret;
+
+err_free_img:
+	kfree(oprom_img);
+err_free_hdr:
+	kfree(oprom_img_hdr);
+
+	return ret;
+}
+
 static int intel_use_opregion_panel_type_callback(const struct dmi_system_id *id)
 {
 	DRM_INFO("Using panel type from OpRegion on %s\n", id->ident);
diff --git a/drivers/gpu/drm/i915/display/intel_opregion.h b/drivers/gpu/drm/i915/display/intel_opregion.h
index 4aa68ffbd30e..4e2eeadf101e 100644
--- a/drivers/gpu/drm/i915/display/intel_opregion.h
+++ b/drivers/gpu/drm/i915/display/intel_opregion.h
@@ -54,6 +54,34 @@ struct intel_opregion {
 
 #define OPREGION_SIZE            (8 * 1024)
 
+#define CPD_SIGNATURE "$CPD"                  /* CPD Signature */
+#define NUM_CPD_BYTES 4
+#define PCI_IMAGE_LENGTH_OFFSET 0x10
+#define PCI_CODE_TYPE_OFFSET 0x14
+#define PCI_LAST_IMAGE_INDICATOR_OFFSET 0x15
+#define LAST_IMG_INDICATOR 0x80
+#define OPROM_IMAGE_MAGIC 0xAA55       /* Little Endian */
+#define OPROM_CSS_CODE_TYPE 0xF0
+#define OPROM_BYTE_BOUNDARY 512        /* OPROM image sizes are indicated in 512 byte boundaries */
+#define OPROM_INITIAL_READ_SIZE 60     /* Read 60 bytes to compute the Img Len from PCI structure */
+
+union oprom_header {
+	u32 data;
+	struct {
+		u16 signature;  /* Offset[0x0]: Header 0x55 0xAA */
+		u8 sizein512bytes;
+		u8 reserved;
+	};
+};
+
+struct expansion_rom_header {
+	union oprom_header header;      /* Offset[0x0]: Oprom Header */
+	u16 vbiospostoffset;    /* Offset[0x4]: pointer to VBIOS entry point */
+	u8 resvd[0x12];
+	u16 pcistructoffset;    /* Offset[0x18]: Contains pointer PCI Data Structure */
+	u16 opregion_base;      /* Offset[0x1A]: Offset to Opregion Base start */
+};
+
 #ifdef CONFIG_ACPI
 
 int intel_opregion_setup(struct drm_i915_private *dev_priv);
@@ -118,5 +146,6 @@ static inline int intel_opregion_get_panel_type(struct drm_i915_private *dev)
 }
 
 #endif /* CONFIG_ACPI */
-
+int intel_oprom_verify_signature(u32 **opreg, u16 *opreg_size,
+				 struct drm_i915_private *i915);
 #endif
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 121/162] drm/i915: WA for zero memory channel
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (119 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 120/162] drm/i915/oprom: Basic sanitization Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 122/162] drm/i915/dg1: Compute MEM Bandwidth using MCHBAR Matthew Auld
                   ` (40 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx
  Cc: Lucas De Marchi, dri-devel, Stanislav Lisovskiy,
	Daniele Ceraolo Spurio, Rodrigo Vivi

From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Commit c457d9cf256e ("drm/i915: Make sure we have enough memory
bandwidth on ICL") assumes that we always have a non-zero
dram_info->channels and uses it as a divisor. We need num memory
channels to be at least 1 for sane bw limits checking, even when PCode
returns 0, so lets force it to 1 in this case.

Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
index bd060404d249..9e7971ce24b3 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -222,7 +222,7 @@ static int icl_get_bw_info(struct drm_i915_private *dev_priv, const struct intel
 			    "Failed to get memory subsystem information, ignoring bandwidth limits");
 		return ret;
 	}
-	num_channels = qi.num_channels;
+	num_channels = max_t(u8, 1, qi.num_channels);
 
 	deinterleave = DIV_ROUND_UP(num_channels, is_y_tile ? 4 : 2);
 	dclk_max = icl_sagv_max_dclk(&qi);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 122/162] drm/i915/dg1: Compute MEM Bandwidth using MCHBAR
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (120 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 121/162] drm/i915: WA for zero memory channel Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 123/162] drm/i915/dg1: Double memory bandwidth available Matthew Auld
                   ` (39 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Jani Saarinen

From: Clint Taylor <clinton.a.taylor@intel.com>

The PUNIT FW is currently returning 0 for all memory bandwidth
parameters. Read the values directly from MCHBAR offsets 0x5918 and
0x4000(4). This is a temporary WA until the PUNIT FW returns valid
values.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jani Saarinen <jani.saarinen@intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bw.c | 54 ++++++++++++++++++++++++-
 1 file changed, 53 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
index 9e7971ce24b3..5244ae77226d 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -90,6 +90,53 @@ static int icl_pcode_read_mem_global_info(struct drm_i915_private *dev_priv,
 	return 0;
 }
 
+#define SA_PERF_STATUS_0_0_0_MCHBAR_PC _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5918)
+#define  DG1_QCLK_RATIO_MASK (0xFF << 2)
+#define  DG1_QCLK_RATIO_SHIFT 2
+#define  DG1_QCLK_REFERENCE (1 << 10)
+
+#define MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4000)
+#define MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR_HIGH _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4004)
+#define MCHBAR_CH1_CR_TC_PRE_0_0_0_MCHBAR _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4400)
+#define MCHBAR_CH1_CR_TC_PRE_0_0_0_MCHBAR_HIGH _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4404)
+#define  DG1_DRAM_T_RCD_MASK (0x7F << 9)
+#define  DG1_DRAM_T_RCD_SHIFT 9
+#define  DG1_DRAM_T_RDPRE_MASK (0x3F << 11)
+#define  DG1_DRAM_T_RDPRE_SHIFT 11
+#define  DG1_DRAM_T_RAS_MASK (0xFF << 1)
+#define  DG1_DRAM_T_RAS_SHIFT 1
+#define  DG1_DRAM_T_RP_MASK (0x7F << 0)
+#define  DG1_DRAM_T_RP_SHIFT 0
+
+static int dg1_mchbar_read_qgv_point_info(struct drm_i915_private *dev_priv,
+					  struct intel_qgv_point *sp,
+					  int point)
+{
+	u32 val = 0;
+	u32 dclk_ratio = 0, dclk_reference = 0;
+
+	val = I915_READ(SA_PERF_STATUS_0_0_0_MCHBAR_PC);
+	dclk_ratio = (val & DG1_QCLK_RATIO_MASK) >> DG1_QCLK_RATIO_SHIFT;
+	if (val & DG1_QCLK_REFERENCE)
+		dclk_reference = 6; /* 6 * 16.666 MHz = 100 MHz */
+	else
+		dclk_reference = 8; /* 8 * 16.666 MHz = 133 MHz */
+	sp->dclk = dclk_ratio * dclk_reference;
+	if (sp->dclk == 0)
+		return -EINVAL;
+
+	val = I915_READ(MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR);
+	sp->t_rp = (val & DG1_DRAM_T_RP_MASK) >> DG1_DRAM_T_RP_SHIFT;
+	sp->t_rdpre = (val & DG1_DRAM_T_RDPRE_MASK) >> DG1_DRAM_T_RDPRE_SHIFT;
+
+	val = I915_READ(MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR_HIGH);
+	sp->t_rcd = (val & DG1_DRAM_T_RCD_MASK) >> DG1_DRAM_T_RCD_SHIFT;
+	sp->t_ras = (val & DG1_DRAM_T_RAS_MASK) >> DG1_DRAM_T_RAS_SHIFT;
+
+	sp->t_rc = sp->t_rp + sp->t_ras;
+	return 0;
+}
+
 static int icl_pcode_read_qgv_point_info(struct drm_i915_private *dev_priv,
 					 struct intel_qgv_point *sp,
 					 int point)
@@ -153,7 +200,12 @@ static int icl_get_qgv_points(struct drm_i915_private *dev_priv,
 		struct intel_qgv_point *sp = &qi->points[i];
 
 		ret = icl_pcode_read_qgv_point_info(dev_priv, sp, i);
-		if (ret)
+		if (IS_DG1(dev_priv) && (ret || sp->dclk == 0)) {
+			drm_dbg_kms(&dev_priv->drm, "Failed to get memory subsystem information via pcode. IFWI needs update. Trying with MCHBAR\n");
+			ret = dg1_mchbar_read_qgv_point_info(dev_priv, sp, i);
+			if (ret)
+				return ret;
+		} else if (ret)
 			return ret;
 
 		drm_dbg_kms(&dev_priv->drm,
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 123/162] drm/i915/dg1: Double memory bandwidth available
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (121 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 122/162] drm/i915/dg1: Compute MEM Bandwidth using MCHBAR Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 124/162] drm/i915/lmem: allocate HWSP in lmem Matthew Auld
                   ` (38 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Swati Sharma, dri-devel

From: Clint Taylor <clinton.a.taylor@intel.com>

Use MCHBAR Gear_type information to compute memory bandwidth available
during MCHBAR calculations.

Cc: Swati Sharma <swati2.sharma@intel.com>
Cc: Swati Sharma <swati2.sharma@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bw.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
index 5244ae77226d..37fef3b5cb58 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -108,6 +108,9 @@ static int icl_pcode_read_mem_global_info(struct drm_i915_private *dev_priv,
 #define  DG1_DRAM_T_RP_MASK (0x7F << 0)
 #define  DG1_DRAM_T_RP_SHIFT 0
 
+#define  ICL_GEAR_TYPE_MASK (0x01 << 16)
+#define  ICL_GEAR_TYPE_SHIFT 16
+
 static int dg1_mchbar_read_qgv_point_info(struct drm_i915_private *dev_priv,
 					  struct intel_qgv_point *sp,
 					  int point)
@@ -122,6 +125,11 @@ static int dg1_mchbar_read_qgv_point_info(struct drm_i915_private *dev_priv,
 	else
 		dclk_reference = 8; /* 8 * 16.666 MHz = 133 MHz */
 	sp->dclk = dclk_ratio * dclk_reference;
+
+	val = I915_READ(SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU);
+	if ((val & ICL_GEAR_TYPE_MASK) >> ICL_GEAR_TYPE_SHIFT)
+		sp->dclk *= 2;
+
 	if (sp->dclk == 0)
 		return -EINVAL;
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 124/162] drm/i915/lmem: allocate HWSP in lmem
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (122 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 123/162] drm/i915/dg1: Double memory bandwidth available Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 13:55   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 125/162] drm/i915/lmem: Limit block size to 4G Matthew Auld
                   ` (37 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx
  Cc: Abdiel Janulgue, Michel Thierry, dri-devel, Daniele Ceraolo Spurio

From: Michel Thierry <michel.thierry@intel.com>

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 10 +++++++++-
 drivers/gpu/drm/i915/gt/intel_timeline.c  |  8 +++++++-
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 0ba020346566..9e0394b06f38 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -25,6 +25,7 @@
 #include <drm/drm_print.h>
 
 #include "gem/i915_gem_context.h"
+#include "gem/i915_gem_lmem.h"
 
 #include "i915_drv.h"
 
@@ -657,7 +658,14 @@ static int init_status_page(struct intel_engine_cs *engine)
 	 * in GFP_DMA32 for i965, and no earlier physical address users had
 	 * access to more than 4G.
 	 */
-	obj = i915_gem_object_create_internal(engine->i915, PAGE_SIZE);
+	if (HAS_LMEM(engine->i915)) {
+		obj = i915_gem_object_create_lmem(engine->i915,
+						  PAGE_SIZE,
+						  I915_BO_ALLOC_CONTIGUOUS |
+						  I915_BO_ALLOC_VOLATILE);
+	} else {
+		obj = i915_gem_object_create_internal(engine->i915, PAGE_SIZE);
+	}
 	if (IS_ERR(obj)) {
 		drm_err(&engine->i915->drm,
 			"Failed to allocate status page\n");
diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c
index 065943781586..589559b526eb 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -6,6 +6,7 @@
 
 #include "i915_drv.h"
 
+#include "gem/i915_gem_lmem.h"
 #include "i915_active.h"
 #include "i915_syncmap.h"
 #include "intel_gt.h"
@@ -34,7 +35,12 @@ static int __hwsp_alloc(struct intel_gt *gt, struct intel_timeline_hwsp *hwsp)
 	int type;
 	int ret;
 
-	obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
+	if (HAS_LMEM(i915))
+		obj = i915_gem_object_create_lmem(i915, PAGE_SIZE,
+						  I915_BO_ALLOC_CONTIGUOUS |
+						  I915_BO_ALLOC_VOLATILE);
+	else
+		obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 125/162] drm/i915/lmem: Limit block size to 4G
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (123 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 124/162] drm/i915/lmem: allocate HWSP in lmem Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 14:02   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 126/162] drm/i915/gem: Update shmem available memory Matthew Auld
                   ` (36 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx
  Cc: Niranjana Vishwanathapura, Venkata Sandeep Dhanalakota, CQ Tang,
	dri-devel

From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

when allocating pages to lmem object of size 4G or greater
we allocate memory blocks from buddy system. In this scenario
buddy sytem can allocate blocks that can have size >= 4G and
these blocks require >32b to represent block size with these
blocks we run into an issue with sg list construction because
sg->length field is only 32b wide.

Hence limit the max allowed block size to less than 4G.

Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
---
 drivers/gpu/drm/i915/intel_memory_region.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index 554fdd7735a8..371cd88ff6d8 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -101,6 +101,7 @@ __intel_memory_region_get_pages_buddy(struct intel_memory_region *mem,
 				      struct list_head *blocks)
 {
 	unsigned int min_order = 0;
+	unsigned int max_order;
 	unsigned long n_pages;
 
 	GEM_BUG_ON(!IS_ALIGNED(size, mem->mm.chunk_size));
@@ -121,6 +122,16 @@ __intel_memory_region_get_pages_buddy(struct intel_memory_region *mem,
 
 	n_pages = size >> ilog2(mem->mm.chunk_size);
 
+	/*
+	 * When allocating pages for an lmem object of size > 4G
+	 * the memory blocks allocated from buddy system could be
+	 * from sizes greater than 4G requiring > 32b to represent
+	 * block size. But those blocks cannot be used in sg list
+	 * construction(in caller) as sg->length is only 32b wide.
+	 * Hence limiting the block size to 4G.
+	 */
+	max_order = (ilog2(SZ_4G) - 1) - ilog2(mem->mm.chunk_size);
+
 	mutex_lock(&mem->mm_lock);
 
 	do {
@@ -128,7 +139,7 @@ __intel_memory_region_get_pages_buddy(struct intel_memory_region *mem,
 		unsigned int order;
 		bool retry = true;
 retry:
-		order = fls(n_pages) - 1;
+		order = min_t(u32, (fls(n_pages) - 1), max_order);
 		GEM_BUG_ON(order > mem->mm.max_order);
 		GEM_BUG_ON(order < min_order);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 126/162] drm/i915/gem: Update shmem available memory
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (124 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 125/162] drm/i915/lmem: Limit block size to 4G Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 14:04   ` Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 127/162] drm/i915: Allow non-uniform subslices in gen12+ Matthew Auld
                   ` (35 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Bommu Krishnaiah, Zbigniew Kempczyński, CQ Tang, dri-devel

From: Bommu Krishnaiah <krishnaiah.bommu@intel.com>

Update shmem available memory in “intel_memory_region”

Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index b4dd7a709800..f4bac72b3ccd 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -30,6 +30,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct intel_memory_region *mem = obj->mm.region;
 	const unsigned long page_count = obj->base.size / PAGE_SIZE;
+	resource_size_t size = obj->base.size;
 	unsigned long i;
 	struct address_space *mapping;
 	struct sg_table *st;
@@ -184,6 +185,8 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 
 	__i915_gem_object_set_pages(obj, st, sg_page_sizes);
 
+	mem->avail -= size;
+
 	return 0;
 
 err_sg:
@@ -298,6 +301,8 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
 
 void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages)
 {
+	struct intel_memory_region *mem = obj->mm.region;
+	resource_size_t size = obj->base.size;
 	struct sgt_iter sgt_iter;
 	struct pagevec pvec;
 	struct page *page;
@@ -326,6 +331,8 @@ void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_
 		check_release_pagevec(&pvec);
 	obj->mm.dirty = false;
 
+	mem->avail += size;
+
 	sg_free_table(pages);
 	kfree(pages);
 }
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 127/162] drm/i915: Allow non-uniform subslices in gen12+
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (125 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 126/162] drm/i915/gem: Update shmem available memory Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 128/162] drm/i915/dg1: intel_memory_region_evict() changes for eviction Matthew Auld
                   ` (34 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx
  Cc: Stuart Summers, Harish Chegondi, Daniele Ceraolo Spurio, dri-devel

From: Stuart Summers <stuart.summers@intel.com>

The current implementation of intel_set_subslices only takes
the number of bits per subslice stride and copies those in
based on the slice given. For all known use cases, this works
fine. But in the event of some faulty hardware or other future
use case, do a straight memcpy of these subslice bits into
the internal mask to ensure all subslices are correctly
calculated.

Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Suggested-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_sseu.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
index 8a72e0fe34ca..b8a945166d32 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -104,6 +104,7 @@ static u16 compute_eu_total(const struct sseu_dev_info *sseu)
 static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
 				    u8 s_en, u32 ss_en, u16 eu_en)
 {
+	u32 ss_mask;
 	int s, ss;
 
 	/* ss_en represents entire subslice mask across all slices */
@@ -116,7 +117,10 @@ static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
 
 		sseu->slice_mask |= BIT(s);
 
-		intel_sseu_set_subslices(sseu, s, ss_en);
+		ss_mask = ss_en >> (s * sseu->max_subslices);
+		ss_mask &= GENMASK(sseu->max_subslices - 1, 0);
+
+		intel_sseu_set_subslices(sseu, s, ss_mask);
 
 		for (ss = 0; ss < sseu->max_subslices; ss++)
 			if (intel_sseu_has_subslice(sseu, s, ss))
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 128/162] drm/i915/dg1: intel_memory_region_evict() changes for eviction
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (126 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 127/162] drm/i915: Allow non-uniform subslices in gen12+ Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 14:07   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 129/162] drm/i915/dg1: i915_gem_object_memcpy(..) infrastructure Matthew Auld
                   ` (33 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

From: CQ Tang <cq.tang@intel.com>

Function i915_gem_shrink_memory_region() is changed to
intel_memory_region_evict() and moved from i915_gem_shrinker.c
to intel_memory_region.c, this function is used to handle local
memory swapping, in addition to evict purgeable objects only.

When an object is selected from list, i915_gem_object_unbind()
might fail if the object vma is pinned, this causes an error
-EBUSY is returned from this function.

The new code uses similar logic as function i915_gem_shrink().

Signed-off-by: CQ Tang <cq.tang@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  1 -
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  | 58 -----------
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.h  |  2 -
 drivers/gpu/drm/i915/i915_gem.c               |  8 +-
 drivers/gpu/drm/i915/intel_memory_region.c    | 95 +++++++++++++++++--
 .../drm/i915/selftests/intel_memory_region.c  |  3 +-
 6 files changed, 94 insertions(+), 73 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 8d639509b78b..517a606ade8d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -237,7 +237,6 @@ struct drm_i915_gem_object {
 		 * region->obj_lock.
 		 */
 		struct list_head region_link;
-		struct list_head tmp_link;
 
 		struct sg_table *pages;
 		void *mapping;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index 4d346df8fd5b..27674048f17d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -272,64 +272,6 @@ unsigned long i915_gem_shrink_all(struct drm_i915_private *i915)
 	return freed;
 }
 
-int i915_gem_shrink_memory_region(struct intel_memory_region *mem,
-				  resource_size_t target)
-{
-	struct drm_i915_private *i915 = mem->i915;
-	struct drm_i915_gem_object *obj;
-	resource_size_t purged;
-	LIST_HEAD(purgeable);
-	int err = -ENOSPC;
-
-	intel_gt_retire_requests(&i915->gt);
-
-	purged = 0;
-
-	mutex_lock(&mem->objects.lock);
-
-	while ((obj = list_first_entry_or_null(&mem->objects.purgeable,
-					       typeof(*obj),
-					       mm.region_link))) {
-		list_move_tail(&obj->mm.region_link, &purgeable);
-
-		if (!i915_gem_object_has_pages(obj))
-			continue;
-
-		if (i915_gem_object_is_framebuffer(obj))
-			continue;
-
-		if (!kref_get_unless_zero(&obj->base.refcount))
-			continue;
-
-		mutex_unlock(&mem->objects.lock);
-
-		if (!i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE)) {
-			if (i915_gem_object_trylock(obj)) {
-				__i915_gem_object_put_pages(obj);
-				if (!i915_gem_object_has_pages(obj)) {
-					purged += obj->base.size;
-					if (!i915_gem_object_is_volatile(obj))
-						obj->mm.madv = __I915_MADV_PURGED;
-				}
-				i915_gem_object_unlock(obj);
-			}
-		}
-
-		i915_gem_object_put(obj);
-
-		mutex_lock(&mem->objects.lock);
-
-		if (purged >= target) {
-			err = 0;
-			break;
-		}
-	}
-
-	list_splice_tail(&purgeable, &mem->objects.purgeable);
-	mutex_unlock(&mem->objects.lock);
-	return err;
-}
-
 static unsigned long
 i915_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
index c945f3b587d6..7c1e648a8b44 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.h
@@ -31,7 +31,5 @@ void i915_gem_driver_register__shrinker(struct drm_i915_private *i915);
 void i915_gem_driver_unregister__shrinker(struct drm_i915_private *i915);
 void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915,
 				    struct mutex *mutex);
-int i915_gem_shrink_memory_region(struct intel_memory_region *mem,
-				  resource_size_t target);
 
 #endif /* __I915_GEM_SHRINKER_H__ */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bf67f323a1ae..85cbdb8e2bb8 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1008,12 +1008,12 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 
 		switch (obj->mm.madv) {
 		case I915_MADV_WILLNEED:
-			list_move(&obj->mm.region_link,
-				  &obj->mm.region->objects.list);
+			list_move_tail(&obj->mm.region_link,
+				       &obj->mm.region->objects.list);
 			break;
 		default:
-			list_move(&obj->mm.region_link,
-				  &obj->mm.region->objects.purgeable);
+			list_move_tail(&obj->mm.region_link,
+				       &obj->mm.region->objects.purgeable);
 			break;
 		}
 
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index 371cd88ff6d8..185eab497803 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -3,6 +3,7 @@
  * Copyright © 2019 Intel Corporation
  */
 
+#include "gt/intel_gt_requests.h"
 #include "intel_memory_region.h"
 #include "i915_drv.h"
 
@@ -94,6 +95,90 @@ __intel_memory_region_put_block_buddy(struct i915_buddy_block *block)
 	__intel_memory_region_put_pages_buddy(block->private, &blocks);
 }
 
+static int intel_memory_region_evict(struct intel_memory_region *mem,
+				     resource_size_t target)
+{
+	struct drm_i915_private *i915 = mem->i915;
+	struct list_head still_in_list;
+	struct drm_i915_gem_object *obj;
+	struct list_head *phases[] = {
+		&mem->objects.purgeable,
+		&mem->objects.list,
+		NULL,
+	};
+	struct list_head **phase;
+	resource_size_t found;
+	int pass;
+
+	intel_gt_retire_requests(&i915->gt);
+
+	found = 0;
+	pass = 0;
+	phase = phases;
+
+next:
+	INIT_LIST_HEAD(&still_in_list);
+	mutex_lock(&mem->objects.lock);
+
+	while (found < target &&
+		(obj = list_first_entry_or_null(*phase,
+						typeof(*obj),
+						mm.region_link))) {
+		list_move_tail(&obj->mm.region_link, &still_in_list);
+
+		if (!i915_gem_object_has_pages(obj))
+			continue;
+
+		if (i915_gem_object_is_framebuffer(obj))
+			continue;
+
+		/*
+		 * For IOMEM region, only swap user space objects.
+		 * kernel objects are bound and causes a lot of unbind
+		 * warning message in driver.
+		 * FIXME: swap kernel object as well.
+		 */
+		if (i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM)
+		    && !obj->base.handle_count)
+			continue;
+
+		if (!kref_get_unless_zero(&obj->base.refcount))
+			continue;
+
+		mutex_unlock(&mem->objects.lock);
+
+		if (!i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE)) {
+			if (i915_gem_object_trylock(obj)) {
+				__i915_gem_object_put_pages(obj);
+				/* May arrive from get_pages on another bo */
+				if (!i915_gem_object_has_pages(obj)) {
+					found += obj->base.size;
+					if (obj->mm.madv == I915_MADV_DONTNEED)
+						obj->mm.madv = __I915_MADV_PURGED;
+				}
+				i915_gem_object_unlock(obj);
+			}
+		}
+
+		i915_gem_object_put(obj);
+		mutex_lock(&mem->objects.lock);
+
+		if (found >= target)
+			break;
+	}
+	list_splice_tail(&still_in_list, *phase);
+	mutex_unlock(&mem->objects.lock);
+
+	if (found < target) {
+		pass++;
+		phase++;
+		if (*phase)
+			goto next;
+	}
+
+	return (found < target) ? -ENOSPC : 0;
+}
+
 int
 __intel_memory_region_get_pages_buddy(struct intel_memory_region *mem,
 				      resource_size_t size,
@@ -137,7 +222,7 @@ __intel_memory_region_get_pages_buddy(struct intel_memory_region *mem,
 	do {
 		struct i915_buddy_block *block;
 		unsigned int order;
-		bool retry = true;
+
 retry:
 		order = min_t(u32, (fls(n_pages) - 1), max_order);
 		GEM_BUG_ON(order > mem->mm.max_order);
@@ -152,19 +237,15 @@ __intel_memory_region_get_pages_buddy(struct intel_memory_region *mem,
 				resource_size_t target;
 				int err;
 
-				if (!retry)
-					goto err_free_blocks;
-
 				target = n_pages * mem->mm.chunk_size;
 
 				mutex_unlock(&mem->mm_lock);
-				err = i915_gem_shrink_memory_region(mem,
-								    target);
+				err = intel_memory_region_evict(mem,
+								target);
 				mutex_lock(&mem->mm_lock);
 				if (err)
 					goto err_free_blocks;
 
-				retry = false;
 				goto retry;
 			}
 		} while (1);
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index 9df0a4f657c1..4b007ed48d2f 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -1093,7 +1093,8 @@ static void igt_mark_evictable(struct drm_i915_gem_object *obj)
 {
 	i915_gem_object_unpin_pages(obj);
 	obj->mm.madv = I915_MADV_DONTNEED;
-	list_move(&obj->mm.region_link, &obj->mm.region->objects.purgeable);
+	list_move_tail(&obj->mm.region_link,
+		       &obj->mm.region->objects.purgeable);
 }
 
 static int igt_mock_shrink(void *arg)
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 129/162] drm/i915/dg1: i915_gem_object_memcpy(..) infrastructure
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (127 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 128/162] drm/i915/dg1: intel_memory_region_evict() changes for eviction Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 130/162] drm/i915/dg1: Eviction logic Matthew Auld
                   ` (32 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

From: CQ Tang <cq.tang@intel.com>

i915_gem_object_memcpy() will copy the pages from source object
to destination object by using memcpy. If source and destination
are not the same size, copy the smaller pages.

Using pread/pwrite mechanism to do the page read/write.

Signed-off-by: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 151 +++++++++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_object.h |   2 +
 2 files changed, 153 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 89b530841126..65690e3bf648 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -30,11 +30,13 @@
 #include "i915_drv.h"
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
+#include "i915_gem_lmem.h"
 #include "i915_gem_mman.h"
 #include "i915_gem_object.h"
 #include "i915_gem_object_blt.h"
 #include "i915_gem_region.h"
 #include "i915_globals.h"
+#include "i915_memcpy.h"
 #include "i915_trace.h"
 
 static struct i915_global_object {
@@ -449,6 +451,155 @@ int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
 	return err;
 }
 
+struct object_memcpy_info {
+	struct drm_i915_gem_object *obj;
+	intel_wakeref_t wakeref;
+	bool write;
+	int clflush;
+	struct page *page;
+	void *vaddr;
+	void *(*get_vaddr)(struct object_memcpy_info *info,
+			   unsigned long idx);
+	void (*put_vaddr)(struct object_memcpy_info *info);
+};
+
+static
+void *lmem_get_vaddr(struct object_memcpy_info *info, unsigned long idx)
+{
+	info->vaddr = i915_gem_object_lmem_io_map_page(info->obj, idx);
+	return info->vaddr;
+}
+
+static
+void lmem_put_vaddr(struct object_memcpy_info *info)
+{
+	io_mapping_unmap(info->vaddr);
+}
+
+static
+void *smem_get_vaddr(struct object_memcpy_info *info, unsigned long idx)
+{
+	info->page = i915_gem_object_get_page(info->obj, (unsigned int)idx);
+	info->vaddr = kmap(info->page);
+	if (info->clflush & CLFLUSH_BEFORE)
+		drm_clflush_virt_range(info->vaddr, PAGE_SIZE);
+	return info->vaddr;
+}
+
+static
+void smem_put_vaddr(struct object_memcpy_info *info)
+{
+	if (info->clflush & CLFLUSH_AFTER)
+		drm_clflush_virt_range(info->vaddr, PAGE_SIZE);
+	kunmap(info->page);
+}
+
+static int
+i915_gem_object_prepare_memcpy(struct drm_i915_gem_object *obj,
+			       struct object_memcpy_info *info,
+			       bool write)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	int ret;
+
+	assert_object_held(obj);
+	ret = i915_gem_object_wait(obj,
+				   I915_WAIT_INTERRUPTIBLE,
+				   MAX_SCHEDULE_TIMEOUT);
+	if (ret)
+		return ret;
+
+	ret = i915_gem_object_pin_pages(obj);
+	if (ret)
+		return ret;
+
+	if (i915_gem_object_is_lmem(obj)) {
+		ret = i915_gem_object_set_to_wc_domain(obj, write);
+		if (!ret) {
+			info->wakeref =
+				intel_runtime_pm_get(&i915->runtime_pm);
+			info->get_vaddr = lmem_get_vaddr;
+			info->put_vaddr = lmem_put_vaddr;
+		}
+	} else {
+		if (write)
+			ret = i915_gem_object_prepare_write(obj,
+							    &info->clflush);
+		else
+			ret = i915_gem_object_prepare_read(obj,
+							   &info->clflush);
+
+		if (!ret) {
+			i915_gem_object_finish_access(obj);
+			info->get_vaddr = smem_get_vaddr;
+			info->put_vaddr = smem_put_vaddr;
+		}
+	}
+
+	if (!ret) {
+		info->obj = obj;
+		info->write = write;
+	} else {
+		i915_gem_object_unpin_pages(obj);
+	}
+
+	return ret;
+}
+
+static void
+i915_gem_object_finish_memcpy(struct object_memcpy_info *info)
+{
+	struct drm_i915_private *i915 = to_i915(info->obj->base.dev);
+
+	if (i915_gem_object_is_lmem(info->obj)) {
+		intel_runtime_pm_put(&i915->runtime_pm, info->wakeref);
+	} else {
+		if (info->write) {
+			i915_gem_object_flush_frontbuffer(info->obj,
+							  ORIGIN_CPU);
+			info->obj->mm.dirty = true;
+		}
+	}
+	i915_gem_object_unpin_pages(info->obj);
+}
+
+int i915_gem_object_memcpy(struct drm_i915_gem_object *dst,
+			   struct drm_i915_gem_object *src)
+{
+	struct object_memcpy_info sinfo, dinfo;
+	void *svaddr, *dvaddr;
+	unsigned long npages;
+	int i, ret;
+
+	ret = i915_gem_object_prepare_memcpy(src, &sinfo, false);
+	if (ret)
+		return ret;
+
+	ret = i915_gem_object_prepare_memcpy(dst, &dinfo, true);
+	if (ret)
+		goto finish_src;
+
+	npages = src->base.size / PAGE_SIZE;
+	for (i = 0; i < npages; i++) {
+		svaddr = sinfo.get_vaddr(&sinfo, i);
+		dvaddr = dinfo.get_vaddr(&dinfo, i);
+
+		/* a performance optimization */
+		if (!i915_gem_object_is_lmem(src) ||
+		    !i915_memcpy_from_wc(dvaddr, svaddr, PAGE_SIZE))
+			memcpy(dvaddr, svaddr, PAGE_SIZE);
+
+		dinfo.put_vaddr(&dinfo);
+		sinfo.put_vaddr(&sinfo);
+	}
+
+	i915_gem_object_finish_memcpy(&dinfo);
+finish_src:
+	i915_gem_object_finish_memcpy(&sinfo);
+
+	return ret;
+}
+
 static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj)
 {
 	return !(obj->cache_level == I915_CACHE_NONE ||
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 1a1aa71a4494..175258106642 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -57,6 +57,8 @@ int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
 			    struct i915_gem_ww_ctx *ww,
 			    struct intel_context *ce,
 			    enum intel_region_id id);
+int i915_gem_object_memcpy(struct drm_i915_gem_object *dst,
+			   struct drm_i915_gem_object *src);
 
 void i915_gem_flush_free_objects(struct drm_i915_private *i915);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 130/162] drm/i915/dg1: Eviction logic
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (128 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 129/162] drm/i915/dg1: i915_gem_object_memcpy(..) infrastructure Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 131/162] drm/i915/dg1: Add enable_eviction modparam Matthew Auld
                   ` (31 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

From: CQ Tang <cq.tang@intel.com>

When an object is pinning, get_pages() is called to allocate memory
on a region, if memory pages are not availabe, the region eviction
is triggered to find other objects on the same region to be evicted,
the selected object is passed to put_pages() call to free the
memory pages, before freeing the pages, whether to swap out first
to system memory depends on if object is marked as WILLNEED.

After swapped-out, the object is treated as if it does not have any
page allocated for it.

Similarly, when an object is pinning, memory pages are allocated from
a region, then the object is checked if it had been swapped out before,
if yes, swap-in the pages contents into the allocated memory pages.

For this initial swapping code, i915_gem_object_memcpy() is used to
copy pages.

Signed-off-by: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  12 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   2 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   6 +
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     |   1 -
 drivers/gpu/drm/i915/gem/i915_gem_region.c    | 139 +++++++++++++++++-
 drivers/gpu/drm/i915/intel_memory_region.c    |   6 +
 6 files changed, 162 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 65690e3bf648..7cb5f137522f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -178,6 +178,8 @@ static void __i915_gem_free_object_rcu(struct rcu_head *head)
 		container_of(head, typeof(*obj), rcu);
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 
+	/* Reset shared reservation object */
+	obj->base.resv = &obj->base._resv;
 	dma_resv_fini(&obj->base._resv);
 	i915_gem_object_free(obj);
 
@@ -185,7 +187,7 @@ static void __i915_gem_free_object_rcu(struct rcu_head *head)
 	atomic_dec(&i915->mm.free_count);
 }
 
-static void __i915_gem_object_free_mmaps(struct drm_i915_gem_object *obj)
+void __i915_gem_object_free_mmaps(struct drm_i915_gem_object *obj)
 {
 	/* Skip serialisation and waking the device if known to be not used. */
 
@@ -287,6 +289,14 @@ static void i915_gem_free_object(struct drm_gem_object *gem_obj)
 
 	GEM_BUG_ON(i915_gem_object_is_framebuffer(obj));
 
+	/*
+	 * If object had been swapped out, free the hidden object.
+	 */
+	if (obj->swapto) {
+		i915_gem_object_put(obj->swapto);
+		obj->swapto = NULL;
+	}
+
 	/*
 	 * Before we free the object, make sure any pure RCU-only
 	 * read-side critical sections are complete, e.g.
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 175258106642..ee1914ed2070 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -366,6 +366,8 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 int ____i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
 int __i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
 
+void __i915_gem_object_free_mmaps(struct drm_i915_gem_object *obj);
+
 static inline int __must_check
 i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 517a606ade8d..e9f42d3137b3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -316,6 +316,12 @@ struct drm_i915_gem_object {
 
 		void *gvt_info;
 	};
+
+	/**
+	 * object to swap-to if non-null.
+	 */
+	bool do_swapping;
+	struct drm_i915_gem_object *swapto;
 };
 
 static inline struct drm_i915_gem_object *
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 2cdb7cf63383..d0f3da0925f5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -231,7 +231,6 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)
 	}
 
 	__i915_gem_object_reset_page_iter(obj);
-	obj->mm.page_sizes.phys = obj->mm.page_sizes.sg = 0;
 
 	return pages;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c
index e497ff374b13..a437538cd872 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
@@ -7,11 +7,135 @@
 #include "i915_gem_region.h"
 #include "i915_drv.h"
 #include "i915_trace.h"
+#include "i915_gem_mman.h"
+
+static int
+i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
+			      struct sg_table *pages, unsigned int sizes)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct drm_i915_gem_object *dst, *src;
+	int err;
+
+	GEM_BUG_ON(obj->swapto);
+	GEM_BUG_ON(i915_gem_object_has_pages(obj));
+	GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
+	GEM_BUG_ON(obj->mm.region->type != INTEL_MEMORY_LOCAL);
+
+	assert_object_held(obj);
+
+	/* create a shadow object on smem region */
+	dst = i915_gem_object_create_shmem(i915, obj->base.size);
+	if (IS_ERR(dst))
+		return PTR_ERR(dst);
+
+	/* Share the dma-resv between the shadow- and the parent object */
+	dst->base.resv = obj->base.resv;
+	assert_object_held(dst);
+
+	/*
+	 * create working object on the same region as 'obj',
+	 * if 'obj' is used directly, it is set pages and is pinned
+	 * again, other thread may wrongly use 'obj' pages.
+	 */
+	src = i915_gem_object_create_region(obj->mm.region,
+					    obj->base.size, 0);
+	if (IS_ERR(src)) {
+		i915_gem_object_put(dst);
+		return PTR_ERR(src);
+	}
+
+	/* set and pin working object pages */
+	i915_gem_object_lock_isolated(src);
+	__i915_gem_object_set_pages(src, pages, sizes);
+	__i915_gem_object_pin_pages(src);
+
+	/* copying the pages */
+	err = i915_gem_object_memcpy(dst, src);
+
+	__i915_gem_object_unpin_pages(src);
+	__i915_gem_object_unset_pages(src);
+	i915_gem_object_unlock(src);
+	i915_gem_object_put(src);
+
+	if (!err)
+		obj->swapto = dst;
+	else
+		i915_gem_object_put(dst);
+
+	return err;
+}
+
+static int
+i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
+			     struct sg_table *pages, unsigned int sizes)
+{
+	struct drm_i915_gem_object *dst, *src;
+	int err;
+
+	GEM_BUG_ON(!obj->swapto);
+	GEM_BUG_ON(i915_gem_object_has_pages(obj));
+	GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
+	GEM_BUG_ON(obj->mm.region->type != INTEL_MEMORY_LOCAL);
+
+	assert_object_held(obj);
+
+	src = obj->swapto;
+
+	/*
+	 * create working object on the same region as 'obj',
+	 * if 'obj' is used directly, it is set pages and is pinned
+	 * again, other thread may wrongly use 'obj' pages.
+	 */
+	dst = i915_gem_object_create_region(obj->mm.region,
+					    obj->base.size, 0);
+	if (IS_ERR(dst)) {
+		err = PTR_ERR(dst);
+		return err;
+	}
+
+	/* @scr is sharing @obj's reservation object */
+	assert_object_held(src);
+
+	/* set and pin working object pages */
+	i915_gem_object_lock_isolated(dst);
+	__i915_gem_object_set_pages(dst, pages, sizes);
+	__i915_gem_object_pin_pages(dst);
+
+	/* copying the pages */
+	err = i915_gem_object_memcpy(dst, src);
+
+	__i915_gem_object_unpin_pages(dst);
+	__i915_gem_object_unset_pages(dst);
+	i915_gem_object_unlock(dst);
+	i915_gem_object_put(dst);
+
+	if (!err) {
+		obj->swapto = NULL;
+		i915_gem_object_put(src);
+	}
+
+	return err;
+}
 
 void
 i915_gem_object_put_pages_buddy(struct drm_i915_gem_object *obj,
 				struct sg_table *pages)
 {
+	/* if need to save the page contents, swap them out */
+	if (obj->do_swapping) {
+		unsigned int sizes = obj->mm.page_sizes.phys;
+
+		GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
+		GEM_BUG_ON(i915_gem_object_is_volatile(obj));
+
+		if (i915_gem_object_swapout_pages(obj, pages, sizes)) {
+			/* swapout failed, keep the pages */
+			__i915_gem_object_set_pages(obj, pages, sizes);
+			return;
+		}
+	}
+
 	__intel_memory_region_put_pages_buddy(obj->mm.region, &obj->mm.blocks);
 
 	obj->mm.dirty = false;
@@ -95,8 +219,19 @@ i915_gem_object_get_pages_buddy(struct drm_i915_gem_object *obj)
 	sg_mark_end(sg);
 	i915_sg_trim(st);
 
-	/* Intended for kernel internal use only */
-	if (obj->flags & I915_BO_ALLOC_CPU_CLEAR) {
+	/* if we saved the page contents, swap them in */
+	if (obj->swapto) {
+		GEM_BUG_ON(i915_gem_object_is_volatile(obj));
+
+		ret = i915_gem_object_swapin_pages(obj, st,
+						   sg_page_sizes);
+		if (ret) {
+			/* swapin failed, free the pages */
+			__intel_memory_region_put_pages_buddy(mem, blocks);
+			ret = -ENXIO;
+			goto err_free_sg;
+		}
+	} else if (obj->flags & I915_BO_ALLOC_CPU_CLEAR) {
 		struct scatterlist *sg;
 		unsigned long i;
 
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index 185eab497803..afcd6fe6eaff 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -147,6 +147,11 @@ static int intel_memory_region_evict(struct intel_memory_region *mem,
 
 		mutex_unlock(&mem->objects.lock);
 
+		/* tell callee to do swapping */
+		if (i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM)
+		    && pass == 1)
+			obj->do_swapping = true;
+
 		if (!i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE)) {
 			if (i915_gem_object_trylock(obj)) {
 				__i915_gem_object_put_pages(obj);
@@ -160,6 +165,7 @@ static int intel_memory_region_evict(struct intel_memory_region *mem,
 			}
 		}
 
+		obj->do_swapping = false;
 		i915_gem_object_put(obj);
 		mutex_lock(&mem->objects.lock);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 131/162] drm/i915/dg1: Add enable_eviction modparam
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (129 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 130/162] drm/i915/dg1: Eviction logic Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-30 12:20   ` Jani Nikula
  2020-11-27 12:06 ` [RFC PATCH 132/162] drm/i915/dg1: Add lmem_size modparam Matthew Auld
                   ` (30 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, Sudeep Dutt, dri-devel

From: CQ Tang <cq.tang@intel.com>

enable_eviction is used to tune if eviction is enabled (default) or not.

Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 1 +
 drivers/gpu/drm/i915/gem/i915_gem_region.c | 5 +++++
 drivers/gpu/drm/i915/i915_params.c         | 3 +++
 drivers/gpu/drm/i915/i915_params.h         | 1 +
 drivers/gpu/drm/i915/intel_memory_region.c | 2 +-
 5 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 7cb5f137522f..46d0f8731db0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -293,6 +293,7 @@ static void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	 * If object had been swapped out, free the hidden object.
 	 */
 	if (obj->swapto) {
+		GEM_BUG_ON(!i915->params.enable_eviction);
 		i915_gem_object_put(obj->swapto);
 		obj->swapto = NULL;
 	}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c
index a437538cd872..e1793c5f8d8c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
@@ -21,6 +21,7 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
 	GEM_BUG_ON(i915_gem_object_has_pages(obj));
 	GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
 	GEM_BUG_ON(obj->mm.region->type != INTEL_MEMORY_LOCAL);
+	GEM_BUG_ON(!i915->params.enable_eviction);
 
 	assert_object_held(obj);
 
@@ -70,6 +71,7 @@ static int
 i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
 			     struct sg_table *pages, unsigned int sizes)
 {
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct drm_i915_gem_object *dst, *src;
 	int err;
 
@@ -77,6 +79,7 @@ i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
 	GEM_BUG_ON(i915_gem_object_has_pages(obj));
 	GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
 	GEM_BUG_ON(obj->mm.region->type != INTEL_MEMORY_LOCAL);
+	GEM_BUG_ON(!i915->params.enable_eviction);
 
 	assert_object_held(obj);
 
@@ -146,6 +149,7 @@ i915_gem_object_put_pages_buddy(struct drm_i915_gem_object *obj,
 int
 i915_gem_object_get_pages_buddy(struct drm_i915_gem_object *obj)
 {
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct intel_memory_region *mem = obj->mm.region;
 	struct list_head *blocks = &obj->mm.blocks;
 	resource_size_t size = obj->base.size;
@@ -222,6 +226,7 @@ i915_gem_object_get_pages_buddy(struct drm_i915_gem_object *obj)
 	/* if we saved the page contents, swap them in */
 	if (obj->swapto) {
 		GEM_BUG_ON(i915_gem_object_is_volatile(obj));
+		GEM_BUG_ON(!i915->params.enable_eviction);
 
 		ret = i915_gem_object_swapin_pages(obj, st,
 						   sg_page_sizes);
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 7f139ea4a90b..bb1ebb6ece95 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -197,6 +197,9 @@ i915_param_named_unsafe(fake_lmem_start, ulong, 0400,
 	"Fake LMEM start offset (default: 0)");
 #endif
 
+i915_param_named_unsafe(enable_eviction, bool, 0600,
+	"Enable memcpy based eviction which does not rely on DMA resv refactoring)");
+
 static __always_inline void _print_param(struct drm_printer *p,
 					 const char *name,
 					 const char *type,
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index 330c03e2b4f7..87df407d9afb 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -72,6 +72,7 @@ struct drm_printer;
 	param(char *, force_probe, CONFIG_DRM_I915_FORCE_PROBE, 0400) \
 	param(unsigned long, fake_lmem_start, 0, 0400) \
 	/* leave bools at the end to not create holes */ \
+	param(bool, enable_eviction, true, 0600) \
 	param(bool, enable_hangcheck, true, 0600) \
 	param(bool, load_detect_test, false, 0600) \
 	param(bool, force_reset_modeset_test, false, 0600) \
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index afcd6fe6eaff..57f01ef16628 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -175,7 +175,7 @@ static int intel_memory_region_evict(struct intel_memory_region *mem,
 	list_splice_tail(&still_in_list, *phase);
 	mutex_unlock(&mem->objects.lock);
 
-	if (found < target) {
+	if (found < target && i915->params.enable_eviction) {
 		pass++;
 		phase++;
 		if (*phase)
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 132/162] drm/i915/dg1: Add lmem_size modparam
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (130 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 131/162] drm/i915/dg1: Add enable_eviction modparam Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 133/162] drm/i915/dg1: Track swap in/out stats via debugfs Matthew Auld
                   ` (29 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

From: CQ Tang <cq.tang@intel.com>

lmem_size is used to limit the amount of lmem_size. Default is to
use hardware available lmem size, when setting this modpraram
which is in MB unit.

Signed-off-by: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/i915_params.c       | 3 +++
 drivers/gpu/drm/i915/i915_params.h       | 1 +
 drivers/gpu/drm/i915/intel_region_lmem.c | 4 ++++
 3 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index bb1ebb6ece95..264de32f3d6a 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -200,6 +200,9 @@ i915_param_named_unsafe(fake_lmem_start, ulong, 0400,
 i915_param_named_unsafe(enable_eviction, bool, 0600,
 	"Enable memcpy based eviction which does not rely on DMA resv refactoring)");
 
+i915_param_named_unsafe(lmem_size, uint, 0400,
+	"Change lmem size for each region. (default: 0, all memory)");
+
 static __always_inline void _print_param(struct drm_printer *p,
 					 const char *name,
 					 const char *type,
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index 87df407d9afb..be6979e7feda 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -71,6 +71,7 @@ struct drm_printer;
 	param(int, enable_dpcd_backlight, -1, 0600) \
 	param(char *, force_probe, CONFIG_DRM_I915_FORCE_PROBE, 0400) \
 	param(unsigned long, fake_lmem_start, 0, 0400) \
+	param(unsigned int, lmem_size, 0, 0400) \
 	/* leave bools at the end to not create holes */ \
 	param(bool, enable_eviction, true, 0600) \
 	param(bool, enable_hangcheck, true, 0600) \
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index eafef7034680..1cdb6354b968 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -196,6 +196,10 @@ setup_lmem(struct drm_i915_private *dev_priv)
 
 	io_start = pci_resource_start(pdev, 2);
 
+	if (dev_priv->params.lmem_size > 0)
+		lmem_size = min_t(resource_size_t, lmem_size,
+				  mul_u32_u32(dev_priv->params.lmem_size, SZ_1M));
+
 	mem = intel_memory_region_create(dev_priv,
 					 0,
 					 lmem_size,
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 133/162] drm/i915/dg1: Track swap in/out stats via debugfs
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (131 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 132/162] drm/i915/dg1: Add lmem_size modparam Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 14:09   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 134/162] drm/i915/dg1: Measure swap in/out timing stats Matthew Auld
                   ` (28 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Sudeep Dutt, dri-devel

From: Sudeep Dutt <sudeep.dutt@intel.com>

cat /sys/kernel/debug/dri/0/i915_gem_objects
num_bytes_swapped_out 94170000 num_bytes_swapped_in 56120000

Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_region.c | 6 ++++++
 drivers/gpu/drm/i915/i915_debugfs.c        | 3 +++
 drivers/gpu/drm/i915/i915_drv.h            | 3 +++
 3 files changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c
index e1793c5f8d8c..ed108dbcb34e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
@@ -64,6 +64,9 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
 	else
 		i915_gem_object_put(dst);
 
+	if (!err)
+		atomic_long_add(sizes, &i915->num_bytes_swapped_out);
+
 	return err;
 }
 
@@ -118,6 +121,9 @@ i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
 		i915_gem_object_put(src);
 	}
 
+	if (!err)
+		atomic_long_add(sizes, &i915->num_bytes_swapped_in);
+
 	return err;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 6d1482c82694..1b7e9b6ab660 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -372,6 +372,9 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 	for_each_memory_region(mr, i915, id)
 		seq_printf(m, "%s: total:%pa, available:%pa bytes\n",
 			   mr->name, &mr->total, &mr->avail);
+	seq_printf(m, "num_bytes_swapped_out %ld num_bytes_swapped_in %ld\n",
+		   atomic_long_read(&i915->num_bytes_swapped_out),
+		   atomic_long_read(&i915->num_bytes_swapped_in));
 	seq_putc(m, '\n');
 
 	print_context_stats(m, i915);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1366b53ac8c9..7b1e95d494e6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1214,6 +1214,9 @@ struct drm_i915_private {
 	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
 	 * will be rejected. Instead look for a better place.
 	 */
+
+	atomic_long_t num_bytes_swapped_out;
+	atomic_long_t num_bytes_swapped_in;
 };
 
 static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 134/162] drm/i915/dg1: Measure swap in/out timing stats
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (132 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 133/162] drm/i915/dg1: Track swap in/out stats via debugfs Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 14:11   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 135/162] drm/i915: define intel_partial_pages_for_sg_table Matthew Auld
                   ` (27 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Sudeep Dutt, dri-devel

From: Sudeep Dutt <sudeep.dutt@intel.com>

Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_region.c | 16 ++++++++++++++--
 drivers/gpu/drm/i915/i915_debugfs.c        |  3 +++
 drivers/gpu/drm/i915/i915_drv.h            |  2 ++
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c
index ed108dbcb34e..4fab9f6b4bee 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
@@ -15,6 +15,7 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct drm_i915_gem_object *dst, *src;
+	unsigned long start, diff, msec;
 	int err;
 
 	GEM_BUG_ON(obj->swapto);
@@ -24,6 +25,7 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
 	GEM_BUG_ON(!i915->params.enable_eviction);
 
 	assert_object_held(obj);
+	start = jiffies;
 
 	/* create a shadow object on smem region */
 	dst = i915_gem_object_create_shmem(i915, obj->base.size);
@@ -64,8 +66,12 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
 	else
 		i915_gem_object_put(dst);
 
-	if (!err)
+	if (!err) {
+		diff = jiffies - start;
+		msec = diff * 1000 / HZ;
+		atomic_long_add(msec, &i915->time_swap_out_ms);
 		atomic_long_add(sizes, &i915->num_bytes_swapped_out);
+	}
 
 	return err;
 }
@@ -76,6 +82,7 @@ i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct drm_i915_gem_object *dst, *src;
+	unsigned long start, diff, msec;
 	int err;
 
 	GEM_BUG_ON(!obj->swapto);
@@ -85,6 +92,7 @@ i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
 	GEM_BUG_ON(!i915->params.enable_eviction);
 
 	assert_object_held(obj);
+	start = jiffies;
 
 	src = obj->swapto;
 
@@ -121,8 +129,12 @@ i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
 		i915_gem_object_put(src);
 	}
 
-	if (!err)
+	if (!err) {
+		diff = jiffies - start;
+		msec = diff * 1000 / HZ;
+		atomic_long_add(msec, &i915->time_swap_in_ms);
 		atomic_long_add(sizes, &i915->num_bytes_swapped_in);
+	}
 
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 1b7e9b6ab660..2bf51dd9de7c 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -375,6 +375,9 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 	seq_printf(m, "num_bytes_swapped_out %ld num_bytes_swapped_in %ld\n",
 		   atomic_long_read(&i915->num_bytes_swapped_out),
 		   atomic_long_read(&i915->num_bytes_swapped_in));
+	seq_printf(m, "time_swap_out_msec %ld time_swap_in_msec %ld\n",
+		   atomic_long_read(&i915->time_swap_out_ms),
+		   atomic_long_read(&i915->time_swap_in_ms));
 	seq_putc(m, '\n');
 
 	print_context_stats(m, i915);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7b1e95d494e6..10823abab224 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1217,6 +1217,8 @@ struct drm_i915_private {
 
 	atomic_long_t num_bytes_swapped_out;
 	atomic_long_t num_bytes_swapped_in;
+	atomic_long_t time_swap_out_ms;
+	atomic_long_t time_swap_in_ms;
 };
 
 static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 135/162] drm/i915: define intel_partial_pages_for_sg_table
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (133 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 134/162] drm/i915/dg1: Measure swap in/out timing stats Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 136/162] drm/i915: create and destroy dummy vma Matthew Auld
                   ` (26 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

From: Ramalingam C <ramalingam.c@intel.com>

Function to retrieve the partial pages from the object, from mentioned
offset(pages). This is created as a subset of intel_partial pages to be
used for window blt copy feature which is introduced in forthcoming
patches.

This takes the sg_table to be filled in with pages and also passes out
the ptr to last scatterlist used. sg_table is trimmed based on the
parameter.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 59 +++++++++++++++++-----------
 drivers/gpu/drm/i915/gt/intel_gtt.h  |  4 ++
 2 files changed, 40 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index eed5b640e493..21804c4cef9c 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -1383,25 +1383,17 @@ intel_remap_pages(struct intel_remapped_info *rem_info,
 	return ERR_PTR(ret);
 }
 
-static noinline struct sg_table *
-intel_partial_pages(const struct i915_ggtt_view *view,
-		    struct drm_i915_gem_object *obj)
+void intel_partial_pages_for_sg_table(struct drm_i915_gem_object *obj,
+				      struct sg_table *st,
+				      u32 obj_offset, u32 page_count,
+				      struct scatterlist **sgl)
 {
-	struct sg_table *st;
 	struct scatterlist *sg, *iter;
-	unsigned int count = view->partial.size;
 	unsigned int offset;
-	int ret = -ENOMEM;
 
-	st = kmalloc(sizeof(*st), GFP_KERNEL);
-	if (!st)
-		goto err_st_alloc;
+	GEM_BUG_ON(!st);
 
-	ret = sg_alloc_table(st, count, GFP_KERNEL);
-	if (ret)
-		goto err_sg_alloc;
-
-	iter = i915_gem_object_get_sg_dma(obj, view->partial.offset, &offset, true);
+	iter = i915_gem_object_get_sg_dma(obj, obj_offset, &offset, true);
 	GEM_BUG_ON(!iter);
 
 	sg = st->sgl;
@@ -1410,30 +1402,51 @@ intel_partial_pages(const struct i915_ggtt_view *view,
 		unsigned int len;
 
 		len = min(sg_dma_len(iter) - (offset << PAGE_SHIFT),
-			  count << PAGE_SHIFT);
+			  page_count << PAGE_SHIFT);
+
 		sg_set_page(sg, NULL, len, 0);
 		sg_dma_address(sg) =
 			sg_dma_address(iter) + (offset << PAGE_SHIFT);
 		sg_dma_len(sg) = len;
 
 		st->nents++;
-		count -= len >> PAGE_SHIFT;
-		if (count == 0) {
+		page_count -= len >> PAGE_SHIFT;
+		if (page_count == 0) {
 			sg_mark_end(sg);
-			i915_sg_trim(st); /* Drop any unused tail entries. */
+			if (sgl)
+				*sgl = sg;
 
-			return st;
+			return;
 		}
 
 		sg = __sg_next(sg);
 		iter = __sg_next(iter);
 		offset = 0;
 	} while (1);
+}
 
-err_sg_alloc:
-	kfree(st);
-err_st_alloc:
-	return ERR_PTR(ret);
+static noinline struct sg_table *
+intel_partial_pages(const struct i915_ggtt_view *view,
+		    struct drm_i915_gem_object *obj)
+{
+	struct sg_table *st;
+	int ret;
+
+	st = kmalloc(sizeof(*st), GFP_KERNEL);
+	if (!st)
+		return ERR_PTR(-ENOMEM);
+
+	ret = sg_alloc_table(st, view->partial.size, GFP_KERNEL);
+	if (ret) {
+		kfree(st);
+		return ERR_PTR(ret);
+	}
+
+	intel_partial_pages_for_sg_table(obj, st, view->partial.offset,
+					 view->partial.size, NULL);
+	i915_sg_trim(st);
+
+	return st;
 }
 
 static int
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index db3626c0ee20..37d2c692c0af 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -506,6 +506,10 @@ static inline bool i915_ggtt_has_aperture(const struct i915_ggtt *ggtt)
 	return ggtt->mappable_end > 0;
 }
 
+void intel_partial_pages_for_sg_table(struct drm_i915_gem_object *obj,
+				      struct sg_table *st,
+				      u32 obj_offset, u32 page_count,
+				      struct scatterlist **sgl);
 int i915_ppgtt_init_hw(struct intel_gt *gt);
 
 struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 136/162] drm/i915: create and destroy dummy vma
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (134 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 135/162] drm/i915: define intel_partial_pages_for_sg_table Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 137/162] drm/i915: blt copy between objs using pre-created vma windows Matthew Auld
                   ` (25 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

From: Ramalingam C <ramalingam.c@intel.com>

Functions for window_blt_copy defined to create and destroy
the dummy vmas for virtual memory, which dont have any associated
objects.

These dummy vmas are used at window_blt_copy festure to associated to
set of pages and create ptes at runtime and submit it for blt copy.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/i915_vma.c | 38 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_vma.h |  6 ++++++
 2 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 59fe82af48b2..5537950e310f 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -100,6 +100,44 @@ static void __i915_vma_retire(struct i915_active *ref)
 	i915_vma_put(active_to_vma(ref));
 }
 
+struct i915_vma *
+i915_alloc_window_vma(struct drm_i915_private *i915,
+		      struct i915_address_space *vm, u64 size,
+		      u64 min_page_size)
+{
+	struct i915_vma *vma;
+
+	vma = i915_vma_alloc();
+	if (!vma)
+		return ERR_PTR(-ENOMEM);
+
+	kref_init(&vma->ref);
+	mutex_init(&vma->pages_mutex);
+	vma->vm = i915_vm_get(vm);
+	vma->ops = &vm->vma_ops;
+	vma->obj = NULL;
+	vma->resv = NULL;
+	vma->size = size;
+	vma->display_alignment = I915_GTT_MIN_ALIGNMENT;
+	vma->page_sizes.sg = min_page_size;
+
+	i915_active_init(&vma->active, __i915_vma_active, __i915_vma_retire);
+	INIT_LIST_HEAD(&vma->closed_link);
+
+	GEM_BUG_ON(!IS_ALIGNED(vma->size, I915_GTT_PAGE_SIZE));
+	GEM_BUG_ON(i915_is_ggtt(vm));
+
+	return vma;
+}
+
+void i915_destroy_window_vma(struct i915_vma *vma)
+{
+	i915_active_fini(&vma->active);
+	i915_vm_put(vma->vm);
+	mutex_destroy(&vma->pages_mutex);
+	i915_vma_free(vma);
+}
+
 static struct i915_vma *
 vma_create(struct drm_i915_gem_object *obj,
 	   struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 2db4f25b8d5f..f595fe706010 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -44,6 +44,12 @@ i915_vma_instance(struct drm_i915_gem_object *obj,
 		  struct i915_address_space *vm,
 		  const struct i915_ggtt_view *view);
 
+struct i915_vma *
+i915_alloc_window_vma(struct drm_i915_private *i915,
+		      struct i915_address_space *vm, u64 size,
+		      u64 min_page_size);
+void i915_destroy_window_vma(struct i915_vma *vma);
+
 void i915_vma_unpin_and_release(struct i915_vma **p_vma, unsigned int flags);
 #define I915_VMA_RELEASE_MAP BIT(0)
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 137/162] drm/i915: blt copy between objs using pre-created vma windows
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (135 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 136/162] drm/i915: create and destroy dummy vma Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 14:19   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 138/162] drm/i915/dg1: Eliminate eviction mutex Matthew Auld
                   ` (24 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Jani Nikula, dri-devel, CQ Tang, Daniel Vetter

From: Ramalingam C <ramalingam.c@intel.com>

To avoid the locking issues in vma handling during the blt copy of
objects, dummy vmas with sg_tables of size BLT_WINDOW_SZ (with
WINDOW_SIZE / PAGE_SIZE number of scatterlists) are created
in driver load for source and destination objects.

Two sets of these vmas are created, where one set is for lmem and
another set is for smem to enable the blt copy of the contents of
objects belongs to same mem regions or different mem regions

When blitter copy is required between objects, pages of size
<=WINDOW_SIZE are assigned to the dummy_vma->pages for both source and
destination objects into corrresponding windows.

Then PTE are created at runtime for the pages attached and batch
commands are filled into the BCS ring for the pages filled in the
window and request is submitted for the BCS0.

Until all pages of the source object are copied into destination
object pages above process will be executed in a loop.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 355 +++++++++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_object.h |   5 +
 drivers/gpu/drm/i915/i915_drv.c            |  11 +
 drivers/gpu/drm/i915/i915_drv.h            |   6 +
 4 files changed, 377 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 46d0f8731db0..3943a184fbe3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -22,11 +22,13 @@
  *
  */
 
+#include <drm/drm_print.h>
 #include <linux/sched/mm.h>
 
 #include "display/intel_frontbuffer.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_requests.h"
+#include "gt/intel_ring.h"
 #include "i915_drv.h"
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
@@ -718,6 +720,359 @@ static const struct drm_gem_object_funcs i915_gem_object_funcs = {
 	.export = i915_gem_prime_export,
 };
 
+#define BLT_WINDOW_SZ SZ_4M
+static int i915_alloc_vm_range(struct i915_vma *vma)
+{
+	struct i915_vm_pt_stash stash = {};
+	int err;
+	struct i915_gem_ww_ctx ww;
+
+	err = i915_vm_alloc_pt_stash(vma->vm, &stash, vma->size);
+	if (err)
+		return err;
+
+	for_i915_gem_ww(&ww, err, false) {
+		err = i915_vm_lock_objects(vma->vm, &ww);
+		if (err)
+			continue;
+
+		dma_resv_assert_held(&vma->vm->resv);
+
+		err = i915_vm_map_pt_stash(vma->vm, &stash);
+		if (err)
+			continue;
+
+		vma->vm->allocate_va_range(vma->vm, &stash,
+					   vma->node.start, vma->size);
+
+		set_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma));
+		/* Implicit unlock */
+	}
+
+	i915_vm_free_pt_stash(vma->vm, &stash);
+
+	return err;
+}
+
+static inline void i915_insert_vma_pages(struct i915_vma *vma, bool is_lmem)
+{
+	enum i915_cache_level cache_level = I915_CACHE_NONE;
+
+	vma->vm->insert_entries(vma->vm, vma, cache_level,
+				is_lmem ? PTE_LM : 0);
+	wmb();
+}
+
+static struct i915_vma *
+i915_window_vma_init(struct drm_i915_private *i915,
+		     struct intel_memory_region *mem)
+{
+	struct intel_context *ce = i915->gt.engine[BCS0]->blitter_context;
+	struct i915_address_space *vm = ce->vm;
+	struct i915_vma *vma;
+	int ret;
+
+	vma = i915_alloc_window_vma(i915, vm, BLT_WINDOW_SZ,
+				    mem->min_page_size);
+	if (IS_ERR(vma)) {
+		DRM_ERROR("window vma alloc failed(%ld)\n", PTR_ERR(vma));
+		return vma;
+	}
+
+	vma->pages = kmalloc(sizeof(*vma->pages), GFP_KERNEL);
+	if (!vma->pages) {
+		ret = -ENOMEM;
+		DRM_ERROR("page alloc failed. %d", ret);
+		goto err_page;
+	}
+
+	ret = sg_alloc_table(vma->pages, BLT_WINDOW_SZ / PAGE_SIZE,
+			     GFP_KERNEL);
+	if (ret) {
+		DRM_ERROR("sg alloc table failed(%d)", ret);
+		goto err_sg_table;
+	}
+
+	mutex_lock(&vm->mutex);
+	ret = drm_mm_insert_node_in_range(&vm->mm, &vma->node,
+					  BLT_WINDOW_SZ, BLT_WINDOW_SZ,
+					  I915_COLOR_UNEVICTABLE,
+					  0, vm->total,
+					  DRM_MM_INSERT_LOW);
+	mutex_unlock(&vm->mutex);
+	if (ret) {
+		DRM_ERROR("drm_mm_insert_node_in_range failed. %d\n", ret);
+		goto err_mm_node;
+	}
+
+	ret = i915_alloc_vm_range(vma);
+	if (ret) {
+		DRM_ERROR("src: Page table alloc failed(%d)\n", ret);
+		goto err_alloc;
+	}
+
+	return vma;
+
+err_alloc:
+	mutex_lock(&vm->mutex);
+	drm_mm_remove_node(&vma->node);
+	mutex_unlock(&vm->mutex);
+err_mm_node:
+	sg_free_table(vma->pages);
+err_sg_table:
+	kfree(vma->pages);
+err_page:
+	i915_destroy_window_vma(vma);
+
+	return ERR_PTR(ret);
+}
+
+static void i915_window_vma_teardown(struct i915_vma *vma)
+{
+	vma->vm->clear_range(vma->vm, vma->node.start, vma->size);
+	drm_mm_remove_node(&vma->node);
+	sg_free_table(vma->pages);
+	kfree(vma->pages);
+	i915_destroy_window_vma(vma);
+}
+
+int i915_setup_blt_windows(struct drm_i915_private *i915)
+{
+	struct intel_memory_region *lmem_region =
+		intel_memory_region_by_type(i915, INTEL_MEMORY_LOCAL);
+	struct intel_memory_region *smem_region =
+		intel_memory_region_by_type(i915, INTEL_MEMORY_SYSTEM);
+	struct i915_vma *lmem[2];
+	struct i915_vma *smem[2];
+	int ret, i;
+
+	if (intel_gt_is_wedged(&i915->gt)) {
+		drm_dbg(&i915->drm, "GT0 is wedged; BCS0 not available\n");
+		return -EIO;
+	}
+
+	if (!i915->gt.engine[BCS0]) {
+		DRM_DEBUG("No BCS0 engine, hence blt evict is not setup\n");
+		return 0;
+	}
+
+	mutex_init(&i915->mm.window_mutex);
+	for (i = 0; i < ARRAY_SIZE(lmem); i++) {
+		lmem[i] = i915_window_vma_init(i915, lmem_region);
+		if (IS_ERR_OR_NULL(lmem[i])) {
+			ret = PTR_ERR(lmem[i]);
+			DRM_ERROR("Err for lmem[%d]. %d\n", i, ret);
+			if (i--)
+				for (; i >= 0; i--)
+					i915_window_vma_teardown(lmem[i]);
+			return ret;
+		}
+		i915->mm.lmem_window[i] = lmem[i];
+		GEM_BUG_ON(!i915->mm.lmem_window[i]);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(smem); i++) {
+		smem[i] = i915_window_vma_init(i915, smem_region);
+		if (IS_ERR_OR_NULL(smem[i])) {
+			ret = PTR_ERR(smem[i]);
+			DRM_ERROR("Err for smem[%d]. %d\n", i, ret);
+			if (i--)
+				for (; i >= 0; i--)
+					i915_window_vma_teardown(smem[i]);
+			for (i = 0; i < ARRAY_SIZE(lmem); i++)
+				i915_window_vma_teardown(lmem[i]);
+			return ret;
+		}
+		i915->mm.smem_window[i] = smem[i];
+		GEM_BUG_ON(!i915->mm.smem_window[i]);
+	}
+
+	return 0;
+}
+
+void i915_teardown_blt_windows(struct drm_i915_private *i915)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(i915->mm.lmem_window); i++) {
+		if (!i915->mm.lmem_window[i])
+			continue;
+		i915_window_vma_teardown(i915->mm.lmem_window[i]);
+	}
+	for (i = 0; i < ARRAY_SIZE(i915->mm.smem_window); i++) {
+		if (!i915->mm.smem_window[i])
+			continue;
+		i915_window_vma_teardown(i915->mm.smem_window[i]);
+	}
+	mutex_destroy(&i915->mm.window_mutex);
+}
+
+static int i915_window_blt_copy_prepare_obj(struct drm_i915_gem_object *obj)
+{
+	int ret;
+
+	ret = i915_gem_object_wait(obj,
+				   I915_WAIT_INTERRUPTIBLE,
+				   MAX_SCHEDULE_TIMEOUT);
+	if (ret)
+		return ret;
+
+	return i915_gem_object_pin_pages(obj);
+}
+
+static int
+i915_window_blt_copy_batch_prepare(struct i915_request *rq,
+				   struct i915_vma *src,
+				   struct i915_vma *dst, size_t size)
+{
+	u32 *cmd;
+
+	GEM_BUG_ON(size > BLT_WINDOW_SZ);
+	cmd = intel_ring_begin(rq, 10);
+	if (IS_ERR(cmd))
+		return PTR_ERR(cmd);
+
+	GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
+	GEM_BUG_ON(INTEL_GEN(rq->engine->i915) < 9);
+
+	*cmd++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
+	*cmd++ = BLT_DEPTH_32 | PAGE_SIZE;
+	*cmd++ = 0;
+	*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+	*cmd++ = lower_32_bits(dst->node.start);
+	*cmd++ = upper_32_bits(dst->node.start);
+	*cmd++ = 0;
+	*cmd++ = PAGE_SIZE;
+	*cmd++ = lower_32_bits(src->node.start);
+	*cmd++ = upper_32_bits(src->node.start);
+	intel_ring_advance(rq, cmd);
+
+	return 0;
+}
+
+int i915_window_blt_copy(struct drm_i915_gem_object *dst,
+			 struct drm_i915_gem_object *src)
+{
+	struct drm_i915_private *i915 = to_i915(src->base.dev);
+	struct intel_context *ce = i915->gt.engine[BCS0]->blitter_context;
+	bool src_is_lmem = i915_gem_object_is_lmem(src);
+	bool dst_is_lmem = i915_gem_object_is_lmem(dst);
+	struct scatterlist *last_sgl;
+	struct i915_vma *src_vma, *dst_vma;
+	struct i915_request *rq;
+	u64 cur_win_sz, blt_copied, offset;
+	long timeout;
+	u32 size;
+	int err;
+
+	src_vma = src_is_lmem ? i915->mm.lmem_window[0] :
+				i915->mm.smem_window[0];
+	dst_vma = dst_is_lmem ? i915->mm.lmem_window[1] :
+				i915->mm.smem_window[1];
+
+	if (!src_vma || !dst_vma)
+		return -ENODEV;
+
+	blt_copied = 0;
+
+	err = i915_window_blt_copy_prepare_obj(src);
+	if (err)
+		return err;
+
+	err = i915_window_blt_copy_prepare_obj(dst);
+	if (err) {
+		i915_gem_object_unpin_pages(src);
+		return err;
+	}
+
+	mutex_lock(&i915->mm.window_mutex);
+	src_vma->obj = src;
+	dst_vma->obj = dst;
+	do {
+		cur_win_sz = min_t(u64, BLT_WINDOW_SZ,
+				   (src->base.size - blt_copied));
+		offset = blt_copied >> PAGE_SHIFT;
+		size = ALIGN(cur_win_sz, src->mm.region->min_page_size) >>
+		       PAGE_SHIFT;
+		intel_partial_pages_for_sg_table(src, src_vma->pages, offset,
+						 size, &last_sgl);
+
+		/*
+		 * Insert pages into vm, expects the pages to the full
+		 * length of VMA. But we may have the pages of <= vma_size.
+		 * Hence altering the vma size to match the total size of
+		 * the pages attached.
+		 */
+		src_vma->size = size << PAGE_SHIFT;
+		i915_insert_vma_pages(src_vma, src_is_lmem);
+		sg_unmark_end(last_sgl);
+
+		/*
+		 * Source obj size could be smaller than the dst obj size,
+		 * due to the varying min_page_size of the mem regions the
+		 * obj belongs to. But when we insert the pages into vm,
+		 * the total size of the pages supposed to be multiples of
+		 * the min page size of that mem region.
+		 */
+		size = ALIGN(cur_win_sz, dst->mm.region->min_page_size) >>
+		       PAGE_SHIFT;
+		intel_partial_pages_for_sg_table(dst, dst_vma->pages, offset,
+						 size, &last_sgl);
+
+		dst_vma->size = size << PAGE_SHIFT;
+		i915_insert_vma_pages(dst_vma, dst_is_lmem);
+		sg_unmark_end(last_sgl);
+
+		rq = i915_request_create(ce);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			break;
+		}
+		if (rq->engine->emit_init_breadcrumb) {
+			err = rq->engine->emit_init_breadcrumb(rq);
+			if (unlikely(err)) {
+				DRM_ERROR("init_breadcrumb failed. %d\n", err);
+				break;
+			}
+		}
+		err = i915_window_blt_copy_batch_prepare(rq, src_vma, dst_vma,
+							 cur_win_sz);
+		if (err) {
+			DRM_ERROR("Batch preparation failed. %d\n", err);
+			i915_request_set_error_once(rq, -EIO);
+		}
+
+		i915_request_get(rq);
+		i915_request_add(rq);
+
+		timeout = i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT);
+		if (timeout < 0) {
+			DRM_ERROR("BLT Request is not completed. %ld\n",
+				  timeout);
+			err = timeout;
+			i915_request_put(rq);
+			break;
+		}
+
+		blt_copied += cur_win_sz;
+		err = 0;
+		i915_request_put(rq);
+		flush_work(&i915->gt.engine[BCS0]->retire_work);
+	} while (src->base.size != blt_copied);
+
+	src_vma->size = BLT_WINDOW_SZ;
+	dst_vma->size = BLT_WINDOW_SZ;
+	src_vma->obj = NULL;
+	dst_vma->obj = NULL;
+	mutex_unlock(&i915->mm.window_mutex);
+
+	dst->mm.dirty = true;
+	i915_gem_object_unpin_pages(src);
+	i915_gem_object_unpin_pages(dst);
+
+	return err;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/huge_gem_object.c"
 #include "selftests/huge_pages.c"
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index ee1914ed2070..52a36b4052f0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -598,4 +598,9 @@ static inline int i915_gem_object_userptr_submit_done(struct drm_i915_gem_object
 static inline void i915_gem_object_userptr_submit_fini(struct drm_i915_gem_object *obj) { GEM_BUG_ON(1); }
 #endif
 
+int i915_window_blt_copy(struct drm_i915_gem_object *dst,
+			 struct drm_i915_gem_object *src);
+int i915_setup_blt_windows(struct drm_i915_private *i915);
+void i915_teardown_blt_windows(struct drm_i915_private *i915);
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index f4540c048cd9..683643b211fa 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -891,6 +891,12 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	i915_driver_register(i915);
 
+	if (HAS_LMEM(i915)) {
+		ret = i915_setup_blt_windows(i915);
+		if (ret)
+			goto out_cleanup_drv_register;
+	}
+
 	enable_rpm_wakeref_asserts(&i915->runtime_pm);
 
 	i915_welcome_messages(i915);
@@ -899,6 +905,8 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	return 0;
 
+out_cleanup_drv_register:
+	i915_driver_unregister(i915);
 out_cleanup_gem:
 	i915_gem_suspend(i915);
 	i915_gem_driver_remove(i915);
@@ -931,6 +939,9 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 void i915_driver_remove(struct drm_i915_private *i915)
 {
+	if (HAS_LMEM(i915))
+		i915_teardown_blt_windows(i915);
+
 	disable_rpm_wakeref_asserts(&i915->runtime_pm);
 
 	i915_driver_unregister(i915);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 10823abab224..07da059640a1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -595,6 +595,12 @@ struct i915_gem_mm {
 	/* shrinker accounting, also useful for userland debugging */
 	u64 shrink_memory;
 	u32 shrink_count;
+
+	struct i915_vma *lmem_window[2];
+	struct i915_vma *smem_window[2];
+
+	/* To protect above two set of vmas */
+	struct mutex window_mutex;
 };
 
 #define I915_IDLE_ENGINES_TIMEOUT (200) /* in ms */
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 138/162] drm/i915/dg1: Eliminate eviction mutex
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (136 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 137/162] drm/i915: blt copy between objs using pre-created vma windows Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 139/162] drm/i915/dg1: Keep engine awake across whole blit Matthew Auld
                   ` (23 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We can eliminate the current evict window mutex, held over the whole
eviction process, and replace it with a wait queue which takes over the
role of co-ordinating access to pre-configured window copy vmas.

Apart from the global lock not being held over whole of the copy,
additional benefit is that, since we have two pairs of copy windows, two
evict operations can now progress independently. (One swap-in plus one
swap-out.)

Also consolidate some of the eviction code into helper functions for
readability and fix cleanup if emit_init_breadcrumb fails.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 144 ++++++++++++---------
 drivers/gpu/drm/i915/i915_drv.h            |   2 +-
 2 files changed, 85 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 3943a184fbe3..34bbefa6d67f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -856,7 +856,8 @@ int i915_setup_blt_windows(struct drm_i915_private *i915)
 		return 0;
 	}
 
-	mutex_init(&i915->mm.window_mutex);
+	init_waitqueue_head(&i915->mm.window_queue);
+
 	for (i = 0; i < ARRAY_SIZE(lmem); i++) {
 		lmem[i] = i915_window_vma_init(i915, lmem_region);
 		if (IS_ERR_OR_NULL(lmem[i])) {
@@ -904,7 +905,6 @@ void i915_teardown_blt_windows(struct drm_i915_private *i915)
 			continue;
 		i915_window_vma_teardown(i915->mm.smem_window[i]);
 	}
-	mutex_destroy(&i915->mm.window_mutex);
 }
 
 static int i915_window_blt_copy_prepare_obj(struct drm_i915_gem_object *obj)
@@ -950,6 +950,36 @@ i915_window_blt_copy_batch_prepare(struct i915_request *rq,
 	return 0;
 }
 
+static void prepare_vma(struct i915_vma *vma,
+			struct drm_i915_gem_object *obj,
+			u32 offset,
+			u32 chunk,
+			bool is_lmem)
+{
+	struct scatterlist *sgl;
+	u32 size;
+
+	/*
+	 * Source obj size could be smaller than the dst obj size,
+	 * due to the varying min_page_size of the mem regions the
+	 * obj belongs to. But when we insert the pages into vm,
+	 * the total size of the pages supposed to be multiples of
+	 * the min page size of that mem region.
+	 */
+	size = ALIGN(chunk, obj->mm.region->min_page_size) >> PAGE_SHIFT;
+	intel_partial_pages_for_sg_table(obj, vma->pages, offset, size, &sgl);
+
+	/*
+	 * Insert pages into vm, expects the pages to the full
+	 * length of VMA. But we may have the pages of <= vma_size.
+	 * Hence altering the vma size to match the total size of
+	 * the pages attached.
+	 */
+	vma->size = size << PAGE_SHIFT;
+	i915_insert_vma_pages(vma, is_lmem);
+	sg_unmark_end(sgl);
+}
+
 int i915_window_blt_copy(struct drm_i915_gem_object *dst,
 			 struct drm_i915_gem_object *src)
 {
@@ -957,24 +987,10 @@ int i915_window_blt_copy(struct drm_i915_gem_object *dst,
 	struct intel_context *ce = i915->gt.engine[BCS0]->blitter_context;
 	bool src_is_lmem = i915_gem_object_is_lmem(src);
 	bool dst_is_lmem = i915_gem_object_is_lmem(dst);
-	struct scatterlist *last_sgl;
-	struct i915_vma *src_vma, *dst_vma;
-	struct i915_request *rq;
-	u64 cur_win_sz, blt_copied, offset;
-	long timeout;
-	u32 size;
+	u64 remain = src->base.size, offset = 0;
+	struct i915_vma *src_vma, *dst_vma, **ps, **pd;
 	int err;
 
-	src_vma = src_is_lmem ? i915->mm.lmem_window[0] :
-				i915->mm.smem_window[0];
-	dst_vma = dst_is_lmem ? i915->mm.lmem_window[1] :
-				i915->mm.smem_window[1];
-
-	if (!src_vma || !dst_vma)
-		return -ENODEV;
-
-	blt_copied = 0;
-
 	err = i915_window_blt_copy_prepare_obj(src);
 	if (err)
 		return err;
@@ -985,43 +1001,42 @@ int i915_window_blt_copy(struct drm_i915_gem_object *dst,
 		return err;
 	}
 
-	mutex_lock(&i915->mm.window_mutex);
+	ps = src_is_lmem ? &i915->mm.lmem_window[0] :
+			   &i915->mm.smem_window[0];
+	pd = dst_is_lmem ? &i915->mm.lmem_window[1] :
+			   &i915->mm.smem_window[1];
+
+	spin_lock(&i915->mm.window_queue.lock);
+
+	err = wait_event_interruptible_locked(i915->mm.window_queue,
+					      *ps && *pd);
+	if (err) {
+		spin_unlock(&i915->mm.window_queue.lock);
+		i915_gem_object_unpin_pages(src);
+		i915_gem_object_unpin_pages(dst);
+		return err;
+	}
+
+	src_vma = *ps;
+	dst_vma = *pd;
+
 	src_vma->obj = src;
 	dst_vma->obj = dst;
-	do {
-		cur_win_sz = min_t(u64, BLT_WINDOW_SZ,
-				   (src->base.size - blt_copied));
-		offset = blt_copied >> PAGE_SHIFT;
-		size = ALIGN(cur_win_sz, src->mm.region->min_page_size) >>
-		       PAGE_SHIFT;
-		intel_partial_pages_for_sg_table(src, src_vma->pages, offset,
-						 size, &last_sgl);
 
-		/*
-		 * Insert pages into vm, expects the pages to the full
-		 * length of VMA. But we may have the pages of <= vma_size.
-		 * Hence altering the vma size to match the total size of
-		 * the pages attached.
-		 */
-		src_vma->size = size << PAGE_SHIFT;
-		i915_insert_vma_pages(src_vma, src_is_lmem);
-		sg_unmark_end(last_sgl);
+	*ps = NULL;
+	*pd = NULL;
 
-		/*
-		 * Source obj size could be smaller than the dst obj size,
-		 * due to the varying min_page_size of the mem regions the
-		 * obj belongs to. But when we insert the pages into vm,
-		 * the total size of the pages supposed to be multiples of
-		 * the min page size of that mem region.
-		 */
-		size = ALIGN(cur_win_sz, dst->mm.region->min_page_size) >>
-		       PAGE_SHIFT;
-		intel_partial_pages_for_sg_table(dst, dst_vma->pages, offset,
-						 size, &last_sgl);
+	spin_unlock(&i915->mm.window_queue.lock);
+
+	do {
+		struct i915_request *rq;
+		long timeout;
+		u32 chunk;
 
-		dst_vma->size = size << PAGE_SHIFT;
-		i915_insert_vma_pages(dst_vma, dst_is_lmem);
-		sg_unmark_end(last_sgl);
+		chunk = min_t(u64, BLT_WINDOW_SZ, remain);
+
+		prepare_vma(src_vma, src, offset, chunk, src_is_lmem);
+		prepare_vma(dst_vma, dst, offset, chunk, dst_is_lmem);
 
 		rq = i915_request_create(ce);
 		if (IS_ERR(rq)) {
@@ -1032,11 +1047,14 @@ int i915_window_blt_copy(struct drm_i915_gem_object *dst,
 			err = rq->engine->emit_init_breadcrumb(rq);
 			if (unlikely(err)) {
 				DRM_ERROR("init_breadcrumb failed. %d\n", err);
+				i915_request_set_error_once(rq, err);
+				__i915_request_skip(rq);
+				i915_request_add(rq);
 				break;
 			}
 		}
 		err = i915_window_blt_copy_batch_prepare(rq, src_vma, dst_vma,
-							 cur_win_sz);
+							 chunk);
 		if (err) {
 			DRM_ERROR("Batch preparation failed. %d\n", err);
 			i915_request_set_error_once(rq, -EIO);
@@ -1045,26 +1063,32 @@ int i915_window_blt_copy(struct drm_i915_gem_object *dst,
 		i915_request_get(rq);
 		i915_request_add(rq);
 
-		timeout = i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT);
-		if (timeout < 0) {
+		if (!err)
+			timeout = i915_request_wait(rq, 0,
+						    MAX_SCHEDULE_TIMEOUT);
+		i915_request_put(rq);
+		if (!err && timeout < 0) {
 			DRM_ERROR("BLT Request is not completed. %ld\n",
 				  timeout);
 			err = timeout;
-			i915_request_put(rq);
 			break;
 		}
 
-		blt_copied += cur_win_sz;
-		err = 0;
-		i915_request_put(rq);
-		flush_work(&i915->gt.engine[BCS0]->retire_work);
-	} while (src->base.size != blt_copied);
+		remain -= chunk;
+		offset += chunk >> PAGE_SHIFT;
+
+		flush_work(&ce->engine->retire_work);
+	} while (remain);
 
+	spin_lock(&i915->mm.window_queue.lock);
 	src_vma->size = BLT_WINDOW_SZ;
 	dst_vma->size = BLT_WINDOW_SZ;
 	src_vma->obj = NULL;
 	dst_vma->obj = NULL;
-	mutex_unlock(&i915->mm.window_mutex);
+	*ps = src_vma;
+	*pd = dst_vma;
+	wake_up_locked(&i915->mm.window_queue);
+	spin_unlock(&i915->mm.window_queue.lock);
 
 	dst->mm.dirty = true;
 	i915_gem_object_unpin_pages(src);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 07da059640a1..82f431cc38cd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -600,7 +600,7 @@ struct i915_gem_mm {
 	struct i915_vma *smem_window[2];
 
 	/* To protect above two set of vmas */
-	struct mutex window_mutex;
+	wait_queue_head_t window_queue;
 };
 
 #define I915_IDLE_ENGINES_TIMEOUT (200) /* in ms */
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 139/162] drm/i915/dg1: Keep engine awake across whole blit
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (137 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 138/162] drm/i915/dg1: Eliminate eviction mutex Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 140/162] drm/i915: window_blt_copy is used for swapin and swapout Matthew Auld
                   ` (22 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Hold blitter engine power reference across the whole copy operation for
efficiency.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 34bbefa6d67f..c84443e01ef1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -1028,6 +1028,8 @@ int i915_window_blt_copy(struct drm_i915_gem_object *dst,
 
 	spin_unlock(&i915->mm.window_queue.lock);
 
+	intel_engine_pm_get(ce->engine);
+
 	do {
 		struct i915_request *rq;
 		long timeout;
@@ -1080,6 +1082,8 @@ int i915_window_blt_copy(struct drm_i915_gem_object *dst,
 		flush_work(&ce->engine->retire_work);
 	} while (remain);
 
+	intel_engine_pm_put(ce->engine);
+
 	spin_lock(&i915->mm.window_queue.lock);
 	src_vma->size = BLT_WINDOW_SZ;
 	dst_vma->size = BLT_WINDOW_SZ;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 140/162] drm/i915: window_blt_copy is used for swapin and swapout
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (138 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 139/162] drm/i915/dg1: Keep engine awake across whole blit Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 14:20   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 141/162] drm/i915: Lmem eviction statistics by category Matthew Auld
                   ` (21 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

From: Ramalingam C <ramalingam.c@intel.com>

window_blt_copy feature is used for swapin and swapout based on the i915
module parameter called enable_eviction.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_region.c | 14 ++++++++++----
 drivers/gpu/drm/i915/i915_drv.c            |  4 ++--
 drivers/gpu/drm/i915/i915_params.c         |  6 ++++--
 drivers/gpu/drm/i915/i915_params.h         |  2 +-
 4 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c
index 4fab9f6b4bee..f9ff0aa31752 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
@@ -16,7 +16,7 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct drm_i915_gem_object *dst, *src;
 	unsigned long start, diff, msec;
-	int err;
+	int err = -EINVAL;
 
 	GEM_BUG_ON(obj->swapto);
 	GEM_BUG_ON(i915_gem_object_has_pages(obj));
@@ -54,7 +54,10 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
 	__i915_gem_object_pin_pages(src);
 
 	/* copying the pages */
-	err = i915_gem_object_memcpy(dst, src);
+	if (i915->params.enable_eviction >= 2)
+		err = i915_window_blt_copy(dst, src);
+	if (err && i915->params.enable_eviction != 2)
+		err = i915_gem_object_memcpy(dst, src);
 
 	__i915_gem_object_unpin_pages(src);
 	__i915_gem_object_unset_pages(src);
@@ -83,7 +86,7 @@ i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct drm_i915_gem_object *dst, *src;
 	unsigned long start, diff, msec;
-	int err;
+	int err = -EINVAL;
 
 	GEM_BUG_ON(!obj->swapto);
 	GEM_BUG_ON(i915_gem_object_has_pages(obj));
@@ -117,7 +120,10 @@ i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
 	__i915_gem_object_pin_pages(dst);
 
 	/* copying the pages */
-	err = i915_gem_object_memcpy(dst, src);
+	if (i915->params.enable_eviction >= 2)
+		err = i915_window_blt_copy(dst, src);
+	if (err && i915->params.enable_eviction != 2)
+		err = i915_gem_object_memcpy(dst, src);
 
 	__i915_gem_object_unpin_pages(dst);
 	__i915_gem_object_unset_pages(dst);
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 683643b211fa..78b528e89486 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -891,7 +891,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	i915_driver_register(i915);
 
-	if (HAS_LMEM(i915)) {
+	if (HAS_LMEM(i915) && i915->params.enable_eviction >= 2) {
 		ret = i915_setup_blt_windows(i915);
 		if (ret)
 			goto out_cleanup_drv_register;
@@ -939,7 +939,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 void i915_driver_remove(struct drm_i915_private *i915)
 {
-	if (HAS_LMEM(i915))
+	if (HAS_LMEM(i915) && i915->params.enable_eviction >= 2)
 		i915_teardown_blt_windows(i915);
 
 	disable_rpm_wakeref_asserts(&i915->runtime_pm);
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 264de32f3d6a..9fa58ed76614 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -197,8 +197,10 @@ i915_param_named_unsafe(fake_lmem_start, ulong, 0400,
 	"Fake LMEM start offset (default: 0)");
 #endif
 
-i915_param_named_unsafe(enable_eviction, bool, 0600,
-	"Enable memcpy based eviction which does not rely on DMA resv refactoring)");
+i915_param_named_unsafe(enable_eviction, uint, 0600,
+	"Enable eviction which does not rely on DMA resv refactoring "
+	"0=disabled, 1=memcpy based only, 2=blt based only, "
+	"3=blt based but fallsback to memcpy based [default])");
 
 i915_param_named_unsafe(lmem_size, uint, 0400,
 	"Change lmem size for each region. (default: 0, all memory)");
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index be6979e7feda..c835e592ee5f 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -72,8 +72,8 @@ struct drm_printer;
 	param(char *, force_probe, CONFIG_DRM_I915_FORCE_PROBE, 0400) \
 	param(unsigned long, fake_lmem_start, 0, 0400) \
 	param(unsigned int, lmem_size, 0, 0400) \
+	param(unsigned int, enable_eviction, 3, 0600) \
 	/* leave bools at the end to not create holes */ \
-	param(bool, enable_eviction, true, 0600) \
 	param(bool, enable_hangcheck, true, 0600) \
 	param(bool, load_detect_test, false, 0600) \
 	param(bool, force_reset_modeset_test, false, 0600) \
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 141/162] drm/i915: Lmem eviction statistics by category
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (139 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 140/162] drm/i915: window_blt_copy is used for swapin and swapout Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 14:21   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:06 ` [RFC PATCH 142/162] drm/i915/gem/selftest: test and measure window based blt cpy Matthew Auld
                   ` (20 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

From: Ramalingam C <ramalingam.c@intel.com>

Number of bytes swapped in and out are captured for both blitter and
memcpy based evictions with time taken for the process.

Debugfs is extended to provide the eviction statistics through both
methods with rate of transfer.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_region.c | 32 +++++++++++++---
 drivers/gpu/drm/i915/i915_debugfs.c        | 43 +++++++++++++++++++---
 drivers/gpu/drm/i915/i915_drv.h            |  5 +++
 3 files changed, 68 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c
index f9ff0aa31752..1ec6528498c8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
@@ -16,6 +16,7 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct drm_i915_gem_object *dst, *src;
 	unsigned long start, diff, msec;
+	bool blt_completed = false;
 	int err = -EINVAL;
 
 	GEM_BUG_ON(obj->swapto);
@@ -54,8 +55,11 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
 	__i915_gem_object_pin_pages(src);
 
 	/* copying the pages */
-	if (i915->params.enable_eviction >= 2)
+	if (i915->params.enable_eviction >= 2) {
 		err = i915_window_blt_copy(dst, src);
+		if (!err)
+			blt_completed = true;
+	}
 	if (err && i915->params.enable_eviction != 2)
 		err = i915_gem_object_memcpy(dst, src);
 
@@ -72,8 +76,14 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
 	if (!err) {
 		diff = jiffies - start;
 		msec = diff * 1000 / HZ;
-		atomic_long_add(msec, &i915->time_swap_out_ms);
-		atomic_long_add(sizes, &i915->num_bytes_swapped_out);
+		if (blt_completed) {
+			atomic_long_add(sizes, &i915->num_bytes_swapped_out);
+			atomic_long_add(msec, &i915->time_swap_out_ms);
+		} else {
+			atomic_long_add(sizes,
+					&i915->num_bytes_swapped_out_memcpy);
+			atomic_long_add(msec, &i915->time_swap_out_ms_memcpy);
+		}
 	}
 
 	return err;
@@ -86,6 +96,7 @@ i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	struct drm_i915_gem_object *dst, *src;
 	unsigned long start, diff, msec;
+	bool blt_completed = false;
 	int err = -EINVAL;
 
 	GEM_BUG_ON(!obj->swapto);
@@ -120,8 +131,11 @@ i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
 	__i915_gem_object_pin_pages(dst);
 
 	/* copying the pages */
-	if (i915->params.enable_eviction >= 2)
+	if (i915->params.enable_eviction >= 2) {
 		err = i915_window_blt_copy(dst, src);
+		if (!err)
+			blt_completed = true;
+	}
 	if (err && i915->params.enable_eviction != 2)
 		err = i915_gem_object_memcpy(dst, src);
 
@@ -138,8 +152,14 @@ i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
 	if (!err) {
 		diff = jiffies - start;
 		msec = diff * 1000 / HZ;
-		atomic_long_add(msec, &i915->time_swap_in_ms);
-		atomic_long_add(sizes, &i915->num_bytes_swapped_in);
+		if (blt_completed) {
+			atomic_long_add(sizes, &i915->num_bytes_swapped_in);
+			atomic_long_add(msec, &i915->time_swap_in_ms);
+		} else {
+			atomic_long_add(sizes,
+					&i915->num_bytes_swapped_in_memcpy);
+			atomic_long_add(msec, &i915->time_swap_in_ms_memcpy);
+		}
 	}
 
 	return err;
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 2bf51dd9de7c..983030ac39e1 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -364,6 +364,7 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 	struct drm_i915_private *i915 = node_to_i915(m->private);
 	struct intel_memory_region *mr;
 	enum intel_region_id id;
+	u64 time, bytes, rate;
 
 	seq_printf(m, "%u shrinkable [%u free] objects, %llu bytes\n",
 		   i915->mm.shrink_count,
@@ -372,12 +373,42 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 	for_each_memory_region(mr, i915, id)
 		seq_printf(m, "%s: total:%pa, available:%pa bytes\n",
 			   mr->name, &mr->total, &mr->avail);
-	seq_printf(m, "num_bytes_swapped_out %ld num_bytes_swapped_in %ld\n",
-		   atomic_long_read(&i915->num_bytes_swapped_out),
-		   atomic_long_read(&i915->num_bytes_swapped_in));
-	seq_printf(m, "time_swap_out_msec %ld time_swap_in_msec %ld\n",
-		   atomic_long_read(&i915->time_swap_out_ms),
-		   atomic_long_read(&i915->time_swap_in_ms));
+
+	time = atomic_long_read(&i915->time_swap_out_ms);
+	bytes = atomic_long_read(&i915->num_bytes_swapped_out);
+	if (time)
+		rate = div64_u64(bytes * 1000, time * 1024 * 1024);
+	else
+		rate = 0;
+	seq_printf(m, "BLT: swapout %llu Bytes in %llu mSec(%llu MB/Sec)\n",
+		   bytes, time, rate);
+
+	time = atomic_long_read(&i915->time_swap_in_ms);
+	bytes = atomic_long_read(&i915->num_bytes_swapped_in);
+	if (time)
+		rate = div64_u64(bytes * 1000, time * 1024 * 1024);
+	else
+		rate = 0;
+	seq_printf(m, "BLT: swapin %llu Bytes in %llu mSec(%llu MB/Sec)\n",
+		   bytes, time, rate);
+
+	time = atomic_long_read(&i915->time_swap_out_ms_memcpy);
+	bytes = atomic_long_read(&i915->num_bytes_swapped_out_memcpy);
+	if (time)
+		rate = div64_u64(bytes * 1000, time * 1024 * 1024);
+	else
+		rate = 0;
+	seq_printf(m, "Memcpy: swapout %llu Bytes in %llu mSec(%llu MB/Sec)\n",
+		   bytes, time, rate);
+
+	time = atomic_long_read(&i915->time_swap_in_ms_memcpy);
+	bytes = atomic_long_read(&i915->num_bytes_swapped_in_memcpy);
+	if (time)
+		rate = div64_u64(bytes * 1000, time * 1024 * 1024);
+	else
+		rate = 0;
+	seq_printf(m, "Memcpy: swapin %llu Bytes in %llu mSec(%llu MB/Sec)\n",
+		   bytes, time, rate);
 	seq_putc(m, '\n');
 
 	print_context_stats(m, i915);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 82f431cc38cd..6f0ab363bdee 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1225,6 +1225,11 @@ struct drm_i915_private {
 	atomic_long_t num_bytes_swapped_in;
 	atomic_long_t time_swap_out_ms;
 	atomic_long_t time_swap_in_ms;
+
+	atomic_long_t num_bytes_swapped_out_memcpy;
+	atomic_long_t num_bytes_swapped_in_memcpy;
+	atomic_long_t time_swap_out_ms_memcpy;
+	atomic_long_t time_swap_in_ms_memcpy;
 };
 
 static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 142/162] drm/i915/gem/selftest: test and measure window based blt cpy
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (140 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 141/162] drm/i915: Lmem eviction statistics by category Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 12:06 ` [RFC PATCH 143/162] drm/i915: suspend/resume eviction Matthew Auld
                   ` (19 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel

From: Ramalingam C <ramalingam.c@intel.com>

Selftest live_blt_evict is written to create an lmem and smem objects and
copy lmem into smem obj using the window based blt copy used for lmem
eviction.

And we test for range of object size from 4K to 64M with different
usecase scenario w.r.t to window size.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
---
 .../i915/gem/selftests/i915_gem_object_blt.c  | 166 ++++++++++++++++++
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 2 files changed, 167 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
index ee9496f3d11d..4f7941dea291 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
@@ -16,6 +16,7 @@
 #include "selftests/mock_drm.h"
 #include "huge_gem_object.h"
 #include "mock_context.h"
+#include "gem/i915_gem_region.h"
 
 static int wrap_ktime_compare(const void *A, const void *B)
 {
@@ -568,6 +569,171 @@ static int igt_copy_blt_ctx0(void *arg)
 	return test_copy_engines(arg, igt_copy_blt_thread, SINGLE_CTX);
 }
 
+static int __igt_obj_window_blt_copy(struct drm_i915_private *i915,
+				     struct intel_memory_region *src_mem,
+				     struct intel_memory_region *dst_mem,
+				     u64 size)
+{
+	struct drm_i915_gem_object *src, *dst;
+	ktime_t t0, t1;
+	u32 *vaddr, i;
+	int err;
+
+	src = i915_gem_object_create_region(src_mem, size, 0);
+	if (IS_ERR(src)) {
+		err = PTR_ERR(src);
+		goto err;
+	}
+	size = max_t(u64, size, src->base.size);
+	i915_gem_object_lock_isolated(src);
+
+	dst = i915_gem_object_create_region(dst_mem, size, 0);
+	if (IS_ERR(dst)) {
+		err = PTR_ERR(dst);
+		goto err_put_src;
+	}
+
+	i915_gem_object_lock_isolated(dst);
+
+	vaddr = i915_gem_object_pin_map(src,
+					i915_coherent_map_type(i915, src, true));
+	if (IS_ERR(vaddr)) {
+		err = PTR_ERR(vaddr);
+		pr_err("Failed at pin map of src. %d\n", err);
+		goto err_put_dst;
+	}
+
+	for (i = 0; i < size / sizeof(u32); i++)
+		vaddr[i] = i;
+
+	if (!(src->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
+		src->cache_dirty = true;
+
+	vaddr = i915_gem_object_pin_map(dst,
+					i915_coherent_map_type(i915, dst, true));
+	if (IS_ERR(vaddr)) {
+		err = PTR_ERR(vaddr);
+		pr_err("Failed at pin map of dst. %d\n", err);
+		goto err_unpin_src;
+	}
+	memset32(vaddr, 0xdeadbeaf, size / sizeof(u32));
+
+	if (!(dst->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
+		dst->cache_dirty = true;
+
+	/*
+	 * FIXME: Blitter based eviction is failing occasionally due to
+	 * trylock approach. To avoid the selftest failure due to trylocks,
+	 * we are adding retries with a delay inn between.
+	 * Retry count and delay are fixed on trial and error basis.
+	 * As soon as trylocks are removed from blt eviction, we should
+	 * remove this retry attempts.
+	 */
+#define WINDOW_BLT_COPY_RETRY		3
+	for (i = 0; i <= WINDOW_BLT_COPY_RETRY; i++) {
+		t0 = ktime_get();
+		err = i915_window_blt_copy(dst, src);
+		if (err == -EBUSY)
+			msleep(1);
+		else
+			break;
+	}
+
+	if (err)
+		goto err_unpin_dst;
+
+	t1 = ktime_sub(ktime_get(), t0);
+	pr_info("blt of %zd KiB at %lld MiB/s\n", src->base.size >> 10,
+		div64_u64(mul_u32_u32(src->base.size, 1000 * 1000 * 1000),
+			  t1) >> 20);
+
+	for (i = 0; i < size / sizeof(u32); i += 17) {
+		if (!(dst->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
+			drm_clflush_virt_range(&vaddr[i], sizeof(vaddr[i]));
+
+		if (vaddr[i] != i) {
+			pr_err("vaddr[%u]=%x, expected=%x\n", i,
+			       vaddr[i], i);
+			err = -EINVAL;
+			goto err_unpin_dst;
+		}
+	}
+
+err_unpin_dst:
+	i915_gem_object_unpin_map(dst);
+err_unpin_src:
+	i915_gem_object_unpin_map(src);
+err_put_dst:
+	i915_gem_object_unlock(dst);
+	i915_gem_object_put(dst);
+err_put_src:
+	i915_gem_object_unlock(src);
+	i915_gem_object_put(src);
+err:
+	if (err == -ENODEV)
+		err = 0;
+	return err;
+}
+
+static int igt_obj_window_blt_copy(void *data)
+{
+	struct drm_i915_private *i915 = data;
+	u64 size[] = {SZ_2K, SZ_4K, SZ_64K, SZ_4M, SZ_8M + SZ_2K, SZ_64M};
+	struct intel_memory_region *lmem =
+		intel_memory_region_by_type(i915, INTEL_MEMORY_LOCAL);
+	struct intel_memory_region *smem =
+		intel_memory_region_by_type(i915, INTEL_MEMORY_SYSTEM);
+	int i, ret;
+
+	for (i = 0; i < ARRAY_SIZE(size); i++) {
+		ret =  __igt_obj_window_blt_copy(i915, lmem, lmem, size[i]);
+		if (ret < 0) {
+			pr_err("%s: Failed at lmem->lmem size: %llu, err: %d\n",
+			       __func__, size[i], ret);
+			break;
+		}
+		ret =  __igt_obj_window_blt_copy(i915, smem, smem, size[i]);
+		if (ret < 0) {
+			pr_err("%s: Failed at smem->smem size: %llu, err: %d\n",
+			       __func__, size[i], ret);
+			break;
+		}
+		ret =  __igt_obj_window_blt_copy(i915, lmem, smem, size[i]);
+		if (ret < 0) {
+			pr_err("%s: Failed at lmem->smem size: %llu, err: %d\n",
+			       __func__, size[i], ret);
+			break;
+		}
+
+		ret =  __igt_obj_window_blt_copy(i915, smem, lmem, size[i]);
+		if (ret < 0) {
+			pr_err("%s: Failed at smem->lmem size: %llu, err: %d\n",
+			       __func__, size[i], ret);
+			break;
+		}
+	}
+
+	return ret;
+}
+
+int i915_obj_window_blt_copy_live_selftests(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(igt_obj_window_blt_copy),
+	};
+
+	if (intel_gt_is_wedged(&i915->gt))
+		return 0;
+
+	if (!HAS_ENGINE(&i915->gt, BCS0))
+		return 0;
+
+	if (!HAS_LMEM(i915))
+		return 0;
+
+	return i915_live_subtests(tests, i915);
+}
+
 int i915_gem_object_blt_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index a92c0e9b7e6b..2bf900f5d8b0 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -39,6 +39,7 @@ selftest(hugepages, i915_gem_huge_page_live_selftests)
 selftest(gem_contexts, i915_gem_context_live_selftests)
 selftest(gem_execbuf, i915_gem_execbuffer_live_selftests)
 selftest(blt, i915_gem_object_blt_live_selftests)
+selftest(win_blt_copy, i915_obj_window_blt_copy_live_selftests)
 selftest(client, i915_gem_client_blt_live_selftests)
 selftest(reset, intel_reset_live_selftests)
 selftest(memory_region, intel_memory_region_live_selftests)
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 143/162] drm/i915: suspend/resume eviction
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (141 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 142/162] drm/i915/gem/selftest: test and measure window based blt cpy Matthew Auld
@ 2020-11-27 12:06 ` Matthew Auld
  2020-11-27 14:22   ` Chris Wilson
  2020-11-27 12:07 ` [RFC PATCH 144/162] drm/i915: Reset blitter context when unpark engine Matthew Auld
                   ` (18 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Venkata Ramana Nayana, CQ Tang, dri-devel

From: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>

As the initial phase of implementation, when the system in idle,
copying the user objects from LMEM to SMEM during suspend and
restoring back in resume. In present implementation using memcpy based
eviction during swapout/swapin of objects. To test the functionality,
suspend is initiated as part of igt application.

Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  3 +
 drivers/gpu/drm/i915/i915_drv.c               | 83 +++++++++++++++++++
 2 files changed, 86 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index e9f42d3137b3..331d113f7d5b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -322,6 +322,9 @@ struct drm_i915_gem_object {
 	 */
 	bool do_swapping;
 	struct drm_i915_gem_object *swapto;
+
+	/** mark evicted object during suspend */
+	bool evicted;
 };
 
 static inline struct drm_i915_gem_object *
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 78b528e89486..e8c4931fc818 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1102,11 +1102,86 @@ static int i915_drm_prepare(struct drm_device *dev)
 	return 0;
 }
 
+static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend)
+{
+	struct drm_i915_private *i915 = to_i915(dev);
+	struct drm_i915_gem_object *obj;
+	struct intel_memory_region *mem;
+	int id, ret = 0;
+
+	/*
+	 * FIXME: Presently using memcpy,
+	 * will replace with blitter once
+	 * fix the issues.
+	 */
+	i915->params.enable_eviction = 1;
+
+	for_each_memory_region(mem, i915, id) {
+		struct list_head still_in_list;
+		INIT_LIST_HEAD(&still_in_list);
+		if (mem->type == INTEL_MEMORY_LOCAL && mem->total) {
+			mutex_lock(&mem->objects.lock);
+			while ((obj =  list_first_entry_or_null(&mem->objects.list,
+						typeof(*obj),
+						mm.region_link))) {
+
+				list_move_tail(&obj->mm.region_link, &still_in_list);
+
+				if (!i915_gem_object_has_pages(obj) && in_suspend)
+					continue;
+
+				/* Ignore previously evicted objects */
+				if (obj->swapto && in_suspend)
+					continue;
+
+				mutex_unlock(&mem->objects.lock);
+
+				if (in_suspend)
+					i915_gem_object_unbind(obj, 0);
+
+				if (in_suspend) {
+					obj->swapto = NULL;
+					obj->evicted = false;
+					obj->do_swapping = true;
+					ret = __i915_gem_object_put_pages(obj);
+					obj->do_swapping = false;
+					if (ret) {
+						/*
+						 * FIXME: internal ctx objects still pinned
+						 * returning as BUSY. Presently just evicting
+						 * the user objects, will fix it later
+						 */
+						obj->evicted = false;
+						ret = 0;
+					} else
+						obj->evicted = true;
+				} else {
+					if (obj->swapto && obj->evicted) {
+						ret = i915_gem_object_pin_pages(obj);
+						if (ret) {
+							i915_gem_object_put(obj);
+						} else {
+							i915_gem_object_unpin_pages(obj);
+							obj->evicted = false;
+						}
+					}
+				}
+				mutex_lock(&mem->objects.lock);
+			}
+			list_splice_tail(&still_in_list, &mem->objects.list);
+			mutex_unlock(&mem->objects.lock);
+		}
+	}
+	i915->params.enable_eviction = 3;
+	return ret;
+}
+
 static int i915_drm_suspend(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct pci_dev *pdev = dev_priv->drm.pdev;
 	pci_power_t opregion_target_state;
+	int ret = 0;
 
 	disable_rpm_wakeref_asserts(&dev_priv->runtime_pm);
 
@@ -1138,6 +1213,10 @@ static int i915_drm_suspend(struct drm_device *dev)
 
 	intel_fbdev_set_suspend(dev, FBINFO_STATE_SUSPENDED, true);
 
+	ret = intel_dmem_evict_buffers(dev, true);
+	if (ret)
+		return ret;
+
 	dev_priv->suspend_count++;
 
 	intel_csr_ucode_suspend(dev_priv);
@@ -1263,6 +1342,10 @@ static int i915_drm_resume(struct drm_device *dev)
 
 	drm_mode_config_reset(dev);
 
+	ret = intel_dmem_evict_buffers(dev, false);
+	if (ret)
+		DRM_ERROR("i915_resume:i915_gem_object_pin_pages failed with err=%d\n", ret);
+
 	i915_gem_resume(dev_priv);
 
 	intel_modeset_init_hw(dev_priv);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 144/162] drm/i915: Reset blitter context when unpark engine
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (142 preceding siblings ...)
  2020-11-27 12:06 ` [RFC PATCH 143/162] drm/i915: suspend/resume eviction Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 14:26   ` Chris Wilson
  2020-11-27 12:07 ` [RFC PATCH 145/162] drm/i915/dg1: Add dedicated context for blitter eviction Matthew Auld
                   ` (17 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: Venkata Ramana Nayana, dri-devel

From: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>

We are only doing it now for kernel_context. We also need to do for the
copy engine  blitter context.

Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 1b2009b4dcb7..69c8ea70d1e8 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -66,6 +66,11 @@ static int __engine_unpark(struct intel_wakeref *wf)
 		ce->ops->reset(ce);
 	}
 
+	if (engine->class == COPY_ENGINE_CLASS) {
+		ce = engine->blitter_context;
+		ce->ops->reset(ce);
+	}
+
 	if (engine->unpark)
 		engine->unpark(engine);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 145/162] drm/i915/dg1: Add dedicated context for blitter eviction
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (143 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 144/162] drm/i915: Reset blitter context when unpark engine Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 12:07 ` [RFC PATCH 146/162] drm/i915/pm: suspend and restore ppgtt mapping Matthew Auld
                   ` (16 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: CQ Tang, dri-devel, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Without a dedicated context there can be a "deadlock" due inversion
between object clearing and eviction on the shared blitter context
timeline.

Clearing of a newly allocated objects emits it's request, but to execute
the request, something may need to be evicted in order to make space for
the new VMA. When the eviction code emits it's copy request it will be
after the buffer clear one in the ringbuffer and so neither can complete.

If we add a dedicated context for eviction then we can de-couple the two
and break the "deadlock".

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c   |  4 +-
 drivers/gpu/drm/i915/gt/intel_engine.h       |  2 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c    | 40 ++++++++++++++++++--
 drivers/gpu/drm/i915/gt/intel_engine_pm.c    |  9 +++--
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  1 +
 5 files changed, 47 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index c84443e01ef1..ddb448f275eb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -767,7 +767,7 @@ static struct i915_vma *
 i915_window_vma_init(struct drm_i915_private *i915,
 		     struct intel_memory_region *mem)
 {
-	struct intel_context *ce = i915->gt.engine[BCS0]->blitter_context;
+	struct intel_context *ce = i915->gt.engine[BCS0]->evict_context;
 	struct i915_address_space *vm = ce->vm;
 	struct i915_vma *vma;
 	int ret;
@@ -984,7 +984,7 @@ int i915_window_blt_copy(struct drm_i915_gem_object *dst,
 			 struct drm_i915_gem_object *src)
 {
 	struct drm_i915_private *i915 = to_i915(src->base.dev);
-	struct intel_context *ce = i915->gt.engine[BCS0]->blitter_context;
+	struct intel_context *ce = i915->gt.engine[BCS0]->evict_context;
 	bool src_is_lmem = i915_gem_object_is_lmem(src);
 	bool dst_is_lmem = i915_gem_object_is_lmem(dst);
 	u64 remain = src->base.size, offset = 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 188c5ff6dc64..623a6876dca5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -188,6 +188,8 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
 #define I915_GEM_HWS_SEQNO_ADDR		(I915_GEM_HWS_SEQNO * sizeof(u32))
 #define I915_GEM_HWS_BLITTER		0x42
 #define I915_GEM_HWS_BLITTER_ADDR	(I915_GEM_HWS_BLITTER * sizeof(u32))
+#define I915_GEM_HWS_EVICT		0x44
+#define I915_GEM_HWS_EVICT_ADDR		(I915_GEM_HWS_EVICT * sizeof(u32))
 #define I915_GEM_HWS_SCRATCH		0x80
 
 #define I915_HWS_CSB_BUF0_INDEX		0x10
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 9e0394b06f38..a83af8775a64 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -874,6 +874,20 @@ create_blitter_context(struct intel_engine_cs *engine)
 	return ce;
 }
 
+static struct intel_context *
+create_evict_context(struct intel_engine_cs *engine)
+{
+	static struct lock_class_key evict;
+	struct intel_context *ce;
+
+	ce = create_pinned_context(engine, I915_GEM_HWS_EVICT_ADDR, &evict,
+				   "evict_context");
+	if (IS_ERR(ce))
+		return ce;
+
+	return ce;
+}
+
 /**
  * intel_engines_init_common - initialize cengine state which might require hw access
  * @engine: Engine to initialize.
@@ -912,22 +926,35 @@ static int engine_init_common(struct intel_engine_cs *engine)
 	engine->emit_fini_breadcrumb_dw = ret;
 
 	/*
-	 * The blitter context is used to quickly memset or migrate objects
-	 * in local memory, so it has to always be available.
+	 * The blitter and evict contexts are used to clear and migrate objects
+	 * in local memory so they have to always be available.
 	 */
 	if (engine->class == COPY_ENGINE_CLASS) {
 		ce = create_blitter_context(engine);
 		if (IS_ERR(ce)) {
 			ret = PTR_ERR(ce);
-			goto err_unpin;
+			goto err_blitter;
 		}
 
 		engine->blitter_context = ce;
+
+		if (HAS_LMEM(engine->i915)) {
+			ce = create_evict_context(engine);
+			if (IS_ERR(ce)) {
+				ret = PTR_ERR(ce);
+				goto err_evict;
+			}
+
+			engine->evict_context = ce;
+		}
 	}
 
 	return 0;
 
-err_unpin:
+err_evict:
+	intel_context_unpin(engine->blitter_context);
+	intel_context_put(engine->blitter_context);
+err_blitter:
 	intel_context_unpin(engine->kernel_context);
 err_context:
 	intel_context_put(engine->kernel_context);
@@ -986,6 +1013,11 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 	if (engine->default_state)
 		fput(engine->default_state);
 
+	if (engine->evict_context) {
+		intel_context_unpin(engine->evict_context);
+		intel_context_put(engine->evict_context);
+	}
+
 	if (engine->blitter_context) {
 		intel_context_unpin(engine->blitter_context);
 		intel_context_put(engine->blitter_context);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 69c8ea70d1e8..a5ca95270e92 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -66,10 +66,13 @@ static int __engine_unpark(struct intel_wakeref *wf)
 		ce->ops->reset(ce);
 	}
 
-	if (engine->class == COPY_ENGINE_CLASS) {
-		ce = engine->blitter_context;
+	ce = engine->blitter_context;
+	if (ce)
+		ce->ops->reset(ce);
+
+	ce = engine->evict_context;
+	if (ce)
 		ce->ops->reset(ce);
-	}
 
 	if (engine->unpark)
 		engine->unpark(engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index cb2de4bf86ba..14e92423661b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -348,6 +348,7 @@ struct intel_engine_cs {
 
 	struct intel_context *kernel_context; /* pinned */
 	struct intel_context *blitter_context; /* pinned; exists for BCS only */
+	struct intel_context *evict_context; /* pinned; exists for BCS only */
 
 	intel_engine_mask_t saturated; /* submitting semaphores too late? */
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 146/162] drm/i915/pm: suspend and restore ppgtt mapping
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (144 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 145/162] drm/i915/dg1: Add dedicated context for blitter eviction Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 14:29   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:07 ` [RFC PATCH 147/162] drm/i915/gt: Allocate default ctx objects in SMEM Matthew Auld
                   ` (15 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: Venkata Ramana Nayana, CQ Tang, Prathap Kumar Valsan, dri-devel

From: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>

During suspend we will lose all page tables as they are allocated in
LMEM. In-order to  make sure that the contexts do not access the
corrupted page table after we restore, we are evicting all vma's that
are bound to vm's. This includes kernel vm.

During resume, we are restoring the page tables back to scratch page.

Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  |  13 ++++
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h  |   2 +
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |   4 +
 drivers/gpu/drm/i915/i915_drv.c       | 102 +++++++++++++++++++++++---
 4 files changed, 112 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index b6fcebeef02a..704cab807e0b 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -775,3 +775,16 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 	kfree(ppgtt);
 	return ERR_PTR(err);
 }
+
+void gen8_restore_ppgtt_mappings(struct i915_address_space *vm)
+{
+	const unsigned int count = gen8_pd_top_count(vm);
+	int i;
+
+	for (i = 1; i <= vm->top; i++)
+		fill_px(vm->scratch[i], vm->scratch[i - 1]->encode);
+
+	fill_page_dma(px_base(i915_vm_to_ppgtt(vm)->pd),
+		      vm->scratch[vm->top]->encode, count);
+}
+
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
index 76a08b9c1f5c..3fa4b95aaabd 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
@@ -6,8 +6,10 @@
 #ifndef __GEN8_PPGTT_H__
 #define __GEN8_PPGTT_H__
 
+struct i915_address_space;
 struct intel_gt;
 
+void gen8_restore_ppgtt_mappings(struct i915_address_space *vm);
 struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt);
 
 #endif
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index 34a02643bb75..9b3eacd12a7e 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -9,6 +9,8 @@
 #include "intel_gtt.h"
 #include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_region.h"
+#include "gem/i915_gem_context.h"
+#include "gem/i915_gem_region.h"
 #include "gen6_ppgtt.h"
 #include "gen8_ppgtt.h"
 
@@ -317,3 +319,5 @@ void ppgtt_init(struct i915_ppgtt *ppgtt, struct intel_gt *gt)
 	ppgtt->vm.vma_ops.set_pages   = ppgtt_set_pages;
 	ppgtt->vm.vma_ops.clear_pages = clear_pages;
 }
+
+
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index e8c4931fc818..7115f4db5043 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -64,6 +64,7 @@
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_ioctls.h"
 #include "gem/i915_gem_mman.h"
+#include "gt/gen8_ppgtt.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_pm.h"
 #include "gt/intel_rc6.h"
@@ -1136,13 +1137,13 @@ static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend)
 
 				mutex_unlock(&mem->objects.lock);
 
-				if (in_suspend)
-					i915_gem_object_unbind(obj, 0);
-
 				if (in_suspend) {
 					obj->swapto = NULL;
 					obj->evicted = false;
 					obj->do_swapping = true;
+
+					i915_gem_object_unbind(obj, 0);
+
 					ret = __i915_gem_object_put_pages(obj);
 					obj->do_swapping = false;
 					if (ret) {
@@ -1176,6 +1177,43 @@ static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend)
 	return ret;
 }
 
+static int i915_gem_suspend_ppgtt_mappings(struct drm_i915_private *i915)
+{
+	struct i915_gem_context *ctx, *cn;
+	int ret;
+
+	spin_lock(&i915->gem.contexts.lock);
+	list_for_each_entry_safe(ctx, cn, &i915->gem.contexts.list, link) {
+		struct i915_address_space *vm;
+
+		if (!kref_get_unless_zero(&ctx->ref))
+			continue;
+		spin_unlock(&i915->gem.contexts.lock);
+
+		vm = i915_gem_context_get_vm_rcu(ctx);
+		mutex_lock(&vm->mutex);
+		ret = i915_gem_evict_vm(vm);
+		mutex_unlock(&vm->mutex);
+		if (ret) {
+			GEM_WARN_ON(ret);
+			i915_vm_put(vm);
+			i915_gem_context_put(ctx);
+			return ret;
+		}
+		i915_vm_put(vm);
+		spin_lock(&i915->gem.contexts.lock);
+		list_safe_reset_next(ctx, cn, link);
+		i915_gem_context_put(ctx);
+	}
+	spin_unlock(&i915->gem.contexts.lock);
+
+	mutex_lock(&i915->gt.vm->mutex);
+	ret = i915_gem_evict_vm(i915->gt.vm);
+	mutex_unlock(&i915->gt.vm->mutex);
+
+	return ret;
+}
+
 static int i915_drm_suspend(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
@@ -1213,9 +1251,17 @@ static int i915_drm_suspend(struct drm_device *dev)
 
 	intel_fbdev_set_suspend(dev, FBINFO_STATE_SUSPENDED, true);
 
-	ret = intel_dmem_evict_buffers(dev, true);
-	if (ret)
-		return ret;
+	if (HAS_LMEM(dev_priv))	{
+		ret = intel_dmem_evict_buffers(dev, true);
+		if (ret)
+			return ret;
+
+		i915_teardown_blt_windows(dev_priv);
+
+		ret = i915_gem_suspend_ppgtt_mappings(dev_priv);
+		if (ret)
+			return ret;
+	}
 
 	dev_priv->suspend_count++;
 
@@ -1306,6 +1352,36 @@ int i915_suspend_switcheroo(struct drm_i915_private *i915, pm_message_t state)
 	return i915_drm_suspend_late(&i915->drm, false);
 }
 
+static void i915_gem_restore_ppgtt_mappings(struct drm_i915_private *i915)
+{
+	struct i915_gem_context *ctx, *cn;
+
+	spin_lock(&i915->gem.contexts.lock);
+
+	list_for_each_entry_safe(ctx, cn, &i915->gem.contexts.list, link) {
+		struct i915_address_space *vm;
+
+		if (!kref_get_unless_zero(&ctx->ref))
+			continue;
+
+		spin_unlock(&i915->gem.contexts.lock);
+
+		vm = i915_gem_context_get_vm_rcu(ctx);
+		mutex_lock(&vm->mutex);
+		gen8_restore_ppgtt_mappings(vm);
+		mutex_unlock(&vm->mutex);
+		i915_vm_put(vm);
+		spin_lock(&i915->gem.contexts.lock);
+		list_safe_reset_next(ctx, cn, link);
+		i915_gem_context_put(ctx);
+	}
+	spin_unlock(&i915->gem.contexts.lock);
+
+	mutex_lock(&i915->gt.vm->mutex);
+	gen8_restore_ppgtt_mappings(i915->gt.vm);
+	mutex_unlock(&i915->gt.vm->mutex);
+}
+
 static int i915_drm_resume(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
@@ -1342,9 +1418,17 @@ static int i915_drm_resume(struct drm_device *dev)
 
 	drm_mode_config_reset(dev);
 
-	ret = intel_dmem_evict_buffers(dev, false);
-	if (ret)
-		DRM_ERROR("i915_resume:i915_gem_object_pin_pages failed with err=%d\n", ret);
+	if (HAS_LMEM(dev_priv)) {
+		i915_gem_restore_ppgtt_mappings(dev_priv);
+
+		ret = i915_setup_blt_windows(dev_priv);
+		if (ret)
+			GEM_BUG_ON(ret);
+
+		ret = intel_dmem_evict_buffers(dev, false);
+		if (ret)
+			DRM_ERROR("i915_resume:i915_gem_object_pin_pages failed with err=%d\n", ret);
+	}
 
 	i915_gem_resume(dev_priv);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 147/162] drm/i915/gt: Allocate default ctx objects in SMEM
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (145 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 146/162] drm/i915/pm: suspend and restore ppgtt mapping Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 14:30   ` Chris Wilson
  2020-11-27 12:07 ` [RFC PATCH 148/162] drm/i915: suspend/resume enable blitter eviction Matthew Auld
                   ` (14 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: Venkata Ramana Nayana, Prathap Kumar Valsan, dri-devel

From: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>

If record default objects are created in LMEM and in suspend
pin the pages of obj (src) and use blitter for eviction. But
during request creation using blitter context and try to pin the same
default object, to restore the ctx with default HW values, will leads to
the dead lock situation. To avoid this, safe to keep these
objects in SMEM.

Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
---
 .../drm/i915/gt/intel_execlists_submission.c  | 25 +++++++++++++------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index c640b90711fd..ee5732b436e3 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -4697,7 +4697,13 @@ static int __execlists_context_alloc(struct intel_context *ce,
 		context_size += PAGE_SIZE;
 	}
 
-	if (HAS_LMEM(engine->i915)) {
+	/* FIXME: temporary fix for allocating default ctx objects
+	 * in SMEM, to reslove suspend/resume issues while using
+	 * blitter based eviction. Will remove this once the upstream
+	 * changes merged, where default obj's stored using shmemfs file.
+	 */
+	if (HAS_LMEM(engine->i915) &&
+	    (!IS_DG1(engine->i915) || engine->default_state)) {
 		ctx_obj = i915_gem_object_create_lmem(engine->i915,
 						      context_size,
 						      I915_BO_ALLOC_CONTIGUOUS);
@@ -4707,16 +4713,18 @@ static int __execlists_context_alloc(struct intel_context *ce,
 	if (IS_ERR(ctx_obj))
 		return PTR_ERR(ctx_obj);
 
-	if (HAS_LMEM(engine->i915)) {
+	i915_gem_object_lock_isolated(ctx_obj);
+	if (HAS_LMEM(engine->i915) &&
+	    (!IS_DG1(engine->i915) || engine->default_state)) {
 		ret = context_clear_lmem(ctx_obj);
 		if (ret)
-			goto error_deref_obj;
+			goto error_unlock;
 	}
 
 	vma = i915_vma_instance(ctx_obj, &engine->gt->ggtt->vm, NULL);
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
-		goto error_deref_obj;
+		goto error_unlock;
 	}
 
 	if (!page_mask_bits(ce->timeline)) {
@@ -4732,7 +4740,7 @@ static int __execlists_context_alloc(struct intel_context *ce,
 			tl = intel_timeline_create(engine->gt);
 		if (IS_ERR(tl)) {
 			ret = PTR_ERR(tl);
-			goto error_deref_obj;
+			goto error_unlock;
 		}
 
 		ce->timeline = tl;
@@ -4741,15 +4749,18 @@ static int __execlists_context_alloc(struct intel_context *ce,
 	ring = intel_engine_create_ring(engine, (unsigned long)ce->ring);
 	if (IS_ERR(ring)) {
 		ret = PTR_ERR(ring);
-		goto error_deref_obj;
+		goto error_unlock;
 	}
 
 	ce->ring = ring;
 	ce->state = vma;
 
+	i915_gem_object_unlock(ctx_obj);
+
 	return 0;
 
-error_deref_obj:
+error_unlock:
+	i915_gem_object_unlock(ctx_obj);
 	i915_gem_object_put(ctx_obj);
 	return ret;
 }
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 148/162] drm/i915: suspend/resume enable blitter eviction
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (146 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 147/162] drm/i915/gt: Allocate default ctx objects in SMEM Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 14:32   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:07 ` [RFC PATCH 149/162] drm/i915: suspend/resume handling of perma-pinned objects Matthew Auld
                   ` (13 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: Venkata Ramana Nayana, Prathap Kumar Valsan, dri-devel

From: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>

In suspend mode use blitter eviction before disable the runtime
interrupts and in resume use blitter after the gem resume happens.

Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 36 +++++++++++++--------------------
 1 file changed, 14 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 7115f4db5043..eb5383e4a30b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1110,13 +1110,6 @@ static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend)
 	struct intel_memory_region *mem;
 	int id, ret = 0;
 
-	/*
-	 * FIXME: Presently using memcpy,
-	 * will replace with blitter once
-	 * fix the issues.
-	 */
-	i915->params.enable_eviction = 1;
-
 	for_each_memory_region(mem, i915, id) {
 		struct list_head still_in_list;
 		INIT_LIST_HEAD(&still_in_list);
@@ -1173,7 +1166,6 @@ static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend)
 			mutex_unlock(&mem->objects.lock);
 		}
 	}
-	i915->params.enable_eviction = 3;
 	return ret;
 }
 
@@ -1235,6 +1227,18 @@ static int i915_drm_suspend(struct drm_device *dev)
 
 	intel_dp_mst_suspend(dev_priv);
 
+	if (HAS_LMEM(dev_priv))	{
+		ret = intel_dmem_evict_buffers(dev, true);
+		if (ret)
+			return ret;
+
+		i915_teardown_blt_windows(dev_priv);
+
+		ret = i915_gem_suspend_ppgtt_mappings(dev_priv);
+		if (ret)
+			return ret;
+	}
+
 	intel_runtime_pm_disable_interrupts(dev_priv);
 	intel_hpd_cancel_work(dev_priv);
 
@@ -1251,18 +1255,6 @@ static int i915_drm_suspend(struct drm_device *dev)
 
 	intel_fbdev_set_suspend(dev, FBINFO_STATE_SUSPENDED, true);
 
-	if (HAS_LMEM(dev_priv))	{
-		ret = intel_dmem_evict_buffers(dev, true);
-		if (ret)
-			return ret;
-
-		i915_teardown_blt_windows(dev_priv);
-
-		ret = i915_gem_suspend_ppgtt_mappings(dev_priv);
-		if (ret)
-			return ret;
-	}
-
 	dev_priv->suspend_count++;
 
 	intel_csr_ucode_suspend(dev_priv);
@@ -1418,6 +1410,8 @@ static int i915_drm_resume(struct drm_device *dev)
 
 	drm_mode_config_reset(dev);
 
+	i915_gem_resume(dev_priv);
+
 	if (HAS_LMEM(dev_priv)) {
 		i915_gem_restore_ppgtt_mappings(dev_priv);
 
@@ -1430,8 +1424,6 @@ static int i915_drm_resume(struct drm_device *dev)
 			DRM_ERROR("i915_resume:i915_gem_object_pin_pages failed with err=%d\n", ret);
 	}
 
-	i915_gem_resume(dev_priv);
-
 	intel_modeset_init_hw(dev_priv);
 	intel_init_clock_gating(dev_priv);
 	intel_hpd_init(dev_priv);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 149/162] drm/i915: suspend/resume handling of perma-pinned objects
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (147 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 148/162] drm/i915: suspend/resume enable blitter eviction Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 12:07 ` [RFC PATCH 150/162] drm/i915: need consider system BO snoop for dgfx Matthew Auld
                   ` (12 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: Venkata Ramana Nayana, Prathap Kumar Valsan, dri-devel

From: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>

The objects which are perma-pinned (like guc), use memcpy to evict these objects.
Since the objects are always have pinned pages, so can't use present existing
swapout/swapin functions.

Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 105 +++++++++++++++++++++++++++-----
 1 file changed, 89 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index eb5383e4a30b..c8af68227020 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1103,7 +1103,54 @@ static int i915_drm_prepare(struct drm_device *dev)
 	return 0;
 }
 
-static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend)
+static int i915_gem_perma_pinned_object_swapout(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct drm_i915_gem_object *dst;
+	int err = -EINVAL;
+
+	assert_object_held(obj);
+	dst = i915_gem_object_create_shmem(i915, obj->base.size);
+	if (IS_ERR(dst))
+		return PTR_ERR(dst);
+
+	i915_gem_object_lock_isolated(dst);
+	err = i915_gem_object_memcpy(dst, obj);
+	i915_gem_object_unlock(dst);
+
+	if (!err) {
+		obj->swapto = dst;
+		obj->evicted = true;
+	} else
+		i915_gem_object_put(dst);
+
+	return err;
+}
+
+static int i915_gem_perma_pinned_object_swapin(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_gem_object *src;
+	int err = -EINVAL;
+
+	assert_object_held(obj);
+	src = obj->swapto;
+
+	if (WARN_ON(!i915_gem_object_trylock(src)))
+		return -EBUSY;
+
+	err = i915_gem_object_memcpy(obj, src);
+	i915_gem_object_unlock(src);
+
+	if (!err) {
+		obj->swapto = NULL;
+		obj->evicted = false;
+		i915_gem_object_put(src);
+	}
+	return err;
+}
+
+static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend,
+				    bool perma_pin)
 {
 	struct drm_i915_private *i915 = to_i915(dev);
 	struct drm_i915_gem_object *obj;
@@ -1133,24 +1180,37 @@ static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend)
 				if (in_suspend) {
 					obj->swapto = NULL;
 					obj->evicted = false;
-					obj->do_swapping = true;
 
-					i915_gem_object_unbind(obj, 0);
+					ret = i915_gem_object_unbind(obj, 0);
+					if (ret || i915_gem_object_has_pinned_pages(obj)) {
+						if (!i915_gem_object_trylock(obj)) {
+							ret = -EBUSY;
+							goto next;
+						}
+						ret = i915_gem_perma_pinned_object_swapout(obj);
+						i915_gem_object_unlock(obj);
+						goto next;
+					}
 
+					obj->do_swapping = true;
 					ret = __i915_gem_object_put_pages(obj);
 					obj->do_swapping = false;
-					if (ret) {
-						/*
-						 * FIXME: internal ctx objects still pinned
-						 * returning as BUSY. Presently just evicting
-						 * the user objects, will fix it later
-						 */
+					if (ret)
 						obj->evicted = false;
-						ret = 0;
-					} else
+					else
 						obj->evicted = true;
 				} else {
-					if (obj->swapto && obj->evicted) {
+					if (i915_gem_object_has_pinned_pages(obj) && perma_pin) {
+						if (!i915_gem_object_trylock(obj)) {
+							ret = -EBUSY;
+							goto next;
+						}
+						ret = i915_gem_perma_pinned_object_swapin(obj);
+						/* FIXME: Where is this error message taken care of? */
+						i915_gem_object_unlock(obj);
+					}
+
+					if (obj->swapto && obj->evicted && !perma_pin) {
 						ret = i915_gem_object_pin_pages(obj);
 						if (ret) {
 							i915_gem_object_put(obj);
@@ -1160,7 +1220,10 @@ static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend)
 						}
 					}
 				}
+next:
 				mutex_lock(&mem->objects.lock);
+				if (ret)
+					break;
 			}
 			list_splice_tail(&still_in_list, &mem->objects.list);
 			mutex_unlock(&mem->objects.lock);
@@ -1228,7 +1291,7 @@ static int i915_drm_suspend(struct drm_device *dev)
 	intel_dp_mst_suspend(dev_priv);
 
 	if (HAS_LMEM(dev_priv))	{
-		ret = intel_dmem_evict_buffers(dev, true);
+		ret = intel_dmem_evict_buffers(dev, true, false);
 		if (ret)
 			return ret;
 
@@ -1410,6 +1473,14 @@ static int i915_drm_resume(struct drm_device *dev)
 
 	drm_mode_config_reset(dev);
 
+	if (HAS_LMEM(dev_priv)) {
+		ret = intel_dmem_evict_buffers(dev, false, true);
+		if (ret) {
+			DRM_ERROR("perma pinned obj's failed with err=%d\n", ret);
+			return ret;
+		}
+	}
+
 	i915_gem_resume(dev_priv);
 
 	if (HAS_LMEM(dev_priv)) {
@@ -1419,9 +1490,11 @@ static int i915_drm_resume(struct drm_device *dev)
 		if (ret)
 			GEM_BUG_ON(ret);
 
-		ret = intel_dmem_evict_buffers(dev, false);
-		if (ret)
-			DRM_ERROR("i915_resume:i915_gem_object_pin_pages failed with err=%d\n", ret);
+		ret = intel_dmem_evict_buffers(dev, false, false);
+		if (ret) {
+			DRM_ERROR("gem_object_pin_pages failed with err=%d\n", ret);
+			return ret;
+		}
 	}
 
 	intel_modeset_init_hw(dev_priv);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 150/162] drm/i915: need consider system BO snoop for dgfx
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (148 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 149/162] drm/i915: suspend/resume handling of perma-pinned objects Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 14:36   ` Chris Wilson
  2020-11-27 12:07 ` [RFC PATCH 151/162] drm/i915: move eviction to prepare hook Matthew Auld
                   ` (11 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: Sudeep Dutt, Chris P Wilson, CQ Tang, dri-devel

From: CQ Tang <cq.tang@intel.com>

When cache_level is NONE, we check HAS_LLC(i915).
But additionally for DGFX, we also need to check
HAS_SNOOP(i915) on system memory object to use
I915_BO_CACHE_COHERENT_FOR_READ. on dg1, has_llc=0, and
has_snoop=1. Otherwise, we set obj->cache_choerent=0 and
have performance impact.

Cc: Chris P Wilson <chris.p.wilson@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index ddb448f275eb..be603171c444 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -95,6 +95,20 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 	mutex_init(&obj->mm.get_dma_page.lock);
 }
 
+static bool i915_gem_object_use_llc(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+
+	if (HAS_LLC(i915))
+		return true;
+
+	if (IS_DGFX(i915) && HAS_SNOOP(i915) &&
+	    !i915_gem_object_is_lmem(obj))
+		return true;
+
+	return false;
+}
+
 /**
  * Mark up the object's coherency levels for a given cache_level
  * @obj: #drm_i915_gem_object
@@ -108,7 +122,7 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
 	if (cache_level != I915_CACHE_NONE)
 		obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |
 				       I915_BO_CACHE_COHERENT_FOR_WRITE);
-	else if (HAS_LLC(to_i915(obj->base.dev)))
+	else if (i915_gem_object_use_llc(obj))
 		obj->cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ;
 	else
 		obj->cache_coherent = 0;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 151/162] drm/i915: move eviction to prepare hook
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (149 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 150/162] drm/i915: need consider system BO snoop for dgfx Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 12:07 ` [RFC PATCH 152/162] drm/i915: Perform execbuffer object locking as a separate step Matthew Auld
                   ` (10 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: Lucas De Marchi, dri-devel

From: Lucas De Marchi <lucas.demarchi@intel.com>

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 40 ++++++++++++++++++++++-----------
 1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c8af68227020..b7d40a9c00bf 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -68,6 +68,7 @@
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_pm.h"
 #include "gt/intel_rc6.h"
+#include "gt/intel_gt_requests.h"
 
 #include "i915_debugfs.h"
 #include "i915_drv.h"
@@ -1088,10 +1089,36 @@ static bool suspend_to_idle(struct drm_i915_private *dev_priv)
 	return false;
 }
 
+static int i915_gem_suspend_ppgtt_mappings(struct drm_i915_private *i915);
+
+static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend,
+				    bool perma_pin);
+
 static int i915_drm_prepare(struct drm_device *dev)
 {
 	struct drm_i915_private *i915 = to_i915(dev);
 
+	if (HAS_LMEM(i915))     {
+		struct intel_gt *gt= &i915->gt;
+		long timeout = I915_GEM_IDLE_TIMEOUT;
+		int ret;
+
+		if (intel_gt_wait_for_idle(gt, timeout) == -ETIME) {
+			intel_gt_set_wedged(gt);
+			intel_gt_retire_requests(gt);
+		}
+
+		ret = intel_dmem_evict_buffers(dev, true, false);
+		if (ret)
+			return ret;
+
+		i915_teardown_blt_windows(i915);
+
+		ret = i915_gem_suspend_ppgtt_mappings(i915);
+		if (ret)
+			return ret;
+	}
+
 	/*
 	 * NB intel_display_suspend() may issue new requests after we've
 	 * ostensibly marked the GPU as ready-to-sleep here. We need to
@@ -1274,7 +1301,6 @@ static int i915_drm_suspend(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct pci_dev *pdev = dev_priv->drm.pdev;
 	pci_power_t opregion_target_state;
-	int ret = 0;
 
 	disable_rpm_wakeref_asserts(&dev_priv->runtime_pm);
 
@@ -1290,18 +1316,6 @@ static int i915_drm_suspend(struct drm_device *dev)
 
 	intel_dp_mst_suspend(dev_priv);
 
-	if (HAS_LMEM(dev_priv))	{
-		ret = intel_dmem_evict_buffers(dev, true, false);
-		if (ret)
-			return ret;
-
-		i915_teardown_blt_windows(dev_priv);
-
-		ret = i915_gem_suspend_ppgtt_mappings(dev_priv);
-		if (ret)
-			return ret;
-	}
-
 	intel_runtime_pm_disable_interrupts(dev_priv);
 	intel_hpd_cancel_work(dev_priv);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 152/162] drm/i915: Perform execbuffer object locking as a separate step
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (150 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 151/162] drm/i915: move eviction to prepare hook Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 12:07 ` [RFC PATCH 153/162] drm/i915: Implement eviction locking v2 Matthew Auld
                   ` (9 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Hellström, dri-devel

From: Thomas Hellström <thomas.hellstrom@intel.com>

This is important to help avoid evicting already resident buffers
from the batch we're processing.

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 25 ++++++++++++++++---
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index e73a761a7d1f..c988f8ffd39f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -918,21 +918,38 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
 	return err;
 }
 
-static int eb_validate_vmas(struct i915_execbuffer *eb)
+static int eb_lock_vmas(struct i915_execbuffer *eb)
 {
 	unsigned int i;
 	int err;
 
-	INIT_LIST_HEAD(&eb->unbound);
-
 	for (i = 0; i < eb->buffer_count; i++) {
-		struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
 		struct eb_vma *ev = &eb->vma[i];
 		struct i915_vma *vma = ev->vma;
 
 		err = i915_gem_object_lock(vma->obj, &eb->ww);
 		if (err)
 			return err;
+	}
+
+	return 0;
+}
+
+static int eb_validate_vmas(struct i915_execbuffer *eb)
+{
+	unsigned int i;
+	int err;
+
+	INIT_LIST_HEAD(&eb->unbound);
+
+	err = eb_lock_vmas(eb);
+	if (err)
+		return err;
+
+	for (i = 0; i < eb->buffer_count; i++) {
+		struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
+		struct eb_vma *ev = &eb->vma[i];
+		struct i915_vma *vma = ev->vma;
 
 		err = eb_pin_vma(eb, entry, ev);
 		if (err == -EDEADLK)
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 153/162] drm/i915: Implement eviction locking v2
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (151 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 152/162] drm/i915: Perform execbuffer object locking as a separate step Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 12:07 ` [RFC PATCH 154/162] drm/i915: Support ww eviction Matthew Auld
                   ` (8 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, Thomas Hellström

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Use a separate acquire context list and a separate locking function
for objects that are locked for eviction. These objects are then
properly referenced while on the list and can be unlocked early in
the ww transaction.

Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h    | 67 +++++++++++++++++--
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  5 ++
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  | 14 +++-
 drivers/gpu/drm/i915/i915_gem_ww.c            | 51 ++++++++++----
 drivers/gpu/drm/i915/i915_gem_ww.h            |  3 +
 5 files changed, 122 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 52a36b4052f0..e237b0fb0e79 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -158,6 +158,32 @@ static inline void assert_object_held_shared(struct drm_i915_gem_object *obj)
 		assert_object_held(obj);
 }
 
+static inline int
+i915_gem_object_lock_to_evict(struct drm_i915_gem_object *obj,
+			      struct i915_gem_ww_ctx *ww)
+{
+	int ret;
+
+	if (ww->intr)
+		ret = dma_resv_lock_interruptible(obj->base.resv, &ww->ctx);
+	else
+		ret = dma_resv_lock(obj->base.resv, &ww->ctx);
+
+	if (!ret) {
+		list_add_tail(&obj->obj_link, &ww->eviction_list);
+		i915_gem_object_get(obj);
+		obj->evict_locked = true;
+	}
+
+	GEM_WARN_ON(ret == -EALREADY);
+	if (ret == -EDEADLK) {
+		ww->contended_evict = true;
+		ww->contended = i915_gem_object_get(obj);
+	}
+
+	return ret;
+}
+
 static inline int __i915_gem_object_lock(struct drm_i915_gem_object *obj,
 					 struct i915_gem_ww_ctx *ww,
 					 bool intr)
@@ -169,13 +195,25 @@ static inline int __i915_gem_object_lock(struct drm_i915_gem_object *obj,
 	else
 		ret = dma_resv_lock(obj->base.resv, ww ? &ww->ctx : NULL);
 
-	if (!ret && ww)
+	if (!ret && ww) {
 		list_add_tail(&obj->obj_link, &ww->obj_list);
-	if (ret == -EALREADY)
-		ret = 0;
+		obj->evict_locked = false;
+	}
 
-	if (ret == -EDEADLK)
+	if (ret == -EALREADY) {
+		ret = 0;
+		/* We've already evicted an object needed for this batch. */
+		if (obj->evict_locked) {
+			list_move_tail(&obj->obj_link, &ww->obj_list);
+			i915_gem_object_put(obj);
+			obj->evict_locked = false;
+		}
+	}
+
+	if (ret == -EDEADLK) {
+		ww->contended_evict = false;
 		ww->contended = i915_gem_object_get(obj);
+	}
 
 	return ret;
 }
@@ -580,6 +618,27 @@ i915_gem_object_invalidate_frontbuffer(struct drm_i915_gem_object *obj,
 		__i915_gem_object_invalidate_frontbuffer(obj, origin);
 }
 
+/**
+ * i915_gem_get_locking_ctx - Get the locking context of a locked object
+ * if any.
+ *
+ * @obj: The object to get the locking ctx from
+ *
+ * RETURN: The locking context if the object was locked using a context.
+ * NULL otherwise.
+ */
+static inline struct i915_gem_ww_ctx *
+i915_gem_get_locking_ctx(const struct drm_i915_gem_object *obj)
+{
+	struct ww_acquire_ctx *ctx;
+
+	ctx = obj->base.resv->lock.ctx;
+	if (!ctx)
+		return NULL;
+
+	return container_of(ctx, struct i915_gem_ww_ctx, ctx);
+}
+
 #ifdef CONFIG_MMU_NOTIFIER
 static inline bool
 i915_gem_object_is_userptr(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 331d113f7d5b..c42c0d3d5d67 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -142,6 +142,11 @@ struct drm_i915_gem_object {
 	 */
 	struct list_head obj_link;
 
+	/**
+	 * @evict_locked: Whether @obj_link sits on the eviction_list
+	 */
+	bool evict_locked;
+
 	/** Stolen memory for this object, instead of being backed by shmem. */
 	struct drm_mm_node *stolen;
 	union {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index 27674048f17d..59d0f14b90ea 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -100,6 +100,7 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
 		unsigned long *nr_scanned,
 		unsigned int shrink)
 {
+	struct drm_i915_gem_object *obj;
 	const struct {
 		struct list_head *list;
 		unsigned int bit;
@@ -164,7 +165,6 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
 	 */
 	for (phase = phases; phase->list; phase++) {
 		struct list_head still_in_list;
-		struct drm_i915_gem_object *obj;
 		unsigned long flags;
 
 		if ((shrink & phase->bit) == 0)
@@ -197,6 +197,10 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
 			if (!can_release_pages(obj))
 				continue;
 
+			/* Already locked this object? */
+			if (ww && ww == i915_gem_get_locking_ctx(obj))
+				continue;
+
 			if (!kref_get_unless_zero(&obj->base.refcount))
 				continue;
 
@@ -209,7 +213,11 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
 					if (!i915_gem_object_trylock(obj))
 						goto skip;
 				} else {
-					err = i915_gem_object_lock(obj, ww);
+					err = i915_gem_object_lock_to_evict(obj, ww);
+					if (err == -EALREADY) {
+						err = 0;
+						goto skip;
+					}
 					if (err)
 						goto skip;
 				}
@@ -235,6 +243,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
 		if (err)
 			return err;
 	}
+	if (ww)
+		i915_gem_ww_ctx_unlock_evictions(ww);
 
 	if (shrink & I915_SHRINK_BOUND)
 		intel_runtime_pm_put(&i915->runtime_pm, wakeref);
diff --git a/drivers/gpu/drm/i915/i915_gem_ww.c b/drivers/gpu/drm/i915/i915_gem_ww.c
index 43960d8595eb..811bf7677d78 100644
--- a/drivers/gpu/drm/i915/i915_gem_ww.c
+++ b/drivers/gpu/drm/i915/i915_gem_ww.c
@@ -10,24 +10,45 @@ void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ww, bool intr)
 {
 	ww_acquire_init(&ww->ctx, &reservation_ww_class);
 	INIT_LIST_HEAD(&ww->obj_list);
+	INIT_LIST_HEAD(&ww->eviction_list);
 	ww->intr = intr;
 	ww->contended = NULL;
+	ww->contended_evict = false;
+}
+
+void i915_gem_ww_ctx_unlock_evictions(struct i915_gem_ww_ctx *ww)
+{
+	struct drm_i915_gem_object *obj, *next;
+
+	list_for_each_entry_safe(obj, next, &ww->eviction_list, obj_link) {
+		list_del(&obj->obj_link);
+		GEM_WARN_ON(!obj->evict_locked);
+		i915_gem_object_unlock(obj);
+		i915_gem_object_put(obj);
+	}
 }
 
 static void i915_gem_ww_ctx_unlock_all(struct i915_gem_ww_ctx *ww)
 {
-	struct drm_i915_gem_object *obj;
+	struct drm_i915_gem_object *obj, *next;
 
-	while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) {
+	list_for_each_entry_safe(obj, next, &ww->obj_list, obj_link) {
 		list_del(&obj->obj_link);
+		GEM_WARN_ON(obj->evict_locked);
 		i915_gem_object_unlock(obj);
 	}
+
+	i915_gem_ww_ctx_unlock_evictions(ww);
 }
 
 void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj)
 {
+	bool evict_locked = obj->evict_locked;
+
 	list_del(&obj->obj_link);
 	i915_gem_object_unlock(obj);
+	if (evict_locked)
+		i915_gem_object_put(obj);
 }
 
 void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ww)
@@ -39,27 +60,33 @@ void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ww)
 
 int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ww)
 {
+	struct drm_i915_gem_object *obj = ww->contended;
 	int ret = 0;
 
-	if (WARN_ON(!ww->contended))
+	if (WARN_ON(!obj))
 		return -EINVAL;
 
 	i915_gem_ww_ctx_unlock_all(ww);
 	if (ww->intr)
-		ret = dma_resv_lock_slow_interruptible(ww->contended->base.resv, &ww->ctx);
+		ret = dma_resv_lock_slow_interruptible(obj->base.resv, &ww->ctx);
 	else
-		dma_resv_lock_slow(ww->contended->base.resv, &ww->ctx);
+		dma_resv_lock_slow(obj->base.resv, &ww->ctx);
+	if (ret)
+		goto out;
 
 	/*
-	 * Unlocking the contended lock again, as might not need it in
-	 * the retried transaction. This does not increase starvation,
-	 * but it's opening up for a wakeup flood if there are many
-	 * transactions relaxing on this object.
+	 * Unlocking the contended lock again, if it was locked for eviction.
+	 * We will most likely not need it in the retried transaction.
 	 */
-	if (!ret)
-		dma_resv_unlock(ww->contended->base.resv);
+	if (ww->contended_evict) {
+		dma_resv_unlock(obj->base.resv);
+	} else {
+		obj->evict_locked = false;
+		list_add_tail(&obj->obj_link, &ww->obj_list);
+	}
 
-	i915_gem_object_put(ww->contended);
+out:
+	i915_gem_object_put(obj);
 	ww->contended = NULL;
 
 	return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h b/drivers/gpu/drm/i915/i915_gem_ww.h
index f6b1a796667b..11793b170cc2 100644
--- a/drivers/gpu/drm/i915/i915_gem_ww.h
+++ b/drivers/gpu/drm/i915/i915_gem_ww.h
@@ -10,15 +10,18 @@
 struct i915_gem_ww_ctx {
 	struct ww_acquire_ctx ctx;
 	struct list_head obj_list;
+	struct list_head eviction_list;
 	struct drm_i915_gem_object *contended;
 	unsigned short intr;
 	unsigned short loop;
+	unsigned short contended_evict;
 };
 
 void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
 void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ctx);
 int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ctx);
 void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj);
+void i915_gem_ww_ctx_unlock_evictions(struct i915_gem_ww_ctx *ww);
 
 /* Internal functions used by the inlines! Don't use. */
 static inline int __i915_gem_ww_fini(struct i915_gem_ww_ctx *ww, int err)
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 154/162] drm/i915: Support ww eviction
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (152 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 153/162] drm/i915: Implement eviction locking v2 Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 12:07 ` [RFC PATCH 155/162] drm/i915: Use a ww transaction in the fault handler Matthew Auld
                   ` (7 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Hellström, dri-devel

From: Thomas Hellström <thomas.hellstrom@intel.com>

Use sleeping ww locks if we're in a ww transaction.
Trylock otherwise.
We unlock the evicted objects either when eviction failed or
when we've reached the target. The ww ticket locks will then
ensure we will eventually succeed reaching the target if there
is evictable space available. However another process may still
steal the evicted memory before we have a chance to allocate it.
To ensure we eventually succeed we need to move the evict unlock
until after get pages succeeds. That's considered a TODO for now.

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_region.c |  7 ++-
 drivers/gpu/drm/i915/intel_memory_region.c | 57 ++++++++++++++++------
 drivers/gpu/drm/i915/intel_memory_region.h |  2 +
 3 files changed, 49 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c
index 1ec6528498c8..8ec59fbaa3e6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
@@ -204,6 +204,7 @@ i915_gem_object_get_pages_buddy(struct drm_i915_gem_object *obj)
 	struct scatterlist *sg;
 	unsigned int sg_page_sizes;
 	int ret;
+	struct i915_gem_ww_ctx *ww = i915_gem_get_locking_ctx(obj);
 
 	/* XXX: Check if we have any post. This is nasty hack, see gem_create */
 	if (obj->mm.gem_create_posted_err)
@@ -222,7 +223,8 @@ i915_gem_object_get_pages_buddy(struct drm_i915_gem_object *obj)
 	if (obj->flags & I915_BO_ALLOC_CONTIGUOUS)
 		flags |= I915_ALLOC_CONTIGUOUS;
 
-	ret = __intel_memory_region_get_pages_buddy(mem, size, flags, blocks);
+	ret = __intel_memory_region_get_pages_buddy(mem, ww, size, flags,
+						    blocks);
 	if (ret)
 		goto err_free_sg;
 
@@ -277,7 +279,8 @@ i915_gem_object_get_pages_buddy(struct drm_i915_gem_object *obj)
 		if (ret) {
 			/* swapin failed, free the pages */
 			__intel_memory_region_put_pages_buddy(mem, blocks);
-			ret = -ENXIO;
+			if (ret != -EDEADLK && ret != -EINTR)
+				ret = -ENXIO;
 			goto err_free_sg;
 		}
 	} else if (obj->flags & I915_BO_ALLOC_CPU_CLEAR) {
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index 57f01ef16628..6b26b6cd5958 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -96,6 +96,7 @@ __intel_memory_region_put_block_buddy(struct i915_buddy_block *block)
 }
 
 static int intel_memory_region_evict(struct intel_memory_region *mem,
+				     struct i915_gem_ww_ctx *ww,
 				     resource_size_t target)
 {
 	struct drm_i915_private *i915 = mem->i915;
@@ -109,6 +110,7 @@ static int intel_memory_region_evict(struct intel_memory_region *mem,
 	struct list_head **phase;
 	resource_size_t found;
 	int pass;
+	int err = 0;
 
 	intel_gt_retire_requests(&i915->gt);
 
@@ -126,10 +128,11 @@ static int intel_memory_region_evict(struct intel_memory_region *mem,
 						mm.region_link))) {
 		list_move_tail(&obj->mm.region_link, &still_in_list);
 
-		if (!i915_gem_object_has_pages(obj))
+		if (i915_gem_object_is_framebuffer(obj))
 			continue;
 
-		if (i915_gem_object_is_framebuffer(obj))
+		/* Already locked this object? */
+		if (ww && ww == i915_gem_get_locking_ctx(obj))
 			continue;
 
 		/*
@@ -147,34 +150,51 @@ static int intel_memory_region_evict(struct intel_memory_region *mem,
 
 		mutex_unlock(&mem->objects.lock);
 
+		if (ww) {
+			err = i915_gem_object_lock_to_evict(obj, ww);
+			if (err)
+				goto put;
+		} else {
+			if (!i915_gem_object_trylock(obj))
+				goto put;
+		}
+
+		if (!i915_gem_object_has_pages(obj))
+			goto unlock;
+
 		/* tell callee to do swapping */
 		if (i915_gem_object_type_has(obj, I915_GEM_OBJECT_HAS_IOMEM)
 		    && pass == 1)
 			obj->do_swapping = true;
 
 		if (!i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE)) {
-			if (i915_gem_object_trylock(obj)) {
-				__i915_gem_object_put_pages(obj);
-				/* May arrive from get_pages on another bo */
-				if (!i915_gem_object_has_pages(obj)) {
-					found += obj->base.size;
-					if (obj->mm.madv == I915_MADV_DONTNEED)
-						obj->mm.madv = __I915_MADV_PURGED;
-				}
-				i915_gem_object_unlock(obj);
+			__i915_gem_object_put_pages(obj);
+			/* May arrive from get_pages on another bo */
+
+			if (!i915_gem_object_has_pages(obj)) {
+				found += obj->base.size;
+				if (obj->mm.madv == I915_MADV_DONTNEED)
+					obj->mm.madv = __I915_MADV_PURGED;
 			}
 		}
 
 		obj->do_swapping = false;
+unlock:
+		if (!ww)
+			i915_gem_object_unlock(obj);
+put:
 		i915_gem_object_put(obj);
 		mutex_lock(&mem->objects.lock);
 
-		if (found >= target)
+		if (err == -EDEADLK || err == -EINTR || found >= target)
 			break;
 	}
 	list_splice_tail(&still_in_list, *phase);
 	mutex_unlock(&mem->objects.lock);
 
+	if (err == -EDEADLK || err == -EINTR)
+		return err;
+
 	if (found < target && i915->params.enable_eviction) {
 		pass++;
 		phase++;
@@ -182,11 +202,15 @@ static int intel_memory_region_evict(struct intel_memory_region *mem,
 			goto next;
 	}
 
+	if (ww)
+		i915_gem_ww_ctx_unlock_evictions(ww);
+
 	return (found < target) ? -ENOSPC : 0;
 }
 
 int
 __intel_memory_region_get_pages_buddy(struct intel_memory_region *mem,
+				      struct i915_gem_ww_ctx *ww,
 				      resource_size_t size,
 				      unsigned int flags,
 				      struct list_head *blocks)
@@ -194,6 +218,7 @@ __intel_memory_region_get_pages_buddy(struct intel_memory_region *mem,
 	unsigned int min_order = 0;
 	unsigned int max_order;
 	unsigned long n_pages;
+	int err;
 
 	GEM_BUG_ON(!IS_ALIGNED(size, mem->mm.chunk_size));
 	GEM_BUG_ON(!list_empty(blocks));
@@ -241,12 +266,11 @@ __intel_memory_region_get_pages_buddy(struct intel_memory_region *mem,
 
 			if (order-- == min_order) {
 				resource_size_t target;
-				int err;
 
 				target = n_pages * mem->mm.chunk_size;
 
 				mutex_unlock(&mem->mm_lock);
-				err = intel_memory_region_evict(mem,
+				err = intel_memory_region_evict(mem, ww,
 								target);
 				mutex_lock(&mem->mm_lock);
 				if (err)
@@ -272,6 +296,9 @@ __intel_memory_region_get_pages_buddy(struct intel_memory_region *mem,
 err_free_blocks:
 	intel_memory_region_free_pages(mem, blocks);
 	mutex_unlock(&mem->mm_lock);
+	if (err == -EDEADLK || err == -EINTR)
+		return err;
+
 	return -ENXIO;
 }
 
@@ -284,7 +311,7 @@ __intel_memory_region_get_block_buddy(struct intel_memory_region *mem,
 	LIST_HEAD(blocks);
 	int ret;
 
-	ret = __intel_memory_region_get_pages_buddy(mem, size, flags, &blocks);
+	ret = __intel_memory_region_get_pages_buddy(mem, NULL, size, flags, &blocks);
 	if (ret)
 		return ERR_PTR(ret);
 
diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
index 0bfc1fa36f74..ff1d97667618 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.h
+++ b/drivers/gpu/drm/i915/intel_memory_region.h
@@ -16,6 +16,7 @@
 
 #include "i915_buddy.h"
 
+struct i915_gem_ww_ctx;
 struct drm_i915_private;
 struct drm_i915_gem_object;
 struct intel_memory_region;
@@ -116,6 +117,7 @@ int intel_memory_region_init_buddy(struct intel_memory_region *mem);
 void intel_memory_region_release_buddy(struct intel_memory_region *mem);
 
 int __intel_memory_region_get_pages_buddy(struct intel_memory_region *mem,
+					  struct i915_gem_ww_ctx *ww,
 					  resource_size_t size,
 					  unsigned int flags,
 					  struct list_head *blocks);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 155/162] drm/i915: Use a ww transaction in the fault handler
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (153 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 154/162] drm/i915: Support ww eviction Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 12:07 ` [RFC PATCH 156/162] drm/i915: Use a ww transaction in i915_gem_object_pin_map_unlocked() Matthew Auld
                   ` (6 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Hellström, dri-devel

From: Thomas Hellström <thomas.hellstrom@intel.com>

Prefer a ww transaction rather than a single object lock to
enable sleeping lock eviction if reached by the fault
handler.

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c | 45 +++++++++++++-----------
 1 file changed, 24 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index 33ccd4d665d4..a9526cc309d3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -238,6 +238,7 @@ static vm_fault_t vm_fault_cpu(struct vm_fault *vmf)
 	struct vm_area_struct *area = vmf->vma;
 	struct i915_mmap_offset *mmo = area->vm_private_data;
 	struct drm_i915_gem_object *obj = mmo->obj;
+	struct i915_gem_ww_ctx ww;
 	resource_size_t iomap;
 	int err;
 
@@ -246,33 +247,35 @@ static vm_fault_t vm_fault_cpu(struct vm_fault *vmf)
 		     area->vm_flags & VM_WRITE))
 		return VM_FAULT_SIGBUS;
 
-	if (i915_gem_object_lock_interruptible(obj, NULL))
-		return VM_FAULT_NOPAGE;
+	for_i915_gem_ww(&ww, err, true) {
+		err = i915_gem_object_lock(obj, &ww);
+		if (err)
+			continue;
 
-	err = i915_gem_object_pin_pages(obj);
-	if (err)
-		goto out;
+		err = i915_gem_object_pin_pages(obj);
+		if (err)
+			continue;
 
-	iomap = -1;
-	if (!i915_gem_object_has_struct_page(obj)) {
-		iomap = obj->mm.region->iomap.base;
-		iomap -= obj->mm.region->region.start;
-	}
+		iomap = -1;
+		if (!i915_gem_object_has_struct_page(obj)) {
+			iomap = obj->mm.region->iomap.base;
+			iomap -= obj->mm.region->region.start;
+		}
 
-	/* PTEs are revoked in obj->ops->put_pages() */
-	err = remap_io_sg(area,
-			  area->vm_start, area->vm_end - area->vm_start,
-			  obj->mm.pages->sgl, iomap);
+		/* PTEs are revoked in obj->ops->put_pages() */
+		err = remap_io_sg(area,
+				  area->vm_start, area->vm_end - area->vm_start,
+				  obj->mm.pages->sgl, iomap);
 
-	if (area->vm_flags & VM_WRITE) {
-		GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
-		obj->mm.dirty = true;
-	}
+		if (area->vm_flags & VM_WRITE) {
+			GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
+			obj->mm.dirty = true;
+		}
 
-	i915_gem_object_unpin_pages(obj);
+		i915_gem_object_unpin_pages(obj);
+		/* Implicit unlock */
+	}
 
-out:
-	i915_gem_object_unlock(obj);
 	return i915_error_to_vmf_fault(err);
 }
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 156/162] drm/i915: Use a ww transaction in i915_gem_object_pin_map_unlocked()
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (154 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 155/162] drm/i915: Use a ww transaction in the fault handler Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 12:07 ` [RFC PATCH 157/162] drm/i915: Improve accuracy of eviction stats Matthew Auld
                   ` (5 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: Thomas Hellström, dri-devel

From: Thomas Hellström <thomas.hellstrom@intel.com>

By using a ww transaction, anybody using this function and ending up
evicting objects can use sleeping waits when locking objects to evict.

Signed-off-by: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_pages.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index d0f3da0925f5..0c20f9b18956 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -425,11 +425,22 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
 void *i915_gem_object_pin_map_unlocked(struct drm_i915_gem_object *obj,
 				       enum i915_map_type type)
 {
+	struct i915_gem_ww_ctx ww;
 	void *ret;
+	int err;
 
-	i915_gem_object_lock(obj, NULL);
-	ret = i915_gem_object_pin_map(obj, type);
-	i915_gem_object_unlock(obj);
+	for_i915_gem_ww(&ww, err, false) {
+		err = i915_gem_object_lock(obj, &ww);
+		if (err)
+			continue;
+
+		ret = i915_gem_object_pin_map(obj, type);
+		if (IS_ERR(ret))
+			err = PTR_ERR(ret);
+		/* Implicit unlock */
+	}
+	if (err)
+		return ERR_PTR(err);
 
 	return ret;
 }
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 157/162] drm/i915: Improve accuracy of eviction stats
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (155 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 156/162] drm/i915: Use a ww transaction in i915_gem_object_pin_map_unlocked() Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 14:40   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:07 ` [RFC PATCH 158/162] drm/i915: Support ww locks in suspend/resume Matthew Auld
                   ` (4 subsequent siblings)
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: Tvrtko Ursulin, Mika Kuoppala, Sudeep Dutt, dri-devel, CQ Tang

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Current code uses jiffie time to do the accounting and then does:

  diff = jiffies - start;
  msec = diff * 1000 / HZ;
  ...
  atomic_long_add(msec, &i915->time_swap_out_ms);

If we assume jiffie can be as non-granular as 10ms and that the current
accounting records all evictions faster than one jiffie as infinite speed,
we can end up over-estimating the reported eviction throughput.

Fix this by accumulating ktime_t and only dividing to more user friendly
granularity at presentation time (debugfs read).

At the same time consolidate the code a bit and convert from multiple
atomics to single seqlock per stat.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_region.c | 67 ++++++++++----------
 drivers/gpu/drm/i915/i915_debugfs.c        | 73 +++++++++++-----------
 drivers/gpu/drm/i915/i915_drv.h            | 25 +++++---
 drivers/gpu/drm/i915/i915_gem.c            |  5 ++
 4 files changed, 90 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c
index 8ec59fbaa3e6..1a390e502d5a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
@@ -9,14 +9,29 @@
 #include "i915_trace.h"
 #include "i915_gem_mman.h"
 
+static void
+__update_stat(struct i915_mm_swap_stat *stat,
+	      unsigned long pages,
+	      ktime_t start)
+{
+	if (stat) {
+		start = ktime_get() - start;
+
+		write_seqlock(&stat->lock);
+		stat->time = ktime_add(stat->time, start);
+		stat->pages += pages;
+		write_sequnlock(&stat->lock);
+	}
+}
+
 static int
 i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
 			      struct sg_table *pages, unsigned int sizes)
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct i915_mm_swap_stat *stat = NULL;
 	struct drm_i915_gem_object *dst, *src;
-	unsigned long start, diff, msec;
-	bool blt_completed = false;
+	ktime_t start = ktime_get();
 	int err = -EINVAL;
 
 	GEM_BUG_ON(obj->swapto);
@@ -26,7 +41,6 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
 	GEM_BUG_ON(!i915->params.enable_eviction);
 
 	assert_object_held(obj);
-	start = jiffies;
 
 	/* create a shadow object on smem region */
 	dst = i915_gem_object_create_shmem(i915, obj->base.size);
@@ -58,10 +72,14 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
 	if (i915->params.enable_eviction >= 2) {
 		err = i915_window_blt_copy(dst, src);
 		if (!err)
-			blt_completed = true;
+			stat = &i915->mm.blt_swap_stats.out;
 	}
-	if (err && i915->params.enable_eviction != 2)
+
+	if (err && i915->params.enable_eviction != 2) {
 		err = i915_gem_object_memcpy(dst, src);
+		if (!err)
+			stat = &i915->mm.memcpy_swap_stats.out;
+	}
 
 	__i915_gem_object_unpin_pages(src);
 	__i915_gem_object_unset_pages(src);
@@ -73,18 +91,7 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
 	else
 		i915_gem_object_put(dst);
 
-	if (!err) {
-		diff = jiffies - start;
-		msec = diff * 1000 / HZ;
-		if (blt_completed) {
-			atomic_long_add(sizes, &i915->num_bytes_swapped_out);
-			atomic_long_add(msec, &i915->time_swap_out_ms);
-		} else {
-			atomic_long_add(sizes,
-					&i915->num_bytes_swapped_out_memcpy);
-			atomic_long_add(msec, &i915->time_swap_out_ms_memcpy);
-		}
-	}
+	__update_stat(stat, sizes >> PAGE_SHIFT, start);
 
 	return err;
 }
@@ -94,9 +101,9 @@ i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
 			     struct sg_table *pages, unsigned int sizes)
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct i915_mm_swap_stat *stat = NULL;
 	struct drm_i915_gem_object *dst, *src;
-	unsigned long start, diff, msec;
-	bool blt_completed = false;
+	ktime_t start = ktime_get();
 	int err = -EINVAL;
 
 	GEM_BUG_ON(!obj->swapto);
@@ -106,7 +113,6 @@ i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
 	GEM_BUG_ON(!i915->params.enable_eviction);
 
 	assert_object_held(obj);
-	start = jiffies;
 
 	src = obj->swapto;
 
@@ -134,10 +140,14 @@ i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
 	if (i915->params.enable_eviction >= 2) {
 		err = i915_window_blt_copy(dst, src);
 		if (!err)
-			blt_completed = true;
+			stat = &i915->mm.blt_swap_stats.in;
 	}
-	if (err && i915->params.enable_eviction != 2)
+
+	if (err && i915->params.enable_eviction != 2) {
 		err = i915_gem_object_memcpy(dst, src);
+		if (!err)
+			stat = &i915->mm.memcpy_swap_stats.in;
+	}
 
 	__i915_gem_object_unpin_pages(dst);
 	__i915_gem_object_unset_pages(dst);
@@ -149,18 +159,7 @@ i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
 		i915_gem_object_put(src);
 	}
 
-	if (!err) {
-		diff = jiffies - start;
-		msec = diff * 1000 / HZ;
-		if (blt_completed) {
-			atomic_long_add(sizes, &i915->num_bytes_swapped_in);
-			atomic_long_add(msec, &i915->time_swap_in_ms);
-		} else {
-			atomic_long_add(sizes,
-					&i915->num_bytes_swapped_in_memcpy);
-			atomic_long_add(msec, &i915->time_swap_in_ms_memcpy);
-		}
-	}
+	__update_stat(stat, sizes >> PAGE_SHIFT, start);
 
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 983030ac39e1..f06f900b598e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -359,12 +359,46 @@ static void print_context_stats(struct seq_file *m,
 	print_file_stats(m, "[k]contexts", kstats);
 }
 
+static void
+evict_stat(struct seq_file *m,
+	   const char *name,
+	   const char *direction,
+	   struct i915_mm_swap_stat *stat)
+{
+	unsigned long pages;
+	unsigned int seq;
+	u64 time, rate;
+	ktime_t ktime;
+
+	do {
+		seq = read_seqbegin(&stat->lock);
+		pages = stat->pages;
+		ktime = stat->time;
+	} while (read_seqretry(&stat->lock, seq));
+
+	time = ktime_to_us(ktime);
+	rate = time ? div64_u64((u64)pages * PAGE_SIZE, time) : 0;
+	rate = div64_ul(rate * USEC_PER_SEC, 1024 * 1024);
+
+	seq_printf(m, "%s swap %s %lu MiB in %llums, %llu MiB/s.\n",
+		   name, direction, pages * PAGE_SIZE, ktime_to_ms(ktime),
+		   rate);
+}
+
+static void
+evict_stats(struct seq_file *m,
+	    const char *name,
+	    struct i915_mm_swap_stats *stats)
+{
+	evict_stat(m, name, "in", &stats->in);
+	evict_stat(m, name, "out", &stats->out);
+}
+
 static int i915_gem_object_info(struct seq_file *m, void *data)
 {
 	struct drm_i915_private *i915 = node_to_i915(m->private);
 	struct intel_memory_region *mr;
 	enum intel_region_id id;
-	u64 time, bytes, rate;
 
 	seq_printf(m, "%u shrinkable [%u free] objects, %llu bytes\n",
 		   i915->mm.shrink_count,
@@ -374,41 +408,8 @@ static int i915_gem_object_info(struct seq_file *m, void *data)
 		seq_printf(m, "%s: total:%pa, available:%pa bytes\n",
 			   mr->name, &mr->total, &mr->avail);
 
-	time = atomic_long_read(&i915->time_swap_out_ms);
-	bytes = atomic_long_read(&i915->num_bytes_swapped_out);
-	if (time)
-		rate = div64_u64(bytes * 1000, time * 1024 * 1024);
-	else
-		rate = 0;
-	seq_printf(m, "BLT: swapout %llu Bytes in %llu mSec(%llu MB/Sec)\n",
-		   bytes, time, rate);
-
-	time = atomic_long_read(&i915->time_swap_in_ms);
-	bytes = atomic_long_read(&i915->num_bytes_swapped_in);
-	if (time)
-		rate = div64_u64(bytes * 1000, time * 1024 * 1024);
-	else
-		rate = 0;
-	seq_printf(m, "BLT: swapin %llu Bytes in %llu mSec(%llu MB/Sec)\n",
-		   bytes, time, rate);
-
-	time = atomic_long_read(&i915->time_swap_out_ms_memcpy);
-	bytes = atomic_long_read(&i915->num_bytes_swapped_out_memcpy);
-	if (time)
-		rate = div64_u64(bytes * 1000, time * 1024 * 1024);
-	else
-		rate = 0;
-	seq_printf(m, "Memcpy: swapout %llu Bytes in %llu mSec(%llu MB/Sec)\n",
-		   bytes, time, rate);
-
-	time = atomic_long_read(&i915->time_swap_in_ms_memcpy);
-	bytes = atomic_long_read(&i915->num_bytes_swapped_in_memcpy);
-	if (time)
-		rate = div64_u64(bytes * 1000, time * 1024 * 1024);
-	else
-		rate = 0;
-	seq_printf(m, "Memcpy: swapin %llu Bytes in %llu mSec(%llu MB/Sec)\n",
-		   bytes, time, rate);
+	evict_stats(m, "Blitter", &i915->mm.blt_swap_stats);
+	evict_stats(m, "Memcpy", &i915->mm.memcpy_swap_stats);
 	seq_putc(m, '\n');
 
 	print_context_stats(m, i915);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6f0ab363bdee..45511f2d8da0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -49,6 +49,7 @@
 #include <linux/shmem_fs.h>
 #include <linux/stackdepot.h>
 #include <linux/xarray.h>
+#include <linux/seqlock.h>
 
 #include <drm/intel-gtt.h>
 #include <drm/drm_legacy.h> /* for struct drm_dma_handle */
@@ -548,6 +549,17 @@ struct intel_l3_parity {
 	int which_slice;
 };
 
+struct i915_mm_swap_stat {
+	seqlock_t lock;
+	unsigned long pages;
+	ktime_t time;
+};
+
+struct i915_mm_swap_stats {
+	struct i915_mm_swap_stat in;
+	struct i915_mm_swap_stat out;
+};
+
 struct i915_gem_mm {
 	/* Protects bound_list/unbound_list and #drm_i915_gem_object.mm.link */
 	spinlock_t obj_lock;
@@ -601,6 +613,9 @@ struct i915_gem_mm {
 
 	/* To protect above two set of vmas */
 	wait_queue_head_t window_queue;
+
+	struct i915_mm_swap_stats blt_swap_stats;
+	struct i915_mm_swap_stats memcpy_swap_stats;
 };
 
 #define I915_IDLE_ENGINES_TIMEOUT (200) /* in ms */
@@ -1220,16 +1235,6 @@ struct drm_i915_private {
 	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
 	 * will be rejected. Instead look for a better place.
 	 */
-
-	atomic_long_t num_bytes_swapped_out;
-	atomic_long_t num_bytes_swapped_in;
-	atomic_long_t time_swap_out_ms;
-	atomic_long_t time_swap_in_ms;
-
-	atomic_long_t num_bytes_swapped_out_memcpy;
-	atomic_long_t num_bytes_swapped_in_memcpy;
-	atomic_long_t time_swap_out_ms_memcpy;
-	atomic_long_t time_swap_in_ms_memcpy;
 };
 
 static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 85cbdb8e2bb8..e94f3f689b30 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1151,6 +1151,11 @@ static void i915_gem_init__mm(struct drm_i915_private *i915)
 	INIT_LIST_HEAD(&i915->mm.purge_list);
 	INIT_LIST_HEAD(&i915->mm.shrink_list);
 
+	seqlock_init(&i915->mm.blt_swap_stats.in.lock);
+	seqlock_init(&i915->mm.blt_swap_stats.out.lock);
+	seqlock_init(&i915->mm.memcpy_swap_stats.in.lock);
+	seqlock_init(&i915->mm.memcpy_swap_stats.out.lock);
+
 	i915_gem_init__objects(i915);
 }
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 158/162] drm/i915: Support ww locks in suspend/resume
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (156 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 157/162] drm/i915: Improve accuracy of eviction stats Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 12:07 ` [RFC PATCH 159/162] drm/i915/dg1: Fix mapping type for default state object Matthew Auld
                   ` (3 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: Venkata Ramana Nayana, dri-devel

From: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>

Add ww locks during suspend/resume.

Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 33 ++++++++++++++++++---------------
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index b7d40a9c00bf..c41865d5bf1e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1099,7 +1099,7 @@ static int i915_drm_prepare(struct drm_device *dev)
 	struct drm_i915_private *i915 = to_i915(dev);
 
 	if (HAS_LMEM(i915))     {
-		struct intel_gt *gt= &i915->gt;
+		struct intel_gt *gt = &i915->gt;
 		long timeout = I915_GEM_IDLE_TIMEOUT;
 		int ret;
 
@@ -1182,7 +1182,8 @@ static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend,
 	struct drm_i915_private *i915 = to_i915(dev);
 	struct drm_i915_gem_object *obj;
 	struct intel_memory_region *mem;
-	int id, ret = 0;
+	struct i915_gem_ww_ctx ww;
+	int id, ret = 0, err = 0;
 
 	for_each_memory_region(mem, i915, id) {
 		struct list_head still_in_list;
@@ -1204,19 +1205,20 @@ static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend,
 
 				mutex_unlock(&mem->objects.lock);
 
+				i915_gem_ww_ctx_init (&ww, true);
+retry:
+				err = i915_gem_object_lock(obj, &ww);
+				if (err)
+					goto out_err;
+
 				if (in_suspend) {
 					obj->swapto = NULL;
 					obj->evicted = false;
 
 					ret = i915_gem_object_unbind(obj, 0);
 					if (ret || i915_gem_object_has_pinned_pages(obj)) {
-						if (!i915_gem_object_trylock(obj)) {
-							ret = -EBUSY;
-							goto next;
-						}
 						ret = i915_gem_perma_pinned_object_swapout(obj);
-						i915_gem_object_unlock(obj);
-						goto next;
+						goto out_err;
 					}
 
 					obj->do_swapping = true;
@@ -1228,13 +1230,7 @@ static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend,
 						obj->evicted = true;
 				} else {
 					if (i915_gem_object_has_pinned_pages(obj) && perma_pin) {
-						if (!i915_gem_object_trylock(obj)) {
-							ret = -EBUSY;
-							goto next;
-						}
 						ret = i915_gem_perma_pinned_object_swapin(obj);
-						/* FIXME: Where is this error message taken care of? */
-						i915_gem_object_unlock(obj);
 					}
 
 					if (obj->swapto && obj->evicted && !perma_pin) {
@@ -1247,7 +1243,14 @@ static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend,
 						}
 					}
 				}
-next:
+out_err:
+				if (err ==  -EDEADLK) {
+					err = i915_gem_ww_ctx_backoff(&ww);
+					if (!err)
+						goto retry;
+				}
+				i915_gem_ww_ctx_fini(&ww);
+
 				mutex_lock(&mem->objects.lock);
 				if (ret)
 					break;
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 159/162] drm/i915/dg1: Fix mapping type for default state object
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (157 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 158/162] drm/i915: Support ww locks in suspend/resume Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 12:07 ` [RFC PATCH 160/162] drm/i915/dg1: Fix GPU hang due to shmemfs page drop Matthew Auld
                   ` (2 subsequent siblings)
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: Venkata Ramana Nayana, dri-devel

From: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>

Use I915_MAP_WC when default state object is allocated on LMEM.

Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
---
 drivers/gpu/drm/i915/gt/shmem_utils.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c b/drivers/gpu/drm/i915/gt/shmem_utils.c
index 041e2a50160d..1fbc070a4651 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -8,6 +8,7 @@
 #include <linux/shmem_fs.h>
 
 #include "gem/i915_gem_object.h"
+#include "gem/i915_gem_lmem.h"
 #include "shmem_utils.h"
 
 struct file *shmem_create_from_data(const char *name, void *data, size_t len)
@@ -39,7 +40,8 @@ struct file *shmem_create_from_object(struct drm_i915_gem_object *obj)
 		return file;
 	}
 
-	ptr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
+	ptr = i915_gem_object_pin_map_unlocked(obj, i915_gem_object_is_lmem(obj) ?
+						I915_MAP_WC : I915_MAP_WB);
 	if (IS_ERR(ptr))
 		return ERR_CAST(ptr);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 160/162] drm/i915/dg1: Fix GPU hang due to shmemfs page drop
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (158 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 159/162] drm/i915/dg1: Fix mapping type for default state object Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 14:44   ` [Intel-gfx] " Chris Wilson
  2020-11-27 12:07 ` [RFC PATCH 161/162] drm/i915/dg1: allow pci to auto probe Matthew Auld
  2020-11-27 12:07 ` [RFC PATCH 162/162] drm/i915: drop fake lmem Matthew Auld
  161 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx
  Cc: Tvrtko Ursulin, Chris Wilson, Venkata Ramana Nayana, Sudeep Dutt,
	dri-devel, CQ Tang

From: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>

This is to fix a bug in upstream
commit a6326a4f8ffb ("drm/i915/gt: Keep a no-frills swappable copy of the default context state")

We allocate context state obj ce->state from lmem, so in __engines_record_defaults(),
we call shmem_create_from_object(). Because it is lmem object, this call will
create a new shmemfs file, copy the contents into it, and return the file
pointer and assign to engine->default_state. Of course ce->state lmem object
is freed at the end of function __engines_record_redefaults().

Because a new shmemfs file is create for engine->default_state,
and more importantly, we DON'T mark the pages dirty after we write into it,
the OS page cache eviction will drop these pages.

Now with the test move forward, it will create new request/context, and will
copy the saved engine->default_state into ce->state. If the default_state
pages are dropped during page cache eviction, the copying will get new pages,
and copy garbage from the new pages. Next, ce->state will have wrong
instruction and causes GPU to hang.

The fixing is very simple, we just mark the shmemfs pages to be dirty when
writing into it, and also mark the pages to accessed when read/write to them.

Fixes: a6326a4f8ffb("drm/i915/gt: Keep a no-frills swappable copy of the default context state")
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Chris Wilson <chris@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
---
 drivers/gpu/drm/i915/gt/shmem_utils.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c b/drivers/gpu/drm/i915/gt/shmem_utils.c
index 1fbc070a4651..e24c2c2342bb 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -105,10 +105,13 @@ static int __shmem_rw(struct file *file, loff_t off,
 			return PTR_ERR(page);
 
 		vaddr = kmap(page);
-		if (write)
+		if (write) {
 			memcpy(vaddr + offset_in_page(off), ptr, this);
-		else
+			set_page_dirty(page);
+		} else {
 			memcpy(ptr, vaddr + offset_in_page(off), this);
+		}
+		mark_page_accessed(page);
 		kunmap(page);
 		put_page(page);
 
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 161/162] drm/i915/dg1: allow pci to auto probe
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (159 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 160/162] drm/i915/dg1: Fix GPU hang due to shmemfs page drop Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  2020-11-27 12:07 ` [RFC PATCH 162/162] drm/i915: drop fake lmem Matthew Auld
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: Lucas De Marchi, dri-devel

From: Lucas De Marchi <lucas.demarchi@intel.com>

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/i915/i915_pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index c3d9b36ef651..603976b9a973 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1001,6 +1001,7 @@ static const struct pci_device_id pciidlist[] = {
 	INTEL_JSL_IDS(&jsl_info),
 	INTEL_TGL_12_IDS(&tgl_info),
 	INTEL_RKL_IDS(&rkl_info),
+	INTEL_DG1_IDS(&dg1_info),
 	{0, 0, 0}
 };
 MODULE_DEVICE_TABLE(pci, pciidlist);
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* [RFC PATCH 162/162] drm/i915: drop fake lmem
  2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
                   ` (160 preceding siblings ...)
  2020-11-27 12:07 ` [RFC PATCH 161/162] drm/i915/dg1: allow pci to auto probe Matthew Auld
@ 2020-11-27 12:07 ` Matthew Auld
  161 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-11-27 12:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c            | 15 ----
 drivers/gpu/drm/i915/i915_params.c         |  5 --
 drivers/gpu/drm/i915/i915_params.h         |  1 -
 drivers/gpu/drm/i915/intel_memory_region.c | 11 +--
 drivers/gpu/drm/i915/intel_region_lmem.c   | 96 ----------------------
 drivers/gpu/drm/i915/intel_region_lmem.h   |  3 -
 6 files changed, 1 insertion(+), 130 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c41865d5bf1e..ee7272abc2b4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -836,21 +836,6 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	if (!i915->params.nuclear_pageflip && match_info->gen < 5)
 		i915->drm.driver_features &= ~DRIVER_ATOMIC;
 
-	/*
-	 * Check if we support fake LMEM -- for now we only unleash this for
-	 * the live selftests(test-and-exit).
-	 */
-#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-	if (IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM)) {
-		if (INTEL_GEN(i915) >= 9 && i915_selftest.live < 0 &&
-		    i915->params.fake_lmem_start) {
-			mkwrite_device_info(i915)->memory_regions =
-				REGION_SMEM | REGION_LMEM | REGION_STOLEN_SMEM;
-			GEM_BUG_ON(!HAS_LMEM(i915));
-		}
-	}
-#endif
-
 	ret = pci_enable_device(pdev);
 	if (ret)
 		goto out_fini;
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 9fa58ed76614..819341f77488 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -192,11 +192,6 @@ i915_param_named(enable_gvt, bool, 0400,
 	"Enable support for Intel GVT-g graphics virtualization host support(default:false)");
 #endif
 
-#if IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM)
-i915_param_named_unsafe(fake_lmem_start, ulong, 0400,
-	"Fake LMEM start offset (default: 0)");
-#endif
-
 i915_param_named_unsafe(enable_eviction, uint, 0600,
 	"Enable eviction which does not rely on DMA resv refactoring "
 	"0=disabled, 1=memcpy based only, 2=blt based only, "
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index c835e592ee5f..ea6e99735ff2 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -70,7 +70,6 @@ struct drm_printer;
 	param(int, fastboot, -1, 0600) \
 	param(int, enable_dpcd_backlight, -1, 0600) \
 	param(char *, force_probe, CONFIG_DRM_I915_FORCE_PROBE, 0400) \
-	param(unsigned long, fake_lmem_start, 0, 0400) \
 	param(unsigned int, lmem_size, 0, 0400) \
 	param(unsigned int, enable_eviction, 3, 0600) \
 	/* leave bools at the end to not create holes */ \
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
index 6b26b6cd5958..045efb9b01d9 100644
--- a/drivers/gpu/drm/i915/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/intel_memory_region.c
@@ -447,16 +447,7 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
 			mem = i915_gem_stolen_setup(i915);
 			break;
 		case INTEL_MEMORY_LOCAL:
-#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-			if (IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM)) {
-				if (INTEL_GEN(i915) >= 9 && i915_selftest.live < 0 &&
-				    i915->params.fake_lmem_start)
-					mem = intel_setup_fake_lmem(i915);
-			}
-#endif
-
-			if (IS_ERR(mem))
-				mem = i915_gem_setup_lmem(i915);
+			mem = i915_gem_setup_lmem(i915);
 			break;
 		}
 
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
index 1cdb6354b968..e3f5ca619318 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/intel_region_lmem.c
@@ -9,64 +9,9 @@
 #include "gem/i915_gem_region.h"
 #include "intel_region_lmem.h"
 
-static int init_fake_lmem_bar(struct intel_memory_region *mem)
-{
-	struct drm_i915_private *i915 = mem->i915;
-	struct i915_ggtt *ggtt = &i915->ggtt;
-	unsigned long n;
-	int ret;
-
-	/* We want to 1:1 map the mappable aperture to our reserved region */
-
-	mem->fake_mappable.start = 0;
-	mem->fake_mappable.size = resource_size(&mem->region);
-	mem->fake_mappable.color = I915_COLOR_UNEVICTABLE;
-
-	ret = drm_mm_reserve_node(&ggtt->vm.mm, &mem->fake_mappable);
-	if (ret)
-		return ret;
-
-	mem->remap_addr = dma_map_resource(&i915->drm.pdev->dev,
-					   mem->region.start,
-					   mem->fake_mappable.size,
-					   PCI_DMA_BIDIRECTIONAL,
-					   DMA_ATTR_FORCE_CONTIGUOUS);
-	if (dma_mapping_error(&i915->drm.pdev->dev, mem->remap_addr)) {
-		drm_mm_remove_node(&mem->fake_mappable);
-		return -EINVAL;
-	}
-
-	for (n = 0; n < mem->fake_mappable.size >> PAGE_SHIFT; ++n) {
-		ggtt->vm.insert_page(&ggtt->vm,
-				     mem->remap_addr + (n << PAGE_SHIFT),
-				     n << PAGE_SHIFT,
-				     I915_CACHE_NONE, 0);
-	}
-
-	mem->region = (struct resource)DEFINE_RES_MEM(mem->remap_addr,
-						      mem->fake_mappable.size);
-
-	return 0;
-}
-
-static void release_fake_lmem_bar(struct intel_memory_region *mem)
-{
-	if (!drm_mm_node_allocated(&mem->fake_mappable))
-		return;
-
-	drm_mm_remove_node(&mem->fake_mappable);
-
-	dma_unmap_resource(&mem->i915->drm.pdev->dev,
-			   mem->remap_addr,
-			   mem->fake_mappable.size,
-			   PCI_DMA_BIDIRECTIONAL,
-			   DMA_ATTR_FORCE_CONTIGUOUS);
-}
-
 static void
 region_lmem_release(struct intel_memory_region *mem)
 {
-	release_fake_lmem_bar(mem);
 	io_mapping_fini(&mem->iomap);
 	intel_memory_region_release_buddy(mem);
 }
@@ -76,11 +21,6 @@ region_lmem_init(struct intel_memory_region *mem)
 {
 	int ret;
 
-	if (mem->i915->params.fake_lmem_start) {
-		ret = init_fake_lmem_bar(mem);
-		GEM_BUG_ON(ret);
-	}
-
 	if (!io_mapping_init_wc(&mem->iomap,
 				mem->io_start,
 				resource_size(&mem->region)))
@@ -101,42 +41,6 @@ const struct intel_memory_region_ops intel_region_lmem_ops = {
 	.create_object = __i915_gem_lmem_object_create,
 };
 
-struct intel_memory_region *
-intel_setup_fake_lmem(struct drm_i915_private *i915)
-{
-	struct pci_dev *pdev = i915->drm.pdev;
-	struct intel_memory_region *mem;
-	resource_size_t mappable_end;
-	resource_size_t io_start;
-	resource_size_t start;
-
-	GEM_BUG_ON(i915_ggtt_has_aperture(&i915->ggtt));
-	GEM_BUG_ON(!i915->params.fake_lmem_start);
-
-	/* Your mappable aperture belongs to me now! */
-	mappable_end = pci_resource_len(pdev, 2);
-	io_start = pci_resource_start(pdev, 2),
-	start = i915->params.fake_lmem_start;
-
-	mem = intel_memory_region_create(i915,
-					 start,
-					 mappable_end,
-					 PAGE_SIZE,
-					 io_start,
-					 &intel_region_lmem_ops);
-	if (!IS_ERR(mem)) {
-		drm_info(&i915->drm, "Intel graphics fake LMEM: %pR\n",
-			 &mem->region);
-		drm_info(&i915->drm,
-			 "Intel graphics fake LMEM IO start: %llx\n",
-			(u64)mem->io_start);
-		drm_info(&i915->drm, "Intel graphics fake LMEM size: %llx\n",
-			 (u64)resource_size(&mem->region));
-	}
-
-	return mem;
-}
-
 static void get_legacy_lowmem_region(struct intel_uncore *uncore,
 				     u64 *start, u32 *size)
 {
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.h b/drivers/gpu/drm/i915/intel_region_lmem.h
index 054e729035c1..6dbed8de3ce3 100644
--- a/drivers/gpu/drm/i915/intel_region_lmem.h
+++ b/drivers/gpu/drm/i915/intel_region_lmem.h
@@ -12,7 +12,4 @@ extern const struct intel_memory_region_ops intel_region_lmem_ops;
 
 struct intel_memory_region *i915_gem_setup_lmem(struct drm_i915_private *i915);
 
-struct intel_memory_region *
-intel_setup_fake_lmem(struct drm_i915_private *i915);
-
 #endif /* !__INTEL_REGION_LMEM_H */
-- 
2.26.2

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 092/162] drm/i915/uapi: introduce drm_i915_gem_create_ext
  2020-11-27 12:06 ` [RFC PATCH 092/162] drm/i915/uapi: introduce drm_i915_gem_create_ext Matthew Auld
@ 2020-11-27 13:25   ` Chris Wilson
  2020-12-01 15:06     ` Thomas Hellström (Intel)
  2020-11-27 19:21   ` Chris Wilson
  2020-12-01 12:55   ` Chris Wilson
  2 siblings, 1 reply; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 13:25 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

Quoting Matthew Auld (2020-11-27 12:06:08)
> Same old gem_create but with now with extensions support. This is needed
> to support various upcoming usecases. For now we use the extensions
> mechanism to support setting an immutable-priority-list of potential
> placements, at creation time.
> 
> If we wish to set the placements/regions we can simply do:
> 
> struct drm_i915_gem_object_param region_param = { … }; /* Unchanged */
> struct drm_i915_gem_create_ext_setparam setparam_region = {
>     .base = { .name = I915_GEM_CREATE_EXT_SETPARAM },
>     .param = region_param,
> }
> 
> struct drm_i915_gem_create_ext create_ext = {
>         .size = 16 * PAGE_SIZE,
>         .extensions = (uintptr_t)&setparam_region,
> };
> int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
> if (err) ...
> 
> If we use the normal gem_create or gem_create_ext without the
> extensions/placements then we still get the old behaviour with only
> placing the object in system memory.
> 
> One important change here is the returned size will now be rounded up to
> the correct size, depending on the list of placements, where we might
> have minimum page-size restrictions on some platforms when dealing with
> device local-memory.
> 
> Also, we still keep around the i915_gem_object_setparam ioctl, although
> that is now restricted by the placement list(i.e we are not allowed to
> add new placements), and longer term that will be going away wrt setting
> placements, since it was deemed that the kernel doesn't need to support
> a dynamic list of placements, which is now solidified by this uapi
> change.
> 
> Testcase: igt/gem_create/create-ext-placement-sanity-check
> Testcase: igt/gem_create/create-ext-placement-each
> Testcase: igt/gem_create/create-ext-placement-all
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Signed-off-by: CQ Tang <cq.tang@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/Makefile                 |   1 +
>  drivers/gpu/drm/i915/gem/i915_gem_create.c    | 398 ++++++++++++++++++
>  drivers/gpu/drm/i915/gem/i915_gem_object.c    |   2 +
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   9 +
>  drivers/gpu/drm/i915/gem/i915_gem_region.c    |   4 +
>  drivers/gpu/drm/i915/i915_drv.c               |   2 +-
>  drivers/gpu/drm/i915/i915_gem.c               | 103 +----
>  drivers/gpu/drm/i915/intel_memory_region.c    |  20 +
>  drivers/gpu/drm/i915/intel_memory_region.h    |   4 +
>  include/uapi/drm/i915_drm.h                   |  60 +++
>  10 files changed, 500 insertions(+), 103 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_create.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index ec361d61230b..3955134feca7 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -134,6 +134,7 @@ gem-y += \
>         gem/i915_gem_clflush.o \
>         gem/i915_gem_client_blt.o \
>         gem/i915_gem_context.o \
> +       gem/i915_gem_create.o \
>         gem/i915_gem_dmabuf.o \
>         gem/i915_gem_domain.o \
>         gem/i915_gem_execbuffer.o \
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> new file mode 100644
> index 000000000000..6f6dd4f1ce7e
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> @@ -0,0 +1,398 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2020 Intel Corporation
> + */
> +
> +#include "gem/i915_gem_ioctls.h"
> +#include "gem/i915_gem_lmem.h"
> +#include "gem/i915_gem_object_blt.h"
> +#include "gem/i915_gem_region.h"
> +
> +#include "i915_drv.h"
> +#include "i915_user_extensions.h"
> +
> +static u32 max_page_size(struct intel_memory_region **placements,
> +                        int n_placements)
> +{
> +       u32 max_page_size = 0;
> +       int i;
> +
> +       for (i = 0; i < n_placements; ++i) {
> +               max_page_size = max_t(u32, max_page_size,
> +                                     placements[i]->min_page_size);
> +       }
> +
> +       GEM_BUG_ON(!max_page_size);
> +       return max_page_size;
> +}
> +
> +static int
> +i915_gem_create(struct drm_file *file,
> +               struct intel_memory_region **placements,
> +               int n_placements,
> +               u64 *size_p,
> +               u32 *handle_p)
> +{
> +       struct drm_i915_gem_object *obj;
> +       u32 handle;
> +       u64 size;
> +       int ret;
> +
> +       size = round_up(*size_p, max_page_size(placements, n_placements));
> +       if (size == 0)
> +               return -EINVAL;
> +
> +       /* For most of the ABI (e.g. mmap) we think in system pages */
> +       GEM_BUG_ON(!IS_ALIGNED(size, PAGE_SIZE));
> +
> +       /* Allocate the new object */
> +       obj = i915_gem_object_create_region(placements[0], size, 0);
> +       if (IS_ERR(obj))
> +               return PTR_ERR(obj);
> +
> +       if (i915_gem_object_is_lmem(obj)) {
> +               struct intel_gt *gt = obj->mm.region->gt;
> +               struct intel_context *ce = gt->engine[BCS0]->blitter_context;
> +
> +               /*
> +                * XXX: We really want to move this to get_pages(), but we
> +                * require grabbing the BKL for the blitting operation which is
> +                * annoying. In the pipeline is support for async get_pages()
> +                * which should fit nicely for this. Also note that the actual
> +                * clear should be done async(we currently do an object_wait
> +                * which is pure garbage), we just need to take care if
> +                * userspace opts of implicit sync for the execbuf, to avoid any
> +                * potential info leak.
> +                */

Not just XXX, but the design should be completed first.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [RFC PATCH 093/162] drm/i915/lmem: allocate cmd ring in lmem
  2020-11-27 12:06 ` [RFC PATCH 093/162] drm/i915/lmem: allocate cmd ring in lmem Matthew Auld
@ 2020-11-27 13:27   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 13:27 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Michel Thierry, Abdiel Janulgue, dri-devel

Quoting Matthew Auld (2020-11-27 12:06:09)
> From: Michel Thierry <michel.thierry@intel.com>
> 
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_ring.c | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
> index d636c6ed88b7..aa75e644f3f2 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ring.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ring.c
> @@ -4,6 +4,7 @@
>   * Copyright © 2019 Intel Corporation
>   */
>  
> +#include "gem/i915_gem_lmem.h"
>  #include "gem/i915_gem_object.h"
>  #include "i915_drv.h"
>  #include "i915_vma.h"
> @@ -111,10 +112,16 @@ static struct i915_vma *create_ring_vma(struct i915_ggtt *ggtt, int size)
>         struct i915_vma *vma;
>  
>         obj = ERR_PTR(-ENODEV);
> -       if (i915_ggtt_has_aperture(ggtt))
> -               obj = i915_gem_object_create_stolen(i915, size);
> -       if (IS_ERR(obj))
> -               obj = i915_gem_object_create_internal(i915, size);
> +       if (HAS_LMEM(i915)) {
> +               obj = i915_gem_object_create_lmem(i915, size,
> +                                                 I915_BO_ALLOC_CONTIGUOUS |
> +                                                 I915_BO_ALLOC_VOLATILE);

Just create, and keep trying when !lmem returns an error.

Why contiguous, it's vmapped anyway?

> +       } else {
> +               if (i915_ggtt_has_aperture(ggtt))
> +                       obj = i915_gem_object_create_stolen(i915, size);
> +               if (IS_ERR(obj))
> +                       obj = i915_gem_object_create_internal(i915, size);
> +       }
>         if (IS_ERR(obj))
>                 return ERR_CAST(obj);
>  
> -- 
> 2.26.2
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 097/162] drm/i915: Distinction of memory regions
  2020-11-27 12:06 ` [RFC PATCH 097/162] drm/i915: Distinction of memory regions Matthew Auld
@ 2020-11-27 13:30   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 13:30 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Adam Miszczak, dri-devel

Quoting Matthew Auld (2020-11-27 12:06:13)
> From: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> 
> IGTs should be able to choose testing strategy depending on memory
> regions and its sizes. Add region instance number to make this
> easier and descriptive.
> 
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Ramalingam C <ramalingam.c@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Adam Miszczak <adam.miszczak@intel.com>
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_memory_region.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
> index 1f26bc06ec20..cea44ddebe46 100644
> --- a/drivers/gpu/drm/i915/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/intel_memory_region.c
> @@ -329,6 +329,10 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
>                 mem->instance = instance;
>                 mem->gt = &i915->gt;
>  
> +               if (HAS_LMEM(mem->i915) && type != INTEL_MEMORY_SYSTEM)
> +                       intel_memory_region_set_name(mem, "%s%u",
> +                                                    mem->name, mem->instance);

sprintf(mem->name, "%s", mem->name)

is that even defined behaviour?
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [RFC PATCH 098/162] drm/i915/gtt: map the PD up front
  2020-11-27 12:06 ` [RFC PATCH 098/162] drm/i915/gtt: map the PD up front Matthew Auld
@ 2020-11-27 13:31   ` Chris Wilson
  2021-01-12 10:47     ` [Intel-gfx] " Matthew Auld
  0 siblings, 1 reply; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 13:31 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

Quoting Matthew Auld (2020-11-27 12:06:14)
> We need to general our accessor for the page directories and tables from
> using the simple kmap_atomic to support local memory, and this setup
> must be done on acquisition of the backing storage prior to entering
> fence execution contexts. Here we replace the kmap with the object
> maping code that for simple single page shmemfs object will return a
> plain kmap, that is then kept for the lifetime of the page directory.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

We are going to really struggle with this on 32b :(
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 101/162] drm/i915/gtt/dg1: add PTE_LM plumbing for PPGTT
  2020-11-27 12:06 ` [RFC PATCH 101/162] drm/i915/gtt/dg1: add PTE_LM plumbing for PPGTT Matthew Auld
@ 2020-11-27 13:35   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 13:35 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Abdiel Janulgue, dri-devel

Quoting Matthew Auld (2020-11-27 12:06:17)
> For the PTEs we get an LM bit, to signal whether the page resides in
> SMEM or LMEM.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 35 ++++++++++++++++++++++-----
>  drivers/gpu/drm/i915/gt/intel_gtt.h   |  3 +++
>  drivers/gpu/drm/i915/gt/intel_ppgtt.c |  4 +++
>  3 files changed, 36 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index e2f1dfc48d43..b6fcebeef02a 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -5,6 +5,7 @@
>  
>  #include <linux/log2.h>
>  
> +#include "gem/i915_gem_lmem.h"
>  #include "gen8_ppgtt.h"
>  #include "i915_scatterlist.h"
>  #include "i915_trace.h"
> @@ -50,6 +51,21 @@ static u64 gen8_pte_encode(dma_addr_t addr,
>         return pte;
>  }
>  
> +static u64 gen12_pte_encode(dma_addr_t addr,
> +                           enum i915_cache_level level,
> +                           u32 flags)
> +{
> +       gen8_pte_t pte = addr | _PAGE_PRESENT | _PAGE_RW;
> +
> +       if (unlikely(flags & PTE_READ_ONLY))
> +               pte &= ~_PAGE_RW;
> +
> +       if (flags & PTE_LM)
> +               pte |= GEN12_PPGTT_PTE_LM;
> +
> +       return pte;
> +}
> +
>  static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
>  {
>         struct drm_i915_private *i915 = ppgtt->vm.i915;
> @@ -365,7 +381,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
>                       u32 flags)
>  {
>         struct i915_page_directory *pd;
> -       const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
> +       const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, flags);

We don't need the vfunc, since that flag will not be sent for gen8.

That bit test will be cheaper than the retpoline.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 103/162] drm/i915: allocate context from LMEM
  2020-11-27 12:06 ` [RFC PATCH 103/162] drm/i915: allocate context from LMEM Matthew Auld
@ 2020-11-27 13:37   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 13:37 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Abdiel Janulgue, dri-devel

Quoting Matthew Auld (2020-11-27 12:06:19)
> Based on a patch from Michel Thierry.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> ---
>  .../drm/i915/gt/intel_execlists_submission.c  | 31 ++++++++++++++++++-
>  1 file changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 582a9044727e..c640b90711fd 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -108,6 +108,8 @@
>   */
>  #include <linux/interrupt.h>
>  
> +#include "gem/i915_gem_lmem.h"
> +
>  #include "i915_drv.h"
>  #include "i915_perf.h"
>  #include "i915_trace.h"
> @@ -4660,6 +4662,21 @@ static struct intel_timeline *pinned_timeline(struct intel_context *ce)
>                                                  page_unmask_bits(tl));
>  }
>  
> +static int context_clear_lmem(struct drm_i915_gem_object *ctx_obj)
> +{
> +       void *vaddr;
> +
> +       vaddr = i915_gem_object_pin_map(ctx_obj, I915_MAP_WC);
> +       if (IS_ERR(vaddr))
> +               return PTR_ERR(vaddr);
> +
> +       memset64(vaddr, 0, ctx_obj->base.size / sizeof(u64));
> +
> +       i915_gem_object_unpin_map(ctx_obj);

What? We copy over the entire object with the default state.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 118/162] drm/i915/dg1: Reserve first 1MB of local memory
  2020-11-27 12:06 ` [RFC PATCH 118/162] drm/i915/dg1: Reserve first 1MB of local memory Matthew Auld
@ 2020-11-27 13:52   ` Chris Wilson
  2020-11-30 11:09     ` Matthew Auld
  0 siblings, 1 reply; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 13:52 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

Quoting Matthew Auld (2020-11-27 12:06:34)
> From: Imre Deak <imre.deak@intel.com>
> 
> On DG1 A0/B0 steppings the first 1MB of local memory must be reserved.
> One reason for this is that the 0xA0000-0xB0000 range is not accessible
> by the display, probably since this region is redirected to another
> memory location for legacy VGA compatibility.
> 
> BSpec: 50586
> Testcase: igt/kms_big_fb/linear-64bpp-rotate-0
> Signed-off-by: Imre Deak <imre.deak@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_region_lmem.c | 52 ++++++++++++++++++++++++
>  1 file changed, 52 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
> index 939cf0d195a5..eafef7034680 100644
> --- a/drivers/gpu/drm/i915/intel_region_lmem.c
> +++ b/drivers/gpu/drm/i915/intel_region_lmem.c
> @@ -137,6 +137,48 @@ intel_setup_fake_lmem(struct drm_i915_private *i915)
>         return mem;
>  }
>  
> +static void get_legacy_lowmem_region(struct intel_uncore *uncore,
> +                                    u64 *start, u32 *size)
> +{
> +       *start = 0;
> +       *size = 0;
> +
> +       if (!IS_DG1_REVID(uncore->i915, DG1_REVID_A0, DG1_REVID_B0))
> +               return;
> +
> +       *size = SZ_1M;
> +
> +       DRM_DEBUG_DRIVER("LMEM: reserved legacy low-memory [0x%llx-0x%llx]\n",
> +                        *start, *start + *size);
> +}
> +
> +static int reserve_lowmem_region(struct intel_uncore *uncore,
> +                                struct intel_memory_region *mem)
> +{
> +       u64 reserve_start;
> +       u64 reserve_end;
> +       u64 region_start;
> +       u32 region_size;
> +       int ret;
> +
> +       get_legacy_lowmem_region(uncore, &region_start, &region_size);
> +       reserve_start = region_start;
> +       reserve_end = region_start + region_size;
> +
> +       if (!reserve_end)
> +               return 0;
> +
> +       DRM_INFO("LMEM: reserving low-memory region [0x%llx-0x%llx]\n",
> +                reserve_start, reserve_end);
> +       ret = i915_buddy_alloc_range(&mem->mm, &mem->reserved,
> +                                    reserve_start,
> +                                    reserve_end - reserve_start);

Isn't this now relative to the stolen offset? Should this be reserved,
or excluded like stolen?
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 124/162] drm/i915/lmem: allocate HWSP in lmem
  2020-11-27 12:06 ` [RFC PATCH 124/162] drm/i915/lmem: allocate HWSP in lmem Matthew Auld
@ 2020-11-27 13:55   ` Chris Wilson
  2020-11-30 17:17     ` Matthew Auld
  0 siblings, 1 reply; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 13:55 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Abdiel Janulgue, Michel Thierry, dri-devel

Quoting Matthew Auld (2020-11-27 12:06:40)
> From: Michel Thierry <michel.thierry@intel.com>

Rationale goes here.

Is this wise? HWSP is very frequently read by the CPU, and expected to
be cached on the CPU.

What do the performance profiles indicate?
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 125/162] drm/i915/lmem: Limit block size to 4G
  2020-11-27 12:06 ` [RFC PATCH 125/162] drm/i915/lmem: Limit block size to 4G Matthew Auld
@ 2020-11-27 14:02   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:02 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

Quoting Matthew Auld (2020-11-27 12:06:41)
> From: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
> 
> when allocating pages to lmem object of size 4G or greater
> we allocate memory blocks from buddy system.

Any lmem object is from the buddy system.

> In this scenario
> buddy sytem can allocate blocks that can have size >= 4G and
> these blocks require >32b to represent block size with these
> blocks we run into an issue with sg list construction because
> sg->length field is only 32b wide.

Just say the when using scatterlist, the maximum segment size is 4G. In
fact, we can ask sg what the backend maximum is, and use that as our max
order.

The only question is whether this merits a flag, or we just assume that
the buddy allocator is only used for objects and so always presented via
sg?
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [RFC PATCH 126/162] drm/i915/gem: Update shmem available memory
  2020-11-27 12:06 ` [RFC PATCH 126/162] drm/i915/gem: Update shmem available memory Matthew Auld
@ 2020-11-27 14:04   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:04 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Bommu Krishnaiah, Zbigniew Kempczyński, CQ Tang, dri-devel

Quoting Matthew Auld (2020-11-27 12:06:42)
> From: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
> 
> Update shmem available memory in “intel_memory_region”

Was avail ever set?
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 128/162] drm/i915/dg1: intel_memory_region_evict() changes for eviction
  2020-11-27 12:06 ` [RFC PATCH 128/162] drm/i915/dg1: intel_memory_region_evict() changes for eviction Matthew Auld
@ 2020-11-27 14:07   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:07 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

Quoting Matthew Auld (2020-11-27 12:06:44)
> From: CQ Tang <cq.tang@intel.com>
> 
> Function i915_gem_shrink_memory_region() is changed to
> intel_memory_region_evict() and moved from i915_gem_shrinker.c
> to intel_memory_region.c, this function is used to handle local
> memory swapping, in addition to evict purgeable objects only.

We really do not want to conflate the system shrinker with eviction.
Reservation based eviction looks nothing like the shrinker.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 133/162] drm/i915/dg1: Track swap in/out stats via debugfs
  2020-11-27 12:06 ` [RFC PATCH 133/162] drm/i915/dg1: Track swap in/out stats via debugfs Matthew Auld
@ 2020-11-27 14:09   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:09 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

Quoting Matthew Auld (2020-11-27 12:06:49)
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 1366b53ac8c9..7b1e95d494e6 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1214,6 +1214,9 @@ struct drm_i915_private {
>          * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
>          * will be rejected. Instead look for a better place.
>          */
> +
> +       atomic_long_t num_bytes_swapped_out;
> +       atomic_long_t num_bytes_swapped_in;

Enough said. Don't mindlessly add fields.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 134/162] drm/i915/dg1: Measure swap in/out timing stats
  2020-11-27 12:06 ` [RFC PATCH 134/162] drm/i915/dg1: Measure swap in/out timing stats Matthew Auld
@ 2020-11-27 14:11   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:11 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

Quoting Matthew Auld (2020-11-27 12:06:50)
> From: Sudeep Dutt <sudeep.dutt@intel.com>
> 
> Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_region.c | 16 ++++++++++++++--
>  drivers/gpu/drm/i915/i915_debugfs.c        |  3 +++
>  drivers/gpu/drm/i915/i915_drv.h            |  2 ++
>  3 files changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c
> index ed108dbcb34e..4fab9f6b4bee 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
> @@ -15,6 +15,7 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
>  {
>         struct drm_i915_private *i915 = to_i915(obj->base.dev);
>         struct drm_i915_gem_object *dst, *src;
> +       unsigned long start, diff, msec;
>         int err;
>  
>         GEM_BUG_ON(obj->swapto);
> @@ -24,6 +25,7 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
>         GEM_BUG_ON(!i915->params.enable_eviction);
>  
>         assert_object_held(obj);
> +       start = jiffies;
>  
>         /* create a shadow object on smem region */
>         dst = i915_gem_object_create_shmem(i915, obj->base.size);
> @@ -64,8 +66,12 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
>         else
>                 i915_gem_object_put(dst);
>  
> -       if (!err)
> +       if (!err) {
> +               diff = jiffies - start;
> +               msec = diff * 1000 / HZ;
> +               atomic_long_add(msec, &i915->time_swap_out_ms);
>                 atomic_long_add(sizes, &i915->num_bytes_swapped_out);
> +       }

This can be done using a kprobe, and with prettier statistics as builtin
functionality.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 137/162] drm/i915: blt copy between objs using pre-created vma windows
  2020-11-27 12:06 ` [RFC PATCH 137/162] drm/i915: blt copy between objs using pre-created vma windows Matthew Auld
@ 2020-11-27 14:19   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:19 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Jani Nikula, Daniel Vetter, dri-devel

Quoting Matthew Auld (2020-11-27 12:06:53)
> +int i915_window_blt_copy(struct drm_i915_gem_object *dst,
> +                        struct drm_i915_gem_object *src)
> +{
> +       struct drm_i915_private *i915 = to_i915(src->base.dev);
> +       struct intel_context *ce = i915->gt.engine[BCS0]->blitter_context;
> +       bool src_is_lmem = i915_gem_object_is_lmem(src);
> +       bool dst_is_lmem = i915_gem_object_is_lmem(dst);
> +       struct scatterlist *last_sgl;
> +       struct i915_vma *src_vma, *dst_vma;
> +       struct i915_request *rq;
> +       u64 cur_win_sz, blt_copied, offset;
> +       long timeout;
> +       u32 size;
> +       int err;
> +
> +       src_vma = src_is_lmem ? i915->mm.lmem_window[0] :
> +                               i915->mm.smem_window[0];
> +       dst_vma = dst_is_lmem ? i915->mm.lmem_window[1] :
> +                               i915->mm.smem_window[1];
> +
> +       if (!src_vma || !dst_vma)
> +               return -ENODEV;
> +
> +       blt_copied = 0;
> +
> +       err = i915_window_blt_copy_prepare_obj(src);
> +       if (err)
> +               return err;
> +
> +       err = i915_window_blt_copy_prepare_obj(dst);
> +       if (err) {
> +               i915_gem_object_unpin_pages(src);
> +               return err;
> +       }
> +
> +       mutex_lock(&i915->mm.window_mutex);
> +       src_vma->obj = src;
> +       dst_vma->obj = dst;
> +       do {
> +               cur_win_sz = min_t(u64, BLT_WINDOW_SZ,
> +                                  (src->base.size - blt_copied));
> +               offset = blt_copied >> PAGE_SHIFT;
> +               size = ALIGN(cur_win_sz, src->mm.region->min_page_size) >>
> +                      PAGE_SHIFT;
> +               intel_partial_pages_for_sg_table(src, src_vma->pages, offset,
> +                                                size, &last_sgl);
> +
> +               /*
> +                * Insert pages into vm, expects the pages to the full
> +                * length of VMA. But we may have the pages of <= vma_size.
> +                * Hence altering the vma size to match the total size of
> +                * the pages attached.
> +                */
> +               src_vma->size = size << PAGE_SHIFT;
> +               i915_insert_vma_pages(src_vma, src_is_lmem);
> +               sg_unmark_end(last_sgl);
> +
> +               /*
> +                * Source obj size could be smaller than the dst obj size,
> +                * due to the varying min_page_size of the mem regions the
> +                * obj belongs to. But when we insert the pages into vm,
> +                * the total size of the pages supposed to be multiples of
> +                * the min page size of that mem region.
> +                */
> +               size = ALIGN(cur_win_sz, dst->mm.region->min_page_size) >>
> +                      PAGE_SHIFT;
> +               intel_partial_pages_for_sg_table(dst, dst_vma->pages, offset,
> +                                                size, &last_sgl);
> +
> +               dst_vma->size = size << PAGE_SHIFT;
> +               i915_insert_vma_pages(dst_vma, dst_is_lmem);
> +               sg_unmark_end(last_sgl);
> +
> +               rq = i915_request_create(ce);
> +               if (IS_ERR(rq)) {
> +                       err = PTR_ERR(rq);
> +                       break;
> +               }
> +               if (rq->engine->emit_init_breadcrumb) {
> +                       err = rq->engine->emit_init_breadcrumb(rq);
> +                       if (unlikely(err)) {
> +                               DRM_ERROR("init_breadcrumb failed. %d\n", err);
> +                               break;
> +                       }
> +               }
> +               err = i915_window_blt_copy_batch_prepare(rq, src_vma, dst_vma,
> +                                                        cur_win_sz);
> +               if (err) {
> +                       DRM_ERROR("Batch preparation failed. %d\n", err);
> +                       i915_request_set_error_once(rq, -EIO);
> +               }
> +
> +               i915_request_get(rq);
> +               i915_request_add(rq);
> +
> +               timeout = i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT);

Locked waits.

> +               if (timeout < 0) {
> +                       DRM_ERROR("BLT Request is not completed. %ld\n",
> +                                 timeout);
> +                       err = timeout;
> +                       i915_request_put(rq);
> +                       break;
> +               }
> +
> +               blt_copied += cur_win_sz;
> +               err = 0;
> +               i915_request_put(rq);
> +               flush_work(&i915->gt.engine[BCS0]->retire_work);

Papering (doubtful the paper is successful) over bugs by introducing a
whole load more.

This fails the basic premise that eviction must be pipelined. The PTE
are transient and can be written prior to the copy and kept within the
non-preemptible window of the blt. Thus allowing many evictions to
scheduled in parallel (by either allocating separate contexts, or more
preferably picking a user context).
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 140/162] drm/i915: window_blt_copy is used for swapin and swapout
  2020-11-27 12:06 ` [RFC PATCH 140/162] drm/i915: window_blt_copy is used for swapin and swapout Matthew Auld
@ 2020-11-27 14:20   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:20 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

Quoting Matthew Auld (2020-11-27 12:06:56)
> From: Ramalingam C <ramalingam.c@intel.com>
> 
> window_blt_copy feature is used for swapin and swapout based on the i915
> module parameter called enable_eviction.

A module parameter?
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 141/162] drm/i915: Lmem eviction statistics by category
  2020-11-27 12:06 ` [RFC PATCH 141/162] drm/i915: Lmem eviction statistics by category Matthew Auld
@ 2020-11-27 14:21   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:21 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

Quoting Matthew Auld (2020-11-27 12:06:57)
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 82f431cc38cd..6f0ab363bdee 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1225,6 +1225,11 @@ struct drm_i915_private {
>         atomic_long_t num_bytes_swapped_in;
>         atomic_long_t time_swap_out_ms;
>         atomic_long_t time_swap_in_ms;
> +
> +       atomic_long_t num_bytes_swapped_out_memcpy;
> +       atomic_long_t num_bytes_swapped_in_memcpy;
> +       atomic_long_t time_swap_out_ms_memcpy;
> +       atomic_long_t time_swap_in_ms_memcpy;

See earlier comments about why this will be rejected.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [RFC PATCH 143/162] drm/i915: suspend/resume eviction
  2020-11-27 12:06 ` [RFC PATCH 143/162] drm/i915: suspend/resume eviction Matthew Auld
@ 2020-11-27 14:22   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:22 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: CQ Tang, Venkata Ramana Nayana, dri-devel

Quoting Matthew Auld (2020-11-27 12:06:59)
> +static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend)
> +{
> +       struct drm_i915_private *i915 = to_i915(dev);
> +       struct drm_i915_gem_object *obj;
> +       struct intel_memory_region *mem;
> +       int id, ret = 0;
> +
> +       /*
> +        * FIXME: Presently using memcpy,
> +        * will replace with blitter once
> +        * fix the issues.
> +        */

Why hasn't it been fixed then?
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [RFC PATCH 144/162] drm/i915: Reset blitter context when unpark engine
  2020-11-27 12:07 ` [RFC PATCH 144/162] drm/i915: Reset blitter context when unpark engine Matthew Auld
@ 2020-11-27 14:26   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:26 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Venkata Ramana Nayana, dri-devel

Quoting Matthew Auld (2020-11-27 12:07:00)
> From: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
> 
> We are only doing it now for kernel_context. We also need to do for the
> copy engine  blitter context.
> 
> Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_engine_pm.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> index 1b2009b4dcb7..69c8ea70d1e8 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> @@ -66,6 +66,11 @@ static int __engine_unpark(struct intel_wakeref *wf)
>                 ce->ops->reset(ce);
>         }

Add a list of pinned volatile contexts to the engine that must be
restored across resume.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 146/162] drm/i915/pm: suspend and restore ppgtt mapping
  2020-11-27 12:07 ` [RFC PATCH 146/162] drm/i915/pm: suspend and restore ppgtt mapping Matthew Auld
@ 2020-11-27 14:29   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:29 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Venkata Ramana Nayana, dri-devel

Quoting Matthew Auld (2020-11-27 12:07:02)
> From: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
> 
> During suspend we will lose all page tables as they are allocated in
> LMEM. In-order to  make sure that the contexts do not access the
> corrupted page table after we restore, we are evicting all vma's that
> are bound to vm's. This includes kernel vm.
> 
> During resume, we are restoring the page tables back to scratch page.
> 
> Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
> Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
> Cc: CQ Tang <cq.tang@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c  |  13 ++++
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.h  |   2 +
>  drivers/gpu/drm/i915/gt/intel_ppgtt.c |   4 +
>  drivers/gpu/drm/i915/i915_drv.c       | 102 +++++++++++++++++++++++---
>  4 files changed, 112 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index b6fcebeef02a..704cab807e0b 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -775,3 +775,16 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
>         kfree(ppgtt);
>         return ERR_PTR(err);
>  }
> +
> +void gen8_restore_ppgtt_mappings(struct i915_address_space *vm)
> +{
> +       const unsigned int count = gen8_pd_top_count(vm);
> +       int i;
> +
> +       for (i = 1; i <= vm->top; i++)
> +               fill_px(vm->scratch[i], vm->scratch[i - 1]->encode);
> +
> +       fill_page_dma(px_base(i915_vm_to_ppgtt(vm)->pd),
> +                     vm->scratch[vm->top]->encode, count);
> +}
> +
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> index 76a08b9c1f5c..3fa4b95aaabd 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> @@ -6,8 +6,10 @@
>  #ifndef __GEN8_PPGTT_H__
>  #define __GEN8_PPGTT_H__
>  
> +struct i915_address_space;
>  struct intel_gt;
>  
> +void gen8_restore_ppgtt_mappings(struct i915_address_space *vm);
>  struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt);
>  
>  #endif
> diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> index 34a02643bb75..9b3eacd12a7e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> @@ -9,6 +9,8 @@
>  #include "intel_gtt.h"
>  #include "gem/i915_gem_lmem.h"
>  #include "gem/i915_gem_region.h"
> +#include "gem/i915_gem_context.h"
> +#include "gem/i915_gem_region.h"
>  #include "gen6_ppgtt.h"
>  #include "gen8_ppgtt.h"
>  
> @@ -317,3 +319,5 @@ void ppgtt_init(struct i915_ppgtt *ppgtt, struct intel_gt *gt)
>         ppgtt->vm.vma_ops.set_pages   = ppgtt_set_pages;
>         ppgtt->vm.vma_ops.clear_pages = clear_pages;
>  }
> +
> +
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index e8c4931fc818..7115f4db5043 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -64,6 +64,7 @@
>  #include "gem/i915_gem_context.h"
>  #include "gem/i915_gem_ioctls.h"
>  #include "gem/i915_gem_mman.h"
> +#include "gt/gen8_ppgtt.h"
>  #include "gt/intel_gt.h"
>  #include "gt/intel_gt_pm.h"
>  #include "gt/intel_rc6.h"
> @@ -1136,13 +1137,13 @@ static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend)
>  
>                                 mutex_unlock(&mem->objects.lock);
>  
> -                               if (in_suspend)
> -                                       i915_gem_object_unbind(obj, 0);
> -
>                                 if (in_suspend) {
>                                         obj->swapto = NULL;
>                                         obj->evicted = false;
>                                         obj->do_swapping = true;
> +
> +                                       i915_gem_object_unbind(obj, 0);
> +
>                                         ret = __i915_gem_object_put_pages(obj);
>                                         obj->do_swapping = false;
>                                         if (ret) {
> @@ -1176,6 +1177,43 @@ static int intel_dmem_evict_buffers(struct drm_device *dev, bool in_suspend)
>         return ret;
>  }
>  
> +static int i915_gem_suspend_ppgtt_mappings(struct drm_i915_private *i915)
> +{
> +       struct i915_gem_context *ctx, *cn;
> +       int ret;
> +
> +       spin_lock(&i915->gem.contexts.lock);
> +       list_for_each_entry_safe(ctx, cn, &i915->gem.contexts.list, link) {

Wrong list. Bad starting point from GEM.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [RFC PATCH 147/162] drm/i915/gt: Allocate default ctx objects in SMEM
  2020-11-27 12:07 ` [RFC PATCH 147/162] drm/i915/gt: Allocate default ctx objects in SMEM Matthew Auld
@ 2020-11-27 14:30   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:30 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Venkata Ramana Nayana, dri-devel, Prathap Kumar Valsan

Quoting Matthew Auld (2020-11-27 12:07:03)
> From: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
> 
> If record default objects are created in LMEM and in suspend
> pin the pages of obj (src) and use blitter for eviction. But
> during request creation using blitter context and try to pin the same
> default object, to restore the ctx with default HW values, will leads to
> the dead lock situation. To avoid this, safe to keep these
> objects in SMEM.

Dead patch. Default object state should be recorded as shmemfs.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 148/162] drm/i915: suspend/resume enable blitter eviction
  2020-11-27 12:07 ` [RFC PATCH 148/162] drm/i915: suspend/resume enable blitter eviction Matthew Auld
@ 2020-11-27 14:32   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:32 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Venkata Ramana Nayana, dri-devel

Quoting Matthew Auld (2020-11-27 12:07:04)
> From: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
> 
> In suspend mode use blitter eviction before disable the runtime
> interrupts and in resume use blitter after the gem resume happens.

Consider add it to the suspend prepare function.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [RFC PATCH 150/162] drm/i915: need consider system BO snoop for dgfx
  2020-11-27 12:07 ` [RFC PATCH 150/162] drm/i915: need consider system BO snoop for dgfx Matthew Auld
@ 2020-11-27 14:36   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:36 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: CQ Tang, Sudeep Dutt, dri-devel

Quoting Matthew Auld (2020-11-27 12:07:06)
> From: CQ Tang <cq.tang@intel.com>
> 
> When cache_level is NONE, we check HAS_LLC(i915).
> But additionally for DGFX, we also need to check
> HAS_SNOOP(i915) on system memory object to use
> I915_BO_CACHE_COHERENT_FOR_READ. on dg1, has_llc=0, and
> has_snoop=1. Otherwise, we set obj->cache_choerent=0 and
> have performance impact.
> 
> Cc: Chris P Wilson <chris.p.wilson@intel.com>
> Cc: Ramalingam C <ramalingam.c@intel.com>
> Cc: Sudeep Dutt <sudeep.dutt@intel.com>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Signed-off-by: CQ Tang <cq.tang@intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_object.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> index ddb448f275eb..be603171c444 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> @@ -95,6 +95,20 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
>         mutex_init(&obj->mm.get_dma_page.lock);
>  }
>  
> +static bool i915_gem_object_use_llc(struct drm_i915_gem_object *obj)
> +{
> +       struct drm_i915_private *i915 = to_i915(obj->base.dev);
> +
> +       if (HAS_LLC(i915))
> +               return true;
> +
> +       if (IS_DGFX(i915) && HAS_SNOOP(i915) &&
> +           !i915_gem_object_is_lmem(obj))
> +               return true;
> +
> +       return false;
> +}
> +
>  /**
>   * Mark up the object's coherency levels for a given cache_level
>   * @obj: #drm_i915_gem_object
> @@ -108,7 +122,7 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
>         if (cache_level != I915_CACHE_NONE)
>                 obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |
>                                        I915_BO_CACHE_COHERENT_FOR_WRITE);
> -       else if (HAS_LLC(to_i915(obj->base.dev)))
> +       else if (i915_gem_object_use_llc(obj))
>                 obj->cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ;
>         else
>                 obj->cache_coherent = 0;

You must also define obj->cache_level correctly. You can not just assume
the object will be snooped.
-Chris
---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 157/162] drm/i915: Improve accuracy of eviction stats
  2020-11-27 12:07 ` [RFC PATCH 157/162] drm/i915: Improve accuracy of eviction stats Matthew Auld
@ 2020-11-27 14:40   ` Chris Wilson
  2020-11-30 10:36     ` Tvrtko Ursulin
  0 siblings, 1 reply; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:40 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

Quoting Matthew Auld (2020-11-27 12:07:13)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Current code uses jiffie time to do the accounting and then does:
> 
>   diff = jiffies - start;
>   msec = diff * 1000 / HZ;
>   ...
>   atomic_long_add(msec, &i915->time_swap_out_ms);
> 
> If we assume jiffie can be as non-granular as 10ms and that the current
> accounting records all evictions faster than one jiffie as infinite speed,
> we can end up over-estimating the reported eviction throughput.
> 
> Fix this by accumulating ktime_t and only dividing to more user friendly
> granularity at presentation time (debugfs read).
> 
> At the same time consolidate the code a bit and convert from multiple
> atomics to single seqlock per stat.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: CQ Tang <cq.tang@intel.com>
> Cc: Sudeep Dutt <sudeep.dutt@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>

A lot of effort to fix up patches after the fact, might as well make it
a real PMU interface.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 160/162] drm/i915/dg1: Fix GPU hang due to shmemfs page drop
  2020-11-27 12:07 ` [RFC PATCH 160/162] drm/i915/dg1: Fix GPU hang due to shmemfs page drop Matthew Auld
@ 2020-11-27 14:44   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 14:44 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Venkata Ramana Nayana, dri-devel, Chris Wilson

Quoting Matthew Auld (2020-11-27 12:07:16)
> From: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
> 
> This is to fix a bug in upstream
> commit a6326a4f8ffb ("drm/i915/gt: Keep a no-frills swappable copy of the default context state")
> 
> We allocate context state obj ce->state from lmem, so in __engines_record_defaults(),
> we call shmem_create_from_object(). Because it is lmem object, this call will
> create a new shmemfs file, copy the contents into it, and return the file
> pointer and assign to engine->default_state. Of course ce->state lmem object
> is freed at the end of function __engines_record_redefaults().
> 
> Because a new shmemfs file is create for engine->default_state,
> and more importantly, we DON'T mark the pages dirty after we write into it,
> the OS page cache eviction will drop these pages.
> 
> Now with the test move forward, it will create new request/context, and will
> copy the saved engine->default_state into ce->state. If the default_state
> pages are dropped during page cache eviction, the copying will get new pages,
> and copy garbage from the new pages. Next, ce->state will have wrong
> instruction and causes GPU to hang.
> 
> The fixing is very simple, we just mark the shmemfs pages to be dirty when
> writing into it, and also mark the pages to accessed when read/write to them.
> 
> Fixes: a6326a4f8ffb("drm/i915/gt: Keep a no-frills swappable copy of the default context state")

A bug fix, send it. But please write a concise changelog first.

I missed setting the dirty bit, and so the contents were not being saved
on swap out as expected. Impact is severe; any context created after
resume may be gibberish.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 092/162] drm/i915/uapi: introduce drm_i915_gem_create_ext
  2020-11-27 12:06 ` [RFC PATCH 092/162] drm/i915/uapi: introduce drm_i915_gem_create_ext Matthew Auld
  2020-11-27 13:25   ` [Intel-gfx] " Chris Wilson
@ 2020-11-27 19:21   ` Chris Wilson
  2020-12-01 12:55   ` Chris Wilson
  2 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 19:21 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

Quoting Matthew Auld (2020-11-27 12:06:08)
> +int
> +i915_gem_create_ioctl(struct drm_device *dev, void *data,
> +                     struct drm_file *file)
> +{
> +       struct drm_i915_private *i915 = to_i915(dev);
> +       struct create_ext ext_data = { .i915 = i915 };
> +       struct drm_i915_gem_create_ext *args = data;
> +       int ret;
> +
> +       i915_gem_flush_free_objects(i915);
> +
> +       ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
> +                                  create_extensions,
> +                                  ARRAY_SIZE(create_extensions),
> +                                  &ext_data);
> +       if (ret)
> +               goto err_free;
> +
> +       if (!ext_data.placements) {
> +               struct intel_memory_region **placements;
> +               enum intel_memory_type mem_type = INTEL_MEMORY_SYSTEM;
> +
> +               placements = kmalloc(sizeof(struct intel_memory_region *),
> +                                    GFP_KERNEL);
> +               if (!placements)
> +                       return -ENOMEM;
> +
> +               placements[0] = intel_memory_region_by_type(i915, mem_type);
> +
> +               ext_data.placements = placements;
> +               ext_data.n_placements = 1;
> +       }
> +
> +       ret = i915_gem_create(file,
> +                             ext_data.placements,
> +                             ext_data.n_placements,
> +                             &args->size, &args->handle);
> +       if (!ret)
> +               return 0;

Applying the extensions has to happen after creating the vanilla object.

It literally is the equivalent of applying the setparam ioctl to a fresh
object.

Look at the PXP series for how badly wrong this goes if you try it this
way around.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [RFC PATCH 001/162] drm/i915/selftest: also consider non-contiguous objects
  2020-11-27 12:04 ` [RFC PATCH 001/162] drm/i915/selftest: also consider non-contiguous objects Matthew Auld
@ 2020-11-27 19:44   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 19:44 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

Quoting Matthew Auld (2020-11-27 12:04:37)
> In igt_ppgtt_sanity_check we should also exercise the non-contiguous
> option for LMEM, since this will give us slightly different sg layouts
> and alignment.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 004/162] drm/i915/gt: Move move context layout registers and offsets to lrc_reg.h
  2020-11-27 12:04 ` [RFC PATCH 004/162] drm/i915/gt: Move move context layout registers and offsets to lrc_reg.h Matthew Auld
@ 2020-11-27 19:55   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 19:55 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

Quoting Matthew Auld (2020-11-27 12:04:40)
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Cleanup intel_lrc.h by moving some of the residual common register
> definitions into intel_lrc_reg.h, prior to rebranding and splitting off
> the submission backends.
> 
> v2: keep the SCHEDULE enum in the old file, since it is specific to the
> gvt usage of the execlists submission backend (John)
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2
> Cc: John Harrison <John.C.Harrison@Intel.com>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c |  2 +-
>  drivers/gpu/drm/i915/gt/intel_gt_irq.c    |  1 +
>  drivers/gpu/drm/i915/gt/intel_lrc.h       | 39 -----------------------
>  drivers/gpu/drm/i915/gt/intel_lrc_reg.h   | 39 +++++++++++++++++++++++
>  drivers/gpu/drm/i915/gvt/mmio_context.h   |  2 ++
>  5 files changed, 43 insertions(+), 40 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index d4e988b2816a..02ea16b29c9f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -36,7 +36,7 @@
>  #include "intel_gt.h"
>  #include "intel_gt_requests.h"
>  #include "intel_gt_pm.h"
> -#include "intel_lrc.h"
> +#include "intel_lrc_reg.h"
>  #include "intel_reset.h"
>  #include "intel_ring.h"
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
> index 257063a57101..9830342aa6f4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
> @@ -11,6 +11,7 @@
>  #include "intel_breadcrumbs.h"
>  #include "intel_gt.h"
>  #include "intel_gt_irq.h"
> +#include "intel_lrc_reg.h"
>  #include "intel_uncore.h"
>  #include "intel_rps.h"
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h
> index 802585a308e9..9116b46844a2 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.h
> @@ -34,45 +34,6 @@ struct i915_request;
>  struct intel_context;
>  struct intel_engine_cs;
>  
> -/* Execlists regs */
> -#define RING_ELSP(base)                                _MMIO((base) + 0x230)
> -#define RING_EXECLIST_STATUS_LO(base)          _MMIO((base) + 0x234)
> -#define RING_EXECLIST_STATUS_HI(base)          _MMIO((base) + 0x234 + 4)
> -#define RING_CONTEXT_CONTROL(base)             _MMIO((base) + 0x244)
> -#define          CTX_CTRL_INHIBIT_SYN_CTX_SWITCH       (1 << 3)
> -#define          CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT   (1 << 0)
> -#define   CTX_CTRL_RS_CTX_ENABLE               (1 << 1)
> -#define          CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT      (1 << 2)
> -#define          GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE     (1 << 8)
> -#define RING_CONTEXT_STATUS_PTR(base)          _MMIO((base) + 0x3a0)
> -#define RING_EXECLIST_SQ_CONTENTS(base)                _MMIO((base) + 0x510)
> -#define RING_EXECLIST_CONTROL(base)            _MMIO((base) + 0x550)
> -
> -#define          EL_CTRL_LOAD                          (1 << 0)
> -
> -/* The docs specify that the write pointer wraps around after 5h, "After status
> - * is written out to the last available status QW at offset 5h, this pointer
> - * wraps to 0."
> - *
> - * Therefore, one must infer than even though there are 3 bits available, 6 and
> - * 7 appear to be * reserved.
> - */
> -#define GEN8_CSB_ENTRIES 6
> -#define GEN8_CSB_PTR_MASK 0x7
> -#define GEN8_CSB_READ_PTR_MASK (GEN8_CSB_PTR_MASK << 8)
> -#define GEN8_CSB_WRITE_PTR_MASK (GEN8_CSB_PTR_MASK << 0)
> -
> -#define GEN11_CSB_ENTRIES 12
> -#define GEN11_CSB_PTR_MASK 0xf
> -#define GEN11_CSB_READ_PTR_MASK (GEN11_CSB_PTR_MASK << 8)
> -#define GEN11_CSB_WRITE_PTR_MASK (GEN11_CSB_PTR_MASK << 0)
> -
> -#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
> -#define MAX_GUC_CONTEXT_HW_ID (1 << 20) /* exclusive */
> -#define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
> -/* in Gen12 ID 0x7FF is reserved to indicate idle */
> -#define GEN12_MAX_CONTEXT_HW_ID        (GEN11_MAX_CONTEXT_HW_ID - 1)
> -
>  enum {
>         INTEL_CONTEXT_SCHEDULE_IN = 0,
>         INTEL_CONTEXT_SCHEDULE_OUT,
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
> index 1b51f7b9a5c3..b2e03ce35599 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
> @@ -52,4 +52,43 @@
>  #define GEN8_EXECLISTS_STATUS_BUF 0x370
>  #define GEN11_EXECLISTS_STATUS_BUF2 0x3c0
>  
> +/* Execlists regs */
> +#define RING_ELSP(base)                                _MMIO((base) + 0x230)
> +#define RING_EXECLIST_STATUS_LO(base)          _MMIO((base) + 0x234)
> +#define RING_EXECLIST_STATUS_HI(base)          _MMIO((base) + 0x234 + 4)
> +#define RING_CONTEXT_CONTROL(base)             _MMIO((base) + 0x244)
> +#define          CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT   REG_BIT(0)
> +#define   CTX_CTRL_RS_CTX_ENABLE               REG_BIT(1)
> +#define          CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT      REG_BIT(2)
> +#define          CTX_CTRL_INHIBIT_SYN_CTX_SWITCH       REG_BIT(3)
> +#define          GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE     REG_BIT(8)
> +#define RING_CONTEXT_STATUS_PTR(base)          _MMIO((base) + 0x3a0)
> +#define RING_EXECLIST_SQ_CONTENTS(base)                _MMIO((base) + 0x510)
> +#define RING_EXECLIST_CONTROL(base)            _MMIO((base) + 0x550)
> +#define          EL_CTRL_LOAD                          REG_BIT(0)
> +
> +/*
> + * The docs specify that the write pointer wraps around after 5h, "After status
> + * is written out to the last available status QW at offset 5h, this pointer
> + * wraps to 0."
> + *
> + * Therefore, one must infer than even though there are 3 bits available, 6 and
> + * 7 appear to be * reserved.

Stray '*'

That's a very weird statement. 6/7 simply do not exist, since the
ringbuffer doesn't have that many elements.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [RFC PATCH 005/162] drm/i915/gt: Rename lrc.c to execlists_submission.c
  2020-11-27 12:04 ` [RFC PATCH 005/162] drm/i915/gt: Rename lrc.c to execlists_submission.c Matthew Auld
@ 2020-11-27 19:56   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 19:56 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Daniele Ceraolo Spurio, dri-devel, Tvrtko Ursulin

Quoting Matthew Auld (2020-11-27 12:04:41)
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> We want to separate the utility functions for controlling the logical
> ring context from the execlists submission mechanism (which is an
> overgrown scheduler).
> 
> This is similar to Daniele's work to split up the files, but being
> selfish I wanted to base it after my own changes to intel_lrc.c petered
> out.

Note in accordance with recent intel_ring_submission.c vs
intel_ring_scheduler.c, this would be intel_execlists_scheduler.c
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [RFC PATCH 006/162] drm/i915: split gen8+ flush and bb_start emission functions to their own file
  2020-11-27 12:04 ` [RFC PATCH 006/162] drm/i915: split gen8+ flush and bb_start emission functions to their own file Matthew Auld
@ 2020-11-27 19:58   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-27 19:58 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Tvrtko Ursulin, Daniele Ceraolo Spurio, John Harrison, dri-devel

Quoting Matthew Auld (2020-11-27 12:04:42)
> From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> 
> These functions are independent from the backend used and can therefore
> be split out of the exelists submission file, so they can be re-used by
> the upcoming GuC submission backend.
> 
> Based on a patch by Chris Wilson.
> 
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Chris P Wilson <chris.p.wilson@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>  drivers/gpu/drm/i915/Makefile                 |   1 +
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c      | 393 ++++++++++++++++++
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.h      |  26 ++
>  .../drm/i915/gt/intel_execlists_submission.c  | 385 +----------------
>  4 files changed, 421 insertions(+), 384 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/gt/gen8_engine_cs.c
>  create mode 100644 drivers/gpu/drm/i915/gt/gen8_engine_cs.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index aedbd8f52be8..f9ef5199b124 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -82,6 +82,7 @@ gt-y += \
>         gt/gen6_engine_cs.o \
>         gt/gen6_ppgtt.o \
>         gt/gen7_renderclear.o \
> +       gt/gen8_engine_cs.o \
>         gt/gen8_ppgtt.o \
>         gt/intel_breadcrumbs.o \
>         gt/intel_context.o \
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> new file mode 100644
> index 000000000000..a96fe108685e
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -0,0 +1,393 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2014 Intel Corporation
> + */
> +
> +#include "i915_drv.h"
> +#include "intel_execlists_submission.h" /* XXX */
> +#include "intel_gpu_commands.h"
> +#include "intel_ring.h"
> +
> +int gen8_emit_flush_render(struct i915_request *request, u32 mode)

Refresh the names to make the recent schemes.
(rcs when specific, xcs when not)
-Chris
---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [RFC PATCH 096/162] drm/i915: setup the LMEM region
  2020-11-27 12:06 ` [RFC PATCH 096/162] drm/i915: setup the LMEM region Matthew Auld
@ 2020-11-30 10:14   ` Jani Nikula
  0 siblings, 0 replies; 208+ messages in thread
From: Jani Nikula @ 2020-11-30 10:14 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Abdiel Janulgue, Lucas De Marchi, dri-devel, Rodrigo Vivi

On Fri, 27 Nov 2020, Matthew Auld <matthew.auld@intel.com> wrote:
> +	/* Enables Local Memory functionality in GAM */
> +	I915_WRITE(GEN12_LMEM_CFG_ADDR, I915_READ(GEN12_LMEM_CFG_ADDR) | LMEM_ENABLE);

Please use intel_uncore_read/write and intel_de_read/write throughout
the series. We don't want any new users of I915_READ/I915_WRITE in the
driver.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 119/162] drm/i915/dg1: Read OPROM via SPI controller
  2020-11-27 12:06 ` [RFC PATCH 119/162] drm/i915/dg1: Read OPROM via SPI controller Matthew Auld
@ 2020-11-30 10:16   ` Jani Nikula
  0 siblings, 0 replies; 208+ messages in thread
From: Jani Nikula @ 2020-11-30 10:16 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Tomas Winkler, Lucas De Marchi, dri-devel

On Fri, 27 Nov 2020, Matthew Auld <matthew.auld@intel.com> wrote:
> +	DRM_DEBUG_KMS("Found valid VBT in SPI flash\n");

Please use drm_dbg_kms() and friends throughout the series. We don't
want new users of DRM_DEBUG* in the driver.

BR,
Jani.

-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 120/162] drm/i915/oprom: Basic sanitization
  2020-11-27 12:06 ` [RFC PATCH 120/162] drm/i915/oprom: Basic sanitization Matthew Auld
@ 2020-11-30 10:24   ` Jani Nikula
  0 siblings, 0 replies; 208+ messages in thread
From: Jani Nikula @ 2020-11-30 10:24 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

On Fri, 27 Nov 2020, Matthew Auld <matthew.auld@intel.com> wrote:
> From: Anshuman Gupta <anshuman.gupta@intel.com>
>
> Sanitize OPROM header, CPD signature and OPROM PCI version.
> OPROM_HEADER, EXPANSION_ROM_HEADER and OPROM_MEU_BLOB structures
> and PCI struct offsets are provided by GSC counterparts.
> These are yet to be Documented in B.Spec.
> After successful sanitization, extract VBT from opregion
> image.

Comments inline.

BR,
Jani.

>
> Cc: Jani Nikula <jani.nikula@intel.com>
> Cc: Uma Shankar <uma.shankar@intel.com>
> Cc: Uma Shankar <uma.shankar@intel.com>
> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_bios.c     |  49 +++--
>  drivers/gpu/drm/i915/display/intel_opregion.c | 169 ++++++++++++++++++
>  drivers/gpu/drm/i915/display/intel_opregion.h |  31 +++-
>  3 files changed, 221 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c
> index 91044fc52acb..358576bc0be2 100644
> --- a/drivers/gpu/drm/i915/display/intel_bios.c
> +++ b/drivers/gpu/drm/i915/display/intel_bios.c
> @@ -2088,37 +2088,36 @@ bool intel_bios_is_valid_vbt(const void *buf, size_t size)
>  
>  static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *dev_priv)
>  {
> -	u32 count, data, found, store = 0;
> -	u32 static_region, oprom_offset;
> -	u32 oprom_size = 0x200000;
> -	u16 vbt_size;
> -	u32 *vbt;
> -
> -	static_region = I915_READ(SPI_STATIC_REGIONS);
> -	static_region &= OPTIONROM_SPI_REGIONID_MASK;
> -	I915_WRITE(PRIMARY_SPI_REGIONID, static_region);
> +	u32 count, found;
> +	u32 *vbt, *oprom_opreg = NULL;
> +	u16 vbt_size, opreg_size;
> +	u8 *parse_ptr;
>  
> -	oprom_offset = I915_READ(OROM_OFFSET);
> -	oprom_offset &= OROM_OFFSET_MASK;
> +	if (intel_oprom_verify_signature(&oprom_opreg, &opreg_size, dev_priv)) {
> +		drm_err(&dev_priv->drm, "oprom signature verification failed\n");
> +		goto err_not_found;
> +	}

Kind of silly that the previous patch adds all the reading here, and
then it gets moved into a function called "verify signature". Which
looks like it verifies the signature, but it actually reads the
SPI. Very confusing.

>  
> -	for (count = 0; count < oprom_size; count += 4) {
> -		I915_WRITE(PRIMARY_SPI_ADDRESS, oprom_offset + count);
> -		data = I915_READ(PRIMARY_SPI_TRIGGER);
> +	if (!oprom_opreg) {
> +		drm_err(&dev_priv->drm, "opregion not found\n");
> +		goto err_not_found;
> +	}
>  
> -		if (data == *((const u32 *)"$VBT")) {
> -			found = oprom_offset + count;
> +	for (count = 0; count < opreg_size; count += 4) {
> +		if (oprom_opreg[count / 4] == *((const u32 *)"$VBT")) {
> +			found = count;
>  			break;
>  		}
>  	}
>  
> -	if (count >= oprom_size)
> +	if (count >= opreg_size) {
> +		drm_err(&dev_priv->drm, "VBT not found in opregion\n");
>  		goto err_not_found;
> +	}
>  
>  	/* Get VBT size and allocate space for the VBT */
> -	I915_WRITE(PRIMARY_SPI_ADDRESS, found +
> -		   offsetof(struct vbt_header, vbt_size));
> -	vbt_size = I915_READ(PRIMARY_SPI_TRIGGER);
> -	vbt_size &= 0xffff;
> +	parse_ptr = (u8 *)oprom_opreg + found;
> +	vbt_size = ((struct vbt_header *)parse_ptr)->vbt_size;
>  
>  	vbt = kzalloc(vbt_size, GFP_KERNEL);
>  	if (!vbt) {
> @@ -2127,16 +2126,12 @@ static struct vbt_header *spi_oprom_get_vbt(struct drm_i915_private *dev_priv)
>  		goto err_not_found;
>  	}
>  
> -	for (count = 0; count < vbt_size; count += 4) {
> -		I915_WRITE(PRIMARY_SPI_ADDRESS, found + count);
> -		data = I915_READ(PRIMARY_SPI_TRIGGER);
> -		*(vbt + store++) = data;
> -	}
> -
> +	memcpy(vbt, parse_ptr, vbt_size);
>  	if (!intel_bios_is_valid_vbt(vbt, vbt_size))
>  		goto err_free_vbt;
>  
>  	DRM_DEBUG_KMS("Found valid VBT in SPI flash\n");
> +	kfree(oprom_opreg);
>  
>  	return (struct vbt_header *)vbt;
>  
> diff --git a/drivers/gpu/drm/i915/display/intel_opregion.c b/drivers/gpu/drm/i915/display/intel_opregion.c
> index 4f77cf849171..81e5946393dd 100644
> --- a/drivers/gpu/drm/i915/display/intel_opregion.c
> +++ b/drivers/gpu/drm/i915/display/intel_opregion.c
> @@ -983,6 +983,175 @@ int intel_opregion_setup(struct drm_i915_private *dev_priv)
>  	return err;
>  }
>  
> +static int oprom_image_parse_helper(u8 *parse_ptr, u8 *last_img, u8 *code_type,
> +				    struct drm_i915_private *i915)
> +{
> +	u8 size_512_bytes;
> +
> +	if (((union oprom_header *)parse_ptr)->signature != OPROM_IMAGE_MAGIC) {
> +		drm_err(&i915->drm, "Wrong OPROM header signature.\n");
> +		return -EINVAL;
> +	}
> +
> +	size_512_bytes = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_IMAGE_LENGTH_OFFSET];
> +	*code_type = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_CODE_TYPE_OFFSET];
> +	*last_img = parse_ptr[((struct expansion_rom_header *)parse_ptr)->pcistructoffset + PCI_LAST_IMAGE_INDICATOR_OFFSET];
> +
> +	return size_512_bytes;
> +}
> +
> +static void spi_read_oprom_helper(size_t len, u32 offset, u32 *buf,
> +				  struct drm_i915_private *dev_priv)
> +{
> +	u32 count, data;
> +
> +	for (count = 0; count < len; count += 4) {
> +		I915_WRITE(PRIMARY_SPI_ADDRESS, offset + count);
> +		data = I915_READ(PRIMARY_SPI_TRIGGER);
> +		buf[count / 4] = data;
> +	}
> +}
> +
> +/**
> + *	+        DASH+G OPROM IMAGE LAYOUT           +
> + *	+--------+-------+---------------------------+
> + *	| Offset | Value |   ROM Header Fields       +-----> Image 1 (CSS)
> + *	+--------------------------------------------+
> + *	|    0h  |  55h  |   ROM Signature Byte1     |
> + *	|    1h  |  AAh  |   ROM Signature Byte2     |
> + *	|    2h  |  xx   |        Reserved           |
> + *	|  18+19h|  xx   |  Ptr to PCI DataStructure |
> + *	+----------------+---------------------------+
> + *	|           PCI Data Structure               |
> + *	+--------------------------------------------+
> + *	|    .       .             .                 |
> + *	|    .       .             .                 |
> + *	|    10  +  xx   +     Image Length          |
> + *	|    14  +  xx   +     Code Type             |
> + *	|    15  +  xx   +  Last Image Indicator     |
> + *	|    .       .             .                 |
> + *	+--------------------------------------------+
> + *	|               MEU BLOB                     |
> + *	+--------------------------------------------+
> + *	|              CPD Header                    |
> + *	|              CPD Entry                     |
> + *	|              Reserved                      |
> + *	|           SignedDataPart1                  |
> + *	|              PublicKey                     |
> + *	|            RSA Signature                   |
> + *	|           SignedDataPart2                  |
> + *	|            IFWI Metadata                   |
> + *	+--------+-------+---------------------------+
> + *	|    .   |   .   |         .                 |
> + *	|    .   |   .   |         .                 |
> + *	+--------------------------------------------+
> + *	| Offset | Value |   ROM Header Fields       +-----> Image 2 (Config Data) (Offset: 0x800)
> + *	+--------------------------------------------+
> + *	|    0h  |  55h  |   ROM Signature Byte1     |
> + *	|    1h  |  AAh  |   ROM Signature Byte2     |
> + *	|    2h  |  xx   |        Reserved           |
> + *	|  18+19h|  xx   |  Ptr to PCI DataStructure |
> + *	+----------------+---------------------------+
> + *	|           PCI Data Structure               |
> + *	+--------------------------------------------+
> + *	|    .       .             .                 |
> + *	|    .       .             .                 |
> + *	|    10  +  xx   +     Image Length          |
> + *	|    14  +  xx   +      Code Type            |
> + *	|    15  +  xx   +   Last Image Indicator    |
> + *	|    .       .             .                 |
> + *	|    1A  +  3C   + Ptr to Opregion Signature |
> + *	|    .       .             .                 |
> + *	|    .       .             .                 |
> + *	|   83Ch + IntelGraphicsMem                  | <---+ Opregion Signature
> + *	+--------+-----------------------------------+
> + *
> + * intel_oprom_verify_signature() verify OPROM signature.
> + * @opreg: pointer to opregion buffer output.
> + * @opreg_size: pointer to opregion size output.
> + * @dev_priv: i915 device.
> + */
> +int
> +intel_oprom_verify_signature(u32 **opreg, u16 *opreg_size,
> +			     struct drm_i915_private *dev_priv)
> +{
> +	u8 img_sig[sizeof(OPREGION_SIGNATURE)];
> +	u8 code_type, last_img;
> +	u32 static_region, offset;
> +	u32 *oprom_img, *oprom_img_hdr;
> +	u16 opreg_base, img_len;
> +	u8 *parse_ptr;
> +	int img_size;
> +	int ret = -EINVAL;
> +
> +	/* initialize SPI to read the OPROM */
> +	static_region = I915_READ(SPI_STATIC_REGIONS);
> +	static_region &= OPTIONROM_SPI_REGIONID_MASK;
> +	I915_WRITE(PRIMARY_SPI_REGIONID, static_region);
> +	/* read OPROM offset in SPI flash */
> +	offset = I915_READ(OROM_OFFSET);
> +	offset &= OROM_OFFSET_MASK;
> +
> +	oprom_img_hdr = kzalloc(OPROM_INITIAL_READ_SIZE, GFP_KERNEL);
> +	if (!oprom_img_hdr)
> +		return -ENOMEM;
> +
> +	do {
> +		spi_read_oprom_helper(OPROM_INITIAL_READ_SIZE, offset,
> +				      oprom_img_hdr, dev_priv);
> +		img_size = oprom_image_parse_helper((u8 *)oprom_img_hdr, &last_img,
> +						    &code_type, dev_priv);
> +		if (img_size <= 0) {
> +			ret = -EINVAL;
> +			goto err_free_hdr;
> +		}
> +
> +		img_len = img_size * OPROM_BYTE_BOUNDARY;
> +		oprom_img = kzalloc(img_len, GFP_KERNEL);
> +		if (!oprom_img) {
> +			ret = -ENOMEM;
> +			goto err_free_hdr;
> +		}
> +
> +		spi_read_oprom_helper(img_len, offset, oprom_img, dev_priv);
> +		parse_ptr = (u8 *)oprom_img;
> +		offset = offset + img_len;
> +
> +		/* opregion base offset */
> +		opreg_base = ((struct expansion_rom_header *)parse_ptr)->opregion_base;
> +		/* CPD or opreg signature is present at opregion_base offset */
> +		memcpy(img_sig, parse_ptr + opreg_base, sizeof(OPREGION_SIGNATURE));
> +
> +		if (!memcmp(img_sig, OPREGION_SIGNATURE, sizeof(OPREGION_SIGNATURE) - 1)) {
> +			*opreg = oprom_img;
> +			*opreg_size = img_len;
> +			drm_dbg_kms(&dev_priv->drm, "Found opregion image\n");
> +			ret = 0;
> +			break;
> +		} else if (!memcmp(img_sig, CPD_SIGNATURE, NUM_CPD_BYTES)) {
> +			if (code_type != OPROM_CSS_CODE_TYPE) {
> +				drm_err(&dev_priv->drm, "Invalid OPROM\n");
> +				ret = -EINVAL;
> +				goto err_free_img;
> +			}
> +			drm_dbg_kms(&dev_priv->drm, "Found CSS image\n");
> +			/* proceed here onwards for signature authentication */
> +			kfree(oprom_img);
> +			continue;
> +		}
> +
> +	} while (last_img != LAST_IMG_INDICATOR);
> +
> +	return ret;
> +
> +err_free_img:
> +	kfree(oprom_img);
> +err_free_hdr:
> +	kfree(oprom_img_hdr);
> +
> +	return ret;
> +}
> +
>  static int intel_use_opregion_panel_type_callback(const struct dmi_system_id *id)
>  {
>  	DRM_INFO("Using panel type from OpRegion on %s\n", id->ident);
> diff --git a/drivers/gpu/drm/i915/display/intel_opregion.h b/drivers/gpu/drm/i915/display/intel_opregion.h
> index 4aa68ffbd30e..4e2eeadf101e 100644
> --- a/drivers/gpu/drm/i915/display/intel_opregion.h
> +++ b/drivers/gpu/drm/i915/display/intel_opregion.h
> @@ -54,6 +54,34 @@ struct intel_opregion {
>  
>  #define OPREGION_SIZE            (8 * 1024)
>  
> +#define CPD_SIGNATURE "$CPD"                  /* CPD Signature */
> +#define NUM_CPD_BYTES 4
> +#define PCI_IMAGE_LENGTH_OFFSET 0x10
> +#define PCI_CODE_TYPE_OFFSET 0x14
> +#define PCI_LAST_IMAGE_INDICATOR_OFFSET 0x15
> +#define LAST_IMG_INDICATOR 0x80
> +#define OPROM_IMAGE_MAGIC 0xAA55       /* Little Endian */
> +#define OPROM_CSS_CODE_TYPE 0xF0
> +#define OPROM_BYTE_BOUNDARY 512        /* OPROM image sizes are indicated in 512 byte boundaries */
> +#define OPROM_INITIAL_READ_SIZE 60     /* Read 60 bytes to compute the Img Len from PCI structure */
> +
> +union oprom_header {
> +	u32 data;
> +	struct {
> +		u16 signature;  /* Offset[0x0]: Header 0x55 0xAA */
> +		u8 sizein512bytes;
> +		u8 reserved;
> +	};
> +};

What's the point of the union?

> +
> +struct expansion_rom_header {
> +	union oprom_header header;      /* Offset[0x0]: Oprom Header */
> +	u16 vbiospostoffset;    /* Offset[0x4]: pointer to VBIOS entry point */
> +	u8 resvd[0x12];
> +	u16 pcistructoffset;    /* Offset[0x18]: Contains pointer PCI Data Structure */
> +	u16 opregion_base;      /* Offset[0x1A]: Offset to Opregion Base start */
> +};

AFAICT both of these should be hidden in the .c file instead of exposed
to the rest of the driver, and they should be __packed as they're used
for serialisation.

> +
>  #ifdef CONFIG_ACPI
>  
>  int intel_opregion_setup(struct drm_i915_private *dev_priv);
> @@ -118,5 +146,6 @@ static inline int intel_opregion_get_panel_type(struct drm_i915_private *dev)
>  }
>  
>  #endif /* CONFIG_ACPI */
> -
> +int intel_oprom_verify_signature(u32 **opreg, u16 *opreg_size,
> +				 struct drm_i915_private *i915);

This breaks the build for CONFIG_ACPI=n.



>  #endif

-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 157/162] drm/i915: Improve accuracy of eviction stats
  2020-11-27 14:40   ` [Intel-gfx] " Chris Wilson
@ 2020-11-30 10:36     ` Tvrtko Ursulin
  0 siblings, 0 replies; 208+ messages in thread
From: Tvrtko Ursulin @ 2020-11-30 10:36 UTC (permalink / raw)
  To: Chris Wilson, Matthew Auld, intel-gfx; +Cc: dri-devel


On 27/11/2020 14:40, Chris Wilson wrote:
> Quoting Matthew Auld (2020-11-27 12:07:13)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Current code uses jiffie time to do the accounting and then does:
>>
>>    diff = jiffies - start;
>>    msec = diff * 1000 / HZ;
>>    ...
>>    atomic_long_add(msec, &i915->time_swap_out_ms);
>>
>> If we assume jiffie can be as non-granular as 10ms and that the current
>> accounting records all evictions faster than one jiffie as infinite speed,
>> we can end up over-estimating the reported eviction throughput.
>>
>> Fix this by accumulating ktime_t and only dividing to more user friendly
>> granularity at presentation time (debugfs read).
>>
>> At the same time consolidate the code a bit and convert from multiple
>> atomics to single seqlock per stat.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: CQ Tang <cq.tang@intel.com>
>> Cc: Sudeep Dutt <sudeep.dutt@intel.com>
>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> 
> A lot of effort to fix up patches after the fact, might as well make it
> a real PMU interface.

It did cross my mind and should be easy to add on top if deemed useful 
or interesting.

More importantly, it is okay with me to incorporate this patch into the 
earlier one(s) which first added statistics.

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 118/162] drm/i915/dg1: Reserve first 1MB of local memory
  2020-11-27 13:52   ` [Intel-gfx] " Chris Wilson
@ 2020-11-30 11:09     ` Matthew Auld
  2020-11-30 11:22       ` Chris Wilson
  0 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-30 11:09 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: dri-devel

On 27/11/2020 13:52, Chris Wilson wrote:
> Quoting Matthew Auld (2020-11-27 12:06:34)
>> From: Imre Deak <imre.deak@intel.com>
>>
>> On DG1 A0/B0 steppings the first 1MB of local memory must be reserved.
>> One reason for this is that the 0xA0000-0xB0000 range is not accessible
>> by the display, probably since this region is redirected to another
>> memory location for legacy VGA compatibility.
>>
>> BSpec: 50586
>> Testcase: igt/kms_big_fb/linear-64bpp-rotate-0
>> Signed-off-by: Imre Deak <imre.deak@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_region_lmem.c | 52 ++++++++++++++++++++++++
>>   1 file changed, 52 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
>> index 939cf0d195a5..eafef7034680 100644
>> --- a/drivers/gpu/drm/i915/intel_region_lmem.c
>> +++ b/drivers/gpu/drm/i915/intel_region_lmem.c
>> @@ -137,6 +137,48 @@ intel_setup_fake_lmem(struct drm_i915_private *i915)
>>          return mem;
>>   }
>>   
>> +static void get_legacy_lowmem_region(struct intel_uncore *uncore,
>> +                                    u64 *start, u32 *size)
>> +{
>> +       *start = 0;
>> +       *size = 0;
>> +
>> +       if (!IS_DG1_REVID(uncore->i915, DG1_REVID_A0, DG1_REVID_B0))
>> +               return;
>> +
>> +       *size = SZ_1M;
>> +
>> +       DRM_DEBUG_DRIVER("LMEM: reserved legacy low-memory [0x%llx-0x%llx]\n",
>> +                        *start, *start + *size);
>> +}
>> +
>> +static int reserve_lowmem_region(struct intel_uncore *uncore,
>> +                                struct intel_memory_region *mem)
>> +{
>> +       u64 reserve_start;
>> +       u64 reserve_end;
>> +       u64 region_start;
>> +       u32 region_size;
>> +       int ret;
>> +
>> +       get_legacy_lowmem_region(uncore, &region_start, &region_size);
>> +       reserve_start = region_start;
>> +       reserve_end = region_start + region_size;
>> +
>> +       if (!reserve_end)
>> +               return 0;
>> +
>> +       DRM_INFO("LMEM: reserving low-memory region [0x%llx-0x%llx]\n",
>> +                reserve_start, reserve_end);
>> +       ret = i915_buddy_alloc_range(&mem->mm, &mem->reserved,
>> +                                    reserve_start,
>> +                                    reserve_end - reserve_start);
> 
> Isn't this now relative to the stolen offset? Should this be reserved,
> or excluded like stolen?

AFAIK stolen is just snipped off at the end of lmem, so I don't think it 
really matters if we exclude or reserve. But for this if we exclude then 
the region.start might have "strange" alignment, which is annoying since 
alloc(some_power_of_two) might not give us the expected alignment, 
whereas if we reserve then the allocator is aware, and so we should get 
the proper alignment. Maybe you have better ideas with how to handle 
this, but I think keeping the alignment property is nice.

> -Chris
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [RFC PATCH 107/162] drm/i915: setup GPU device lmem region
  2020-11-27 12:06 ` [RFC PATCH 107/162] drm/i915: setup GPU device lmem region Matthew Auld
@ 2020-11-30 11:18   ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-30 11:18 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Abdiel Janulgue, Matthew Brost, Sudeep Dutt, dri-devel, CQ Tang,
	Venkata S Dhanalakota, Neel Desai, Francesco, Balestrieri,
	Niranjana Vishwanathapura

Quoting Matthew Auld (2020-11-27 12:06:23)
> From: CQ Tang <cq.tang@intel.com>
> 
> The lmem region needs to remove the stolen part.
> 
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> Cc: Chris P Wilson <chris.p.wilson@intel.com>
> Cc: Balestrieri, Francesco <francesco.balestrieri@intel.com>
> Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> Cc: Venkata S Dhanalakota <venkata.s.dhanalakota@intel.com>
> Cc: Neel Desai <neel.desai@intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Sudeep Dutt <sudeep.dutt@intel.com>
> Signed-off-by: CQ Tang <cq.tang@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_reg.h          |  2 ++
>  drivers/gpu/drm/i915/intel_region_lmem.c | 11 +++++++----
>  2 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 1af1966ac461..0e01ea0cb0a4 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -12066,6 +12066,8 @@ enum skl_power_gate {
>  #define GEN12_LMEM_CFG_ADDR            _MMIO(0xcf58)
>  #define   LMEM_ENABLE                  (1 << 31)
>  
> +#define GEN12_GSMBASE                  _MMIO(0x108100)
> +
>  /* gamt regs */
>  #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
>  #define   GEN8_L3_LRA_1_GPGPU_DEFAULT_VALUE_BDW  0x67F1427F /* max/min for LRA1/2 */
> diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
> index e98582c76de1..7f2b31d469b0 100644
> --- a/drivers/gpu/drm/i915/intel_region_lmem.c
> +++ b/drivers/gpu/drm/i915/intel_region_lmem.c
> @@ -140,20 +140,23 @@ intel_setup_fake_lmem(struct drm_i915_private *i915)
>  static struct intel_memory_region *
>  setup_lmem(struct drm_i915_private *dev_priv)

Am I wrong in thinking lmem should be under gt?

>  {
> +       struct intel_uncore *uncore = &dev_priv->uncore;
>         struct pci_dev *pdev = dev_priv->drm.pdev;
>         struct intel_memory_region *mem;
>         resource_size_t io_start;
> -       resource_size_t size;
> +       resource_size_t lmem_size;
>  
>         /* Enables Local Memory functionality in GAM */
>         I915_WRITE(GEN12_LMEM_CFG_ADDR, I915_READ(GEN12_LMEM_CFG_ADDR) | LMEM_ENABLE);
>  
> +       /* Stolen starts from GSMBASE on DG1 */
> +       lmem_size = intel_uncore_read64(uncore, GEN12_GSMBASE);
> +
>         io_start = pci_resource_start(pdev, 2);
> -       size = pci_resource_len(pdev, 2);

Sanitycheck the two.

size = min(size, lmem_size);

>  
>         mem = intel_memory_region_create(dev_priv,
>                                          0,
> -                                        size,
> +                                        lmem_size,

Ok, stolen is at tail not start. 

>                                          I915_GTT_PAGE_SIZE_4K,
>                                          io_start,
>                                          &intel_region_lmem_ops);
> @@ -162,7 +165,7 @@ setup_lmem(struct drm_i915_private *dev_priv)
>                 DRM_INFO("Intel graphics LMEM IO start: %llx\n",
>                          (u64)mem->io_start);
>                 DRM_INFO("Intel graphics LMEM size: %llx\n",
> -                        (u64)size);
> +                        (u64)lmem_size);

Use the correct printf-formats, %pa.

>         }
>  
>         return mem;
> -- 
> 2.26.2
>
---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 118/162] drm/i915/dg1: Reserve first 1MB of local memory
  2020-11-30 11:09     ` Matthew Auld
@ 2020-11-30 11:22       ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-30 11:22 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

Quoting Matthew Auld (2020-11-30 11:09:57)
> On 27/11/2020 13:52, Chris Wilson wrote:
> > Quoting Matthew Auld (2020-11-27 12:06:34)
> >> From: Imre Deak <imre.deak@intel.com>
> >>
> >> On DG1 A0/B0 steppings the first 1MB of local memory must be reserved.
> >> One reason for this is that the 0xA0000-0xB0000 range is not accessible
> >> by the display, probably since this region is redirected to another
> >> memory location for legacy VGA compatibility.
> >>
> >> BSpec: 50586
> >> Testcase: igt/kms_big_fb/linear-64bpp-rotate-0
> >> Signed-off-by: Imre Deak <imre.deak@intel.com>
> >> ---
> >>   drivers/gpu/drm/i915/intel_region_lmem.c | 52 ++++++++++++++++++++++++
> >>   1 file changed, 52 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c
> >> index 939cf0d195a5..eafef7034680 100644
> >> --- a/drivers/gpu/drm/i915/intel_region_lmem.c
> >> +++ b/drivers/gpu/drm/i915/intel_region_lmem.c
> >> @@ -137,6 +137,48 @@ intel_setup_fake_lmem(struct drm_i915_private *i915)
> >>          return mem;
> >>   }
> >>   
> >> +static void get_legacy_lowmem_region(struct intel_uncore *uncore,
> >> +                                    u64 *start, u32 *size)
> >> +{
> >> +       *start = 0;
> >> +       *size = 0;
> >> +
> >> +       if (!IS_DG1_REVID(uncore->i915, DG1_REVID_A0, DG1_REVID_B0))
> >> +               return;
> >> +
> >> +       *size = SZ_1M;
> >> +
> >> +       DRM_DEBUG_DRIVER("LMEM: reserved legacy low-memory [0x%llx-0x%llx]\n",
> >> +                        *start, *start + *size);
> >> +}
> >> +
> >> +static int reserve_lowmem_region(struct intel_uncore *uncore,
> >> +                                struct intel_memory_region *mem)
> >> +{
> >> +       u64 reserve_start;
> >> +       u64 reserve_end;
> >> +       u64 region_start;
> >> +       u32 region_size;
> >> +       int ret;
> >> +
> >> +       get_legacy_lowmem_region(uncore, &region_start, &region_size);
> >> +       reserve_start = region_start;
> >> +       reserve_end = region_start + region_size;
> >> +
> >> +       if (!reserve_end)
> >> +               return 0;
> >> +
> >> +       DRM_INFO("LMEM: reserving low-memory region [0x%llx-0x%llx]\n",
> >> +                reserve_start, reserve_end);
> >> +       ret = i915_buddy_alloc_range(&mem->mm, &mem->reserved,
> >> +                                    reserve_start,
> >> +                                    reserve_end - reserve_start);
> > 
> > Isn't this now relative to the stolen offset? Should this be reserved,
> > or excluded like stolen?
> 
> AFAIK stolen is just snipped off at the end of lmem, so I don't think it 
> really matters if we exclude or reserve.

Right, misread, thought it was moving the start point.

> But for this if we exclude then 
> the region.start might have "strange" alignment, which is annoying since 
> alloc(some_power_of_two) might not give us the expected alignment, 
> whereas if we reserve then the allocator is aware, and so we should get 
> the proper alignment. Maybe you have better ideas with how to handle 
> this, but I think keeping the alignment property is nice.

The only tweak I would look at is making this reservation be the
property of the VGA decode. But if this promises not to live into
production, kiss.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [RFC PATCH 131/162] drm/i915/dg1: Add enable_eviction modparam
  2020-11-27 12:06 ` [RFC PATCH 131/162] drm/i915/dg1: Add enable_eviction modparam Matthew Auld
@ 2020-11-30 12:20   ` Jani Nikula
  0 siblings, 0 replies; 208+ messages in thread
From: Jani Nikula @ 2020-11-30 12:20 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: CQ Tang, Sudeep Dutt, dri-devel

On Fri, 27 Nov 2020, Matthew Auld <matthew.auld@intel.com> wrote:
> From: CQ Tang <cq.tang@intel.com>
>
> enable_eviction is used to tune if eviction is enabled (default) or not.
>
> Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
> Signed-off-by: CQ Tang <cq.tang@intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_object.c | 1 +
>  drivers/gpu/drm/i915/gem/i915_gem_region.c | 5 +++++
>  drivers/gpu/drm/i915/i915_params.c         | 3 +++
>  drivers/gpu/drm/i915/i915_params.h         | 1 +
>  drivers/gpu/drm/i915/intel_memory_region.c | 2 +-
>  5 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> index 7cb5f137522f..46d0f8731db0 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> @@ -293,6 +293,7 @@ static void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  	 * If object had been swapped out, free the hidden object.
>  	 */
>  	if (obj->swapto) {
> +		GEM_BUG_ON(!i915->params.enable_eviction);
>  		i915_gem_object_put(obj->swapto);
>  		obj->swapto = NULL;
>  	}
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_region.c b/drivers/gpu/drm/i915/gem/i915_gem_region.c
> index a437538cd872..e1793c5f8d8c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_region.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_region.c
> @@ -21,6 +21,7 @@ i915_gem_object_swapout_pages(struct drm_i915_gem_object *obj,
>  	GEM_BUG_ON(i915_gem_object_has_pages(obj));
>  	GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
>  	GEM_BUG_ON(obj->mm.region->type != INTEL_MEMORY_LOCAL);
> +	GEM_BUG_ON(!i915->params.enable_eviction);
>  
>  	assert_object_held(obj);
>  
> @@ -70,6 +71,7 @@ static int
>  i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
>  			     struct sg_table *pages, unsigned int sizes)
>  {
> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
>  	struct drm_i915_gem_object *dst, *src;
>  	int err;
>  
> @@ -77,6 +79,7 @@ i915_gem_object_swapin_pages(struct drm_i915_gem_object *obj,
>  	GEM_BUG_ON(i915_gem_object_has_pages(obj));
>  	GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
>  	GEM_BUG_ON(obj->mm.region->type != INTEL_MEMORY_LOCAL);
> +	GEM_BUG_ON(!i915->params.enable_eviction);
>  
>  	assert_object_held(obj);
>  
> @@ -146,6 +149,7 @@ i915_gem_object_put_pages_buddy(struct drm_i915_gem_object *obj,
>  int
>  i915_gem_object_get_pages_buddy(struct drm_i915_gem_object *obj)
>  {
> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
>  	struct intel_memory_region *mem = obj->mm.region;
>  	struct list_head *blocks = &obj->mm.blocks;
>  	resource_size_t size = obj->base.size;
> @@ -222,6 +226,7 @@ i915_gem_object_get_pages_buddy(struct drm_i915_gem_object *obj)
>  	/* if we saved the page contents, swap them in */
>  	if (obj->swapto) {
>  		GEM_BUG_ON(i915_gem_object_is_volatile(obj));
> +		GEM_BUG_ON(!i915->params.enable_eviction);
>  
>  		ret = i915_gem_object_swapin_pages(obj, st,
>  						   sg_page_sizes);
> diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
> index 7f139ea4a90b..bb1ebb6ece95 100644
> --- a/drivers/gpu/drm/i915/i915_params.c
> +++ b/drivers/gpu/drm/i915/i915_params.c
> @@ -197,6 +197,9 @@ i915_param_named_unsafe(fake_lmem_start, ulong, 0400,
>  	"Fake LMEM start offset (default: 0)");
>  #endif
>  
> +i915_param_named_unsafe(enable_eviction, bool, 0600,
> +	"Enable memcpy based eviction which does not rely on DMA resv refactoring)");

Does the module parameter actually need to be writable? Should be
modified via debugfs as a device specific parameter?

BR,
Jani.

> +
>  static __always_inline void _print_param(struct drm_printer *p,
>  					 const char *name,
>  					 const char *type,
> diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
> index 330c03e2b4f7..87df407d9afb 100644
> --- a/drivers/gpu/drm/i915/i915_params.h
> +++ b/drivers/gpu/drm/i915/i915_params.h
> @@ -72,6 +72,7 @@ struct drm_printer;
>  	param(char *, force_probe, CONFIG_DRM_I915_FORCE_PROBE, 0400) \
>  	param(unsigned long, fake_lmem_start, 0, 0400) \
>  	/* leave bools at the end to not create holes */ \
> +	param(bool, enable_eviction, true, 0600) \
>  	param(bool, enable_hangcheck, true, 0600) \
>  	param(bool, load_detect_test, false, 0600) \
>  	param(bool, force_reset_modeset_test, false, 0600) \
> diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
> index afcd6fe6eaff..57f01ef16628 100644
> --- a/drivers/gpu/drm/i915/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/intel_memory_region.c
> @@ -175,7 +175,7 @@ static int intel_memory_region_evict(struct intel_memory_region *mem,
>  	list_splice_tail(&still_in_list, *phase);
>  	mutex_unlock(&mem->objects.lock);
>  
> -	if (found < target) {
> +	if (found < target && i915->params.enable_eviction) {
>  		pass++;
>  		phase++;
>  		if (*phase)

-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 124/162] drm/i915/lmem: allocate HWSP in lmem
  2020-11-27 13:55   ` [Intel-gfx] " Chris Wilson
@ 2020-11-30 17:17     ` Matthew Auld
  2020-11-30 17:35       ` Chris Wilson
  0 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2020-11-30 17:17 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Abdiel Janulgue, Michel Thierry, dri-devel

On 27/11/2020 13:55, Chris Wilson wrote:
> Quoting Matthew Auld (2020-11-27 12:06:40)
>> From: Michel Thierry <michel.thierry@intel.com>
> 
> Rationale goes here.
> 
> Is this wise? HWSP is very frequently read by the CPU, and expected to
> be cached on the CPU.
> 
> What do the performance profiles indicate?

Do you have a recommendation for an existing selftest or IGT to help 
measure this?

Also are you suggesting moving this to system memory, or just using a 
different mapping type, if it's placed in local memory? Or maybe try 
both? Although I'm pretty sceptical about !wc for local memory.

> -Chris
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 124/162] drm/i915/lmem: allocate HWSP in lmem
  2020-11-30 17:17     ` Matthew Auld
@ 2020-11-30 17:35       ` Chris Wilson
  0 siblings, 0 replies; 208+ messages in thread
From: Chris Wilson @ 2020-11-30 17:35 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: Abdiel Janulgue, Michel Thierry, dri-devel

Quoting Matthew Auld (2020-11-30 17:17:16)
> On 27/11/2020 13:55, Chris Wilson wrote:
> > Quoting Matthew Auld (2020-11-27 12:06:40)
> >> From: Michel Thierry <michel.thierry@intel.com>
> > 
> > Rationale goes here.
> > 
> > Is this wise? HWSP is very frequently read by the CPU, and expected to
> > be cached on the CPU.
> > 
> > What do the performance profiles indicate?
> 
> Do you have a recommendation for an existing selftest or IGT to help 
> measure this?
> 
> Also are you suggesting moving this to system memory, or just using a 
> different mapping type, if it's placed in local memory? Or maybe try 
> both? Although I'm pretty sceptical about !wc for local memory.

A lot of worries go out of the window if this can be in system memory
and snooped.

For measuring, I suspect there is a lot of chaff that needs to be
removed before individual microbenchmarks like perf/request discern any
difference; although that would be a starting point. We do a lot of
completion checking during execlists interrupt processing, and there we
(cpu profiles at least) are sensitive to uncached reads.

We can trivially construct a benchmark that only shows the impact of the
WC reads; but the point where I think we would first notice from userspace
is client wakeup latency scaling: benchmarks/gem_latency, which was once
a point of major concern. Nowadays, we can couple that with a second
concern about inducing system latency from interrupt processing time.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 092/162] drm/i915/uapi: introduce drm_i915_gem_create_ext
  2020-11-27 12:06 ` [RFC PATCH 092/162] drm/i915/uapi: introduce drm_i915_gem_create_ext Matthew Auld
  2020-11-27 13:25   ` [Intel-gfx] " Chris Wilson
  2020-11-27 19:21   ` Chris Wilson
@ 2020-12-01 12:55   ` Chris Wilson
  2020-12-01 13:43     ` Matthew Auld
  2 siblings, 1 reply; 208+ messages in thread
From: Chris Wilson @ 2020-12-01 12:55 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx; +Cc: dri-devel

Quoting Matthew Auld (2020-11-27 12:06:08)
> Same old gem_create but with now with extensions support. This is needed
> to support various upcoming usecases. For now we use the extensions
> mechanism to support setting an immutable-priority-list of potential
> placements, at creation time.
> 
> If we wish to set the placements/regions we can simply do:
> 
> struct drm_i915_gem_object_param region_param = { … }; /* Unchanged */
> struct drm_i915_gem_create_ext_setparam setparam_region = {
>     .base = { .name = I915_GEM_CREATE_EXT_SETPARAM },
>     .param = region_param,
> }
> 
> struct drm_i915_gem_create_ext create_ext = {
>         .size = 16 * PAGE_SIZE,
>         .extensions = (uintptr_t)&setparam_region,
> };
> int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
> if (err) ...

Looking at the existing gem_create, there is no detection of an
unsupported extension. That is there is no rejection of new userspace
asking for placement on an old kernel. (As erroneous as that would be
for many other reasons.)

Unless I've missed something, we need a new ioctl number for CREATEv2.
-Chris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 092/162] drm/i915/uapi: introduce drm_i915_gem_create_ext
  2020-12-01 12:55   ` Chris Wilson
@ 2020-12-01 13:43     ` Matthew Auld
  0 siblings, 0 replies; 208+ messages in thread
From: Matthew Auld @ 2020-12-01 13:43 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: dri-devel

On 01/12/2020 12:55, Chris Wilson wrote:
> Quoting Matthew Auld (2020-11-27 12:06:08)
>> Same old gem_create but with now with extensions support. This is needed
>> to support various upcoming usecases. For now we use the extensions
>> mechanism to support setting an immutable-priority-list of potential
>> placements, at creation time.
>>
>> If we wish to set the placements/regions we can simply do:
>>
>> struct drm_i915_gem_object_param region_param = { … }; /* Unchanged */
>> struct drm_i915_gem_create_ext_setparam setparam_region = {
>>      .base = { .name = I915_GEM_CREATE_EXT_SETPARAM },
>>      .param = region_param,
>> }
>>
>> struct drm_i915_gem_create_ext create_ext = {
>>          .size = 16 * PAGE_SIZE,
>>          .extensions = (uintptr_t)&setparam_region,
>> };
>> int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
>> if (err) ...
> 
> Looking at the existing gem_create, there is no detection of an
> unsupported extension. That is there is no rejection of new userspace
> asking for placement on an old kernel. (As erroneous as that would be
> for many other reasons.)
> 
> Unless I've missed something, we need a new ioctl number for CREATEv2.

+Joonas

Right, and I guess it's not a good idea for userspace to implement 
something like has_gem_create_ext()?

> -Chris
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 092/162] drm/i915/uapi: introduce drm_i915_gem_create_ext
  2020-11-27 13:25   ` [Intel-gfx] " Chris Wilson
@ 2020-12-01 15:06     ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 208+ messages in thread
From: Thomas Hellström (Intel) @ 2020-12-01 15:06 UTC (permalink / raw)
  To: Chris Wilson, Matthew Auld, intel-gfx; +Cc: dri-devel


On 11/27/20 2:25 PM, Chris Wilson wrote:
> Quoting Matthew Auld (2020-11-27 12:06:08)
>> Same old gem_create but with now with extensions support. This is needed
>> to support various upcoming usecases. For now we use the extensions
>> mechanism to support setting an immutable-priority-list of potential
>> placements, at creation time.
>>
>> If we wish to set the placements/regions we can simply do:
>>
>> struct drm_i915_gem_object_param region_param = { … }; /* Unchanged */
>> struct drm_i915_gem_create_ext_setparam setparam_region = {
>>      .base = { .name = I915_GEM_CREATE_EXT_SETPARAM },
>>      .param = region_param,
>> }
>>
>> struct drm_i915_gem_create_ext create_ext = {
>>          .size = 16 * PAGE_SIZE,
>>          .extensions = (uintptr_t)&setparam_region,
>> };
>> int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
>> if (err) ...
>>
>> If we use the normal gem_create or gem_create_ext without the
>> extensions/placements then we still get the old behaviour with only
>> placing the object in system memory.
>>
>> One important change here is the returned size will now be rounded up to
>> the correct size, depending on the list of placements, where we might
>> have minimum page-size restrictions on some platforms when dealing with
>> device local-memory.
>>
>> Also, we still keep around the i915_gem_object_setparam ioctl, although
>> that is now restricted by the placement list(i.e we are not allowed to
>> add new placements), and longer term that will be going away wrt setting
>> placements, since it was deemed that the kernel doesn't need to support
>> a dynamic list of placements, which is now solidified by this uapi
>> change.
>>
>> Testcase: igt/gem_create/create-ext-placement-sanity-check
>> Testcase: igt/gem_create/create-ext-placement-each
>> Testcase: igt/gem_create/create-ext-placement-all
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Signed-off-by: CQ Tang <cq.tang@intel.com>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> ---
>>   drivers/gpu/drm/i915/Makefile                 |   1 +
>>   drivers/gpu/drm/i915/gem/i915_gem_create.c    | 398 ++++++++++++++++++
>>   drivers/gpu/drm/i915/gem/i915_gem_object.c    |   2 +
>>   .../gpu/drm/i915/gem/i915_gem_object_types.h  |   9 +
>>   drivers/gpu/drm/i915/gem/i915_gem_region.c    |   4 +
>>   drivers/gpu/drm/i915/i915_drv.c               |   2 +-
>>   drivers/gpu/drm/i915/i915_gem.c               | 103 +----
>>   drivers/gpu/drm/i915/intel_memory_region.c    |  20 +
>>   drivers/gpu/drm/i915/intel_memory_region.h    |   4 +
>>   include/uapi/drm/i915_drm.h                   |  60 +++
>>   10 files changed, 500 insertions(+), 103 deletions(-)
>>   create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_create.c
>>
>> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
>> index ec361d61230b..3955134feca7 100644
>> --- a/drivers/gpu/drm/i915/Makefile
>> +++ b/drivers/gpu/drm/i915/Makefile
>> @@ -134,6 +134,7 @@ gem-y += \
>>          gem/i915_gem_clflush.o \
>>          gem/i915_gem_client_blt.o \
>>          gem/i915_gem_context.o \
>> +       gem/i915_gem_create.o \
>>          gem/i915_gem_dmabuf.o \
>>          gem/i915_gem_domain.o \
>>          gem/i915_gem_execbuffer.o \
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c
>> new file mode 100644
>> index 000000000000..6f6dd4f1ce7e
>> --- /dev/null
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
>> @@ -0,0 +1,398 @@
>> +// SPDX-License-Identifier: MIT
>> +/*
>> + * Copyright © 2020 Intel Corporation
>> + */
>> +
>> +#include "gem/i915_gem_ioctls.h"
>> +#include "gem/i915_gem_lmem.h"
>> +#include "gem/i915_gem_object_blt.h"
>> +#include "gem/i915_gem_region.h"
>> +
>> +#include "i915_drv.h"
>> +#include "i915_user_extensions.h"
>> +
>> +static u32 max_page_size(struct intel_memory_region **placements,
>> +                        int n_placements)
>> +{
>> +       u32 max_page_size = 0;
>> +       int i;
>> +
>> +       for (i = 0; i < n_placements; ++i) {
>> +               max_page_size = max_t(u32, max_page_size,
>> +                                     placements[i]->min_page_size);
>> +       }
>> +
>> +       GEM_BUG_ON(!max_page_size);
>> +       return max_page_size;
>> +}
>> +
>> +static int
>> +i915_gem_create(struct drm_file *file,
>> +               struct intel_memory_region **placements,
>> +               int n_placements,
>> +               u64 *size_p,
>> +               u32 *handle_p)
>> +{
>> +       struct drm_i915_gem_object *obj;
>> +       u32 handle;
>> +       u64 size;
>> +       int ret;
>> +
>> +       size = round_up(*size_p, max_page_size(placements, n_placements));
>> +       if (size == 0)
>> +               return -EINVAL;
>> +
>> +       /* For most of the ABI (e.g. mmap) we think in system pages */
>> +       GEM_BUG_ON(!IS_ALIGNED(size, PAGE_SIZE));
>> +
>> +       /* Allocate the new object */
>> +       obj = i915_gem_object_create_region(placements[0], size, 0);
>> +       if (IS_ERR(obj))
>> +               return PTR_ERR(obj);
>> +
>> +       if (i915_gem_object_is_lmem(obj)) {
>> +               struct intel_gt *gt = obj->mm.region->gt;
>> +               struct intel_context *ce = gt->engine[BCS0]->blitter_context;
>> +
>> +               /*
>> +                * XXX: We really want to move this to get_pages(), but we
>> +                * require grabbing the BKL for the blitting operation which is
>> +                * annoying. In the pipeline is support for async get_pages()
>> +                * which should fit nicely for this. Also note that the actual
>> +                * clear should be done async(we currently do an object_wait
>> +                * which is pure garbage), we just need to take care if
>> +                * userspace opts of implicit sync for the execbuf, to avoid any
>> +                * potential info leak.
>> +                */
> Not just XXX, but the design should be completed first.

Matthew, I have a patch series in the makings that moves this blit to 
get_pages().

/Thomas


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 113/162] drm/i915: Create stolen memory region from local memory
  2020-11-27 12:06 ` [RFC PATCH 113/162] drm/i915: Create stolen memory region from local memory Matthew Auld
@ 2020-12-07 13:39   ` Jani Nikula
  0 siblings, 0 replies; 208+ messages in thread
From: Jani Nikula @ 2020-12-07 13:39 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: Abdiel Janulgue, Lucas De Marchi, dri-devel, Chris P Wilson,
	Neel Desai, Balestrieri

On Fri, 27 Nov 2020, Matthew Auld <matthew.auld@intel.com> wrote:
> From: CQ Tang <cq.tang@intel.com>
>
> Add "REGION_STOLEN" device info to dg1, create stolen memory
> region from upper portion of local device memory, starting
> from DSMBASE.
>
> The memory region is marked with "is_devmem=true".
>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
> Cc: Chris P Wilson <chris.p.wilson@intel.com>
> Cc: Balestrieri, Francesco <francesco.balestrieri@intel.com>
> Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> Cc: Venkata S Dhanalakota <venkata.s.dhanalakota@intel.com>
> Cc: Neel Desai <neel.desai@intel.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Sudeep Dutt <sudeep.dutt@intel.com>
> Signed-off-by: CQ Tang <cq.tang@intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_lmem.c   |  4 +-
>  drivers/gpu/drm/i915/gem/i915_gem_lmem.h   |  7 +++
>  drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 56 +++++++++++++++++++++-
>  drivers/gpu/drm/i915/i915_pci.c            |  2 +-
>  drivers/gpu/drm/i915/i915_reg.h            |  1 +
>  drivers/gpu/drm/i915/intel_memory_region.c |  5 ++
>  drivers/gpu/drm/i915/intel_memory_region.h |  2 +-
>  7 files changed, 71 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> index 71c07e1f6f26..b2fd2bc862c0 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
> @@ -111,8 +111,8 @@ int i915_gem_object_lmem_pread(struct drm_i915_gem_object *obj,
>  	return ret;
>  }
>  
> -static int i915_gem_object_lmem_pwrite(struct drm_i915_gem_object *obj,
> -				       const struct drm_i915_gem_pwrite *arg)
> +int i915_gem_object_lmem_pwrite(struct drm_i915_gem_object *obj,
> +				const struct drm_i915_gem_pwrite *arg)
>  {
>  	struct drm_i915_private *i915 = to_i915(obj->base.dev);
>  	struct intel_runtime_pm *rpm = &i915->runtime_pm;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
> index e11e0545e39c..c59aa6c014c7 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
> @@ -11,9 +11,16 @@
>  struct drm_i915_private;
>  struct drm_i915_gem_object;
>  struct intel_memory_region;
> +struct drm_i915_gem_pread;
> +struct drm_i915_gem_pwrite;
>  
>  extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
>  
> +int i915_gem_object_lmem_pread(struct drm_i915_gem_object *obj,
> +			       const struct drm_i915_gem_pread *args);
> +int i915_gem_object_lmem_pwrite(struct drm_i915_gem_object *obj,
> +				const struct drm_i915_gem_pwrite *args);
> +
>  void __iomem *
>  i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
>  			    unsigned long n,
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> index 0ddf48e472a0..633745336f40 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
> @@ -10,6 +10,7 @@
>  #include <drm/drm_mm.h>
>  #include <drm/i915_drm.h>
>  
> +#include "gem/i915_gem_lmem.h"
>  #include "gem/i915_gem_region.h"
>  #include "i915_drv.h"
>  #include "i915_gem_stolen.h"
> @@ -121,6 +122,14 @@ static int i915_adjust_stolen(struct drm_i915_private *i915,
>  		}
>  	}
>  
> +	/*
> +	 * With device local memory, we don't need to check the address range,
> +	 * this is device memory physical address, could overlap with system
> +	 * memory.
> +	 */
> +	if (HAS_LMEM(i915))
> +		return 0;
> +
>  	/*
>  	 * Verify that nothing else uses this physical address. Stolen
>  	 * memory should be reserved by the BIOS and hidden from the
> @@ -607,7 +616,7 @@ static void i915_gem_object_put_pages_stolen(struct drm_i915_gem_object *obj,
>  	kfree(pages);
>  }
>  
> -static const struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
> +static struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {

Making driver specific ops non-const seems suspicious...

>  	.name = "i915_gem_object_stolen",
>  	.get_pages = i915_gem_object_get_pages_stolen,
>  	.put_pages = i915_gem_object_put_pages_stolen,
> @@ -716,7 +725,19 @@ i915_gem_object_create_stolen(struct drm_i915_private *i915,
>  
>  static int init_stolen(struct intel_memory_region *mem)
>  {
> -	intel_memory_region_set_name(mem, "stolen");
> +	if (mem->type == INTEL_MEMORY_STOLEN_SYSTEM)
> +		intel_memory_region_set_name(mem, "stolen-system");
> +	else
> +		intel_memory_region_set_name(mem, "stolen-local");
> +
> +	if (HAS_LMEM(mem->i915)) {
> +		i915_gem_object_stolen_ops.pread = i915_gem_object_lmem_pread;
> +		i915_gem_object_stolen_ops.pwrite = i915_gem_object_lmem_pwrite;

...and AFAICT this modifies the ops for all devices, including the
integrated GPU, if any of the devices HAS_LMEM().

BR,
Jani.

> +		if (!io_mapping_init_wc(&mem->iomap,
> +					mem->io_start,
> +					resource_size(&mem->region)))
> +			return -EIO;
> +	}
>  
>  	/*
>  	 * Initialise stolen early so that we may reserve preallocated
> @@ -736,8 +757,39 @@ static const struct intel_memory_region_ops i915_region_stolen_ops = {
>  	.create_object = i915_gem_object_create_stolen_region,
>  };
>  
> +static
> +struct intel_memory_region *setup_lmem_stolen(struct drm_i915_private *i915)
> +{
> +	struct intel_uncore *uncore = &i915->uncore;
> +	struct pci_dev *pdev = i915->drm.pdev;
> +	struct intel_memory_region *mem;
> +	resource_size_t io_start;
> +	resource_size_t lmem_size;
> +	u64 lmem_base;
> +
> +	lmem_base = intel_uncore_read64(uncore, GEN12_DSMBASE);
> +	lmem_size = pci_resource_len(pdev, 2) - lmem_base;
> +	io_start = pci_resource_start(pdev, 2) + lmem_base;
> +
> +	mem = intel_memory_region_create(i915, lmem_base, lmem_size,
> +					 I915_GTT_PAGE_SIZE_4K, io_start,
> +					 &i915_region_stolen_ops);
> +	if (!IS_ERR(mem)) {
> +		DRM_INFO("Intel graphics stolen LMEM: %pR\n", &mem->region);
> +		DRM_INFO("Intel graphics stolen LMEM IO start: %llx\n",
> +			 (u64)mem->io_start);
> +		/* this is real device memory */
> +		mem->is_devmem = true;
> +	}
> +
> +	return mem;
> +}
> +
>  struct intel_memory_region *i915_gem_stolen_setup(struct drm_i915_private *i915)
>  {
> +	if (HAS_LMEM(i915))
> +		return setup_lmem_stolen(i915);
> +
>  	return intel_memory_region_create(i915,
>  					  intel_graphics_stolen_res.start,
>  					  resource_size(&intel_graphics_stolen_res),
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 8243178a56f9..c3d9b36ef651 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -907,7 +907,7 @@ static const struct intel_device_info rkl_info = {
>  
>  #define GEN12_DGFX_FEATURES \
>  	GEN12_FEATURES, \
> -	.memory_regions = REGION_SMEM | REGION_LMEM, \
> +	.memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \
>  	.has_master_unit_irq = 1, \
>  	.has_llc = 0, \
>  	.has_snoop = 1, \
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 0e01ea0cb0a4..3c8350f108e4 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -12067,6 +12067,7 @@ enum skl_power_gate {
>  #define   LMEM_ENABLE			(1 << 31)
>  
>  #define GEN12_GSMBASE			_MMIO(0x108100)
> +#define GEN12_DSMBASE			_MMIO(0x1080C0)
>  
>  /* gamt regs */
>  #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
> diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c
> index 043541d409bd..c7a1d84e7ee8 100644
> --- a/drivers/gpu/drm/i915/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/intel_memory_region.c
> @@ -19,6 +19,10 @@ const struct intel_memory_region_info intel_region_map[] = {
>                 .class = INTEL_MEMORY_STOLEN_SYSTEM,
>                 .instance = 0,
>         },
> +       [INTEL_REGION_STOLEN_LMEM] = {
> +	       .class = INTEL_MEMORY_STOLEN_LOCAL,
> +	       .instance = 0,
> +       },
>  };
>  
>  struct intel_memory_region *
> @@ -311,6 +315,7 @@ int intel_memory_regions_hw_probe(struct drm_i915_private *i915)
>  		case INTEL_MEMORY_SYSTEM:
>  			mem = i915_gem_shmem_setup(i915);
>  			break;
> +		case INTEL_MEMORY_STOLEN_LOCAL: /* fallthrough */
>  		case INTEL_MEMORY_STOLEN_SYSTEM:
>  			mem = i915_gem_stolen_setup(i915);
>  			break;
> diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h
> index b7a9e34faaf1..8da82cb2afe3 100644
> --- a/drivers/gpu/drm/i915/intel_memory_region.h
> +++ b/drivers/gpu/drm/i915/intel_memory_region.h
> @@ -93,7 +93,7 @@ struct intel_memory_region {
>  	u16 type;
>  	u16 instance;
>  	enum intel_region_id id;
> -	char name[8];
> +	char name[16];
>  	struct intel_gt *gt; /* GT closest to this region. */
>  	bool is_devmem;	/* true for device memory */

-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 098/162] drm/i915/gtt: map the PD up front
  2020-11-27 13:31   ` Chris Wilson
@ 2021-01-12 10:47     ` Matthew Auld
  2021-01-12 14:33       ` Daniel Vetter
  0 siblings, 1 reply; 208+ messages in thread
From: Matthew Auld @ 2021-01-12 10:47 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development, Matthew Auld, ML dri-devel

On Fri, 27 Nov 2020 at 13:32, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Matthew Auld (2020-11-27 12:06:14)
> > We need to general our accessor for the page directories and tables from
> > using the simple kmap_atomic to support local memory, and this setup
> > must be done on acquisition of the backing storage prior to entering
> > fence execution contexts. Here we replace the kmap with the object
> > maping code that for simple single page shmemfs object will return a
> > plain kmap, that is then kept for the lifetime of the page directory.
> >
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>
> We are going to really struggle with this on 32b :(

Just go back to mapping everything on demand like we did previously,
and unmap as soon as we are done with the current directory across
alloc/insert/clear?

> -Chris
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

* Re: [Intel-gfx] [RFC PATCH 098/162] drm/i915/gtt: map the PD up front
  2021-01-12 10:47     ` [Intel-gfx] " Matthew Auld
@ 2021-01-12 14:33       ` Daniel Vetter
  0 siblings, 0 replies; 208+ messages in thread
From: Daniel Vetter @ 2021-01-12 14:33 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Intel Graphics Development, Matthew Auld, ML dri-devel, Chris Wilson

On Tue, Jan 12, 2021 at 10:47:57AM +0000, Matthew Auld wrote:
> On Fri, 27 Nov 2020 at 13:32, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >
> > Quoting Matthew Auld (2020-11-27 12:06:14)
> > > We need to general our accessor for the page directories and tables from
> > > using the simple kmap_atomic to support local memory, and this setup
> > > must be done on acquisition of the backing storage prior to entering
> > > fence execution contexts. Here we replace the kmap with the object
> > > maping code that for simple single page shmemfs object will return a
> > > plain kmap, that is then kept for the lifetime of the page directory.
> > >
> > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >
> > We are going to really struggle with this on 32b :(
> 
> Just go back to mapping everything on demand like we did previously,
> and unmap as soon as we are done with the current directory across
> alloc/insert/clear?

tbh if you run i915.ko on 32b kernels, on a modern platform, you deserve
all the pain you get. There's quite a bit of work going on to essentially
make kmap functions worse on 32b (we're not yet at the stage where people
propose to nuke them, but getting there slowly), so designing code today
with them in mind as primary justification is backwards.

What we can't do is keep kmap around forever, it'd need to be something
like vmap that has a long-term mapping intention behind it. And at that
point it's probably equally amounts of work to just go back to ad-hoc
kmap. Also the rules have changed somewhat with kmap_local anyway, a kmap
is a lot less painful in the code than it was with kmap_atomic.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 208+ messages in thread

end of thread, other threads:[~2021-01-12 14:33 UTC | newest]

Thread overview: 208+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-27 12:04 [RFC PATCH 000/162] DG1 + LMEM enabling Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 001/162] drm/i915/selftest: also consider non-contiguous objects Matthew Auld
2020-11-27 19:44   ` Chris Wilson
2020-11-27 12:04 ` [RFC PATCH 002/162] drm/i915/selftest: assert we get 2M GTT pages Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 003/162] drm/i915/selftest: handle local-memory in perf_memcpy Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 004/162] drm/i915/gt: Move move context layout registers and offsets to lrc_reg.h Matthew Auld
2020-11-27 19:55   ` [Intel-gfx] " Chris Wilson
2020-11-27 12:04 ` [RFC PATCH 005/162] drm/i915/gt: Rename lrc.c to execlists_submission.c Matthew Auld
2020-11-27 19:56   ` Chris Wilson
2020-11-27 12:04 ` [RFC PATCH 006/162] drm/i915: split gen8+ flush and bb_start emission functions to their own file Matthew Auld
2020-11-27 19:58   ` Chris Wilson
2020-11-27 12:04 ` [RFC PATCH 007/162] drm/i915: split wa_bb code to its " Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 008/162] HAX drm/i915: Work around the selftest timeline lock splat workaround Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 009/162] drm/i915: Introduce drm_i915_lock_isolated Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 010/162] drm/i915: Lock hwsp objects isolated for pinning at create time Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 011/162] drm/i915: Pin timeline map after first timeline pin, v5 Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 012/162] drm/i915: Move cmd parser pinning to execbuffer Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 013/162] drm/i915: Add missing -EDEADLK handling to execbuf pinning, v2 Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 014/162] drm/i915: Ensure we hold the object mutex in pin correctly v2 Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 015/162] drm/i915: Add gem object locking to madvise Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 016/162] drm/i915: Move HAS_STRUCT_PAGE to obj->flags Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 017/162] drm/i915: Rework struct phys attachment handling Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 018/162] drm/i915: Convert i915_gem_object_attach_phys() to ww locking, v2 Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 019/162] drm/i915: make lockdep slightly happier about execbuf Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 020/162] drm/i915: Disable userptr pread/pwrite support Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 021/162] drm/i915: No longer allow exporting userptr through dma-buf Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 022/162] drm/i915: Reject more ioctls for userptr Matthew Auld
2020-11-27 12:04 ` [RFC PATCH 023/162] drm/i915: Reject UNSYNCHRONIZED for userptr, v2 Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 024/162] drm/i915: Make compilation of userptr code depend on MMU_NOTIFIER Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 025/162] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v5 Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 026/162] drm/i915: Flatten obj->mm.lock Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 027/162] drm/i915: Populate logical context during first pin Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 028/162] drm/i915: Make ring submission compatible with obj->mm.lock removal, v2 Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 029/162] drm/i915: Handle ww locking in init_status_page Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 030/162] drm/i915: Rework clflush to work correctly without obj->mm.lock Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 031/162] drm/i915: Pass ww ctx to intel_pin_to_display_plane Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 032/162] drm/i915: Add object locking to vm_fault_cpu Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 033/162] drm/i915: Move pinning to inside engine_wa_list_verify() Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 034/162] drm/i915: Take reservation lock around i915_vma_pin Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 035/162] drm/i915: Make intel_init_workaround_bb more compatible with ww locking Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 036/162] drm/i915: Make __engine_unpark() compatible with ww locking v2 Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 037/162] drm/i915: Take obj lock around set_domain ioctl Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 038/162] drm/i915: Defer pin calls in buffer pool until first use by caller Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 039/162] drm/i915: Fix pread/pwrite to work with new locking rules Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 040/162] drm/i915: Fix workarounds selftest, part 1 Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 041/162] drm/i915: Prepare for obj->mm.lock removal Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 042/162] drm/i915: Add igt_spinner_pin() to allow for ww locking around spinner Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 043/162] drm/i915: Add ww locking around vm_access() Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 044/162] drm/i915: Increase ww locking for perf Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 045/162] drm/i915: Lock ww in ucode objects correctly Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 046/162] drm/i915: Add ww locking to dma-buf ops Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 047/162] drm/i915: Add missing ww lock in intel_dsb_prepare Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 048/162] drm/i915: Fix ww locking in shmem_create_from_object Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 049/162] drm/i915: Use a single page table lock for each gtt Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 050/162] drm/i915/selftests: Prepare huge_pages testcases for obj->mm.lock removal Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 051/162] drm/i915/selftests: Prepare client blit " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 052/162] drm/i915/selftests: Prepare coherency tests " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 053/162] drm/i915/selftests: Prepare context " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 054/162] drm/i915/selftests: Prepare dma-buf " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 055/162] drm/i915/selftests: Prepare execbuf " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 056/162] drm/i915/selftests: Prepare mman testcases " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 057/162] drm/i915/selftests: Prepare object tests " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 058/162] drm/i915/selftests: Prepare object blit " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 059/162] drm/i915/selftests: Prepare igt_gem_utils " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 060/162] drm/i915/selftests: Prepare context selftest " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 061/162] drm/i915/selftests: Prepare hangcheck " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 062/162] drm/i915/selftests: Prepare execlists " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 063/162] drm/i915/selftests: Prepare mocs tests " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 064/162] drm/i915/selftests: Prepare ring submission " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 065/162] drm/i915/selftests: Prepare timeline tests " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 066/162] drm/i915/selftests: Prepare i915_request " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 067/162] drm/i915/selftests: Prepare memory region " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 068/162] drm/i915/selftests: Prepare cs engine " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 069/162] drm/i915/selftests: Prepare gtt " Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 070/162] drm/i915: Finally remove obj->mm.lock Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 071/162] drm/i915: Keep userpointer bindings if seqcount is unchanged, v2 Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 072/162] drm/i915: Avoid some false positives in assert_object_held() Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 073/162] drm/i915: Reference contending lock objects Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 074/162] drm/i915: Break out dma_resv ww locking utilities to separate files Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 075/162] drm/i915: Introduce a for_i915_gem_ww(){} Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 076/162] drm/i915: Untangle the vma pages_mutex Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 077/162] drm/i915/fbdev: Use lmem physical addresses for fb_mmap() on discrete Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 078/162] drm/i915: Return error value when bo not in LMEM for discrete Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 079/162] drm/i915/dmabuf: Disallow LMEM objects from dma-buf Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 080/162] drm/i915/lmem: Fail driver init if LMEM training failed Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 081/162] HAX drm/i915/lmem: support CPU relocations Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 082/162] HAX drm/i915/lmem: support pread and pwrite Matthew Auld
2020-11-27 12:05 ` [RFC PATCH 083/162] drm/i915: Update the helper to set correct mapping Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 084/162] drm/i915: introduce kernel blitter_context Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 085/162] drm/i915/region: support basic eviction Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 086/162] drm/i915: Add blit functions that can be called from within a WW transaction Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 087/162] drm/i915: Delay publishing objects on the eviction lists Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 088/162] drm/i915: support basic object migration Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 089/162] drm/i915/dg1: Fix occasional migration error Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 090/162] drm/i915/query: Expose memory regions through the query uAPI Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 091/162] drm/i915: Store gt in memory region Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 092/162] drm/i915/uapi: introduce drm_i915_gem_create_ext Matthew Auld
2020-11-27 13:25   ` [Intel-gfx] " Chris Wilson
2020-12-01 15:06     ` Thomas Hellström (Intel)
2020-11-27 19:21   ` Chris Wilson
2020-12-01 12:55   ` Chris Wilson
2020-12-01 13:43     ` Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 093/162] drm/i915/lmem: allocate cmd ring in lmem Matthew Auld
2020-11-27 13:27   ` Chris Wilson
2020-11-27 12:06 ` [RFC PATCH 094/162] drm/i915/dg1: Do not check r->sgt.pfn for NULL Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 095/162] drm/i915/dg1: Introduce dmabuf mmap to LMEM Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 096/162] drm/i915: setup the LMEM region Matthew Auld
2020-11-30 10:14   ` Jani Nikula
2020-11-27 12:06 ` [RFC PATCH 097/162] drm/i915: Distinction of memory regions Matthew Auld
2020-11-27 13:30   ` [Intel-gfx] " Chris Wilson
2020-11-27 12:06 ` [RFC PATCH 098/162] drm/i915/gtt: map the PD up front Matthew Auld
2020-11-27 13:31   ` Chris Wilson
2021-01-12 10:47     ` [Intel-gfx] " Matthew Auld
2021-01-12 14:33       ` Daniel Vetter
2020-11-27 12:06 ` [RFC PATCH 099/162] drm/i915/gtt/dgfx: place the PD in LMEM Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 100/162] drm/i915/gtt: make flushing conditional Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 101/162] drm/i915/gtt/dg1: add PTE_LM plumbing for PPGTT Matthew Auld
2020-11-27 13:35   ` [Intel-gfx] " Chris Wilson
2020-11-27 12:06 ` [RFC PATCH 102/162] drm/i915/gtt/dg1: add PTE_LM plumbing for GGTT Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 103/162] drm/i915: allocate context from LMEM Matthew Auld
2020-11-27 13:37   ` [Intel-gfx] " Chris Wilson
2020-11-27 12:06 ` [RFC PATCH 104/162] drm/i915: move engine scratch to LMEM Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 105/162] drm/i915: Provide a way to disable PCIe relaxed write ordering Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 106/162] drm/i915: i915 returns -EBUSY on thread contention Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 107/162] drm/i915: setup GPU device lmem region Matthew Auld
2020-11-30 11:18   ` Chris Wilson
2020-11-27 12:06 ` [RFC PATCH 108/162] drm/i915: Fix object page offset within a region Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 109/162] drm/i915: add i915_gem_object_is_devmem() function Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 110/162] drm/i915: finish memory region support for stolen objects Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 111/162] drm/i915/lmem: support optional CPU clearing for special internal use Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 112/162] drm/i915/guc: put all guc objects in lmem when available Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 113/162] drm/i915: Create stolen memory region from local memory Matthew Auld
2020-12-07 13:39   ` [Intel-gfx] " Jani Nikula
2020-11-27 12:06 ` [RFC PATCH 114/162] drm/i915/lmem: Bypass aperture when lmem is available Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 115/162] drm/i915/lmem: reset the lmem buffer created by fbdev Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 116/162] drm/i915/dsb: Enable lmem for dsb Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 117/162] drm/i915: Reintroduce mem->reserved Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 118/162] drm/i915/dg1: Reserve first 1MB of local memory Matthew Auld
2020-11-27 13:52   ` [Intel-gfx] " Chris Wilson
2020-11-30 11:09     ` Matthew Auld
2020-11-30 11:22       ` Chris Wilson
2020-11-27 12:06 ` [RFC PATCH 119/162] drm/i915/dg1: Read OPROM via SPI controller Matthew Auld
2020-11-30 10:16   ` [Intel-gfx] " Jani Nikula
2020-11-27 12:06 ` [RFC PATCH 120/162] drm/i915/oprom: Basic sanitization Matthew Auld
2020-11-30 10:24   ` [Intel-gfx] " Jani Nikula
2020-11-27 12:06 ` [RFC PATCH 121/162] drm/i915: WA for zero memory channel Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 122/162] drm/i915/dg1: Compute MEM Bandwidth using MCHBAR Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 123/162] drm/i915/dg1: Double memory bandwidth available Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 124/162] drm/i915/lmem: allocate HWSP in lmem Matthew Auld
2020-11-27 13:55   ` [Intel-gfx] " Chris Wilson
2020-11-30 17:17     ` Matthew Auld
2020-11-30 17:35       ` Chris Wilson
2020-11-27 12:06 ` [RFC PATCH 125/162] drm/i915/lmem: Limit block size to 4G Matthew Auld
2020-11-27 14:02   ` [Intel-gfx] " Chris Wilson
2020-11-27 12:06 ` [RFC PATCH 126/162] drm/i915/gem: Update shmem available memory Matthew Auld
2020-11-27 14:04   ` Chris Wilson
2020-11-27 12:06 ` [RFC PATCH 127/162] drm/i915: Allow non-uniform subslices in gen12+ Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 128/162] drm/i915/dg1: intel_memory_region_evict() changes for eviction Matthew Auld
2020-11-27 14:07   ` [Intel-gfx] " Chris Wilson
2020-11-27 12:06 ` [RFC PATCH 129/162] drm/i915/dg1: i915_gem_object_memcpy(..) infrastructure Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 130/162] drm/i915/dg1: Eviction logic Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 131/162] drm/i915/dg1: Add enable_eviction modparam Matthew Auld
2020-11-30 12:20   ` Jani Nikula
2020-11-27 12:06 ` [RFC PATCH 132/162] drm/i915/dg1: Add lmem_size modparam Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 133/162] drm/i915/dg1: Track swap in/out stats via debugfs Matthew Auld
2020-11-27 14:09   ` [Intel-gfx] " Chris Wilson
2020-11-27 12:06 ` [RFC PATCH 134/162] drm/i915/dg1: Measure swap in/out timing stats Matthew Auld
2020-11-27 14:11   ` [Intel-gfx] " Chris Wilson
2020-11-27 12:06 ` [RFC PATCH 135/162] drm/i915: define intel_partial_pages_for_sg_table Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 136/162] drm/i915: create and destroy dummy vma Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 137/162] drm/i915: blt copy between objs using pre-created vma windows Matthew Auld
2020-11-27 14:19   ` [Intel-gfx] " Chris Wilson
2020-11-27 12:06 ` [RFC PATCH 138/162] drm/i915/dg1: Eliminate eviction mutex Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 139/162] drm/i915/dg1: Keep engine awake across whole blit Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 140/162] drm/i915: window_blt_copy is used for swapin and swapout Matthew Auld
2020-11-27 14:20   ` [Intel-gfx] " Chris Wilson
2020-11-27 12:06 ` [RFC PATCH 141/162] drm/i915: Lmem eviction statistics by category Matthew Auld
2020-11-27 14:21   ` [Intel-gfx] " Chris Wilson
2020-11-27 12:06 ` [RFC PATCH 142/162] drm/i915/gem/selftest: test and measure window based blt cpy Matthew Auld
2020-11-27 12:06 ` [RFC PATCH 143/162] drm/i915: suspend/resume eviction Matthew Auld
2020-11-27 14:22   ` Chris Wilson
2020-11-27 12:07 ` [RFC PATCH 144/162] drm/i915: Reset blitter context when unpark engine Matthew Auld
2020-11-27 14:26   ` Chris Wilson
2020-11-27 12:07 ` [RFC PATCH 145/162] drm/i915/dg1: Add dedicated context for blitter eviction Matthew Auld
2020-11-27 12:07 ` [RFC PATCH 146/162] drm/i915/pm: suspend and restore ppgtt mapping Matthew Auld
2020-11-27 14:29   ` [Intel-gfx] " Chris Wilson
2020-11-27 12:07 ` [RFC PATCH 147/162] drm/i915/gt: Allocate default ctx objects in SMEM Matthew Auld
2020-11-27 14:30   ` Chris Wilson
2020-11-27 12:07 ` [RFC PATCH 148/162] drm/i915: suspend/resume enable blitter eviction Matthew Auld
2020-11-27 14:32   ` [Intel-gfx] " Chris Wilson
2020-11-27 12:07 ` [RFC PATCH 149/162] drm/i915: suspend/resume handling of perma-pinned objects Matthew Auld
2020-11-27 12:07 ` [RFC PATCH 150/162] drm/i915: need consider system BO snoop for dgfx Matthew Auld
2020-11-27 14:36   ` Chris Wilson
2020-11-27 12:07 ` [RFC PATCH 151/162] drm/i915: move eviction to prepare hook Matthew Auld
2020-11-27 12:07 ` [RFC PATCH 152/162] drm/i915: Perform execbuffer object locking as a separate step Matthew Auld
2020-11-27 12:07 ` [RFC PATCH 153/162] drm/i915: Implement eviction locking v2 Matthew Auld
2020-11-27 12:07 ` [RFC PATCH 154/162] drm/i915: Support ww eviction Matthew Auld
2020-11-27 12:07 ` [RFC PATCH 155/162] drm/i915: Use a ww transaction in the fault handler Matthew Auld
2020-11-27 12:07 ` [RFC PATCH 156/162] drm/i915: Use a ww transaction in i915_gem_object_pin_map_unlocked() Matthew Auld
2020-11-27 12:07 ` [RFC PATCH 157/162] drm/i915: Improve accuracy of eviction stats Matthew Auld
2020-11-27 14:40   ` [Intel-gfx] " Chris Wilson
2020-11-30 10:36     ` Tvrtko Ursulin
2020-11-27 12:07 ` [RFC PATCH 158/162] drm/i915: Support ww locks in suspend/resume Matthew Auld
2020-11-27 12:07 ` [RFC PATCH 159/162] drm/i915/dg1: Fix mapping type for default state object Matthew Auld
2020-11-27 12:07 ` [RFC PATCH 160/162] drm/i915/dg1: Fix GPU hang due to shmemfs page drop Matthew Auld
2020-11-27 14:44   ` [Intel-gfx] " Chris Wilson
2020-11-27 12:07 ` [RFC PATCH 161/162] drm/i915/dg1: allow pci to auto probe Matthew Auld
2020-11-27 12:07 ` [RFC PATCH 162/162] drm/i915: drop fake lmem Matthew Auld

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).