All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] drm/i915: Get rid of fence error propagation (v2)
@ 2021-06-03 15:40 ` Jason Ekstrand
  0 siblings, 0 replies; 24+ messages in thread
From: Jason Ekstrand @ 2021-06-03 15:40 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Daniel Vetter, Jon Bloomfield, Jason Ekstrand

Fence error propagation is sketchy at best.  Instead of explicitly handling
fences which might have errors set in the code which is aware of errors, we
just kick them down the line and hope that userspace knows what to do when
a wait eventually fails.  This is sketchy at best because most userspace
isn't prepared to handle errors in those places.  To make things worse, it
allows errors to propagate across processes in unpredictable ways.  This is
causing hangs in one client to kill X11.

Unfortunately, there's no quick path from here to there thanks to the fact
that we're now running the command parser asynchronously and relying on
fence errors for when it fails.  This series first gets rid of asynchronous
command parsing and then cleans up from there.  There was never any real
use-case for asynchronous parsing and the platforms that rely heavily on
the command parser are old enough (Gen7) that, when we changed the way the
command parser works, it wasn't really a change anyone was asking for
anyway.

I think we probably want this whole mess back-ported.  I'm happy to take
suggestions on the strategy there because the history there is a bit
annoying and I'm not 100% sure where the Linux release cuts land.  In any
case, I'm happy to make a version of this series per-release if needed for
Greg to back-port.

v2 (Daniel Vetter):
 - Re-order to put the reverts first
 - Add ACKs from Daniel
 - Add better CC and Fixes tags

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>

Jason Ekstrand (5):
  drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser"
  Revert "drm/i915: Propagate errors on awaiting already signaled
    fences"
  drm/i915: Remove allow_alloc from i915_gem_object_get_sg*
  drm/i915: Drop error handling from dma_fence_work
  Revert "drm/i915: Skip over MI_NOOP when parsing"

 drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |   4 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 227 +-----------------
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  10 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     |  21 +-
 .../i915/gem/selftests/i915_gem_execbuffer.c  |   4 +
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |   2 +-
 drivers/gpu/drm/i915/i915_cmd_parser.c        | 199 ++++++++-------
 drivers/gpu/drm/i915/i915_drv.h               |   7 +-
 drivers/gpu/drm/i915/i915_request.c           |   8 +-
 drivers/gpu/drm/i915/i915_sw_fence_work.c     |   5 +-
 drivers/gpu/drm/i915/i915_sw_fence_work.h     |   2 +-
 drivers/gpu/drm/i915/i915_vma.c               |   3 +-
 12 files changed, 141 insertions(+), 351 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 24+ messages in thread
* [PATCH 0/5] drm/i915: Get rid of fence error propagation (v4)
@ 2021-07-14 19:34 Jason Ekstrand
  2021-07-14 19:34   ` Jason Ekstrand
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Ekstrand @ 2021-07-14 19:34 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Jason Ekstrand

Fence error propagation is sketchy at best.  Instead of explicitly handling
fences which might have errors set in the code which is aware of errors, we
just kick them down the line and hope that userspace knows what to do when
a wait eventually fails.  This is sketchy at best because most userspace
isn't prepared to handle errors in those places.  To make things worse, it
allows errors to propagate across processes in unpredictable ways.  This is
causing hangs in one client to kill X11.

Unfortunately, there's no quick path from here to there thanks to the fact
that we're now running the command parser asynchronously and relying on
fence errors for when it fails.  This series first gets rid of asynchronous
command parsing and then cleans up from there.  There was never any real
use-case for asynchronous parsing and the platforms that rely heavily on
the command parser are old enough (Gen7) that, when we changed the way the
command parser works, it wasn't really a change anyone was asking for
anyway.

I think we probably want this whole mess back-ported.  I'm happy to take
suggestions on the strategy there because the history there is a bit
annoying and I'm not 100% sure where the Linux release cuts land.  In any
case, I'm happy to make a version of this series per-release if needed for
Greg to back-port.

v2 (Daniel Vetter):
 - Re-order to put the reverts first
 - Add ACKs from Daniel
 - Add better CC and Fixes tags

v3 (Jason Ekstrand):
 - Rebase on drm-tip

v4 (Jason Ekstrand):
 - Rebase on drm-tip

Test-with: 20210714173141.1381686-1-jason@jlekstrand.net

Jason Ekstrand (5):
  drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser"
  Revert "drm/i915: Propagate errors on awaiting already signaled
    fences"
  drm/i915: Remove allow_alloc from i915_gem_object_get_sg*
  drm/i915: Drop error handling from dma_fence_work
  Revert "drm/i915: Skip over MI_NOOP when parsing"

 drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |   4 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 227 +-----------------
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  10 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     |  20 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c       |   2 +-
 .../i915/gem/selftests/i915_gem_execbuffer.c  |   4 +
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |   2 +-
 drivers/gpu/drm/i915/i915_cmd_parser.c        | 199 ++++++++-------
 drivers/gpu/drm/i915/i915_drv.h               |   7 +-
 drivers/gpu/drm/i915/i915_request.c           |   8 +-
 drivers/gpu/drm/i915/i915_sw_fence_work.c     |   5 +-
 drivers/gpu/drm/i915/i915_sw_fence_work.h     |   2 +-
 drivers/gpu/drm/i915/i915_vma.c               |   3 +-
 13 files changed, 142 insertions(+), 351 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 24+ messages in thread
* [PATCH 0/5] drm/i915: Get rid of fence error propagation (v2)
@ 2021-07-11  3:53 Jason Ekstrand
  2021-07-11  3:53   ` Jason Ekstrand
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Ekstrand @ 2021-07-11  3:53 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Daniel Vetter, Jon Bloomfield, Jason Ekstrand

Fence error propagation is sketchy at best.  Instead of explicitly handling
fences which might have errors set in the code which is aware of errors, we
just kick them down the line and hope that userspace knows what to do when
a wait eventually fails.  This is sketchy at best because most userspace
isn't prepared to handle errors in those places.  To make things worse, it
allows errors to propagate across processes in unpredictable ways.  This is
causing hangs in one client to kill X11.

Unfortunately, there's no quick path from here to there thanks to the fact
that we're now running the command parser asynchronously and relying on
fence errors for when it fails.  This series first gets rid of asynchronous
command parsing and then cleans up from there.  There was never any real
use-case for asynchronous parsing and the platforms that rely heavily on
the command parser are old enough (Gen7) that, when we changed the way the
command parser works, it wasn't really a change anyone was asking for
anyway.

I think we probably want this whole mess back-ported.  I'm happy to take
suggestions on the strategy there because the history there is a bit
annoying and I'm not 100% sure where the Linux release cuts land.  In any
case, I'm happy to make a version of this series per-release if needed for
Greg to back-port.

v2 (Daniel Vetter):
 - Re-order to put the reverts first
 - Add ACKs from Daniel
 - Add better CC and Fixes tags

v3 (Daniel Vetter):
 - Rebase on drm-tip

Test-with: 20210711035204.802908-1-jason@jlekstrand.net
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>

Jason Ekstrand (5):
  drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser"
  Revert "drm/i915: Propagate errors on awaiting already signaled
    fences"
  drm/i915: Remove allow_alloc from i915_gem_object_get_sg*
  drm/i915: Drop error handling from dma_fence_work
  Revert "drm/i915: Skip over MI_NOOP when parsing"

 drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |   4 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 227 +-----------------
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  10 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     |  20 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c       |   2 +-
 .../i915/gem/selftests/i915_gem_execbuffer.c  |   4 +
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |   2 +-
 drivers/gpu/drm/i915/i915_cmd_parser.c        | 199 ++++++++-------
 drivers/gpu/drm/i915/i915_drv.h               |   7 +-
 drivers/gpu/drm/i915/i915_request.c           |   8 +-
 drivers/gpu/drm/i915/i915_sw_fence_work.c     |   5 +-
 drivers/gpu/drm/i915/i915_sw_fence_work.h     |   2 +-
 drivers/gpu/drm/i915/i915_vma.c               |   3 +-
 13 files changed, 142 insertions(+), 351 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 24+ messages in thread
* [PATCH 0/5] drm/i915: Get rid of fence error propagation
@ 2021-06-02 16:41 Jason Ekstrand
  2021-06-02 16:41 ` [PATCH 1/5] drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser" Jason Ekstrand
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Ekstrand @ 2021-06-02 16:41 UTC (permalink / raw)
  To: dri-devel, intel-gfx; +Cc: Daniel Vetter, Jon Bloomfield, Jason Ekstrand

Fence error propagation is sketchy at best.  Instead of explicitly handling
fences which might have errors set in the code which is aware of errors, we
just kick them down the line and hope that userspace knows what to do when
a wait eventually fails.  This is sketchy at best because most userspace
isn't prepared to handle errors in those places.  To make things worse, it
allows errors to propagate across processes in unpredictable ways.  This is
causing hangs in one client to kill X11.

Unfortunately, there's no quick path from here to there thanks to the fact
that we're now running the command parser asynchronously and relying on
fence errors for when it fails.  This series first gets rid of asynchronous
command parsing and then cleans up from there.  There was never any real
use-case for asynchronous parsing and the platforms that rely heavily on
the command parser are old enough (Gen7) that, when we changed the way the
command parser works, it wasn't really a change anyone was asking for
anyway.

I think we probably want this whole mess back-ported.  I'm happy to take
suggestions on the strategy there because the history there is a bit
annoying and I'm not 100% sure where the Linux release cuts land.  In any
case, I'm happy to make a version of this series per-release if needed for
Greg to back-port.

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>

Jason Ekstrand (5):
  drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser"
  drm/i915: Remove allow_alloc from i915_gem_object_get_sg*
  drm/i915: Drop error handling from dma_fence_work
  Revert "drm/i915: Propagate errors on awaiting already signaled
    fences"
  Revert "drm/i915: Skip over MI_NOOP when parsing"

 drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |   4 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 227 +-----------------
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  10 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     |  21 +-
 .../i915/gem/selftests/i915_gem_execbuffer.c  |   4 +
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |   2 +-
 drivers/gpu/drm/i915/i915_cmd_parser.c        | 199 ++++++++-------
 drivers/gpu/drm/i915/i915_drv.h               |   7 +-
 drivers/gpu/drm/i915/i915_request.c           |   8 +-
 drivers/gpu/drm/i915/i915_sw_fence_work.c     |   5 +-
 drivers/gpu/drm/i915/i915_sw_fence_work.h     |   2 +-
 drivers/gpu/drm/i915/i915_vma.c               |   3 +-
 12 files changed, 141 insertions(+), 351 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2021-07-14 19:34 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-03 15:40 [PATCH 0/5] drm/i915: Get rid of fence error propagation (v2) Jason Ekstrand
2021-06-03 15:40 ` [Intel-gfx] " Jason Ekstrand
2021-06-03 15:40 ` [PATCH 1/5] drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser" Jason Ekstrand
2021-06-03 15:40   ` [Intel-gfx] " Jason Ekstrand
2021-06-03 15:40   ` Jason Ekstrand
2021-06-03 15:40 ` [PATCH 2/5] Revert "drm/i915: Propagate errors on awaiting already signaled fences" Jason Ekstrand
2021-06-03 15:40   ` [Intel-gfx] " Jason Ekstrand
2021-06-03 15:40   ` Jason Ekstrand
2021-06-03 15:40 ` [PATCH 3/5] drm/i915: Remove allow_alloc from i915_gem_object_get_sg* Jason Ekstrand
2021-06-03 15:40   ` [Intel-gfx] " Jason Ekstrand
2021-06-03 15:40 ` [PATCH 4/5] drm/i915: Drop error handling from dma_fence_work Jason Ekstrand
2021-06-03 15:40   ` [Intel-gfx] " Jason Ekstrand
2021-06-03 15:40 ` [PATCH 5/5] Revert "drm/i915: Skip over MI_NOOP when parsing" Jason Ekstrand
2021-06-03 15:40   ` [Intel-gfx] " Jason Ekstrand
2021-06-03 16:32 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Get rid of fence error propagation (rev2) Patchwork
2021-06-03 16:33 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-06-03 16:37 ` [Intel-gfx] ✗ Fi.CI.DOCS: " Patchwork
2021-06-03 17:04 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-06-03 18:09 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2021-07-14 19:34 [PATCH 0/5] drm/i915: Get rid of fence error propagation (v4) Jason Ekstrand
2021-07-14 19:34 ` [PATCH 1/5] drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser" Jason Ekstrand
2021-07-14 19:34   ` Jason Ekstrand
2021-07-11  3:53 [PATCH 0/5] drm/i915: Get rid of fence error propagation (v2) Jason Ekstrand
2021-07-11  3:53 ` [PATCH 1/5] drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser" Jason Ekstrand
2021-07-11  3:53   ` Jason Ekstrand
2021-06-02 16:41 [PATCH 0/5] drm/i915: Get rid of fence error propagation Jason Ekstrand
2021-06-02 16:41 ` [PATCH 1/5] drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser" Jason Ekstrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.