All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	dri-devel <dri-devel@lists.freedesktop.org>
Cc: "Ville Syrjala" <ville.syrjala@linux.intel.com>,
	intel-gfx <intel-gfx@lists.freedesktop.org>,
	"Thomas Hellström" <thomas.hellstrom@intel.com>,
	stable <stable@vger.kernel.org>
Subject: Re: [Intel-gfx] [PATCH] drm/i915/gt: Fix -EDEADLK handling regression
Date: Tue, 13 Jul 2021 21:58:03 +0200	[thread overview]
Message-ID: <CAKMK7uFTYgK9rmXTNSczPdBWPTNaLBp-GitzBQb0-gX5wZWHNQ@mail.gmail.com> (raw)
In-Reply-To: <2edf584b-3835-53ed-f6e3-76c7e8d581ed@linux.intel.com>

On Thu, Jul 1, 2021 at 9:07 AM Maarten Lankhorst
<maarten.lankhorst@linux.intel.com> wrote:
> Op 30-06-2021 om 18:44 schreef Ville Syrjala:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > The conversion to ww mutexes failed to address the fence code which
> > already returns -EDEADLK when we run out of fences. Ww mutexes on
> > the other hand treat -EDEADLK as an internal errno value indicating
> > a need to restart the operation due to a deadlock. So now when the
> > fence code returns -EDEADLK the higher level code erroneously
> > restarts everything instead of returning the error to userspace
> > as is expected.
> >
> > To remedy this let's switch the fence code to use a different errno
> > value for this. -ENOBUFS seems like a semi-reasonable unique choice.
> > Apart from igt the only user of this I could find is sna, and even
> > there all we do is dump the current fence registers from debugfs
> > into the X server log. So no user visible functionality is affected.
> > If we really cared about preserving this we could of course convert
> > back to -EDEADLK higher up, but doesn't seem like that's worth
> > the hassle here.
> >
> > Not quite sure which commit specifically broke this, but I'll
> > just attribute it to the general gem ww mutex work.
> >
> > Cc: stable@vger.kernel.org
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom@intel.com>
> > Testcase: igt/gem_pread/exhaustion
> > Testcase: igt/gem_pwrite/basic-exhaustion
> > Testcase: igt/gem_fenced_exec_thrash/too-many-fences
> > Fixes: 80f0b679d6f0 ("drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.")
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > index cac7f3f44642..f8948de72036 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > @@ -348,7 +348,7 @@ static struct i915_fence_reg *fence_find(struct i915_ggtt *ggtt)
> >       if (intel_has_pending_fb_unpin(ggtt->vm.i915))
> >               return ERR_PTR(-EAGAIN);
> >
> > -     return ERR_PTR(-EDEADLK);
> > +     return ERR_PTR(-ENOBUFS);
> >  }
> >
> >  int __i915_vma_pin_fence(struct i915_vma *vma)
>
> Makes sense..
>
> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>
> Is it a slightly more reent commit? Might probably be the part that converts execbuffer to use ww locks.

- please cc: dri-devel on anything gem/gt related.
- this should probably be ENOSPC or something like that for at least a
seeming retention of errno consistentcy:

https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#recommended-ioctl-return-values

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch>
To: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	 dri-devel <dri-devel@lists.freedesktop.org>
Cc: intel-gfx <intel-gfx@lists.freedesktop.org>,
	"Thomas Hellström" <thomas.hellstrom@intel.com>,
	stable <stable@vger.kernel.org>
Subject: Re: [Intel-gfx] [PATCH] drm/i915/gt: Fix -EDEADLK handling regression
Date: Tue, 13 Jul 2021 21:58:03 +0200	[thread overview]
Message-ID: <CAKMK7uFTYgK9rmXTNSczPdBWPTNaLBp-GitzBQb0-gX5wZWHNQ@mail.gmail.com> (raw)
In-Reply-To: <2edf584b-3835-53ed-f6e3-76c7e8d581ed@linux.intel.com>

On Thu, Jul 1, 2021 at 9:07 AM Maarten Lankhorst
<maarten.lankhorst@linux.intel.com> wrote:
> Op 30-06-2021 om 18:44 schreef Ville Syrjala:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > The conversion to ww mutexes failed to address the fence code which
> > already returns -EDEADLK when we run out of fences. Ww mutexes on
> > the other hand treat -EDEADLK as an internal errno value indicating
> > a need to restart the operation due to a deadlock. So now when the
> > fence code returns -EDEADLK the higher level code erroneously
> > restarts everything instead of returning the error to userspace
> > as is expected.
> >
> > To remedy this let's switch the fence code to use a different errno
> > value for this. -ENOBUFS seems like a semi-reasonable unique choice.
> > Apart from igt the only user of this I could find is sna, and even
> > there all we do is dump the current fence registers from debugfs
> > into the X server log. So no user visible functionality is affected.
> > If we really cared about preserving this we could of course convert
> > back to -EDEADLK higher up, but doesn't seem like that's worth
> > the hassle here.
> >
> > Not quite sure which commit specifically broke this, but I'll
> > just attribute it to the general gem ww mutex work.
> >
> > Cc: stable@vger.kernel.org
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom@intel.com>
> > Testcase: igt/gem_pread/exhaustion
> > Testcase: igt/gem_pwrite/basic-exhaustion
> > Testcase: igt/gem_fenced_exec_thrash/too-many-fences
> > Fixes: 80f0b679d6f0 ("drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.")
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > index cac7f3f44642..f8948de72036 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > @@ -348,7 +348,7 @@ static struct i915_fence_reg *fence_find(struct i915_ggtt *ggtt)
> >       if (intel_has_pending_fb_unpin(ggtt->vm.i915))
> >               return ERR_PTR(-EAGAIN);
> >
> > -     return ERR_PTR(-EDEADLK);
> > +     return ERR_PTR(-ENOBUFS);
> >  }
> >
> >  int __i915_vma_pin_fence(struct i915_vma *vma)
>
> Makes sense..
>
> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>
> Is it a slightly more reent commit? Might probably be the part that converts execbuffer to use ww locks.

- please cc: dri-devel on anything gem/gt related.
- this should probably be ENOSPC or something like that for at least a
seeming retention of errno consistentcy:

https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#recommended-ioctl-return-values

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Vetter <daniel@ffwll.ch>
To: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	 dri-devel <dri-devel@lists.freedesktop.org>
Cc: intel-gfx <intel-gfx@lists.freedesktop.org>,
	"Thomas Hellström" <thomas.hellstrom@intel.com>,
	stable <stable@vger.kernel.org>
Subject: Re: [Intel-gfx] [PATCH] drm/i915/gt: Fix -EDEADLK handling regression
Date: Tue, 13 Jul 2021 21:58:03 +0200	[thread overview]
Message-ID: <CAKMK7uFTYgK9rmXTNSczPdBWPTNaLBp-GitzBQb0-gX5wZWHNQ@mail.gmail.com> (raw)
In-Reply-To: <2edf584b-3835-53ed-f6e3-76c7e8d581ed@linux.intel.com>

On Thu, Jul 1, 2021 at 9:07 AM Maarten Lankhorst
<maarten.lankhorst@linux.intel.com> wrote:
> Op 30-06-2021 om 18:44 schreef Ville Syrjala:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > The conversion to ww mutexes failed to address the fence code which
> > already returns -EDEADLK when we run out of fences. Ww mutexes on
> > the other hand treat -EDEADLK as an internal errno value indicating
> > a need to restart the operation due to a deadlock. So now when the
> > fence code returns -EDEADLK the higher level code erroneously
> > restarts everything instead of returning the error to userspace
> > as is expected.
> >
> > To remedy this let's switch the fence code to use a different errno
> > value for this. -ENOBUFS seems like a semi-reasonable unique choice.
> > Apart from igt the only user of this I could find is sna, and even
> > there all we do is dump the current fence registers from debugfs
> > into the X server log. So no user visible functionality is affected.
> > If we really cared about preserving this we could of course convert
> > back to -EDEADLK higher up, but doesn't seem like that's worth
> > the hassle here.
> >
> > Not quite sure which commit specifically broke this, but I'll
> > just attribute it to the general gem ww mutex work.
> >
> > Cc: stable@vger.kernel.org
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom@intel.com>
> > Testcase: igt/gem_pread/exhaustion
> > Testcase: igt/gem_pwrite/basic-exhaustion
> > Testcase: igt/gem_fenced_exec_thrash/too-many-fences
> > Fixes: 80f0b679d6f0 ("drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.")
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > index cac7f3f44642..f8948de72036 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
> > @@ -348,7 +348,7 @@ static struct i915_fence_reg *fence_find(struct i915_ggtt *ggtt)
> >       if (intel_has_pending_fb_unpin(ggtt->vm.i915))
> >               return ERR_PTR(-EAGAIN);
> >
> > -     return ERR_PTR(-EDEADLK);
> > +     return ERR_PTR(-ENOBUFS);
> >  }
> >
> >  int __i915_vma_pin_fence(struct i915_vma *vma)
>
> Makes sense..
>
> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>
> Is it a slightly more reent commit? Might probably be the part that converts execbuffer to use ww locks.

- please cc: dri-devel on anything gem/gt related.
- this should probably be ENOSPC or something like that for at least a
seeming retention of errno consistentcy:

https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#recommended-ioctl-return-values

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2021-07-13 19:58 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-30 16:44 [PATCH] drm/i915/gt: Fix -EDEADLK handling regression Ville Syrjala
2021-06-30 16:44 ` [Intel-gfx] " Ville Syrjala
2021-06-30 19:42 ` [Intel-gfx] ✓ Fi.CI.BAT: success for " Patchwork
2021-07-01  2:48 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2021-07-01  7:07 ` [PATCH] " Maarten Lankhorst
2021-07-01  7:07   ` [Intel-gfx] " Maarten Lankhorst
2021-07-01 17:00   ` Ville Syrjälä
2021-07-01 17:00     ` [Intel-gfx] " Ville Syrjälä
2021-07-13 19:58   ` Daniel Vetter [this message]
2021-07-13 19:58     ` Daniel Vetter
2021-07-13 19:58     ` Daniel Vetter
2021-07-13 19:59     ` Daniel Vetter
2021-07-13 19:59       ` Daniel Vetter
2021-07-13 19:59       ` Daniel Vetter
2021-07-13 20:19       ` Rodrigo Vivi
2021-07-13 20:19         ` Rodrigo Vivi
2021-07-13 20:19         ` Rodrigo Vivi
2021-07-13 20:22       ` Ville Syrjälä
2021-07-13 20:22         ` Ville Syrjälä
2021-07-13 20:22         ` Ville Syrjälä

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKMK7uFTYgK9rmXTNSczPdBWPTNaLBp-GitzBQb0-gX5wZWHNQ@mail.gmail.com \
    --to=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=stable@vger.kernel.org \
    --cc=thomas.hellstrom@intel.com \
    --cc=ville.syrjala@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.