All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Auld <matthew.auld@intel.com>
To: Daniel Vetter <daniel.vetter@ffwll.ch>,
	Intel Graphics Development <intel-gfx@lists.freedesktop.org>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"DRI Development" <dri-devel@lists.freedesktop.org>,
	"Chris Wilson" <chris@chris-wilson.co.uk>,
	"Jon Bloomfield" <jon.bloomfield@intel.com>,
	"Jason Ekstrand" <jason@jlekstrand.net>,
	"Dave Airlie" <airlied@redhat.com>,
	"Daniel Vetter" <daniel.vetter@intel.com>
Subject: Re: [PATCH] drm/i915/eb: Fix pagefault disabling in the first slowpath
Date: Mon, 21 Jun 2021 10:33:01 +0100	[thread overview]
Message-ID: <8678b7b8-ad42-3e14-158e-77994b299c6e@intel.com> (raw)
In-Reply-To: <20210618214503.1773805-1-daniel.vetter@ffwll.ch>

On 18/06/2021 22:45, Daniel Vetter wrote:
> In
> 
> commit ebc0808fa2da0548a78e715858024cb81cd732bc
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Tue Oct 18 13:02:51 2016 +0100
> 
>      drm/i915: Restrict pagefault disabling to just around copy_from_user()
> 
> we entirely missed that there's a slow path call to eb_relocate_entry
> (or i915_gem_execbuffer_relocate_entry as it was called back then)
> which was left fully wrapped by pagefault_disable/enable() calls.
> Previously any issues with blocking calls where handled by the
> following code:
> 
> 	/* we can't wait for rendering with pagefaults disabled */
> 	if (pagefault_disabled() && !object_is_idle(obj))
> 		return -EFAULT;
> 
> Now at this point the prefaulting was still around, which means in
> normal applications it was very hard to hit this bug. No idea why the
> regressions in igts weren't caught.
> 
> Now this all changed big time with 2 patches merged closely together.
> 
> First
> 
> commit 2889caa9232109afc8881f29a2205abeb5709d0c
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Fri Jun 16 15:05:19 2017 +0100
> 
>      drm/i915: Eliminate lots of iterations over the execobjects array
> 
> removes the prefaulting from the first relocation path, pushing it into
> the first slowpath (of which this patch added a total of 3 escalation
> levels). This would have really quickly uncovered the above bug, were
> it not for immediate adding a duct-tape on top with
> 
> commit 7dd4f6729f9243bd7046c6f04c107a456bda38eb
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Fri Jun 16 15:05:24 2017 +0100
> 
>      drm/i915: Async GPU relocation processing
> 
> by pushing all all the relocation patching to the gpu if the buffer
> was busy, which avoided all the possible blocking calls.
> 
> The entire slowpath was then furthermore ditched in
> 
> commit 7dc8f1143778a35b190f9413f228b3cf28f67f8d
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Wed Mar 11 16:03:10 2020 +0000
> 
>          drm/i915/gem: Drop relocation slowpath
> 
> and resurrected in
> 
> commit fd1500fcd4420eee06e2c7f3aa6067b78ac05871
> Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Date:   Wed Aug 19 16:08:43 2020 +0200
> 
>          Revert "drm/i915/gem: Drop relocation slowpath".
> 
> but this did not further impact what's going on.
> 
> Since pagefault_disable/enable is an atomic section, any sleeping in
> there is prohibited, and we definitely do that without gpu relocations
> since we have to wait for the gpu usage to finish before we can patch
> up the relocations.

Why do we also need the __copy_from_user_inatomic in eb_relocate_vma()?

Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 --
>   1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 6539b82dda54..7ff2fc3c0b2c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -2082,9 +2082,7 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb,
>   
>   	list_for_each_entry(ev, &eb->relocs, reloc_link) {
>   		if (!have_copy) {
> -			pagefault_disable();
>   			err = eb_relocate_vma(eb, ev);
> -			pagefault_enable();
>   			if (err)
>   				break;
>   		} else {
> 

WARNING: multiple messages have this Message-ID (diff)
From: Matthew Auld <matthew.auld@intel.com>
To: Daniel Vetter <daniel.vetter@ffwll.ch>,
	Intel Graphics Development <intel-gfx@lists.freedesktop.org>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"DRI Development" <dri-devel@lists.freedesktop.org>,
	"Chris Wilson" <chris@chris-wilson.co.uk>,
	"Dave Airlie" <airlied@redhat.com>,
	"Daniel Vetter" <daniel.vetter@intel.com>
Subject: Re: [Intel-gfx] [PATCH] drm/i915/eb: Fix pagefault disabling in the first slowpath
Date: Mon, 21 Jun 2021 10:33:01 +0100	[thread overview]
Message-ID: <8678b7b8-ad42-3e14-158e-77994b299c6e@intel.com> (raw)
In-Reply-To: <20210618214503.1773805-1-daniel.vetter@ffwll.ch>

On 18/06/2021 22:45, Daniel Vetter wrote:
> In
> 
> commit ebc0808fa2da0548a78e715858024cb81cd732bc
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Tue Oct 18 13:02:51 2016 +0100
> 
>      drm/i915: Restrict pagefault disabling to just around copy_from_user()
> 
> we entirely missed that there's a slow path call to eb_relocate_entry
> (or i915_gem_execbuffer_relocate_entry as it was called back then)
> which was left fully wrapped by pagefault_disable/enable() calls.
> Previously any issues with blocking calls where handled by the
> following code:
> 
> 	/* we can't wait for rendering with pagefaults disabled */
> 	if (pagefault_disabled() && !object_is_idle(obj))
> 		return -EFAULT;
> 
> Now at this point the prefaulting was still around, which means in
> normal applications it was very hard to hit this bug. No idea why the
> regressions in igts weren't caught.
> 
> Now this all changed big time with 2 patches merged closely together.
> 
> First
> 
> commit 2889caa9232109afc8881f29a2205abeb5709d0c
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Fri Jun 16 15:05:19 2017 +0100
> 
>      drm/i915: Eliminate lots of iterations over the execobjects array
> 
> removes the prefaulting from the first relocation path, pushing it into
> the first slowpath (of which this patch added a total of 3 escalation
> levels). This would have really quickly uncovered the above bug, were
> it not for immediate adding a duct-tape on top with
> 
> commit 7dd4f6729f9243bd7046c6f04c107a456bda38eb
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Fri Jun 16 15:05:24 2017 +0100
> 
>      drm/i915: Async GPU relocation processing
> 
> by pushing all all the relocation patching to the gpu if the buffer
> was busy, which avoided all the possible blocking calls.
> 
> The entire slowpath was then furthermore ditched in
> 
> commit 7dc8f1143778a35b190f9413f228b3cf28f67f8d
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Wed Mar 11 16:03:10 2020 +0000
> 
>          drm/i915/gem: Drop relocation slowpath
> 
> and resurrected in
> 
> commit fd1500fcd4420eee06e2c7f3aa6067b78ac05871
> Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Date:   Wed Aug 19 16:08:43 2020 +0200
> 
>          Revert "drm/i915/gem: Drop relocation slowpath".
> 
> but this did not further impact what's going on.
> 
> Since pagefault_disable/enable is an atomic section, any sleeping in
> there is prohibited, and we definitely do that without gpu relocations
> since we have to wait for the gpu usage to finish before we can patch
> up the relocations.

Why do we also need the __copy_from_user_inatomic in eb_relocate_vma()?

Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 --
>   1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 6539b82dda54..7ff2fc3c0b2c 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -2082,9 +2082,7 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb,
>   
>   	list_for_each_entry(ev, &eb->relocs, reloc_link) {
>   		if (!have_copy) {
> -			pagefault_disable();
>   			err = eb_relocate_vma(eb, ev);
> -			pagefault_enable();
>   			if (err)
>   				break;
>   		} else {
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2021-06-21  9:33 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-18 21:45 [PATCH] drm/i915/eb: Fix pagefault disabling in the first slowpath Daniel Vetter
2021-06-18 21:45 ` [Intel-gfx] " Daniel Vetter
2021-06-18 22:08 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2021-06-18 22:32 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-06-19  1:12 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2021-06-21  9:33 ` Matthew Auld [this message]
2021-06-21  9:33   ` [Intel-gfx] [PATCH] " Matthew Auld
2021-06-21 14:30   ` Maarten Lankhorst
2021-06-21 14:30     ` [Intel-gfx] " Maarten Lankhorst
2021-06-21 14:51     ` Daniel Vetter
2021-06-21 14:51       ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8678b7b8-ad42-3e14-158e-77994b299c6e@intel.com \
    --to=matthew.auld@intel.com \
    --cc=airlied@redhat.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel.vetter@ffwll.ch \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jason@jlekstrand.net \
    --cc=jon.bloomfield@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.