All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: ankitprasad.r.sharma@intel.com
Cc: intel-gfx@lists.freedesktop.org, akash.goel@intel.com,
	shashidhar.hiremath@intel.com
Subject: Re: [PATCH 08/11] drm/i915: Support for pread/pwrite from/to non shmem backed objects
Date: Thu, 14 Jan 2016 10:51:45 +0000	[thread overview]
Message-ID: <20160114105145.GQ29867@nuc-i3427.alporthouse.com> (raw)
In-Reply-To: <1452752207-30382-9-git-send-email-ankitprasad.r.sharma@intel.com>

On Thu, Jan 14, 2016 at 11:46:44AM +0530, ankitprasad.r.sharma@intel.com wrote:
> From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> 
> This patch adds support for extending the pread/pwrite functionality
> for objects not backed by shmem. The access will be made through
> gtt interface. This will cover objects backed by stolen memory as well
> as other non-shmem backed objects.
> 
> v2: Drop locks around slow_user_access, prefault the pages before
> access (Chris)
> 
> v3: Rebased to the latest drm-intel-nightly (Ankit)
> 
> v4: Moved page base & offset calculations outside the copy loop,
> corrected data types for size and offset variables, corrected if-else
> braces format (Tvrtko/kerneldocs)
> 
> v5: Enabled pread/pwrite for all non-shmem backed objects including
> without tiling restrictions (Ankit)
> 
> v6: Using pwrite_fast for non-shmem backed objects as well (Chris)
> 
> v7: Updated commit message, Renamed i915_gem_gtt_read to i915_gem_gtt_copy,
> added pwrite slow path for non-shmem backed objects (Chris/Tvrtko)
> 
> v8: Updated v7 commit message, mutex unlock around pwrite slow path for
> non-shmem backed objects (Tvrtko)
> 
> v9: Corrected check during pread_ioctl, to avoid shmem_pread being
> called for non-shmem backed objects (Tvrtko)
> 
> v10: Moved the write_domain check to needs_clflush and tiling mode check
> to pwrite_fast (Chris)
> 
> v11: Use pwrite_fast fallback for all objects (shmem and non-shmem backed),
> call fast_user_write regardless of pagefault in previous iteration
> 
> Testcase: igt/gem_stolen

Presumably igt/gem_pread and igt/gem_pwrite didn't break?

We could with the help of say vgem create a few 1+GiB objects that have
no obj->base.filp and so stress the pinning code better. (Or even just
stolen on some machines would be enough to create an unmappable object.)

> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 155 +++++++++++++++++++++++++++++++++-------
>  1 file changed, 129 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 7642b1b..ab1d043 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -55,6 +55,9 @@ static bool cpu_cache_is_coherent(struct drm_device *dev,
>  
>  static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
>  {
> +	if (obj->base.write_domain == I915_GEM_DOMAIN_CPU)
> +		return false;
> +
>  	if (!cpu_cache_is_coherent(obj->base.dev, obj->cache_level))
>  		return true;
>  
> @@ -646,6 +649,99 @@ shmem_pread_slow(struct page *page, int shmem_page_offset, int page_length,
>  	return ret ? - EFAULT : 0;
>  }
>  
> +static inline uint64_t
> +slow_user_access(struct io_mapping *mapping,
> +		 uint64_t page_base, int page_offset,
> +		 char __user *user_data,
> +		 int length, bool pwrite)

static inline bool
> +{
> +	void __iomem *vaddr_inatomic;

inatomic? We are not!
void __iomem *ioaddr;
void *vaddr;

> +	void *vaddr;
> +	uint64_t unwritten;
> +
> +	vaddr_inatomic = io_mapping_map_wc(mapping, page_base);
> +	/* We can use the cpu mem copy function because this is X86. */
> +	vaddr = (void __force *)vaddr_inatomic + page_offset;
> +	if (pwrite)
> +		unwritten = __copy_from_user(vaddr, user_data, length);
> +	else
> +		unwritten = __copy_to_user(user_data, vaddr, length);

> +
> +	io_mapping_unmap(vaddr_inatomic);
> +	return unwritten;
> +}
> +
> +static int
> +i915_gem_gtt_copy(struct drm_device *dev,
> +		   struct drm_i915_gem_object *obj, uint64_t size,
> +		   uint64_t data_offset, uint64_t data_ptr)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	char __user *user_data;
> +	uint64_t remain;
> +	uint64_t offset, page_base;
> +	int page_offset, page_length, ret = 0;
> +
> +	ret = i915_gem_obj_ggtt_pin(obj, 0, PIN_MAPPABLE);
> +	if (ret)
> +		goto out;
> +
> +	ret = i915_gem_object_set_to_gtt_domain(obj, false);
> +	if (ret)
> +		goto out_unpin;
> +
> +	ret = i915_gem_object_put_fence(obj);
> +	if (ret)
> +		goto out_unpin;
> +
> +	user_data = to_user_ptr(data_ptr);
> +	remain = size;
> +	offset = i915_gem_obj_ggtt_offset(obj) + data_offset;
> +
> +	mutex_unlock(&dev->struct_mutex);
> +	if (likely(!i915.prefault_disable))
> +		ret = fault_in_multipages_writeable(user_data, remain);
> +
> +	/*
> +	 * page_offset = offset within page
> +	 * page_base = page offset within aperture
> +	 */
> +	page_offset = offset_in_page(offset);
> +	page_base = offset & PAGE_MASK;

Where's the page-by-page copy? (Think about objects larger than the
aperture, or trying to operate in parallel with many - pinning aperture
space is bad practice.)

> +	while (remain > 0) {
> +		/* page_length = bytes to copy for this page */
> +		page_length = remain;
> +		if ((page_offset + remain) > PAGE_SIZE)
> +			page_length = PAGE_SIZE - page_offset;
> +
> +		/* This is a slow read/write as it tries to read from
> +		 * and write to user memory which may result into page
> +		 * faults

...and so we cannot perform this under the struct_mutex.

> +		 */
> +		ret = slow_user_access(dev_priv->gtt.mappable, page_base,
> +				       page_offset, user_data,
> +				       page_length, false);
> +
> +		if (ret) {

Fortuituously we don't copy more than a page, so this truncation error
should never hit. Still sloppy though.

> +			ret = -EFAULT;
> +			break;
> +		}
> +
> +		remain -= page_length;
> +		user_data += page_length;
> +		page_base += page_length;
> +		page_offset = 0;
> +	}
> +
> +	mutex_lock(&dev->struct_mutex);

This was the unlocked path, reset the domain. We could do some error
detection, but it is a user error so they get to keep the broken
pieces. (We should be consistent, always restore the domain after the
unlock/lock, or never. I'm actually thinking never, user error and all.)

> +
> +out_unpin:
> +	i915_gem_object_ggtt_unpin(obj);
> +out:
> +	return ret;
> +}
> +
>  static int
>  i915_gem_shmem_pread(struct drm_device *dev,
>  		     struct drm_i915_gem_object *obj,
> @@ -769,17 +865,14 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data,
>  		goto out;
>  	}
>  
> -	/* prime objects have no backing filp to GEM pread/pwrite
> -	 * pages from.
> -	 */
> -	if (!obj->base.filp) {
> -		ret = -EINVAL;
> -		goto out;
> -	}
> -
>  	trace_i915_gem_object_pread(obj, args->offset, args->size);
>  
> -	ret = i915_gem_shmem_pread(dev, obj, args, file);
> +	/* pread for non shmem backed objects */
> +	if (!obj->base.filp && obj->tiling_mode == I915_TILING_NONE)
> +		ret = i915_gem_gtt_copy(dev, obj, args->size,
> +					args->offset, args->data_ptr);
> +	else if (obj->base.filp)
> +		ret = i915_gem_shmem_pread(dev, obj, args, file);

I think this is much more readable as

ret = i915_gem_shmem_pread(##args);
if (ret == -ENODEV) /* or -EFAULT */
	ret = i915_gem_ggtt_pread(##args);

i.e. push the condition checking down and we know that a read of the
backing storage + clflush is the preferred option and should *always* be
tried first.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-01-14 10:51 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-14  6:16 [PATCH v14 0/11] Support for creating/using Stolen memory backed objects ankitprasad.r.sharma
2016-01-14  6:16 ` [PATCH 01/11] drm/i915: Add support for mapping an object page by page ankitprasad.r.sharma
2016-01-14 10:32   ` Tvrtko Ursulin
2016-01-14 13:34     ` Chris Wilson
2016-01-15  9:53       ` Tvrtko Ursulin
2016-01-15 11:44         ` Chris Wilson
2016-01-15 15:15           ` Tvrtko Ursulin
2016-01-14  6:16 ` [PATCH 02/11] drm/i915: Introduce i915_gem_object_get_dma_address() ankitprasad.r.sharma
2016-01-14  6:16 ` [PATCH 03/11] drm/i915: Use insert_page for pwrite_fast ankitprasad.r.sharma
2016-01-14  9:46   ` Chris Wilson
2016-01-14  6:16 ` [PATCH 04/11] drm/i915: Clearing buffer objects via CPU/GTT ankitprasad.r.sharma
2016-01-14 10:06   ` Chris Wilson
2016-01-14  6:16 ` [PATCH 05/11] drm/i915: Support for creating Stolen memory backed objects ankitprasad.r.sharma
2016-01-14 10:24   ` Chris Wilson
2016-01-14 10:46     ` Tvrtko Ursulin
2016-01-14 11:14       ` Chris Wilson
2016-01-14 11:27         ` Tvrtko Ursulin
2016-01-14 11:57           ` Chris Wilson
2016-01-14  6:16 ` [PATCH 06/11] drm/i915: Propagating correct error codes to the userspace ankitprasad.r.sharma
2016-01-14 10:25   ` Chris Wilson
2016-01-14  6:16 ` [PATCH 07/11] drm/i915: Add support for stealing purgable stolen pages ankitprasad.r.sharma
2016-01-14  6:16 ` [PATCH 08/11] drm/i915: Support for pread/pwrite from/to non shmem backed objects ankitprasad.r.sharma
2016-01-14 10:51   ` Chris Wilson [this message]
2016-01-14  6:16 ` [PATCH 09/11] drm/i915: Migrate stolen objects before hibernation ankitprasad.r.sharma
2016-01-14 10:53   ` Chris Wilson
2016-01-14  6:16 ` [PATCH 10/11] acpi: Export acpi_bus_type ankitprasad.r.sharma
2016-01-14  6:16   ` ankitprasad.r.sharma
2016-01-15 14:51   ` Rafael J. Wysocki
2016-01-18  9:01     ` Ankitprasad Sharma
2016-01-18  9:01       ` Ankitprasad Sharma
2016-01-18 14:57       ` Rafael J. Wysocki
2016-01-18 18:26         ` Lukas Wunner
2016-01-19  8:15           ` Ankitprasad Sharma
2016-01-18 22:28         ` Rafael J. Wysocki
2016-01-18 22:39           ` Lukas Wunner
2016-01-18 22:46             ` Rafael J. Wysocki
2016-01-18 23:00               ` Lukas Wunner
2016-01-18 23:00                 ` Lukas Wunner
2016-01-18 23:59                 ` Rafael J. Wysocki
2016-01-19 16:31                   ` Lukas Wunner
2016-01-19 22:03                     ` Rafael J. Wysocki
2016-01-14  6:16 ` [PATCH 11/11] drm/i915: Disable use of stolen area by User when Intel RST is present ankitprasad.r.sharma
2016-01-14  6:16   ` ankitprasad.r.sharma
2016-01-14 11:20 ` ✗ failure: Fi.CI.BAT Patchwork
2016-05-31  6:19 [PATCH v20 00/11] Support for creating/using Stolen memory backed objects ankitprasad.r.sharma
2016-05-31  6:19 ` [PATCH 08/11] drm/i915: Support for pread/pwrite from/to non shmem " ankitprasad.r.sharma
2016-06-08  9:26 [PATCH v21 00/11] Support for creating/using Stolen memory " ankitprasad.r.sharma
2016-06-08  9:26 ` [PATCH 08/11] drm/i915: Support for pread/pwrite from/to non shmem " ankitprasad.r.sharma
2016-06-08 12:20 [PATCH v22 00/11] Support for creating/using Stolen memory " ankitprasad.r.sharma
2016-06-08 12:20 ` [PATCH 08/11] drm/i915: Support for pread/pwrite from/to non shmem " ankitprasad.r.sharma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160114105145.GQ29867@nuc-i3427.alporthouse.com \
    --to=chris@chris-wilson.co.uk \
    --cc=akash.goel@intel.com \
    --cc=ankitprasad.r.sharma@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=shashidhar.hiremath@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.