All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 11/42] drm/i915: Introduce an internal allocator for disposable private objects
Date: Sat, 8 Oct 2016 09:12:13 +0100	[thread overview]
Message-ID: <5c234e90-abfd-301b-5743-2116b15d67c9@linux.intel.com> (raw)
In-Reply-To: <20161007170801.GU22676@nuc-i3427.alporthouse.com>


On 07/10/2016 18:08, Chris Wilson wrote:
> On Fri, Oct 07, 2016 at 05:52:47PM +0100, Tvrtko Ursulin wrote:
>> On 07/10/2016 10:46, Chris Wilson wrote:
>>> Quite a few of our objects used for internal hardware programming do not
>>> benefit from being swappable or from being zero initialised. As such
>>> they do not benefit from using a shmemfs backing storage and since they
>>> are internal and never directly exposed to the user, we do not need to
>>> worry about providing a filp. For these we can use an
>>> drm_i915_gem_object wrapper around a sg_table of plain struct page. They
>>> are not swap backed and not automatically pinned. If they are reaped
>>> by the shrinker, the pages are released and the contents discarded. For
>>> the internal use case, this is fine as for example, ringbuffers are
>>> pinned from being written by a request to be read by the hardware. Once
>>> they are idle, they can be discarded entirely. As such they are a good
>>> match for execlist ringbuffers and a small variety of other internal
>>> objects.
>>>
>>> In the first iteration, this is limited to the scratch batch buffers we
>>> use (for command parsing and state initialisation).
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>   drivers/gpu/drm/i915/Makefile                |   1 +
>>>   drivers/gpu/drm/i915/i915_drv.h              |   5 +
>>>   drivers/gpu/drm/i915/i915_gem_batch_pool.c   |  28 ++---
>>>   drivers/gpu/drm/i915/i915_gem_internal.c     | 161 +++++++++++++++++++++++++++
>>>   drivers/gpu/drm/i915/i915_gem_render_state.c |   2 +-
>>>   drivers/gpu/drm/i915/intel_engine_cs.c       |   2 +-
>>>   drivers/gpu/drm/i915/intel_ringbuffer.c      |  14 ++-
>>>   7 files changed, 189 insertions(+), 24 deletions(-)
>>>   create mode 100644 drivers/gpu/drm/i915/i915_gem_internal.c
>>>
>>> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
>>> index a998c2bce70a..b94a90f34d2d 100644
>>> --- a/drivers/gpu/drm/i915/Makefile
>>> +++ b/drivers/gpu/drm/i915/Makefile
>>> @@ -35,6 +35,7 @@ i915-y += i915_cmd_parser.o \
>>>   	  i915_gem_execbuffer.o \
>>>   	  i915_gem_fence.o \
>>>   	  i915_gem_gtt.o \
>>> +	  i915_gem_internal.o \
>>>   	  i915_gem.o \
>>>   	  i915_gem_render_state.o \
>>>   	  i915_gem_request.o \
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index fee5cc92e2f2..bad97f1e5265 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -3538,6 +3538,11 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>>>   					       u32 gtt_offset,
>>>   					       u32 size);
>>> +/* i915_gem_internal.c */
>>> +struct drm_i915_gem_object *
>>> +i915_gem_object_create_internal(struct drm_device *dev,
>>> +				unsigned int size);
>>> +
>> Wasn't size_t our convention for GEM objects?
> Not our convention, no. Using size_t has caused too many bugs, that we
> started to erradicate it (in our usual piecemeal approach).

Oh wow, I missed that decision. Last thing I remember was that someone 
was trying to convert it all to size_t. Fine by me.

>>> +		/* 965gm cannot relocate objects above 4GiB. */
>>> +		gfp &= ~__GFP_HIGHMEM;
>>> +		gfp |= __GFP_DMA32;
>>> +	}
>>> +
>>> +	for (i = 0; i < npages; i++) {
>>> +		struct page *page;
>>> +
>>> +		page = alloc_page(gfp);
>>> +		if (!page)
>>> +			goto err;
>>> +
>>> +#ifdef CONFIG_SWIOTLB
>>> +		if (swiotlb_nr_tbl()) {
>>> +			st->nents++;
>>> +			sg_set_page(sg, page, PAGE_SIZE, 0);
>>> +			sg = sg_next(sg);
>>> +			continue;
>>> +		}
>>> +#endif
>>> +		if (!i || page_to_pfn(page) != last_pfn + 1) {
>>> +			if (i)
>>> +				sg = sg_next(sg);
>>> +			st->nents++;
>>> +			sg_set_page(sg, page, PAGE_SIZE, 0);
>>> +		} else {
>>> +			sg->length += PAGE_SIZE;
>>> +		}
>>> +		last_pfn = page_to_pfn(page);
>>> +	}
>>> +#ifdef CONFIG_SWIOTLB
>>> +	if (!swiotlb_nr_tbl())
>>> +#endif
>>> +		sg_mark_end(sg);
>> Looks like the loop above could be moved into a helper and shared
>> with i915_gem_object_get_pages_gtt. Maybe just a page-alloc and
>> page-alloc-error callbacks would be required.
> So just the entire thing as a callback... I would have thought you might
> suggest trying high order allocations and falling back to low order.

I am not sure higher order attempts would be worth it. What I was 
thinking was to implement the loop above generically in one place, and 
then two (or even three with userptr) callers would call that with their 
own alloc-page and alloc-page-on-error callbacks. So that there is a 
single implementation of the coalescing logic etc.

Regards,

Tvrtko



_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-10-08  8:12 UTC|newest]

Thread overview: 107+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-07  9:45 Explicit fencing on multiple timelines, again Chris Wilson
2016-10-07  9:45 ` [PATCH 01/42] drm/i915: Allow disabling error capture Chris Wilson
2016-10-07  9:45 ` [PATCH 02/42] drm/i915: Stop the machine whilst capturing the GPU crash dump Chris Wilson
2016-10-07 10:11   ` Joonas Lahtinen
2016-10-07  9:45 ` [PATCH 03/42] drm/i915: Always use the GTT for error capture Chris Wilson
2016-10-07  9:45 ` [PATCH 04/42] drm/i915: Consolidate error object printing Chris Wilson
2016-10-07  9:45 ` [PATCH 05/42] drm/i915: Compress GPU objects in error state Chris Wilson
2016-10-07  9:45 ` [PATCH 06/42] drm/i915: Support asynchronous waits on struct fence from i915_gem_request Chris Wilson
2016-10-07  9:56   ` Joonas Lahtinen
2016-10-07 15:51   ` Tvrtko Ursulin
2016-10-07 16:12     ` Chris Wilson
2016-10-07 16:16       ` Tvrtko Ursulin
2016-10-07 16:37         ` Chris Wilson
2016-10-08  8:23           ` Tvrtko Ursulin
2016-10-08  8:58             ` Chris Wilson
2016-10-07  9:46 ` [PATCH 07/42] drm/i915: Allow i915_sw_fence_await_sw_fence() to allocate Chris Wilson
2016-10-07 16:10   ` Tvrtko Ursulin
2016-10-07 16:22     ` Chris Wilson
2016-10-08  8:21       ` Tvrtko Ursulin
2016-10-07  9:46 ` [PATCH 08/42] drm/i915: Rearrange i915_wait_request() accounting with callers Chris Wilson
2016-10-07  9:58   ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 09/42] drm/i915: Remove unused i915_gem_active_wait() in favour of _unlocked() Chris Wilson
2016-10-07  9:46 ` [PATCH 10/42] drm/i915: Defer active reference until required Chris Wilson
2016-10-07 16:35   ` Tvrtko Ursulin
2016-10-07 16:58     ` Chris Wilson
2016-10-08  8:18       ` Tvrtko Ursulin
2016-10-07  9:46 ` [PATCH 11/42] drm/i915: Introduce an internal allocator for disposable private objects Chris Wilson
2016-10-07 10:01   ` Joonas Lahtinen
2016-10-07 16:52   ` Tvrtko Ursulin
2016-10-07 17:08     ` Chris Wilson
2016-10-08  8:12       ` Tvrtko Ursulin [this message]
2016-10-08  8:32         ` Chris Wilson
2016-10-08  8:34         ` [PATCH v2] " Chris Wilson
2016-10-10  7:01           ` Joonas Lahtinen
2016-10-10  8:11           ` Tvrtko Ursulin
2016-10-10  8:19             ` Chris Wilson
2016-10-10  8:25               ` Tvrtko Ursulin
2016-10-07  9:46 ` [PATCH 12/42] drm/i915: Reuse the active golden render state batch Chris Wilson
2016-10-07  9:46 ` [PATCH 13/42] drm/i915: Markup GEM API with lockdep asserts Chris Wilson
2016-10-07  9:46 ` [PATCH 14/42] drm/i915: Use a radixtree for random access to the object's backing storage Chris Wilson
2016-10-07 10:12   ` Joonas Lahtinen
2016-10-07 11:05     ` Chris Wilson
2016-10-07 11:33       ` Joonas Lahtinen
2016-10-07 13:36   ` John Harrison
2016-10-11  9:32   ` Tvrtko Ursulin
2016-10-11 10:15     ` John Harrison
2016-10-07  9:46 ` [PATCH 15/42] drm/i915: Use radixtree to jump start intel_partial_pages() Chris Wilson
2016-10-07 13:46   ` John Harrison
2016-10-07  9:46 ` [PATCH 16/42] drm/i915: Refactor object page API Chris Wilson
2016-10-10 10:54   ` John Harrison
2016-10-11 11:23   ` Tvrtko Ursulin
2016-10-13 11:04   ` Joonas Lahtinen
2016-10-13 11:10     ` Chris Wilson
2016-10-07  9:46 ` [PATCH 17/42] drm/i915: Pass around sg_table to get_pages/put_pages backend Chris Wilson
2016-10-14  9:12   ` Joonas Lahtinen
2016-10-14  9:24     ` Chris Wilson
2016-10-14  9:28   ` Tvrtko Ursulin
2016-10-14  9:43     ` Chris Wilson
2016-10-17 10:52       ` Tvrtko Ursulin
2016-10-17 11:08         ` Chris Wilson
2016-10-07  9:46 ` [PATCH 18/42] drm/i915: Move object backing storage manipulation to its own locking Chris Wilson
2016-10-13 12:46   ` Joonas Lahtinen
2016-10-13 12:56     ` Chris Wilson
2016-10-07  9:46 ` [PATCH 19/42] drm/i915/dmabuf: Acquire the backing storage outside of struct_mutex Chris Wilson
2016-10-13 11:54   ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 20/42] drm/i915: Implement pread without struct-mutex Chris Wilson
2016-10-12 12:53   ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 21/42] drm/i915: Implement pwrite " Chris Wilson
2016-10-13 11:17   ` Joonas Lahtinen
2016-10-13 11:54     ` Chris Wilson
2016-10-14  7:08       ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 22/42] drm/i915: Acquire the backing storage outside of struct_mutex in set-domain Chris Wilson
2016-10-13 11:47   ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 23/42] drm/i915: Move object release to a freelist + worker Chris Wilson
2016-10-11  9:52   ` John Harrison
2016-10-07  9:46 ` [PATCH 24/42] drm/i915: Treat a framebuffer reference as an active reference whilst shrinking Chris Wilson
2016-10-11  9:54   ` John Harrison
2016-10-07  9:46 ` [PATCH 25/42] drm/i915: Use lockless object free Chris Wilson
2016-10-11  9:56   ` John Harrison
2016-10-07  9:46 ` [PATCH 26/42] drm/i915: Move GEM activity tracking into a common struct reservation_object Chris Wilson
2016-10-07 10:10   ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 27/42] drm: Add reference counting to drm_atomic_state Chris Wilson
2016-10-07  9:46 ` [PATCH 28/42] drm/i915: Restore nonblocking awaits for modesetting Chris Wilson
2016-10-07  9:46 ` [PATCH 29/42] drm/i915: Combine seqno + tracking into a global timeline struct Chris Wilson
2016-10-07  9:46 ` [PATCH 30/42] drm/i915: Queue the idling context switch after all other timelines Chris Wilson
2016-10-07  9:46 ` [PATCH 31/42] drm/i915: Wait first for submission, before waiting for request completion Chris Wilson
2016-10-07  9:46 ` [PATCH 32/42] drm/i915: Introduce a global_seqno for each request Chris Wilson
2016-10-07  9:46 ` [PATCH 33/42] drm/i915: Rename ->emit_request to ->emit_breadcrumb Chris Wilson
2016-10-07  9:46 ` [PATCH 34/42] drm/i915: Record space required for breadcrumb emission Chris Wilson
2016-10-07  9:46 ` [PATCH 35/42] drm/i915: Defer " Chris Wilson
2016-10-07  9:46 ` [PATCH 36/42] drm/i915: Move the global sync optimisation to the timeline Chris Wilson
2016-10-07  9:46 ` [PATCH 37/42] drm/i915: Create a unique name for the context Chris Wilson
2016-10-07  9:46 ` [PATCH 38/42] drm/i915: Reserve space in the global seqno during request allocation Chris Wilson
2016-10-07  9:46 ` [PATCH 39/42] drm/i915: Defer setting of global seqno on request to submission Chris Wilson
2016-10-07 10:25   ` Joonas Lahtinen
2016-10-07 10:27   ` Joonas Lahtinen
2016-10-07 11:03     ` Chris Wilson
2016-10-07 11:10       ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 40/42] drm/i915: Enable multiple timelines Chris Wilson
2016-10-07 10:29   ` Joonas Lahtinen
2016-10-07 11:00     ` Chris Wilson
2016-10-07 11:07       ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 41/42] drm/i915: Enable userspace to opt-out of implicit fencing Chris Wilson
2016-10-07  9:46 ` [PATCH 42/42] drm/i915: Support explicit fencing for execbuf Chris Wilson
2016-10-07 10:19 ` ✗ Fi.CI.BAT: warning for series starting with [01/42] drm/i915: Allow disabling error capture Patchwork
2016-10-10  7:23 ` Patchwork
2016-10-10 15:31 ` ✗ Fi.CI.BAT: failure for series starting with [01/42] drm/i915: Allow disabling error capture (rev2) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5c234e90-abfd-301b-5743-2116b15d67c9@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.