All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v2] drm/i915: Introduce an internal allocator for disposable private objects
Date: Mon, 10 Oct 2016 09:25:08 +0100	[thread overview]
Message-ID: <b2eb5878-ad0e-3eba-9dbc-864da6b2fcda@linux.intel.com> (raw)
In-Reply-To: <20161010081900.GC2718@nuc-i3427.alporthouse.com>


On 10/10/2016 09:19, Chris Wilson wrote:
> On Mon, Oct 10, 2016 at 09:11:49AM +0100, Tvrtko Ursulin wrote:
>> On 08/10/2016 09:34, Chris Wilson wrote:
>>> Quite a few of our objects used for internal hardware programming do not
>>> benefit from being swappable or from being zero initialised. As such
>>> they do not benefit from using a shmemfs backing storage and since they
>>> are internal and never directly exposed to the user, we do not need to
>>> worry about providing a filp. For these we can use an
>>> drm_i915_gem_object wrapper around a sg_table of plain struct page. They
>>> are not swap backed and not automatically pinned. If they are reaped
>>> by the shrinker, the pages are released and the contents discarded. For
>>> the internal use case, this is fine as for example, ringbuffers are
>>> pinned from being written by a request to be read by the hardware. Once
>>> they are idle, they can be discarded entirely. As such they are a good
>>> match for execlist ringbuffers and a small variety of other internal
>>> objects.
>>>
>>> In the first iteration, this is limited to the scratch batch buffers we
>>> use (for command parsing and state initialisation).
>>>
>>> v2: Allocate physically contiguous pages, where possible.
>> Since the allocator will be used constantly at runtime, my
>> recollection is that higher order allocations can easily become next
>> to impossible, so I am wondering why? Also, on your last reply you
>> did not write why you think this is interesting to try?
>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/Makefile                |   1 +
>>>   drivers/gpu/drm/i915/i915_drv.h              |   5 +
>>>   drivers/gpu/drm/i915/i915_gem_batch_pool.c   |  28 ++---
>>>   drivers/gpu/drm/i915/i915_gem_internal.c     | 162 +++++++++++++++++++++++++++
>>>   drivers/gpu/drm/i915/i915_gem_render_state.c |   2 +-
>>>   drivers/gpu/drm/i915/intel_engine_cs.c       |   2 +-
>>>   drivers/gpu/drm/i915/intel_ringbuffer.c      |  14 ++-
>>>   7 files changed, 190 insertions(+), 24 deletions(-)
>>>   create mode 100644 drivers/gpu/drm/i915/i915_gem_internal.c
>>>
>>> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
>>> index a998c2bce70a..b94a90f34d2d 100644
>>> --- a/drivers/gpu/drm/i915/Makefile
>>> +++ b/drivers/gpu/drm/i915/Makefile
>>> @@ -35,6 +35,7 @@ i915-y += i915_cmd_parser.o \
>>>   	  i915_gem_execbuffer.o \
>>>   	  i915_gem_fence.o \
>>>   	  i915_gem_gtt.o \
>>> +	  i915_gem_internal.o \
>>>   	  i915_gem.o \
>>>   	  i915_gem_render_state.o \
>>>   	  i915_gem_request.o \
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index fee5cc92e2f2..bad97f1e5265 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -3538,6 +3538,11 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>>>   					       u32 gtt_offset,
>>>   					       u32 size);
>>> +/* i915_gem_internal.c */
>>> +struct drm_i915_gem_object *
>>> +i915_gem_object_create_internal(struct drm_device *dev,
>>> +				unsigned int size);
>>> +
>>>   /* i915_gem_shrinker.c */
>>>   unsigned long i915_gem_shrink(struct drm_i915_private *dev_priv,
>>>   			      unsigned long target,
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.c b/drivers/gpu/drm/i915/i915_gem_batch_pool.c
>>> index cb25cad3318c..3934c9103cf2 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_batch_pool.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_batch_pool.c
>>> @@ -97,9 +97,9 @@ i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool,
>>>   			size_t size)
>>>   {
>>>   	struct drm_i915_gem_object *obj = NULL;
>>> -	struct drm_i915_gem_object *tmp, *next;
>>> +	struct drm_i915_gem_object *tmp;
>>>   	struct list_head *list;
>>> -	int n;
>>> +	int n, ret;
>>>   	lockdep_assert_held(&pool->engine->i915->drm.struct_mutex);
>>> @@ -112,19 +112,12 @@ i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool,
>>>   		n = ARRAY_SIZE(pool->cache_list) - 1;
>>>   	list = &pool->cache_list[n];
>>> -	list_for_each_entry_safe(tmp, next, list, batch_pool_link) {
>>> +	list_for_each_entry(tmp, list, batch_pool_link) {
>>>   		/* The batches are strictly LRU ordered */
>>>   		if (!i915_gem_active_is_idle(&tmp->last_read[pool->engine->id],
>>>   					     &tmp->base.dev->struct_mutex))
>>>   			break;
>>> -		/* While we're looping, do some clean up */
>>> -		if (tmp->madv == __I915_MADV_PURGED) {
>>> -			list_del(&tmp->batch_pool_link);
>>> -			i915_gem_object_put(tmp);
>>> -			continue;
>>> -		}
>>> -
>>>   		if (tmp->base.size >= size) {
>>>   			obj = tmp;
>>>   			break;
>>> @@ -132,19 +125,16 @@ i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool,
>>>   	}
>>>   	if (obj == NULL) {
>>> -		int ret;
>>> -
>>> -		obj = i915_gem_object_create(&pool->engine->i915->drm, size);
>>> +		obj = i915_gem_object_create_internal(&pool->engine->i915->drm,
>>> +						      size);
>>>   		if (IS_ERR(obj))
>>>   			return obj;
>>> -
>>> -		ret = i915_gem_object_get_pages(obj);
>>> -		if (ret)
>>> -			return ERR_PTR(ret);
>>> -
>>> -		obj->madv = I915_MADV_DONTNEED;
>>>   	}
>>> +	ret = i915_gem_object_get_pages(obj);
>>> +	if (ret)
>>> +		return ERR_PTR(ret);
>>> +
>>>   	list_move_tail(&obj->batch_pool_link, list);
>>>   	i915_gem_object_pin_pages(obj);
>>>   	return obj;
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_internal.c b/drivers/gpu/drm/i915/i915_gem_internal.c
>>> new file mode 100644
>>> index 000000000000..255b9232b8ef
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/i915/i915_gem_internal.c
>>> @@ -0,0 +1,162 @@
>>> +/*
>>> + * Copyright © 2014-2016 Intel Corporation
>>> + *
>>> + * Permission is hereby granted, free of charge, to any person obtaining a
>>> + * copy of this software and associated documentation files (the "Software"),
>>> + * to deal in the Software without restriction, including without limitation
>>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>>> + * and/or sell copies of the Software, and to permit persons to whom the
>>> + * Software is furnished to do so, subject to the following conditions:
>>> + *
>>> + * The above copyright notice and this permission notice (including the next
>>> + * paragraph) shall be included in all copies or substantial portions of the
>>> + * Software.
>>> + *
>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
>>> + * IN THE SOFTWARE.
>>> + *
>>> + */
>>> +
>>> +#include <drm/drmP.h>
>>> +#include <drm/i915_drm.h>
>>> +#include "i915_drv.h"
>>> +
>>> +#define QUIET (__GFP_NORETRY | __GFP_NOWARN)
>>> +
>>> +static void internal_free_pages(struct sg_table *st)
>>> +{
>>> +	struct scatterlist *sg;
>>> +
>>> +	for (sg = st->sgl; sg; sg = __sg_next(sg))
>>> +		__free_pages(sg_page(sg), get_order(sg->length));
>> Fragile since wont work together with coalescing, which I am sad to
>> see not implemented. For some reason it makes me feel real good when
>> it is there.
> We do the coalescing.

Where? It is not in this patch? One allocation == one st->nents++. If 
the next allocation is contiguous there is no coalescing.

>>> +		do {
>>> +			page = alloc_pages(gfp | (order ? QUIET : 0), order);
>>> +			if (page)
>>> +				break;
>>> +			if (!order--)
>>> +				goto err;
>>> +		} while (1);
>> Feels like it could hammer hard on a fragmented system, I have big
>> concerns about this.
> You did notice that these were marked as temporary allocations (because
> they are reclaimed under memory pressure), and on an already fragemented
> system we do the right thing anyway?

You try smaller and then for the next allocation start from the initial 
order which did not work previously.

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-10-10  8:25 UTC|newest]

Thread overview: 107+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-07  9:45 Explicit fencing on multiple timelines, again Chris Wilson
2016-10-07  9:45 ` [PATCH 01/42] drm/i915: Allow disabling error capture Chris Wilson
2016-10-07  9:45 ` [PATCH 02/42] drm/i915: Stop the machine whilst capturing the GPU crash dump Chris Wilson
2016-10-07 10:11   ` Joonas Lahtinen
2016-10-07  9:45 ` [PATCH 03/42] drm/i915: Always use the GTT for error capture Chris Wilson
2016-10-07  9:45 ` [PATCH 04/42] drm/i915: Consolidate error object printing Chris Wilson
2016-10-07  9:45 ` [PATCH 05/42] drm/i915: Compress GPU objects in error state Chris Wilson
2016-10-07  9:45 ` [PATCH 06/42] drm/i915: Support asynchronous waits on struct fence from i915_gem_request Chris Wilson
2016-10-07  9:56   ` Joonas Lahtinen
2016-10-07 15:51   ` Tvrtko Ursulin
2016-10-07 16:12     ` Chris Wilson
2016-10-07 16:16       ` Tvrtko Ursulin
2016-10-07 16:37         ` Chris Wilson
2016-10-08  8:23           ` Tvrtko Ursulin
2016-10-08  8:58             ` Chris Wilson
2016-10-07  9:46 ` [PATCH 07/42] drm/i915: Allow i915_sw_fence_await_sw_fence() to allocate Chris Wilson
2016-10-07 16:10   ` Tvrtko Ursulin
2016-10-07 16:22     ` Chris Wilson
2016-10-08  8:21       ` Tvrtko Ursulin
2016-10-07  9:46 ` [PATCH 08/42] drm/i915: Rearrange i915_wait_request() accounting with callers Chris Wilson
2016-10-07  9:58   ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 09/42] drm/i915: Remove unused i915_gem_active_wait() in favour of _unlocked() Chris Wilson
2016-10-07  9:46 ` [PATCH 10/42] drm/i915: Defer active reference until required Chris Wilson
2016-10-07 16:35   ` Tvrtko Ursulin
2016-10-07 16:58     ` Chris Wilson
2016-10-08  8:18       ` Tvrtko Ursulin
2016-10-07  9:46 ` [PATCH 11/42] drm/i915: Introduce an internal allocator for disposable private objects Chris Wilson
2016-10-07 10:01   ` Joonas Lahtinen
2016-10-07 16:52   ` Tvrtko Ursulin
2016-10-07 17:08     ` Chris Wilson
2016-10-08  8:12       ` Tvrtko Ursulin
2016-10-08  8:32         ` Chris Wilson
2016-10-08  8:34         ` [PATCH v2] " Chris Wilson
2016-10-10  7:01           ` Joonas Lahtinen
2016-10-10  8:11           ` Tvrtko Ursulin
2016-10-10  8:19             ` Chris Wilson
2016-10-10  8:25               ` Tvrtko Ursulin [this message]
2016-10-07  9:46 ` [PATCH 12/42] drm/i915: Reuse the active golden render state batch Chris Wilson
2016-10-07  9:46 ` [PATCH 13/42] drm/i915: Markup GEM API with lockdep asserts Chris Wilson
2016-10-07  9:46 ` [PATCH 14/42] drm/i915: Use a radixtree for random access to the object's backing storage Chris Wilson
2016-10-07 10:12   ` Joonas Lahtinen
2016-10-07 11:05     ` Chris Wilson
2016-10-07 11:33       ` Joonas Lahtinen
2016-10-07 13:36   ` John Harrison
2016-10-11  9:32   ` Tvrtko Ursulin
2016-10-11 10:15     ` John Harrison
2016-10-07  9:46 ` [PATCH 15/42] drm/i915: Use radixtree to jump start intel_partial_pages() Chris Wilson
2016-10-07 13:46   ` John Harrison
2016-10-07  9:46 ` [PATCH 16/42] drm/i915: Refactor object page API Chris Wilson
2016-10-10 10:54   ` John Harrison
2016-10-11 11:23   ` Tvrtko Ursulin
2016-10-13 11:04   ` Joonas Lahtinen
2016-10-13 11:10     ` Chris Wilson
2016-10-07  9:46 ` [PATCH 17/42] drm/i915: Pass around sg_table to get_pages/put_pages backend Chris Wilson
2016-10-14  9:12   ` Joonas Lahtinen
2016-10-14  9:24     ` Chris Wilson
2016-10-14  9:28   ` Tvrtko Ursulin
2016-10-14  9:43     ` Chris Wilson
2016-10-17 10:52       ` Tvrtko Ursulin
2016-10-17 11:08         ` Chris Wilson
2016-10-07  9:46 ` [PATCH 18/42] drm/i915: Move object backing storage manipulation to its own locking Chris Wilson
2016-10-13 12:46   ` Joonas Lahtinen
2016-10-13 12:56     ` Chris Wilson
2016-10-07  9:46 ` [PATCH 19/42] drm/i915/dmabuf: Acquire the backing storage outside of struct_mutex Chris Wilson
2016-10-13 11:54   ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 20/42] drm/i915: Implement pread without struct-mutex Chris Wilson
2016-10-12 12:53   ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 21/42] drm/i915: Implement pwrite " Chris Wilson
2016-10-13 11:17   ` Joonas Lahtinen
2016-10-13 11:54     ` Chris Wilson
2016-10-14  7:08       ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 22/42] drm/i915: Acquire the backing storage outside of struct_mutex in set-domain Chris Wilson
2016-10-13 11:47   ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 23/42] drm/i915: Move object release to a freelist + worker Chris Wilson
2016-10-11  9:52   ` John Harrison
2016-10-07  9:46 ` [PATCH 24/42] drm/i915: Treat a framebuffer reference as an active reference whilst shrinking Chris Wilson
2016-10-11  9:54   ` John Harrison
2016-10-07  9:46 ` [PATCH 25/42] drm/i915: Use lockless object free Chris Wilson
2016-10-11  9:56   ` John Harrison
2016-10-07  9:46 ` [PATCH 26/42] drm/i915: Move GEM activity tracking into a common struct reservation_object Chris Wilson
2016-10-07 10:10   ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 27/42] drm: Add reference counting to drm_atomic_state Chris Wilson
2016-10-07  9:46 ` [PATCH 28/42] drm/i915: Restore nonblocking awaits for modesetting Chris Wilson
2016-10-07  9:46 ` [PATCH 29/42] drm/i915: Combine seqno + tracking into a global timeline struct Chris Wilson
2016-10-07  9:46 ` [PATCH 30/42] drm/i915: Queue the idling context switch after all other timelines Chris Wilson
2016-10-07  9:46 ` [PATCH 31/42] drm/i915: Wait first for submission, before waiting for request completion Chris Wilson
2016-10-07  9:46 ` [PATCH 32/42] drm/i915: Introduce a global_seqno for each request Chris Wilson
2016-10-07  9:46 ` [PATCH 33/42] drm/i915: Rename ->emit_request to ->emit_breadcrumb Chris Wilson
2016-10-07  9:46 ` [PATCH 34/42] drm/i915: Record space required for breadcrumb emission Chris Wilson
2016-10-07  9:46 ` [PATCH 35/42] drm/i915: Defer " Chris Wilson
2016-10-07  9:46 ` [PATCH 36/42] drm/i915: Move the global sync optimisation to the timeline Chris Wilson
2016-10-07  9:46 ` [PATCH 37/42] drm/i915: Create a unique name for the context Chris Wilson
2016-10-07  9:46 ` [PATCH 38/42] drm/i915: Reserve space in the global seqno during request allocation Chris Wilson
2016-10-07  9:46 ` [PATCH 39/42] drm/i915: Defer setting of global seqno on request to submission Chris Wilson
2016-10-07 10:25   ` Joonas Lahtinen
2016-10-07 10:27   ` Joonas Lahtinen
2016-10-07 11:03     ` Chris Wilson
2016-10-07 11:10       ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 40/42] drm/i915: Enable multiple timelines Chris Wilson
2016-10-07 10:29   ` Joonas Lahtinen
2016-10-07 11:00     ` Chris Wilson
2016-10-07 11:07       ` Joonas Lahtinen
2016-10-07  9:46 ` [PATCH 41/42] drm/i915: Enable userspace to opt-out of implicit fencing Chris Wilson
2016-10-07  9:46 ` [PATCH 42/42] drm/i915: Support explicit fencing for execbuf Chris Wilson
2016-10-07 10:19 ` ✗ Fi.CI.BAT: warning for series starting with [01/42] drm/i915: Allow disabling error capture Patchwork
2016-10-10  7:23 ` Patchwork
2016-10-10 15:31 ` ✗ Fi.CI.BAT: failure for series starting with [01/42] drm/i915: Allow disabling error capture (rev2) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b2eb5878-ad0e-3eba-9dbc-864da6b2fcda@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.