* [PATCH] drm/i915: tidy up request alloc
@ 2016-06-30 1:35 Hong Liu
2016-06-30 5:43 ` ✓ Ro.CI.BAT: success for " Patchwork
2016-06-30 8:50 ` [PATCH] " Tvrtko Ursulin
0 siblings, 2 replies; 9+ messages in thread
From: Hong Liu @ 2016-06-30 1:35 UTC (permalink / raw)
To: intel-gfx
Return the allocated request pointer directly to remove
the double pointer parameter.
Signed-off-by: Hong Liu <hong.liu@intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 25 +++++++------------------
1 file changed, 7 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1d98782..9881455 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2988,32 +2988,26 @@ void i915_gem_request_free(struct kref *req_ref)
kmem_cache_free(req->i915->requests, req);
}
-static inline int
+static inline struct drm_i915_gem_request *
__i915_gem_request_alloc(struct intel_engine_cs *engine,
- struct i915_gem_context *ctx,
- struct drm_i915_gem_request **req_out)
+ struct i915_gem_context *ctx)
{
struct drm_i915_private *dev_priv = engine->i915;
unsigned reset_counter = i915_reset_counter(&dev_priv->gpu_error);
struct drm_i915_gem_request *req;
int ret;
- if (!req_out)
- return -EINVAL;
-
- *req_out = NULL;
-
/* ABI: Before userspace accesses the GPU (e.g. execbuffer), report
* EIO if the GPU is already wedged, or EAGAIN to drop the struct_mutex
* and restart.
*/
ret = i915_gem_check_wedge(reset_counter, dev_priv->mm.interruptible);
if (ret)
- return ret;
+ return ERR_PTR(ret);
req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL);
if (req == NULL)
- return -ENOMEM;
+ return ERR_PTR(-ENOMEM);
ret = i915_gem_get_seqno(engine->i915, &req->seqno);
if (ret)
@@ -3041,14 +3035,13 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
if (ret)
goto err_ctx;
- *req_out = req;
- return 0;
+ return req;
err_ctx:
i915_gem_context_unreference(ctx);
err:
kmem_cache_free(dev_priv->requests, req);
- return ret;
+ return ERR_PTR(ret);
}
/**
@@ -3067,13 +3060,9 @@ struct drm_i915_gem_request *
i915_gem_request_alloc(struct intel_engine_cs *engine,
struct i915_gem_context *ctx)
{
- struct drm_i915_gem_request *req;
- int err;
-
if (ctx == NULL)
ctx = engine->i915->kernel_context;
- err = __i915_gem_request_alloc(engine, ctx, &req);
- return err ? ERR_PTR(err) : req;
+ return __i915_gem_request_alloc(engine, ctx);
}
struct drm_i915_gem_request *
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 9+ messages in thread
* ✓ Ro.CI.BAT: success for drm/i915: tidy up request alloc
2016-06-30 1:35 [PATCH] drm/i915: tidy up request alloc Hong Liu
@ 2016-06-30 5:43 ` Patchwork
2016-06-30 8:50 ` [PATCH] " Tvrtko Ursulin
1 sibling, 0 replies; 9+ messages in thread
From: Patchwork @ 2016-06-30 5:43 UTC (permalink / raw)
To: Liu, Hong; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: tidy up request alloc
URL : https://patchwork.freedesktop.org/series/9300/
State : success
== Summary ==
Series 9300v1 drm/i915: tidy up request alloc
http://patchwork.freedesktop.org/api/1.0/series/9300/revisions/1/mbox
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-b:
dmesg-warn -> SKIP (ro-bdw-i5-5250u)
Subgroup suspend-read-crc-pipe-c:
dmesg-warn -> SKIP (ro-bdw-i5-5250u)
fi-kbl-qkkr total:229 pass:161 dwarn:27 dfail:0 fail:0 skip:41
fi-skl-i5-6260u total:229 pass:202 dwarn:0 dfail:0 fail:2 skip:25
fi-snb-i7-2600 total:229 pass:174 dwarn:0 dfail:0 fail:2 skip:53
ro-bdw-i5-5250u total:229 pass:202 dwarn:1 dfail:1 fail:2 skip:23
ro-bdw-i7-5557U total:229 pass:202 dwarn:1 dfail:1 fail:2 skip:23
ro-bdw-i7-5600u total:229 pass:190 dwarn:0 dfail:1 fail:0 skip:38
ro-bsw-n3050 total:229 pass:177 dwarn:0 dfail:1 fail:2 skip:49
ro-byt-n2820 total:229 pass:178 dwarn:0 dfail:1 fail:5 skip:45
ro-hsw-i3-4010u total:229 pass:195 dwarn:0 dfail:1 fail:2 skip:31
ro-hsw-i7-4770r total:229 pass:195 dwarn:0 dfail:1 fail:2 skip:31
ro-ilk-i7-620lm total:229 pass:155 dwarn:0 dfail:1 fail:3 skip:70
ro-ilk1-i5-650 total:224 pass:155 dwarn:0 dfail:1 fail:3 skip:65
ro-ivb-i7-3770 total:229 pass:186 dwarn:0 dfail:1 fail:2 skip:40
ro-ivb2-i7-3770 total:229 pass:190 dwarn:0 dfail:1 fail:2 skip:36
ro-skl3-i5-6260u total:229 pass:206 dwarn:1 dfail:1 fail:2 skip:19
ro-snb-i7-2620M total:229 pass:179 dwarn:0 dfail:1 fail:1 skip:48
Results at /archive/results/CI_IGT_test/RO_Patchwork_1336/
8a6521c drm-intel-nightly: 2016y-06m-29d-16h-08m-16s UTC integration manifest
0c7486c7 drm/i915: tidy up request alloc
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: tidy up request alloc
2016-06-30 1:35 [PATCH] drm/i915: tidy up request alloc Hong Liu
2016-06-30 5:43 ` ✓ Ro.CI.BAT: success for " Patchwork
@ 2016-06-30 8:50 ` Tvrtko Ursulin
2016-06-30 10:22 ` Chris Wilson
1 sibling, 1 reply; 9+ messages in thread
From: Tvrtko Ursulin @ 2016-06-30 8:50 UTC (permalink / raw)
To: Hong Liu, intel-gfx
On 30/06/16 02:35, Hong Liu wrote:
> Return the allocated request pointer directly to remove
> the double pointer parameter.
>
> Signed-off-by: Hong Liu <hong.liu@intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 25 +++++++------------------
> 1 file changed, 7 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 1d98782..9881455 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2988,32 +2988,26 @@ void i915_gem_request_free(struct kref *req_ref)
> kmem_cache_free(req->i915->requests, req);
> }
>
> -static inline int
> +static inline struct drm_i915_gem_request *
> __i915_gem_request_alloc(struct intel_engine_cs *engine,
> - struct i915_gem_context *ctx,
> - struct drm_i915_gem_request **req_out)
> + struct i915_gem_context *ctx)
> {
> struct drm_i915_private *dev_priv = engine->i915;
> unsigned reset_counter = i915_reset_counter(&dev_priv->gpu_error);
> struct drm_i915_gem_request *req;
> int ret;
>
> - if (!req_out)
> - return -EINVAL;
> -
> - *req_out = NULL;
> -
> /* ABI: Before userspace accesses the GPU (e.g. execbuffer), report
> * EIO if the GPU is already wedged, or EAGAIN to drop the struct_mutex
> * and restart.
> */
> ret = i915_gem_check_wedge(reset_counter, dev_priv->mm.interruptible);
> if (ret)
> - return ret;
> + return ERR_PTR(ret);
>
> req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL);
> if (req == NULL)
> - return -ENOMEM;
> + return ERR_PTR(-ENOMEM);
>
> ret = i915_gem_get_seqno(engine->i915, &req->seqno);
> if (ret)
> @@ -3041,14 +3035,13 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
> if (ret)
> goto err_ctx;
>
> - *req_out = req;
> - return 0;
> + return req;
>
> err_ctx:
> i915_gem_context_unreference(ctx);
> err:
> kmem_cache_free(dev_priv->requests, req);
> - return ret;
> + return ERR_PTR(ret);
> }
>
> /**
> @@ -3067,13 +3060,9 @@ struct drm_i915_gem_request *
> i915_gem_request_alloc(struct intel_engine_cs *engine,
> struct i915_gem_context *ctx)
> {
> - struct drm_i915_gem_request *req;
> - int err;
> -
> if (ctx == NULL)
> ctx = engine->i915->kernel_context;
> - err = __i915_gem_request_alloc(engine, ctx, &req);
> - return err ? ERR_PTR(err) : req;
> + return __i915_gem_request_alloc(engine, ctx);
> }
>
> struct drm_i915_gem_request *
>
Looks good to me. And have this feeling I've seen this somewhere before.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: tidy up request alloc
2016-06-30 8:50 ` [PATCH] " Tvrtko Ursulin
@ 2016-06-30 10:22 ` Chris Wilson
2016-06-30 12:49 ` Tvrtko Ursulin
0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2016-06-30 10:22 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: intel-gfx
On Thu, Jun 30, 2016 at 09:50:20AM +0100, Tvrtko Ursulin wrote:
>
> On 30/06/16 02:35, Hong Liu wrote:
> >Return the allocated request pointer directly to remove
> >the double pointer parameter.
> >
> >Signed-off-by: Hong Liu <hong.liu@intel.com>
> >---
> > drivers/gpu/drm/i915/i915_gem.c | 25 +++++++------------------
> > 1 file changed, 7 insertions(+), 18 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >index 1d98782..9881455 100644
> >--- a/drivers/gpu/drm/i915/i915_gem.c
> >+++ b/drivers/gpu/drm/i915/i915_gem.c
> >@@ -2988,32 +2988,26 @@ void i915_gem_request_free(struct kref *req_ref)
> > kmem_cache_free(req->i915->requests, req);
> > }
> >
> >-static inline int
> >+static inline struct drm_i915_gem_request *
> > __i915_gem_request_alloc(struct intel_engine_cs *engine,
> >- struct i915_gem_context *ctx,
> >- struct drm_i915_gem_request **req_out)
> >+ struct i915_gem_context *ctx)
> > {
> > struct drm_i915_private *dev_priv = engine->i915;
> > unsigned reset_counter = i915_reset_counter(&dev_priv->gpu_error);
> > struct drm_i915_gem_request *req;
> > int ret;
> >
> >- if (!req_out)
> >- return -EINVAL;
> >-
> >- *req_out = NULL;
> >-
> > /* ABI: Before userspace accesses the GPU (e.g. execbuffer), report
> > * EIO if the GPU is already wedged, or EAGAIN to drop the struct_mutex
> > * and restart.
> > */
> > ret = i915_gem_check_wedge(reset_counter, dev_priv->mm.interruptible);
> > if (ret)
> >- return ret;
> >+ return ERR_PTR(ret);
> >
> > req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL);
> > if (req == NULL)
> >- return -ENOMEM;
> >+ return ERR_PTR(-ENOMEM);
> >
> > ret = i915_gem_get_seqno(engine->i915, &req->seqno);
> > if (ret)
> >@@ -3041,14 +3035,13 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
> > if (ret)
> > goto err_ctx;
> >
> >- *req_out = req;
> >- return 0;
> >+ return req;
> >
> > err_ctx:
> > i915_gem_context_unreference(ctx);
> > err:
> > kmem_cache_free(dev_priv->requests, req);
> >- return ret;
> >+ return ERR_PTR(ret);
> > }
> >
> > /**
> >@@ -3067,13 +3060,9 @@ struct drm_i915_gem_request *
> > i915_gem_request_alloc(struct intel_engine_cs *engine,
> > struct i915_gem_context *ctx)
> > {
> >- struct drm_i915_gem_request *req;
> >- int err;
> >-
> > if (ctx == NULL)
> > ctx = engine->i915->kernel_context;
> >- err = __i915_gem_request_alloc(engine, ctx, &req);
> >- return err ? ERR_PTR(err) : req;
> >+ return __i915_gem_request_alloc(engine, ctx);
> > }
> >
> > struct drm_i915_gem_request *
> >
>
> Looks good to me. And have this feeling I've seen this somewhere before.
Several times. This is not the full tidy, nor does it realise the
ramifactions of request alloc through the stack.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: tidy up request alloc
2016-06-30 10:22 ` Chris Wilson
@ 2016-06-30 12:49 ` Tvrtko Ursulin
2016-07-01 16:58 ` Dave Gordon
0 siblings, 1 reply; 9+ messages in thread
From: Tvrtko Ursulin @ 2016-06-30 12:49 UTC (permalink / raw)
To: Chris Wilson, Hong Liu, intel-gfx
On 30/06/16 11:22, Chris Wilson wrote:
> On Thu, Jun 30, 2016 at 09:50:20AM +0100, Tvrtko Ursulin wrote:
>>
>> On 30/06/16 02:35, Hong Liu wrote:
>>> Return the allocated request pointer directly to remove
>>> the double pointer parameter.
>>>
>>> Signed-off-by: Hong Liu <hong.liu@intel.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_gem.c | 25 +++++++------------------
>>> 1 file changed, 7 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>> index 1d98782..9881455 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -2988,32 +2988,26 @@ void i915_gem_request_free(struct kref *req_ref)
>>> kmem_cache_free(req->i915->requests, req);
>>> }
>>>
>>> -static inline int
>>> +static inline struct drm_i915_gem_request *
>>> __i915_gem_request_alloc(struct intel_engine_cs *engine,
>>> - struct i915_gem_context *ctx,
>>> - struct drm_i915_gem_request **req_out)
>>> + struct i915_gem_context *ctx)
>>> {
>>> struct drm_i915_private *dev_priv = engine->i915;
>>> unsigned reset_counter = i915_reset_counter(&dev_priv->gpu_error);
>>> struct drm_i915_gem_request *req;
>>> int ret;
>>>
>>> - if (!req_out)
>>> - return -EINVAL;
>>> -
>>> - *req_out = NULL;
>>> -
>>> /* ABI: Before userspace accesses the GPU (e.g. execbuffer), report
>>> * EIO if the GPU is already wedged, or EAGAIN to drop the struct_mutex
>>> * and restart.
>>> */
>>> ret = i915_gem_check_wedge(reset_counter, dev_priv->mm.interruptible);
>>> if (ret)
>>> - return ret;
>>> + return ERR_PTR(ret);
>>>
>>> req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL);
>>> if (req == NULL)
>>> - return -ENOMEM;
>>> + return ERR_PTR(-ENOMEM);
>>>
>>> ret = i915_gem_get_seqno(engine->i915, &req->seqno);
>>> if (ret)
>>> @@ -3041,14 +3035,13 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
>>> if (ret)
>>> goto err_ctx;
>>>
>>> - *req_out = req;
>>> - return 0;
>>> + return req;
>>>
>>> err_ctx:
>>> i915_gem_context_unreference(ctx);
>>> err:
>>> kmem_cache_free(dev_priv->requests, req);
>>> - return ret;
>>> + return ERR_PTR(ret);
>>> }
>>>
>>> /**
>>> @@ -3067,13 +3060,9 @@ struct drm_i915_gem_request *
>>> i915_gem_request_alloc(struct intel_engine_cs *engine,
>>> struct i915_gem_context *ctx)
>>> {
>>> - struct drm_i915_gem_request *req;
>>> - int err;
>>> -
>>> if (ctx == NULL)
>>> ctx = engine->i915->kernel_context;
>>> - err = __i915_gem_request_alloc(engine, ctx, &req);
>>> - return err ? ERR_PTR(err) : req;
>>> + return __i915_gem_request_alloc(engine, ctx);
>>> }
>>>
>>> struct drm_i915_gem_request *
>>>
>>
>> Looks good to me. And have this feeling I've seen this somewhere before.
>
> Several times. This is not the full tidy, nor does it realise the
> ramifactions of request alloc through the stack.
Hm I can't spot that it is doing anything wrong or making anything
worse. You don't want to let the small cleanup in?
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: tidy up request alloc
2016-06-30 12:49 ` Tvrtko Ursulin
@ 2016-07-01 16:58 ` Dave Gordon
2016-07-01 18:34 ` Chris Wilson
0 siblings, 1 reply; 9+ messages in thread
From: Dave Gordon @ 2016-07-01 16:58 UTC (permalink / raw)
To: Tvrtko Ursulin, Chris Wilson, Hong Liu, intel-gfx
On 30/06/16 13:49, Tvrtko Ursulin wrote:
>
> On 30/06/16 11:22, Chris Wilson wrote:
>> On Thu, Jun 30, 2016 at 09:50:20AM +0100, Tvrtko Ursulin wrote:
>>>
>>> On 30/06/16 02:35, Hong Liu wrote:
>>>> Return the allocated request pointer directly to remove
>>>> the double pointer parameter.
>>>>
>>>> Signed-off-by: Hong Liu <hong.liu@intel.com>
>>>> ---
>>>> drivers/gpu/drm/i915/i915_gem.c | 25 +++++++------------------
>>>> 1 file changed, 7 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c
>>>> b/drivers/gpu/drm/i915/i915_gem.c
>>>> index 1d98782..9881455 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>>> @@ -2988,32 +2988,26 @@ void i915_gem_request_free(struct kref
>>>> *req_ref)
>>>> kmem_cache_free(req->i915->requests, req);
>>>> }
>>>>
>>>> -static inline int
>>>> +static inline struct drm_i915_gem_request *
>>>> __i915_gem_request_alloc(struct intel_engine_cs *engine,
>>>> - struct i915_gem_context *ctx,
>>>> - struct drm_i915_gem_request **req_out)
>>>> + struct i915_gem_context *ctx)
>>>> {
>>>> struct drm_i915_private *dev_priv = engine->i915;
>>>> unsigned reset_counter =
>>>> i915_reset_counter(&dev_priv->gpu_error);
>>>> struct drm_i915_gem_request *req;
>>>> int ret;
>>>>
>>>> - if (!req_out)
>>>> - return -EINVAL;
>>>> -
>>>> - *req_out = NULL;
>>>> -
>>>> /* ABI: Before userspace accesses the GPU (e.g. execbuffer),
>>>> report
>>>> * EIO if the GPU is already wedged, or EAGAIN to drop the
>>>> struct_mutex
>>>> * and restart.
>>>> */
>>>> ret = i915_gem_check_wedge(reset_counter,
>>>> dev_priv->mm.interruptible);
>>>> if (ret)
>>>> - return ret;
>>>> + return ERR_PTR(ret);
>>>>
>>>> req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL);
>>>> if (req == NULL)
>>>> - return -ENOMEM;
>>>> + return ERR_PTR(-ENOMEM);
>>>>
>>>> ret = i915_gem_get_seqno(engine->i915, &req->seqno);
>>>> if (ret)
>>>> @@ -3041,14 +3035,13 @@ __i915_gem_request_alloc(struct
>>>> intel_engine_cs *engine,
>>>> if (ret)
>>>> goto err_ctx;
>>>>
>>>> - *req_out = req;
>>>> - return 0;
>>>> + return req;
>>>>
>>>> err_ctx:
>>>> i915_gem_context_unreference(ctx);
>>>> err:
>>>> kmem_cache_free(dev_priv->requests, req);
>>>> - return ret;
>>>> + return ERR_PTR(ret);
>>>> }
>>>>
>>>> /**
>>>> @@ -3067,13 +3060,9 @@ struct drm_i915_gem_request *
>>>> i915_gem_request_alloc(struct intel_engine_cs *engine,
>>>> struct i915_gem_context *ctx)
>>>> {
>>>> - struct drm_i915_gem_request *req;
>>>> - int err;
>>>> -
>>>> if (ctx == NULL)
>>>> ctx = engine->i915->kernel_context;
>>>> - err = __i915_gem_request_alloc(engine, ctx, &req);
>>>> - return err ? ERR_PTR(err) : req;
>>>> + return __i915_gem_request_alloc(engine, ctx);
>>>> }
>>>>
>>>> struct drm_i915_gem_request *
>>>>
>>>
>>> Looks good to me. And have this feeling I've seen this somewhere before.
>>
>> Several times. This is not the full tidy, nor does it realise the
>> ramifactions of request alloc through the stack.
>
> Hm I can't spot that it is doing anything wrong or making anything
> worse. You don't want to let the small cleanup in?
>
> Regards,
> Tvrtko
It ought to make almost no difference, because the *only* place the
inner function is called is from the outer one, which passes a pointer
to a local for the returned object; and the inner one is then inlined,
so the compiler doesn't actually put it on the stack and call to the
inner allocator anyway.
Strangely, however, with this change the code becomes ~400 bytes bigger!
Disassembly reveals that while the code for the externally-callable
outer function is indeed almost identical, a second copy of it has also
been inlined at the one callsite in this file:
__i915_gem_object_sync() ...
req = i915_gem_request_alloc(to, NULL);
I don't think that's a critical path and would rather have 400 bytes
smaller codespace. We can get that back by adding /noinline/ to the
outer function i915_gem_request_alloc() (not, of course, to the inner
one, that definitely *should* be inline).
.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: tidy up request alloc
2016-07-01 16:58 ` Dave Gordon
@ 2016-07-01 18:34 ` Chris Wilson
2016-07-04 4:08 ` Liu, Hong
0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2016-07-01 18:34 UTC (permalink / raw)
To: Dave Gordon; +Cc: intel-gfx
On Fri, Jul 01, 2016 at 05:58:18PM +0100, Dave Gordon wrote:
> On 30/06/16 13:49, Tvrtko Ursulin wrote:
> >
> >On 30/06/16 11:22, Chris Wilson wrote:
> >>On Thu, Jun 30, 2016 at 09:50:20AM +0100, Tvrtko Ursulin wrote:
> >>>
> >>>On 30/06/16 02:35, Hong Liu wrote:
> >>>>Return the allocated request pointer directly to remove
> >>>>the double pointer parameter.
> >>>>
> >>>>Signed-off-by: Hong Liu <hong.liu@intel.com>
> >>>>---
> >>>> drivers/gpu/drm/i915/i915_gem.c | 25 +++++++------------------
> >>>> 1 file changed, 7 insertions(+), 18 deletions(-)
> >>>>
> >>>>diff --git a/drivers/gpu/drm/i915/i915_gem.c
> >>>>b/drivers/gpu/drm/i915/i915_gem.c
> >>>>index 1d98782..9881455 100644
> >>>>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>>>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>>>@@ -2988,32 +2988,26 @@ void i915_gem_request_free(struct kref
> >>>>*req_ref)
> >>>> kmem_cache_free(req->i915->requests, req);
> >>>> }
> >>>>
> >>>>-static inline int
> >>>>+static inline struct drm_i915_gem_request *
> >>>> __i915_gem_request_alloc(struct intel_engine_cs *engine,
> >>>>- struct i915_gem_context *ctx,
> >>>>- struct drm_i915_gem_request **req_out)
> >>>>+ struct i915_gem_context *ctx)
> >>>> {
> >>>> struct drm_i915_private *dev_priv = engine->i915;
> >>>> unsigned reset_counter =
> >>>>i915_reset_counter(&dev_priv->gpu_error);
> >>>> struct drm_i915_gem_request *req;
> >>>> int ret;
> >>>>
> >>>>- if (!req_out)
> >>>>- return -EINVAL;
> >>>>-
> >>>>- *req_out = NULL;
> >>>>-
> >>>> /* ABI: Before userspace accesses the GPU (e.g. execbuffer),
> >>>>report
> >>>> * EIO if the GPU is already wedged, or EAGAIN to drop the
> >>>>struct_mutex
> >>>> * and restart.
> >>>> */
> >>>> ret = i915_gem_check_wedge(reset_counter,
> >>>>dev_priv->mm.interruptible);
> >>>> if (ret)
> >>>>- return ret;
> >>>>+ return ERR_PTR(ret);
> >>>>
> >>>> req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL);
> >>>> if (req == NULL)
> >>>>- return -ENOMEM;
> >>>>+ return ERR_PTR(-ENOMEM);
> >>>>
> >>>> ret = i915_gem_get_seqno(engine->i915, &req->seqno);
> >>>> if (ret)
> >>>>@@ -3041,14 +3035,13 @@ __i915_gem_request_alloc(struct
> >>>>intel_engine_cs *engine,
> >>>> if (ret)
> >>>> goto err_ctx;
> >>>>
> >>>>- *req_out = req;
> >>>>- return 0;
> >>>>+ return req;
> >>>>
> >>>> err_ctx:
> >>>> i915_gem_context_unreference(ctx);
> >>>> err:
> >>>> kmem_cache_free(dev_priv->requests, req);
> >>>>- return ret;
> >>>>+ return ERR_PTR(ret);
> >>>> }
> >>>>
> >>>> /**
> >>>>@@ -3067,13 +3060,9 @@ struct drm_i915_gem_request *
> >>>> i915_gem_request_alloc(struct intel_engine_cs *engine,
> >>>> struct i915_gem_context *ctx)
> >>>> {
> >>>>- struct drm_i915_gem_request *req;
> >>>>- int err;
> >>>>-
> >>>> if (ctx == NULL)
> >>>> ctx = engine->i915->kernel_context;
> >>>>- err = __i915_gem_request_alloc(engine, ctx, &req);
> >>>>- return err ? ERR_PTR(err) : req;
> >>>>+ return __i915_gem_request_alloc(engine, ctx);
> >>>> }
> >>>>
> >>>> struct drm_i915_gem_request *
> >>>>
> >>>
> >>>Looks good to me. And have this feeling I've seen this somewhere before.
> >>
> >>Several times. This is not the full tidy, nor does it realise the
> >>ramifactions of request alloc through the stack.
> >
> >Hm I can't spot that it is doing anything wrong or making anything
> >worse. You don't want to let the small cleanup in?
> >
> >Regards,
> >Tvrtko
>
> It ought to make almost no difference, because the *only* place the
> inner function is called is from the outer one, which passes a
> pointer to a local for the returned object; and the inner one is
> then inlined, so the compiler doesn't actually put it on the stack
> and call to the inner allocator anyway.
>
> Strangely, however, with this change the code becomes ~400 bytes bigger!
>
> Disassembly reveals that while the code for the externally-callable
> outer function is indeed almost identical, a second copy of it has
> also been inlined at the one callsite in this file:
>
> __i915_gem_object_sync() ...
> req = i915_gem_request_alloc(to, NULL);
>
> I don't think that's a critical path and would rather have 400 bytes
> smaller codespace. We can get that back by adding /noinline/ to the
> outer function i915_gem_request_alloc() (not, of course, to the
> inner one, that definitely *should* be inline).
__i915_gem_object_sync() should not be calling i915_gem_request_alloc().
That's the issue with this patch, your patch and John's patch.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: tidy up request alloc
2016-07-01 18:34 ` Chris Wilson
@ 2016-07-04 4:08 ` Liu, Hong
2016-07-04 10:36 ` Dave Gordon
0 siblings, 1 reply; 9+ messages in thread
From: Liu, Hong @ 2016-07-04 4:08 UTC (permalink / raw)
To: Gordon, David S, chris; +Cc: intel-gfx
On Fri, 2016-07-01 at 19:34 +0100, Chris Wilson wrote:
> On Fri, Jul 01, 2016 at 05:58:18PM +0100, Dave Gordon wrote:
> > On 30/06/16 13:49, Tvrtko Ursulin wrote:
> > >
> > > On 30/06/16 11:22, Chris Wilson wrote:
> > > > On Thu, Jun 30, 2016 at 09:50:20AM +0100, Tvrtko Ursulin wrote:
> > > > >
> > > > > On 30/06/16 02:35, Hong Liu wrote:
> > > > > > Return the allocated request pointer directly to remove
> > > > > > the double pointer parameter.
> > > > > >
> > > > > > Signed-off-by: Hong Liu <hong.liu@intel.com>
> > > > > > ---
> > > > > > drivers/gpu/drm/i915/i915_gem.c | 25 +++++++------------
> > > > > > ------
> > > > > > 1 file changed, 7 insertions(+), 18 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c
> > > > > > b/drivers/gpu/drm/i915/i915_gem.c
> > > > > > index 1d98782..9881455 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > > > @@ -2988,32 +2988,26 @@ void i915_gem_request_free(struct
> > > > > > kref
> > > > > > *req_ref)
> > > > > > kmem_cache_free(req->i915->requests, req);
> > > > > > }
> > > > > >
> > > > > > -static inline int
> > > > > > +static inline struct drm_i915_gem_request *
> > > > > > __i915_gem_request_alloc(struct intel_engine_cs *engine,
> > > > > > - struct i915_gem_context *ctx,
> > > > > > - struct drm_i915_gem_request **req_out)
> > > > > > + struct i915_gem_context *ctx)
> > > > > > {
> > > > > > struct drm_i915_private *dev_priv = engine->i915;
> > > > > > unsigned reset_counter =
> > > > > > i915_reset_counter(&dev_priv->gpu_error);
> > > > > > struct drm_i915_gem_request *req;
> > > > > > int ret;
> > > > > >
> > > > > > - if (!req_out)
> > > > > > - return -EINVAL;
> > > > > > -
> > > > > > - *req_out = NULL;
> > > > > > -
> > > > > > /* ABI: Before userspace accesses the GPU (e.g.
> > > > > > execbuffer),
> > > > > > report
> > > > > > * EIO if the GPU is already wedged, or EAGAIN to drop
> > > > > > the
> > > > > > struct_mutex
> > > > > > * and restart.
> > > > > > */
> > > > > > ret = i915_gem_check_wedge(reset_counter,
> > > > > > dev_priv->mm.interruptible);
> > > > > > if (ret)
> > > > > > - return ret;
> > > > > > + return ERR_PTR(ret);
> > > > > >
> > > > > > req = kmem_cache_zalloc(dev_priv->requests,
> > > > > > GFP_KERNEL);
> > > > > > if (req == NULL)
> > > > > > - return -ENOMEM;
> > > > > > + return ERR_PTR(-ENOMEM);
> > > > > >
> > > > > > ret = i915_gem_get_seqno(engine->i915, &req->seqno);
> > > > > > if (ret)
> > > > > > @@ -3041,14 +3035,13 @@ __i915_gem_request_alloc(struct
> > > > > > intel_engine_cs *engine,
> > > > > > if (ret)
> > > > > > goto err_ctx;
> > > > > >
> > > > > > - *req_out = req;
> > > > > > - return 0;
> > > > > > + return req;
> > > > > >
> > > > > > err_ctx:
> > > > > > i915_gem_context_unreference(ctx);
> > > > > > err:
> > > > > > kmem_cache_free(dev_priv->requests, req);
> > > > > > - return ret;
> > > > > > + return ERR_PTR(ret);
> > > > > > }
> > > > > >
> > > > > > /**
> > > > > > @@ -3067,13 +3060,9 @@ struct drm_i915_gem_request *
> > > > > > i915_gem_request_alloc(struct intel_engine_cs *engine,
> > > > > > struct i915_gem_context *ctx)
> > > > > > {
> > > > > > - struct drm_i915_gem_request *req;
> > > > > > - int err;
> > > > > > -
> > > > > > if (ctx == NULL)
> > > > > > ctx = engine->i915->kernel_context;
> > > > > > - err = __i915_gem_request_alloc(engine, ctx, &req);
> > > > > > - return err ? ERR_PTR(err) : req;
> > > > > > + return __i915_gem_request_alloc(engine, ctx);
> > > > > > }
> > > > > >
> > > > > > struct drm_i915_gem_request *
> > > > > >
> > > > >
> > > > > Looks good to me. And have this feeling I've seen this
> > > > > somewhere before.
> > > >
> > > > Several times. This is not the full tidy, nor does it realise
> > > > the
> > > > ramifactions of request alloc through the stack.
> > >
> > > Hm I can't spot that it is doing anything wrong or making
> > > anything
> > > worse. You don't want to let the small cleanup in?
> > >
> > > Regards,
> > > Tvrtko
> >
> > It ought to make almost no difference, because the *only* place the
> > inner function is called is from the outer one, which passes a
> > pointer to a local for the returned object; and the inner one is
> > then inlined, so the compiler doesn't actually put it on the stack
> > and call to the inner allocator anyway.
> >
> > Strangely, however, with this change the code becomes ~400 bytes
> > bigger!
> >
> > Disassembly reveals that while the code for the externally-callable
> > outer function is indeed almost identical, a second copy of it has
> > also been inlined at the one callsite in this file:
> >
> > __i915_gem_object_sync() ...
> > req = i915_gem_request_alloc(to, NULL);
> >
> > I don't think that's a critical path and would rather have 400
> > bytes
> > smaller codespace. We can get that back by adding /noinline/ to the
> > outer function i915_gem_request_alloc() (not, of course, to the
> > inner one, that definitely *should* be inline).
>
> __i915_gem_object_sync() should not be calling
> i915_gem_request_alloc().
>
> That's the issue with this patch, your patch and John's patch.
So we wrote the i915_gem_request_alloc() this way is to avoid being
inlined into callers like __i915_gem_object_sync()?
I checked the file with GCC 4.8.5 on my centos environment, it is like
what Dave found. With the patch, i915_gem_object_sync() is 368 bytes
bigger.
But when I checked it with GCC 6.1.1 on Fedora 24, it seems it inlines
the i915_gem_request_alloc() even with the current implementation.
With the patch, the i915_gem_object_sync() is 80 bytes smaller.
> -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: tidy up request alloc
2016-07-04 4:08 ` Liu, Hong
@ 2016-07-04 10:36 ` Dave Gordon
0 siblings, 0 replies; 9+ messages in thread
From: Dave Gordon @ 2016-07-04 10:36 UTC (permalink / raw)
To: Liu, Hong, chris; +Cc: intel-gfx
On 04/07/16 05:08, Liu, Hong wrote:
> On Fri, 2016-07-01 at 19:34 +0100, Chris Wilson wrote:
>> On Fri, Jul 01, 2016 at 05:58:18PM +0100, Dave Gordon wrote:
>>> On 30/06/16 13:49, Tvrtko Ursulin wrote:
>>>>
>>>> On 30/06/16 11:22, Chris Wilson wrote:
>>>>> On Thu, Jun 30, 2016 at 09:50:20AM +0100, Tvrtko Ursulin wrote:
>>>>>>
>>>>>> On 30/06/16 02:35, Hong Liu wrote:
>>>>>>> Return the allocated request pointer directly to remove
>>>>>>> the double pointer parameter.
>>>>>>>
>>>>>>> Signed-off-by: Hong Liu <hong.liu@intel.com>
>>>>>>> ---
>>>>>>> drivers/gpu/drm/i915/i915_gem.c | 25 +++++++------------
>>>>>>> ------
>>>>>>> 1 file changed, 7 insertions(+), 18 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c
>>>>>>> b/drivers/gpu/drm/i915/i915_gem.c
>>>>>>> index 1d98782..9881455 100644
>>>>>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>>>>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>>>>>> @@ -2988,32 +2988,26 @@ void i915_gem_request_free(struct
>>>>>>> kref
>>>>>>> *req_ref)
>>>>>>> kmem_cache_free(req->i915->requests, req);
>>>>>>> }
>>>>>>>
>>>>>>> -static inline int
>>>>>>> +static inline struct drm_i915_gem_request *
>>>>>>> __i915_gem_request_alloc(struct intel_engine_cs *engine,
>>>>>>> - struct i915_gem_context *ctx,
>>>>>>> - struct drm_i915_gem_request **req_out)
>>>>>>> + struct i915_gem_context *ctx)
>>>>>>> {
>>>>>>> struct drm_i915_private *dev_priv = engine->i915;
>>>>>>> unsigned reset_counter =
>>>>>>> i915_reset_counter(&dev_priv->gpu_error);
>>>>>>> struct drm_i915_gem_request *req;
>>>>>>> int ret;
>>>>>>>
>>>>>>> - if (!req_out)
>>>>>>> - return -EINVAL;
>>>>>>> -
>>>>>>> - *req_out = NULL;
>>>>>>> -
>>>>>>> /* ABI: Before userspace accesses the GPU (e.g.
>>>>>>> execbuffer),
>>>>>>> report
>>>>>>> * EIO if the GPU is already wedged, or EAGAIN to drop
>>>>>>> the
>>>>>>> struct_mutex
>>>>>>> * and restart.
>>>>>>> */
>>>>>>> ret = i915_gem_check_wedge(reset_counter,
>>>>>>> dev_priv->mm.interruptible);
>>>>>>> if (ret)
>>>>>>> - return ret;
>>>>>>> + return ERR_PTR(ret);
>>>>>>>
>>>>>>> req = kmem_cache_zalloc(dev_priv->requests,
>>>>>>> GFP_KERNEL);
>>>>>>> if (req == NULL)
>>>>>>> - return -ENOMEM;
>>>>>>> + return ERR_PTR(-ENOMEM);
>>>>>>>
>>>>>>> ret = i915_gem_get_seqno(engine->i915, &req->seqno);
>>>>>>> if (ret)
>>>>>>> @@ -3041,14 +3035,13 @@ __i915_gem_request_alloc(struct
>>>>>>> intel_engine_cs *engine,
>>>>>>> if (ret)
>>>>>>> goto err_ctx;
>>>>>>>
>>>>>>> - *req_out = req;
>>>>>>> - return 0;
>>>>>>> + return req;
>>>>>>>
>>>>>>> err_ctx:
>>>>>>> i915_gem_context_unreference(ctx);
>>>>>>> err:
>>>>>>> kmem_cache_free(dev_priv->requests, req);
>>>>>>> - return ret;
>>>>>>> + return ERR_PTR(ret);
>>>>>>> }
>>>>>>>
>>>>>>> /**
>>>>>>> @@ -3067,13 +3060,9 @@ struct drm_i915_gem_request *
>>>>>>> i915_gem_request_alloc(struct intel_engine_cs *engine,
>>>>>>> struct i915_gem_context *ctx)
>>>>>>> {
>>>>>>> - struct drm_i915_gem_request *req;
>>>>>>> - int err;
>>>>>>> -
>>>>>>> if (ctx == NULL)
>>>>>>> ctx = engine->i915->kernel_context;
>>>>>>> - err = __i915_gem_request_alloc(engine, ctx, &req);
>>>>>>> - return err ? ERR_PTR(err) : req;
>>>>>>> + return __i915_gem_request_alloc(engine, ctx);
>>>>>>> }
>>>>>>>
>>>>>>> struct drm_i915_gem_request *
>>>>>>
>>>>>> Looks good to me. And have this feeling I've seen this
>>>>>> somewhere before.
>>>>>
>>>>> Several times. This is not the full tidy, nor does it realise
>>>>> the ramifactions of request alloc through the stack.
>>>>
>>>> Hm I can't spot that it is doing anything wrong or making
>>>> anything worse. You don't want to let the small cleanup in?
>>>>
>>>> Regards,
>>>> Tvrtko
>>>
>>> It ought to make almost no difference, because the *only* place the
>>> inner function is called is from the outer one, which passes a
>>> pointer to a local for the returned object; and the inner one is
>>> then inlined, so the compiler doesn't actually put it on the stack
>>> and call to the inner allocator anyway.
>>>
>>> Strangely, however, with this change the code becomes ~400 bytes
>>> bigger!
>>>
>>> Disassembly reveals that while the code for the externally-callable
>>> outer function is indeed almost identical, a second copy of it has
>>> also been inlined at the one callsite in this file:
>>>
>>> __i915_gem_object_sync() ...
>>> req = i915_gem_request_alloc(to, NULL);
>>>
>>> I don't think that's a critical path and would rather have 400
>>> bytes
>>> smaller codespace. We can get that back by adding /noinline/ to the
>>> outer function i915_gem_request_alloc() (not, of course, to the
>>> inner one, that definitely *should* be inline).
>>
>> __i915_gem_object_sync() should not be calling
>> i915_gem_request_alloc().
>>
>> That's the issue with this patch, your patch and John's patch.
>
> So we wrote the i915_gem_request_alloc() this way is to avoid being
> inlined into callers like __i915_gem_object_sync()?
Not specifically, as the description of commit 268270883 says,
"... this patch renames the existing i915_gem_request_alloc(), and
makes it local (static inline), and replaces it with a wrapper that
provides a default if the context is NULL, and also has a nicer
calling convention (doesn't require a pointer to an output parameter).
Then we change all callers to use the new convention:
OLD:
err = i915_gem_request_alloc(ring, user_ctx, &req);
if (err) ...
NEW:
req = i915_gem_request_alloc(ring, user_ctx);
if (IS_ERR(req)) ...
OLD:
err = i915_gem_request_alloc(ring, ring->default_context, &req);
if (err) ...
NEW:
req = i915_gem_request_alloc(ring, NULL);
if (IS_ERR(req)) ..."
At the time, it was more a matter of simplifying review and minimising
lines-of-change by adding the wrapper to implement the new convention
rather than rewriting the real function at the same time as changing all
the callers. So I have no objection to continuing the process of making
this nicer by now flattening the two layers into a single function; I
just noted the oddity that the flattening made the codespace larger.
Chris, do you have a patch for *not* calling i915_gem_request_alloc()
from inside __i915_gem_object_sync() ? I don't really know why John's
original conversion to passing requests everywhere needed this.
.Dave.
> I checked the file with GCC 4.8.5 on my centos environment, it is like
> what Dave found. With the patch, i915_gem_object_sync() is 368 bytes
> bigger.
>
> But when I checked it with GCC 6.1.1 on Fedora 24, it seems it inlines
> the i915_gem_request_alloc() even with the current implementation.
> With the patch, the i915_gem_object_sync() is 80 bytes smaller.
>
>> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-07-04 10:36 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-30 1:35 [PATCH] drm/i915: tidy up request alloc Hong Liu
2016-06-30 5:43 ` ✓ Ro.CI.BAT: success for " Patchwork
2016-06-30 8:50 ` [PATCH] " Tvrtko Ursulin
2016-06-30 10:22 ` Chris Wilson
2016-06-30 12:49 ` Tvrtko Ursulin
2016-07-01 16:58 ` Dave Gordon
2016-07-01 18:34 ` Chris Wilson
2016-07-04 4:08 ` Liu, Hong
2016-07-04 10:36 ` Dave Gordon
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.