All of lore.kernel.org
 help / color / mirror / Atom feed
* ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915: Cache elsp submit register
  2016-03-22 17:16 [PATCH 1/2] drm/i915: Cache elsp submit register Tvrtko Ursulin
@ 2016-03-22 17:13 ` Patchwork
  2016-03-22 17:16 ` [PATCH 2/2] drm/i915: Shrink i915_gem_request_add_to_client Tvrtko Ursulin
  2016-03-22 17:29 ` [PATCH 1/2] drm/i915: Cache elsp submit register Ville Syrjälä
  2 siblings, 0 replies; 8+ messages in thread
From: Patchwork @ 2016-03-22 17:13 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915: Cache elsp submit register
URL   : https://patchwork.freedesktop.org/series/4761/
State : failure

== Summary ==

Series 4761v1 Series without cover letter
2016-03-22T16:54:47.686093 http://patchwork.freedesktop.org/api/1.0/series/4761/revisions/1/mbox/
Applying: drm/i915: Cache elsp submit register
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/intel_lrc.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/intel_lrc.c
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/intel_lrc.c
Patch failed at 0001 drm/i915: Cache elsp submit register

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] drm/i915: Cache elsp submit register
@ 2016-03-22 17:16 Tvrtko Ursulin
  2016-03-22 17:13 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] " Patchwork
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Tvrtko Ursulin @ 2016-03-22 17:16 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Since we write four times to the same register, caching
the mmio register saves a tiny amount of generated code.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e733795b57e0..6916991bdceb 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -362,6 +362,7 @@ static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
 
 	struct intel_engine_cs *engine = rq0->engine;
 	struct drm_i915_private *dev_priv = rq0->i915;
+	i915_reg_t elsp_reg = RING_ELSP(engine);
 	uint64_t desc[2];
 
 	if (rq1) {
@@ -375,12 +376,12 @@ static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
 	rq0->elsp_submitted++;
 
 	/* You must always write both descriptors in the order below. */
-	I915_WRITE_FW(RING_ELSP(engine), upper_32_bits(desc[1]));
-	I915_WRITE_FW(RING_ELSP(engine), lower_32_bits(desc[1]));
+	I915_WRITE_FW(elsp_reg, upper_32_bits(desc[1]));
+	I915_WRITE_FW(elsp_reg, lower_32_bits(desc[1]));
 
-	I915_WRITE_FW(RING_ELSP(engine), upper_32_bits(desc[0]));
+	I915_WRITE_FW(elsp_reg, upper_32_bits(desc[0]));
 	/* The context is automatically loaded after the following */
-	I915_WRITE_FW(RING_ELSP(engine), lower_32_bits(desc[0]));
+	I915_WRITE_FW(elsp_reg, lower_32_bits(desc[0]));
 
 	/* ELSP is a wo register, use another nearby reg for posting */
 	POSTING_READ_FW(RING_EXECLIST_STATUS_LO(engine));
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] drm/i915: Shrink i915_gem_request_add_to_client
  2016-03-22 17:16 [PATCH 1/2] drm/i915: Cache elsp submit register Tvrtko Ursulin
  2016-03-22 17:13 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] " Patchwork
@ 2016-03-22 17:16 ` Tvrtko Ursulin
  2016-03-22 17:59   ` Chris Wilson
  2016-03-22 17:29 ` [PATCH 1/2] drm/i915: Cache elsp submit register Ville Syrjälä
  2 siblings, 1 reply; 8+ messages in thread
From: Tvrtko Ursulin @ 2016-03-22 17:16 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We do not need to check twice for the same conditions.

Results in clearer code and smaller binary.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8588c83abb35..f7051df781d2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1372,12 +1372,7 @@ int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
 {
 	struct drm_i915_file_private *file_priv;
 
-	WARN_ON(!req || !file || req->file_priv);
-
-	if (!req || !file)
-		return -EINVAL;
-
-	if (req->file_priv)
+	if (WARN_ON(!req || !file || req->file_priv))
 		return -EINVAL;
 
 	file_priv = file->driver_priv;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/i915: Cache elsp submit register
  2016-03-22 17:16 [PATCH 1/2] drm/i915: Cache elsp submit register Tvrtko Ursulin
  2016-03-22 17:13 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] " Patchwork
  2016-03-22 17:16 ` [PATCH 2/2] drm/i915: Shrink i915_gem_request_add_to_client Tvrtko Ursulin
@ 2016-03-22 17:29 ` Ville Syrjälä
  2016-03-22 17:39   ` Tvrtko Ursulin
  2 siblings, 1 reply; 8+ messages in thread
From: Ville Syrjälä @ 2016-03-22 17:29 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Tue, Mar 22, 2016 at 05:16:52PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Since we write four times to the same register, caching
> the mmio register saves a tiny amount of generated code.

The compiler can't figure this out on its own?

> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index e733795b57e0..6916991bdceb 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -362,6 +362,7 @@ static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
>  
>  	struct intel_engine_cs *engine = rq0->engine;
>  	struct drm_i915_private *dev_priv = rq0->i915;
> +	i915_reg_t elsp_reg = RING_ELSP(engine);
>  	uint64_t desc[2];
>  
>  	if (rq1) {
> @@ -375,12 +376,12 @@ static void execlists_elsp_write(struct drm_i915_gem_request *rq0,
>  	rq0->elsp_submitted++;
>  
>  	/* You must always write both descriptors in the order below. */
> -	I915_WRITE_FW(RING_ELSP(engine), upper_32_bits(desc[1]));
> -	I915_WRITE_FW(RING_ELSP(engine), lower_32_bits(desc[1]));
> +	I915_WRITE_FW(elsp_reg, upper_32_bits(desc[1]));
> +	I915_WRITE_FW(elsp_reg, lower_32_bits(desc[1]));
>  
> -	I915_WRITE_FW(RING_ELSP(engine), upper_32_bits(desc[0]));
> +	I915_WRITE_FW(elsp_reg, upper_32_bits(desc[0]));
>  	/* The context is automatically loaded after the following */
> -	I915_WRITE_FW(RING_ELSP(engine), lower_32_bits(desc[0]));
> +	I915_WRITE_FW(elsp_reg, lower_32_bits(desc[0]));
>  
>  	/* ELSP is a wo register, use another nearby reg for posting */
>  	POSTING_READ_FW(RING_EXECLIST_STATUS_LO(engine));
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/i915: Cache elsp submit register
  2016-03-22 17:29 ` [PATCH 1/2] drm/i915: Cache elsp submit register Ville Syrjälä
@ 2016-03-22 17:39   ` Tvrtko Ursulin
  2016-03-30 15:05     ` Dave Gordon
  0 siblings, 1 reply; 8+ messages in thread
From: Tvrtko Ursulin @ 2016-03-22 17:39 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: Intel-gfx


On 22/03/16 17:29, Ville Syrjälä wrote:
> On Tue, Mar 22, 2016 at 05:16:52PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Since we write four times to the same register, caching
>> the mmio register saves a tiny amount of generated code.
>
> The compiler can't figure this out on its own?

Nope, at least gcc 4.84 I am running here can't. :(

And this only solves one part of the things it can't figure out in that 
code. It still recalculates one part, can't remember which one is which 
now without revisiting the generated assembly. It used to be for times 
in a row: load register, add 0x230, displace 0x78, store[0-4]. This only 
solves the add 0x230 redundancy. But working around that would possibly 
be a bit too low level.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] drm/i915: Shrink i915_gem_request_add_to_client
  2016-03-22 17:16 ` [PATCH 2/2] drm/i915: Shrink i915_gem_request_add_to_client Tvrtko Ursulin
@ 2016-03-22 17:59   ` Chris Wilson
  2016-03-23 10:10     ` Tvrtko Ursulin
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Wilson @ 2016-03-22 17:59 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Tue, Mar 22, 2016 at 05:16:53PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> We do not need to check twice for the same conditions.
> 
> Results in clearer code and smaller binary.

The checks are entirely garbage.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] drm/i915: Shrink i915_gem_request_add_to_client
  2016-03-22 17:59   ` Chris Wilson
@ 2016-03-23 10:10     ` Tvrtko Ursulin
  0 siblings, 0 replies; 8+ messages in thread
From: Tvrtko Ursulin @ 2016-03-23 10:10 UTC (permalink / raw)
  To: Chris Wilson, Intel-gfx


On 22/03/16 17:59, Chris Wilson wrote:
> On Tue, Mar 22, 2016 at 05:16:53PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> We do not need to check twice for the same conditions.
>>
>> Results in clearer code and smaller binary.
>
> The checks are entirely garbage.

On actually looking wider than the mechanical cleanup your are 
absolutely right. :)

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/i915: Cache elsp submit register
  2016-03-22 17:39   ` Tvrtko Ursulin
@ 2016-03-30 15:05     ` Dave Gordon
  0 siblings, 0 replies; 8+ messages in thread
From: Dave Gordon @ 2016-03-30 15:05 UTC (permalink / raw)
  To: Tvrtko Ursulin, Ville Syrjälä; +Cc: Intel-gfx

On 22/03/16 17:39, Tvrtko Ursulin wrote:
>
> On 22/03/16 17:29, Ville Syrjälä wrote:
>> On Tue, Mar 22, 2016 at 05:16:52PM +0000, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>
>>> Since we write four times to the same register, caching
>>> the mmio register saves a tiny amount of generated code.
>>
>> The compiler can't figure this out on its own?
>
> Nope, at least gcc 4.84 I am running here can't. :(
>
> And this only solves one part of the things it can't figure out in that
> code. It still recalculates one part, can't remember which one is which
> now without revisiting the generated assembly. It used to be for times
> in a row: load register, add 0x230, displace 0x78, store[0-4]. This only
> solves the add 0x230 redundancy. But working around that would possibly
> be a bit too low level.
>
> Regards,
> Tvrtko

Compiler's probably assuming aliasing.

RING_ELSP(engine) is actually (engine->mmio_base+0x230).

I915_WRITE_FW(reg, val) is actually __raw_i915_write32(dev_priv, 
(reg__), (val__)) which ultimately translates to a store to some address.

The compiler can't be sure that this store isn't actually to 
(engine->mmio_base), so it refetches it and adds the 0x230 again. Saving 
the (struct-valued) result of the RING_ELSP() macro means the compiler 
knows it isn't aliased, so can reuse it four times.

We could try adding __restrict to various key pointers, starting with 
dev_priv and all pointers-to-engines?

.Dave.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-03-30 15:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-22 17:16 [PATCH 1/2] drm/i915: Cache elsp submit register Tvrtko Ursulin
2016-03-22 17:13 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] " Patchwork
2016-03-22 17:16 ` [PATCH 2/2] drm/i915: Shrink i915_gem_request_add_to_client Tvrtko Ursulin
2016-03-22 17:59   ` Chris Wilson
2016-03-23 10:10     ` Tvrtko Ursulin
2016-03-22 17:29 ` [PATCH 1/2] drm/i915: Cache elsp submit register Ville Syrjälä
2016-03-22 17:39   ` Tvrtko Ursulin
2016-03-30 15:05     ` Dave Gordon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.