* [PATCH 11/27] drm/i915/icl: Gen11 render context size
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-01-11 1:21 ` Rodrigo Vivi
2018-01-11 18:23 ` [PATCH v3] " Oscar Mateo
2018-01-09 23:28 ` [PATCH 12/27] drm/i915/icl: Add Indirect Context Offset for Gen11 Paulo Zanoni
` (18 subsequent siblings)
19 siblings, 2 replies; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx; +Cc: Rodrigo Vivi
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
The current size may be bigger than the correct one, this needs to be
revisited later.
v2: Rebase.
Acked-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
drivers/gpu/drm/i915/intel_engine_cs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index e88b2fd44724..a373bcbd85d8 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -181,6 +181,8 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
switch (INTEL_GEN(dev_priv)) {
default:
MISSING_CASE(INTEL_GEN(dev_priv));
+ case 11:
+ /* TODO: Make sure this is correct. */
case 10:
return GEN10_LR_CONTEXT_RENDER_SIZE;
case 9:
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH 11/27] drm/i915/icl: Gen11 render context size
2018-01-09 23:28 ` [PATCH 11/27] drm/i915/icl: Gen11 render context size Paulo Zanoni
@ 2018-01-11 1:21 ` Rodrigo Vivi
2018-01-11 18:20 ` Oscar Mateo
2018-01-11 18:23 ` [PATCH v3] " Oscar Mateo
1 sibling, 1 reply; 118+ messages in thread
From: Rodrigo Vivi @ 2018-01-11 1:21 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx
On Tue, Jan 09, 2018 at 11:28:19PM +0000, Paulo Zanoni wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> The current size may be bigger than the correct one, this needs to be
> revisited later.
I don't believe this is true anymore. When this was written initially CNL had a higher value.
Higher values are ok, but smaller can be problematic if I understood correctly.
So we might need to check the accurate number.
Oscar has a good method for that if iirc ;)
>
> v2: Rebase.
>
> Acked-by: Ben Widawsky <benjamin.widawsky@intel.com>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
> drivers/gpu/drm/i915/intel_engine_cs.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index e88b2fd44724..a373bcbd85d8 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -181,6 +181,8 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
> switch (INTEL_GEN(dev_priv)) {
> default:
> MISSING_CASE(INTEL_GEN(dev_priv));
> + case 11:
> + /* TODO: Make sure this is correct. */
> case 10:
> return GEN10_LR_CONTEXT_RENDER_SIZE;
> case 9:
> --
> 2.14.3
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH 11/27] drm/i915/icl: Gen11 render context size
2018-01-11 1:21 ` Rodrigo Vivi
@ 2018-01-11 18:20 ` Oscar Mateo
0 siblings, 0 replies; 118+ messages in thread
From: Oscar Mateo @ 2018-01-11 18:20 UTC (permalink / raw)
To: Rodrigo Vivi, Paulo Zanoni; +Cc: intel-gfx
On 01/10/2018 05:21 PM, Rodrigo Vivi wrote:
> On Tue, Jan 09, 2018 at 11:28:19PM +0000, Paulo Zanoni wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> The current size may be bigger than the correct one, this needs to be
>> revisited later.
> I don't believe this is true anymore. When this was written initially CNL had a higher value.
>
> Higher values are ok, but smaller can be problematic if I understood correctly.
>
> So we might need to check the accurate number.
>
> Oscar has a good method for that if iirc ;)
My method says 14 pages. Seems a bit low, so I went ahead and tested it.
Everything seems to work fine, so I'll send a new patch.
>> v2: Rebase.
>>
>> Acked-by: Ben Widawsky <benjamin.widawsky@intel.com>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> ---
>> drivers/gpu/drm/i915/intel_engine_cs.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
>> index e88b2fd44724..a373bcbd85d8 100644
>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>> @@ -181,6 +181,8 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
>> switch (INTEL_GEN(dev_priv)) {
>> default:
>> MISSING_CASE(INTEL_GEN(dev_priv));
>> + case 11:
>> + /* TODO: Make sure this is correct. */
>> case 10:
>> return GEN10_LR_CONTEXT_RENDER_SIZE;
>> case 9:
>> --
>> 2.14.3
>>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* [PATCH v3] drm/i915/icl: Gen11 render context size
2018-01-09 23:28 ` [PATCH 11/27] drm/i915/icl: Gen11 render context size Paulo Zanoni
2018-01-11 1:21 ` Rodrigo Vivi
@ 2018-01-11 18:23 ` Oscar Mateo
2018-01-11 19:40 ` Rodrigo Vivi
2018-01-11 22:55 ` [PATCH 1/2] drm/i915: Return a default RCS " Oscar Mateo
1 sibling, 2 replies; 118+ messages in thread
From: Oscar Mateo @ 2018-01-11 18:23 UTC (permalink / raw)
To: intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Gen11 removes the Resource Streamer, which frees up a big chunk of
the context image. BSpec indicates 12538 DWORDs (13 pages), plus
one page for PPHWSP.
Please notice that, when looking at the BSpec context image table,
the right filter has to be applied (e.g. "IcelakeLP") as some rows
are excluded for specific GENs. Also, some rows apply per-subslice
(for the calculation above, we have supposed 8 subslices which is
the maximum SKU we expect).
v2: Rebase.
v3: Use the right size as per the BSpec.
BSpec: 18907
Acked-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
drivers/gpu/drm/i915/intel_engine_cs.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index e88b2fd..79b7d36 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -41,6 +41,7 @@
#define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE)
#define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
#define GEN10_LR_CONTEXT_RENDER_SIZE (18 * PAGE_SIZE)
+#define GEN11_LR_CONTEXT_RENDER_SIZE (14 * PAGE_SIZE)
#define GEN8_LR_CONTEXT_OTHER_SIZE ( 2 * PAGE_SIZE)
@@ -181,6 +182,8 @@ struct engine_info {
switch (INTEL_GEN(dev_priv)) {
default:
MISSING_CASE(INTEL_GEN(dev_priv));
+ case 11:
+ return GEN11_LR_CONTEXT_RENDER_SIZE;
case 10:
return GEN10_LR_CONTEXT_RENDER_SIZE;
case 9:
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH v3] drm/i915/icl: Gen11 render context size
2018-01-11 18:23 ` [PATCH v3] " Oscar Mateo
@ 2018-01-11 19:40 ` Rodrigo Vivi
2018-01-11 22:53 ` Oscar Mateo
2018-01-11 22:55 ` [PATCH 1/2] drm/i915: Return a default RCS " Oscar Mateo
1 sibling, 1 reply; 118+ messages in thread
From: Rodrigo Vivi @ 2018-01-11 19:40 UTC (permalink / raw)
To: Oscar Mateo; +Cc: intel-gfx
On Thu, Jan 11, 2018 at 06:23:20PM +0000, Oscar Mateo wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Gen11 removes the Resource Streamer, which frees up a big chunk of
> the context image. BSpec indicates 12538 DWORDs (13 pages), plus
> one page for PPHWSP.
>
> Please notice that, when looking at the BSpec context image table,
> the right filter has to be applied (e.g. "IcelakeLP") as some rows
> are excluded for specific GENs. Also, some rows apply per-subslice
> (for the calculation above, we have supposed 8 subslices which is
> the maximum SKU we expect).
>
> v2: Rebase.
> v3: Use the right size as per the BSpec.
>
> BSpec: 18907
>
> Acked-by: Ben Widawsky <benjamin.widawsky@intel.com>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> ---
> drivers/gpu/drm/i915/intel_engine_cs.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index e88b2fd..79b7d36 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -41,6 +41,7 @@
> #define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE)
> #define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
> #define GEN10_LR_CONTEXT_RENDER_SIZE (18 * PAGE_SIZE)
> +#define GEN11_LR_CONTEXT_RENDER_SIZE (14 * PAGE_SIZE)
>
> #define GEN8_LR_CONTEXT_OTHER_SIZE ( 2 * PAGE_SIZE)
>
> @@ -181,6 +182,8 @@ struct engine_info {
> switch (INTEL_GEN(dev_priv)) {
> default:
> MISSING_CASE(INTEL_GEN(dev_priv));
I believe this default is getting danger as we decrease the size here.
I believe the safest one for missing case is the largest one whatever that is.
> + case 11:
> + return GEN11_LR_CONTEXT_RENDER_SIZE;
> case 10:
> return GEN10_LR_CONTEXT_RENDER_SIZE;
> case 9:
> --
> 1.9.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH v3] drm/i915/icl: Gen11 render context size
2018-01-11 19:40 ` Rodrigo Vivi
@ 2018-01-11 22:53 ` Oscar Mateo
0 siblings, 0 replies; 118+ messages in thread
From: Oscar Mateo @ 2018-01-11 22:53 UTC (permalink / raw)
To: Rodrigo Vivi; +Cc: intel-gfx
On 01/11/2018 11:40 AM, Rodrigo Vivi wrote:
> On Thu, Jan 11, 2018 at 06:23:20PM +0000, Oscar Mateo wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Gen11 removes the Resource Streamer, which frees up a big chunk of
>> the context image. BSpec indicates 12538 DWORDs (13 pages), plus
>> one page for PPHWSP.
>>
>> Please notice that, when looking at the BSpec context image table,
>> the right filter has to be applied (e.g. "IcelakeLP") as some rows
>> are excluded for specific GENs. Also, some rows apply per-subslice
>> (for the calculation above, we have supposed 8 subslices which is
>> the maximum SKU we expect).
>>
>> v2: Rebase.
>> v3: Use the right size as per the BSpec.
>>
>> BSpec: 18907
>>
>> Acked-by: Ben Widawsky <benjamin.widawsky@intel.com>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>> ---
>> drivers/gpu/drm/i915/intel_engine_cs.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
>> index e88b2fd..79b7d36 100644
>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>> @@ -41,6 +41,7 @@
>> #define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE)
>> #define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
>> #define GEN10_LR_CONTEXT_RENDER_SIZE (18 * PAGE_SIZE)
>> +#define GEN11_LR_CONTEXT_RENDER_SIZE (14 * PAGE_SIZE)
>>
>> #define GEN8_LR_CONTEXT_OTHER_SIZE ( 2 * PAGE_SIZE)
>>
>> @@ -181,6 +182,8 @@ struct engine_info {
>> switch (INTEL_GEN(dev_priv)) {
>> default:
>> MISSING_CASE(INTEL_GEN(dev_priv));
> I believe this default is getting danger as we decrease the size here.
> I believe the safest one for missing case is the largest one whatever that is.
That's a very sensible idea. I don't know to put the default case in the
middle of the other ones or change the numerical order, so I'll create a
new define for the default case...
>> + case 11:
>> + return GEN11_LR_CONTEXT_RENDER_SIZE;
>> case 10:
>> return GEN10_LR_CONTEXT_RENDER_SIZE;
>> case 9:
>> --
>> 1.9.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* [PATCH 1/2] drm/i915: Return a default RCS context size
2018-01-11 18:23 ` [PATCH v3] " Oscar Mateo
2018-01-11 19:40 ` Rodrigo Vivi
@ 2018-01-11 22:55 ` Oscar Mateo
2018-01-11 22:55 ` [PATCH 2/2 v4] drm/i915/icl: Gen11 render " Oscar Mateo
2018-01-11 23:08 ` [PATCH 1/2] drm/i915: Return a default RCS " Daniele Ceraolo Spurio
1 sibling, 2 replies; 118+ messages in thread
From: Oscar Mateo @ 2018-01-11 22:55 UTC (permalink / raw)
To: intel-gfx
Instead of returning whatever size the latest GEN used. This is because
context sizes for new GENs can go up or down, but the only safe thing to
do for missing cases is to use the largest known one, whatever that is.
Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
drivers/gpu/drm/i915/intel_engine_cs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index e88b2fd..db758c5 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -38,6 +38,7 @@
*/
#define HSW_CXT_TOTAL_SIZE (17 * PAGE_SIZE)
+#define DEFAULT_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
#define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE)
#define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
#define GEN10_LR_CONTEXT_RENDER_SIZE (18 * PAGE_SIZE)
@@ -181,6 +182,7 @@ struct engine_info {
switch (INTEL_GEN(dev_priv)) {
default:
MISSING_CASE(INTEL_GEN(dev_priv));
+ return DEFAULT_LR_CONTEXT_RENDER_SIZE;
case 10:
return GEN10_LR_CONTEXT_RENDER_SIZE;
case 9:
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* [PATCH 2/2 v4] drm/i915/icl: Gen11 render context size
2018-01-11 22:55 ` [PATCH 1/2] drm/i915: Return a default RCS " Oscar Mateo
@ 2018-01-11 22:55 ` Oscar Mateo
2018-01-12 0:01 ` Daniele Ceraolo Spurio
2018-01-11 23:08 ` [PATCH 1/2] drm/i915: Return a default RCS " Daniele Ceraolo Spurio
1 sibling, 1 reply; 118+ messages in thread
From: Oscar Mateo @ 2018-01-11 22:55 UTC (permalink / raw)
To: intel-gfx; +Cc: Rodrigo Vivi
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Gen11 removes the Resource Streamer, which frees up a big chunk of
the context image. BSpec indicates 12538 DWORDs (13 pages), plus
one page for PPHWSP.
Please notice that, when looking at the BSpec context image table,
the right filter has to be applied as some rows are excluded for
specific GENs. Also, some rows apply per-subslice (for the
calculation above, we have supposed I915_MAX_SUBSLICES = 8).
v2: Rebase.
v3: Use the right size as per the BSpec.
v4:
- Rebased on top of the default context size (Rodrigo)
- Clarify in the commit message where the subslice calculation
comes from.
BSpec: 18907
Acked-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
drivers/gpu/drm/i915/intel_engine_cs.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index db758c5..1e7bf40 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -42,6 +42,7 @@
#define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE)
#define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
#define GEN10_LR_CONTEXT_RENDER_SIZE (18 * PAGE_SIZE)
+#define GEN11_LR_CONTEXT_RENDER_SIZE (14 * PAGE_SIZE)
#define GEN8_LR_CONTEXT_OTHER_SIZE ( 2 * PAGE_SIZE)
@@ -183,6 +184,8 @@ struct engine_info {
default:
MISSING_CASE(INTEL_GEN(dev_priv));
return DEFAULT_LR_CONTEXT_RENDER_SIZE;
+ case 11:
+ return GEN11_LR_CONTEXT_RENDER_SIZE;
case 10:
return GEN10_LR_CONTEXT_RENDER_SIZE;
case 9:
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH 2/2 v4] drm/i915/icl: Gen11 render context size
2018-01-11 22:55 ` [PATCH 2/2 v4] drm/i915/icl: Gen11 render " Oscar Mateo
@ 2018-01-12 0:01 ` Daniele Ceraolo Spurio
0 siblings, 0 replies; 118+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-01-12 0:01 UTC (permalink / raw)
To: Oscar Mateo, intel-gfx; +Cc: Rodrigo Vivi
On 11/01/18 14:55, Oscar Mateo wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Gen11 removes the Resource Streamer, which frees up a big chunk of
> the context image. BSpec indicates 12538 DWORDs (13 pages), plus
> one page for PPHWSP.
>
This is actually 12544 dwords according to the specs (I've already
confirmed it with Oscar via IM). Still 13 pages anyway.
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Thanks,
Daniele
> Please notice that, when looking at the BSpec context image table,
> the right filter has to be applied as some rows are excluded for
> specific GENs. Also, some rows apply per-subslice (for the
> calculation above, we have supposed I915_MAX_SUBSLICES = 8).
>
> v2: Rebase.
> v3: Use the right size as per the BSpec.
> v4:
> - Rebased on top of the default context size (Rodrigo)
> - Clarify in the commit message where the subslice calculation
> comes from.
>
> BSpec: 18907
>
> Acked-by: Ben Widawsky <benjamin.widawsky@intel.com>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> ---
> drivers/gpu/drm/i915/intel_engine_cs.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index db758c5..1e7bf40 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -42,6 +42,7 @@
> #define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE)
> #define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
> #define GEN10_LR_CONTEXT_RENDER_SIZE (18 * PAGE_SIZE)
> +#define GEN11_LR_CONTEXT_RENDER_SIZE (14 * PAGE_SIZE)
>
> #define GEN8_LR_CONTEXT_OTHER_SIZE ( 2 * PAGE_SIZE)
>
> @@ -183,6 +184,8 @@ struct engine_info {
> default:
> MISSING_CASE(INTEL_GEN(dev_priv));
> return DEFAULT_LR_CONTEXT_RENDER_SIZE;
> + case 11:
> + return GEN11_LR_CONTEXT_RENDER_SIZE;
> case 10:
> return GEN10_LR_CONTEXT_RENDER_SIZE;
> case 9:
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH 1/2] drm/i915: Return a default RCS context size
2018-01-11 22:55 ` [PATCH 1/2] drm/i915: Return a default RCS " Oscar Mateo
2018-01-11 22:55 ` [PATCH 2/2 v4] drm/i915/icl: Gen11 render " Oscar Mateo
@ 2018-01-11 23:08 ` Daniele Ceraolo Spurio
1 sibling, 0 replies; 118+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-01-11 23:08 UTC (permalink / raw)
To: Oscar Mateo, intel-gfx
On 11/01/18 14:55, Oscar Mateo wrote:
> Instead of returning whatever size the latest GEN used. This is because
> context sizes for new GENs can go up or down, but the only safe thing to
> do for missing cases is to use the largest known one, whatever that is.
>
> Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> ---
> drivers/gpu/drm/i915/intel_engine_cs.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index e88b2fd..db758c5 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -38,6 +38,7 @@
> */
> #define HSW_CXT_TOTAL_SIZE (17 * PAGE_SIZE)
>
Could use a comment here with the explanation in the commit message, but
it is relatively clear anyway, so:
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Thanks,
Daniele
> +#define DEFAULT_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
> #define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE)
> #define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
> #define GEN10_LR_CONTEXT_RENDER_SIZE (18 * PAGE_SIZE)
> @@ -181,6 +182,7 @@ struct engine_info {
> switch (INTEL_GEN(dev_priv)) {
> default:
> MISSING_CASE(INTEL_GEN(dev_priv));
> + return DEFAULT_LR_CONTEXT_RENDER_SIZE;
> case 10:
> return GEN10_LR_CONTEXT_RENDER_SIZE;
> case 9:
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* [PATCH 12/27] drm/i915/icl: Add Indirect Context Offset for Gen11
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
2018-01-09 23:28 ` [PATCH 11/27] drm/i915/icl: Gen11 render context size Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-01-10 23:44 ` Oscar Mateo
2018-01-25 1:06 ` [PATCH v2 " Michel Thierry
2018-01-09 23:28 ` [PATCH 13/27] drm/i915/icl: Gen11 forcewake support Paulo Zanoni
` (17 subsequent siblings)
19 siblings, 2 replies; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx; +Cc: Rodrigo Vivi
From: Michel Thierry <michel.thierry@intel.com>
v2: rebased to intel_lr_indirect_ctx_offset
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
drivers/gpu/drm/i915/intel_lrc.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3c6f587fa903..dab988f20833 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -204,6 +204,7 @@
#define GEN8_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x17
#define GEN9_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x26
#define GEN10_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x19
+#define GEN11_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x1A
/* Typical size of the average request (2 pipecontrols and a MI_BB) */
#define EXECLISTS_REQUEST_SIZE 64 /* bytes */
@@ -2148,6 +2149,10 @@ static u32 intel_lr_indirect_ctx_offset(struct intel_engine_cs *engine)
default:
MISSING_CASE(INTEL_GEN(engine->i915));
/* fall through */
+ case 11:
+ indirect_ctx_offset =
+ GEN11_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT;
+ break;
case 10:
indirect_ctx_offset =
GEN10_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT;
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH 12/27] drm/i915/icl: Add Indirect Context Offset for Gen11
2018-01-09 23:28 ` [PATCH 12/27] drm/i915/icl: Add Indirect Context Offset for Gen11 Paulo Zanoni
@ 2018-01-10 23:44 ` Oscar Mateo
2018-01-25 1:06 ` [PATCH v2 " Michel Thierry
1 sibling, 0 replies; 118+ messages in thread
From: Oscar Mateo @ 2018-01-10 23:44 UTC (permalink / raw)
To: Paulo Zanoni, intel-gfx; +Cc: Rodrigo Vivi
On 01/09/2018 03:28 PM, Paulo Zanoni wrote:
> From: Michel Thierry <michel.thierry@intel.com>
>
> v2: rebased to intel_lr_indirect_ctx_offset
>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> ---
> drivers/gpu/drm/i915/intel_lrc.c | 5 +++++
> 1 file changed, 5 insertions(+)
We can add the BSpec tag to the commit message:
Bspec: 11740
with that:
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 3c6f587fa903..dab988f20833 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -204,6 +204,7 @@
> #define GEN8_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x17
> #define GEN9_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x26
> #define GEN10_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x19
> +#define GEN11_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x1A
>
> /* Typical size of the average request (2 pipecontrols and a MI_BB) */
> #define EXECLISTS_REQUEST_SIZE 64 /* bytes */
> @@ -2148,6 +2149,10 @@ static u32 intel_lr_indirect_ctx_offset(struct intel_engine_cs *engine)
> default:
> MISSING_CASE(INTEL_GEN(engine->i915));
> /* fall through */
> + case 11:
> + indirect_ctx_offset =
> + GEN11_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT;
> + break;
> case 10:
> indirect_ctx_offset =
> GEN10_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT;
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* [PATCH v2 12/27] drm/i915/icl: Add Indirect Context Offset for Gen11
2018-01-09 23:28 ` [PATCH 12/27] drm/i915/icl: Add Indirect Context Offset for Gen11 Paulo Zanoni
2018-01-10 23:44 ` Oscar Mateo
@ 2018-01-25 1:06 ` Michel Thierry
1 sibling, 0 replies; 118+ messages in thread
From: Michel Thierry @ 2018-01-25 1:06 UTC (permalink / raw)
To: intel-gfx; +Cc: paulo.r.zanoni
v2: rebased to intel_lr_indirect_ctx_offset
v3: rebase, move define to intel_lrc_reg.h
BSpec: 11740
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
---
drivers/gpu/drm/i915/intel_lrc.c | 4 ++++
drivers/gpu/drm/i915/intel_lrc_reg.h | 1 +
2 files changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4eb409dc9dd1..ecc07cc2ffc4 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2426,6 +2426,10 @@ static u32 intel_lr_indirect_ctx_offset(struct intel_engine_cs *engine)
default:
MISSING_CASE(INTEL_GEN(engine->i915));
/* fall through */
+ case 11:
+ indirect_ctx_offset =
+ GEN11_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT;
+ break;
case 10:
indirect_ctx_offset =
GEN10_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT;
diff --git a/drivers/gpu/drm/i915/intel_lrc_reg.h b/drivers/gpu/drm/i915/intel_lrc_reg.h
index a53336e2fc97..169a2239d6c7 100644
--- a/drivers/gpu/drm/i915/intel_lrc_reg.h
+++ b/drivers/gpu/drm/i915/intel_lrc_reg.h
@@ -63,5 +63,6 @@
#define GEN8_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x17
#define GEN9_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x26
#define GEN10_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x19
+#define GEN11_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x1A
#endif /* _INTEL_LRC_REG_H_ */
--
2.16.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* [PATCH 13/27] drm/i915/icl: Gen11 forcewake support
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
2018-01-09 23:28 ` [PATCH 11/27] drm/i915/icl: Gen11 render context size Paulo Zanoni
2018-01-09 23:28 ` [PATCH 12/27] drm/i915/icl: Add Indirect Context Offset for Gen11 Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-02-01 0:52 ` [PATCH v10] " Michel Thierry
2018-01-09 23:28 ` [PATCH 14/27] drm/i915/icl: Set graphics mode register for gen11 Paulo Zanoni
` (16 subsequent siblings)
19 siblings, 1 reply; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
The main difference with previous GENs is that starting from Gen11
each VCS and VECS engine has its own power well, which only exist
if the related engine exists in the HW.
The fallback forcewake request workaround is only needed on gen9
according to the HSDES WA entry (1604254524), so we can go back to using
the simpler fw_domains_get/put functions.
BSpec: 18331
v2: fix fwtable, use array to test shadow tables, create new
accessors to avoid check on every access (Tvrtko)
v3 (from Paulo): Rebase.
v4:
- Range 09400-097FF should be FORCEWAKE_ALL (Daniele)
- Use the BIT macro for forcewake domains (Daniele)
- Add a comment about the range ordering (Oscar)
- Updated commit message (Oscar)
v5: Rebased
v6: Use I915_MAX_VCS/VECS (Michal)
v7: translate FORCEWAKE_ALL to available domains
v8: rebase, add clarification on fallback ack in commit message.
v9: fix rebase issue, change check in fw_domains_init from IS_GEN11
to GEN >= 11
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Acked-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 4 +
drivers/gpu/drm/i915/intel_uncore.c | 148 +++++++++++++++++++++++++-
drivers/gpu/drm/i915/intel_uncore.h | 27 ++++-
drivers/gpu/drm/i915/selftests/intel_uncore.c | 30 ++++--
4 files changed, 191 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index f843ac205313..f383ee5cc592 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7895,9 +7895,13 @@ enum {
#define VLV_GTLC_PW_RENDER_STATUS_MASK (1 << 7)
#define FORCEWAKE_MT _MMIO(0xa188) /* multi-threaded */
#define FORCEWAKE_MEDIA_GEN9 _MMIO(0xa270)
+#define FORCEWAKE_MEDIA_VDBOX_GEN11(n) _MMIO(0xa540 + (n) * 4)
+#define FORCEWAKE_MEDIA_VEBOX_GEN11(n) _MMIO(0xa560 + (n) * 4)
#define FORCEWAKE_RENDER_GEN9 _MMIO(0xa278)
#define FORCEWAKE_BLITTER_GEN9 _MMIO(0xa188)
#define FORCEWAKE_ACK_MEDIA_GEN9 _MMIO(0x0D88)
+#define FORCEWAKE_ACK_MEDIA_VDBOX_GEN11(n) _MMIO(0x0D50 + (n) * 4)
+#define FORCEWAKE_ACK_MEDIA_VEBOX_GEN11(n) _MMIO(0x0D70 + (n) * 4)
#define FORCEWAKE_ACK_RENDER_GEN9 _MMIO(0x0D84)
#define FORCEWAKE_ACK_BLITTER_GEN9 _MMIO(0x130044)
#define FORCEWAKE_KERNEL BIT(0)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 1c524ed1e1da..44f0c5ab58e3 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -37,6 +37,12 @@ static const char * const forcewake_domain_names[] = {
"render",
"blitter",
"media",
+ "vdbox0",
+ "vdbox1",
+ "vdbox2",
+ "vdbox3",
+ "vebox0",
+ "vebox1",
};
const char *
@@ -773,6 +779,8 @@ void assert_forcewakes_active(struct drm_i915_private *dev_priv,
/* We give fast paths for the really cool registers */
#define NEEDS_FORCE_WAKE(reg) ((reg) < 0x40000)
+#define GEN11_NEEDS_FORCE_WAKE(reg) \
+ ((reg) < 0x40000 || ((reg) >= 0x1c0000 && (reg) < 0x1dc000))
#define __gen6_reg_read_fw_domains(offset) \
({ \
@@ -826,6 +834,14 @@ find_fw_domain(struct drm_i915_private *dev_priv, u32 offset)
if (!entry)
return 0;
+ /*
+ * The list of FW domains depends on the SKU in gen11+ so we
+ * can't determine it statically. We use FORCEWAKE_ALL and
+ * translate it here to the list of available domains.
+ */
+ if (entry->domains == FORCEWAKE_ALL)
+ return dev_priv->uncore.fw_domains;
+
WARN(entry->domains & ~dev_priv->uncore.fw_domains,
"Uninitialized forcewake domain(s) 0x%x accessed at 0x%x\n",
entry->domains & ~dev_priv->uncore.fw_domains, offset);
@@ -860,6 +876,14 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] = {
__fwd; \
})
+#define __gen11_fwtable_reg_read_fw_domains(offset) \
+({ \
+ enum forcewake_domains __fwd = 0; \
+ if (GEN11_NEEDS_FORCE_WAKE((offset))) \
+ __fwd = find_fw_domain(dev_priv, offset); \
+ __fwd; \
+})
+
/* *Must* be sorted by offset! See intel_shadow_table_check(). */
static const i915_reg_t gen8_shadowed_regs[] = {
RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */
@@ -871,6 +895,20 @@ static const i915_reg_t gen8_shadowed_regs[] = {
/* TODO: Other registers are not yet used */
};
+static const i915_reg_t gen11_shadowed_regs[] = {
+ RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */
+ GEN6_RPNSWREQ, /* 0xA008 */
+ GEN6_RC_VIDEO_FREQ, /* 0xA00C */
+ RING_TAIL(BLT_RING_BASE), /* 0x22000 (base) */
+ RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */
+ RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */
+ RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */
+ RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */
+ RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */
+ RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */
+ /* TODO: Other registers are not yet used */
+};
+
static int mmio_reg_cmp(u32 key, const i915_reg_t *reg)
{
u32 offset = i915_mmio_reg_offset(*reg);
@@ -891,6 +929,14 @@ static bool is_gen8_shadowed(u32 offset)
mmio_reg_cmp);
}
+static bool is_gen11_shadowed(u32 offset)
+{
+ const i915_reg_t *regs = gen11_shadowed_regs;
+
+ return BSEARCH(offset, regs, ARRAY_SIZE(gen11_shadowed_regs),
+ mmio_reg_cmp);
+}
+
#define __gen8_reg_write_fw_domains(offset) \
({ \
enum forcewake_domains __fwd; \
@@ -929,6 +975,14 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = {
__fwd; \
})
+#define __gen11_fwtable_reg_write_fw_domains(offset) \
+({ \
+ enum forcewake_domains __fwd = 0; \
+ if (GEN11_NEEDS_FORCE_WAKE((offset)) && !is_gen11_shadowed(offset)) \
+ __fwd = find_fw_domain(dev_priv, offset); \
+ __fwd; \
+})
+
/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
static const struct intel_forcewake_range __gen9_fw_ranges[] = {
GEN_FW_RANGE(0x0, 0xaff, FORCEWAKE_BLITTER),
@@ -965,6 +1019,40 @@ static const struct intel_forcewake_range __gen9_fw_ranges[] = {
GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
};
+/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
+static const struct intel_forcewake_range __gen11_fw_ranges[] = {
+ GEN_FW_RANGE(0x0, 0xaff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0xb00, 0x1fff, 0), /* uncore range */
+ GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x2700, 0x2fff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x3000, 0x3fff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x4000, 0x51ff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x8000, 0x813f, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x8160, 0x82ff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x8500, 0x8bff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x8c00, 0x8cff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x8d00, 0x93ff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x9400, 0x97ff, FORCEWAKE_ALL),
+ GEN_FW_RANGE(0x9800, 0xafff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0xb480, 0xdfff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0xe000, 0xe8ff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0xe900, 0x243ff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x24400, 0x247ff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x24800, 0x3ffff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x40000, 0x1bffff, 0),
+ GEN_FW_RANGE(0x1c0000, 0x1c3fff, FORCEWAKE_MEDIA_VDBOX0),
+ GEN_FW_RANGE(0x1c4000, 0x1c7fff, FORCEWAKE_MEDIA_VDBOX1),
+ GEN_FW_RANGE(0x1c8000, 0x1cbfff, FORCEWAKE_MEDIA_VEBOX0),
+ GEN_FW_RANGE(0x1cc000, 0x1cffff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x1d0000, 0x1d3fff, FORCEWAKE_MEDIA_VDBOX2),
+ GEN_FW_RANGE(0x1d4000, 0x1d7fff, FORCEWAKE_MEDIA_VDBOX3),
+ GEN_FW_RANGE(0x1d8000, 0x1dbfff, FORCEWAKE_MEDIA_VEBOX1)
+};
+
static void
ilk_dummy_write(struct drm_i915_private *dev_priv)
{
@@ -1095,7 +1183,12 @@ func##_read##x(struct drm_i915_private *dev_priv, i915_reg_t reg, bool trace) {
}
#define __gen6_read(x) __gen_read(gen6, x)
#define __fwtable_read(x) __gen_read(fwtable, x)
+#define __gen11_fwtable_read(x) __gen_read(gen11_fwtable, x)
+__gen11_fwtable_read(8)
+__gen11_fwtable_read(16)
+__gen11_fwtable_read(32)
+__gen11_fwtable_read(64)
__fwtable_read(8)
__fwtable_read(16)
__fwtable_read(32)
@@ -1105,6 +1198,7 @@ __gen6_read(16)
__gen6_read(32)
__gen6_read(64)
+#undef __gen11_fwtable_read
#undef __fwtable_read
#undef __gen6_read
#undef GEN6_READ_FOOTER
@@ -1181,7 +1275,11 @@ func##_write##x(struct drm_i915_private *dev_priv, i915_reg_t reg, u##x val, boo
}
#define __gen8_write(x) __gen_write(gen8, x)
#define __fwtable_write(x) __gen_write(fwtable, x)
+#define __gen11_fwtable_write(x) __gen_write(gen11_fwtable, x)
+__gen11_fwtable_write(8)
+__gen11_fwtable_write(16)
+__gen11_fwtable_write(32)
__fwtable_write(8)
__fwtable_write(16)
__fwtable_write(32)
@@ -1192,6 +1290,7 @@ __gen6_write(8)
__gen6_write(16)
__gen6_write(32)
+#undef __gen11_fwtable_write
#undef __fwtable_write
#undef __gen8_write
#undef __gen6_write
@@ -1240,6 +1339,13 @@ static void fw_domain_init(struct drm_i915_private *dev_priv,
BUILD_BUG_ON(FORCEWAKE_RENDER != (1 << FW_DOMAIN_ID_RENDER));
BUILD_BUG_ON(FORCEWAKE_BLITTER != (1 << FW_DOMAIN_ID_BLITTER));
BUILD_BUG_ON(FORCEWAKE_MEDIA != (1 << FW_DOMAIN_ID_MEDIA));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX0 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX0));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX1 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX1));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX2 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX2));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX3 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX3));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VEBOX0 != (1 << FW_DOMAIN_ID_MEDIA_VEBOX0));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VEBOX1 != (1 << FW_DOMAIN_ID_MEDIA_VEBOX1));
+
d->mask = BIT(domain_id);
@@ -1267,7 +1373,33 @@ static void intel_uncore_fw_domains_init(struct drm_i915_private *dev_priv)
dev_priv->uncore.fw_clear = _MASKED_BIT_DISABLE(FORCEWAKE_KERNEL);
}
- if (INTEL_GEN(dev_priv) >= 9) {
+ if (INTEL_GEN(dev_priv) >= 11) {
+ int i;
+ dev_priv->uncore.funcs.force_wake_get = fw_domains_get;
+ dev_priv->uncore.funcs.force_wake_put = fw_domains_put;
+ fw_domain_init(dev_priv, FW_DOMAIN_ID_RENDER,
+ FORCEWAKE_RENDER_GEN9,
+ FORCEWAKE_ACK_RENDER_GEN9);
+ fw_domain_init(dev_priv, FW_DOMAIN_ID_BLITTER,
+ FORCEWAKE_BLITTER_GEN9,
+ FORCEWAKE_ACK_BLITTER_GEN9);
+ for (i = 0; i < I915_MAX_VCS; i++) {
+ if (!HAS_ENGINE(dev_priv, _VCS(i)))
+ continue;
+
+ fw_domain_init(dev_priv, FW_DOMAIN_ID_MEDIA_VDBOX0 + i,
+ FORCEWAKE_MEDIA_VDBOX_GEN11(i),
+ FORCEWAKE_ACK_MEDIA_VDBOX_GEN11(i));
+ }
+ for (i = 0; i < I915_MAX_VECS; i++) {
+ if (!HAS_ENGINE(dev_priv, _VECS(i)))
+ continue;
+
+ fw_domain_init(dev_priv, FW_DOMAIN_ID_MEDIA_VEBOX0 + i,
+ FORCEWAKE_MEDIA_VEBOX_GEN11(i),
+ FORCEWAKE_ACK_MEDIA_VEBOX_GEN11(i));
+ }
+ } else if (IS_GEN9(dev_priv) || IS_GEN10(dev_priv)) {
dev_priv->uncore.funcs.force_wake_get =
fw_domains_get_with_fallback;
dev_priv->uncore.funcs.force_wake_put = fw_domains_put;
@@ -1422,10 +1554,14 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, gen8);
ASSIGN_READ_MMIO_VFUNCS(dev_priv, gen6);
}
- } else {
+ } else if (IS_GEN(dev_priv, 9, 10)) {
ASSIGN_FW_DOMAINS_TABLE(__gen9_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, fwtable);
ASSIGN_READ_MMIO_VFUNCS(dev_priv, fwtable);
+ } else {
+ ASSIGN_FW_DOMAINS_TABLE(__gen11_fw_ranges);
+ ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, gen11_fwtable);
+ ASSIGN_READ_MMIO_VFUNCS(dev_priv, gen11_fwtable);
}
iosf_mbi_register_pmic_bus_access_notifier(
@@ -1985,7 +2121,9 @@ intel_uncore_forcewake_for_read(struct drm_i915_private *dev_priv,
u32 offset = i915_mmio_reg_offset(reg);
enum forcewake_domains fw_domains;
- if (HAS_FWTABLE(dev_priv)) {
+ if (INTEL_GEN(dev_priv) >= 11) {
+ fw_domains = __gen11_fwtable_reg_read_fw_domains(offset);
+ } else if (HAS_FWTABLE(dev_priv)) {
fw_domains = __fwtable_reg_read_fw_domains(offset);
} else if (INTEL_GEN(dev_priv) >= 6) {
fw_domains = __gen6_reg_read_fw_domains(offset);
@@ -2006,7 +2144,9 @@ intel_uncore_forcewake_for_write(struct drm_i915_private *dev_priv,
u32 offset = i915_mmio_reg_offset(reg);
enum forcewake_domains fw_domains;
- if (HAS_FWTABLE(dev_priv) && !IS_VALLEYVIEW(dev_priv)) {
+ if (INTEL_GEN(dev_priv) >= 11) {
+ fw_domains = __gen11_fwtable_reg_write_fw_domains(offset);
+ } else if (HAS_FWTABLE(dev_priv) && !IS_VALLEYVIEW(dev_priv)) {
fw_domains = __fwtable_reg_write_fw_domains(offset);
} else if (IS_GEN8(dev_priv)) {
fw_domains = __gen8_reg_write_fw_domains(offset);
diff --git a/drivers/gpu/drm/i915/intel_uncore.h b/drivers/gpu/drm/i915/intel_uncore.h
index bed019ef000f..9e8330c5808e 100644
--- a/drivers/gpu/drm/i915/intel_uncore.h
+++ b/drivers/gpu/drm/i915/intel_uncore.h
@@ -37,17 +37,36 @@ enum forcewake_domain_id {
FW_DOMAIN_ID_RENDER = 0,
FW_DOMAIN_ID_BLITTER,
FW_DOMAIN_ID_MEDIA,
+ FW_DOMAIN_ID_MEDIA_VDBOX0,
+ FW_DOMAIN_ID_MEDIA_VDBOX1,
+ FW_DOMAIN_ID_MEDIA_VDBOX2,
+ FW_DOMAIN_ID_MEDIA_VDBOX3,
+ FW_DOMAIN_ID_MEDIA_VEBOX0,
+ FW_DOMAIN_ID_MEDIA_VEBOX1,
FW_DOMAIN_ID_COUNT
};
enum forcewake_domains {
- FORCEWAKE_RENDER = BIT(FW_DOMAIN_ID_RENDER),
- FORCEWAKE_BLITTER = BIT(FW_DOMAIN_ID_BLITTER),
- FORCEWAKE_MEDIA = BIT(FW_DOMAIN_ID_MEDIA),
+ FORCEWAKE_RENDER = BIT(FW_DOMAIN_ID_RENDER),
+ FORCEWAKE_BLITTER = BIT(FW_DOMAIN_ID_BLITTER),
+ FORCEWAKE_MEDIA = BIT(FW_DOMAIN_ID_MEDIA),
+ FORCEWAKE_MEDIA_VDBOX0 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX0),
+ FORCEWAKE_MEDIA_VDBOX1 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX1),
+ FORCEWAKE_MEDIA_VDBOX2 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX2),
+ FORCEWAKE_MEDIA_VDBOX3 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX3),
+ FORCEWAKE_MEDIA_VEBOX0 = BIT(FW_DOMAIN_ID_MEDIA_VEBOX0),
+ FORCEWAKE_MEDIA_VEBOX1 = BIT(FW_DOMAIN_ID_MEDIA_VEBOX1),
+
FORCEWAKE_ALL = (FORCEWAKE_RENDER |
FORCEWAKE_BLITTER |
- FORCEWAKE_MEDIA)
+ FORCEWAKE_MEDIA |
+ FORCEWAKE_MEDIA_VDBOX0 |
+ FORCEWAKE_MEDIA_VDBOX1 |
+ FORCEWAKE_MEDIA_VDBOX2 |
+ FORCEWAKE_MEDIA_VDBOX3 |
+ FORCEWAKE_MEDIA_VEBOX0 |
+ FORCEWAKE_MEDIA_VEBOX1)
};
struct intel_uncore_funcs {
diff --git a/drivers/gpu/drm/i915/selftests/intel_uncore.c b/drivers/gpu/drm/i915/selftests/intel_uncore.c
index 2f6367643171..df1b5076fb8b 100644
--- a/drivers/gpu/drm/i915/selftests/intel_uncore.c
+++ b/drivers/gpu/drm/i915/selftests/intel_uncore.c
@@ -61,20 +61,30 @@ static int intel_fw_table_check(const struct intel_forcewake_range *ranges,
static int intel_shadow_table_check(void)
{
- const i915_reg_t *reg = gen8_shadowed_regs;
- unsigned int i;
+ struct {
+ const i915_reg_t *regs;
+ unsigned int size;
+ } reg_lists[] = {
+ { gen8_shadowed_regs, ARRAY_SIZE(gen8_shadowed_regs) },
+ { gen11_shadowed_regs, ARRAY_SIZE(gen11_shadowed_regs) },
+ };
+ const i915_reg_t *reg;
+ unsigned int i, j;
s32 prev;
- for (i = 0, prev = -1; i < ARRAY_SIZE(gen8_shadowed_regs); i++, reg++) {
- u32 offset = i915_mmio_reg_offset(*reg);
+ for (j = 0; j < ARRAY_SIZE(reg_lists); ++j) {
+ reg = reg_lists[j].regs;
+ for (i = 0, prev = -1; i < reg_lists[j].size; i++, reg++) {
+ u32 offset = i915_mmio_reg_offset(*reg);
- if (prev >= (s32)offset) {
- pr_err("%s: entry[%d]:(%x) is before previous (%x)\n",
- __func__, i, offset, prev);
- return -EINVAL;
- }
+ if (prev >= (s32)offset) {
+ pr_err("%s: entry[%d]:(%x) is before previous (%x)\n",
+ __func__, i, offset, prev);
+ return -EINVAL;
+ }
- prev = offset;
+ prev = offset;
+ }
}
return 0;
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* [PATCH v10] drm/i915/icl: Gen11 forcewake support
2018-01-09 23:28 ` [PATCH 13/27] drm/i915/icl: Gen11 forcewake support Paulo Zanoni
@ 2018-02-01 0:52 ` Michel Thierry
2018-02-01 10:25 ` Tvrtko Ursulin
` (3 more replies)
0 siblings, 4 replies; 118+ messages in thread
From: Michel Thierry @ 2018-02-01 0:52 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
The main difference with previous GENs is that starting from Gen11
each VCS and VECS engine has its own power well, which only exist
if the related engine exists in the HW.
The fallback forcewake request workaround is only needed on gen9
according to the HSDES WA entry (1604254524), so we can go back to using
the simpler fw_domains_get/put functions.
BSpec: 18331
v2: fix fwtable, use array to test shadow tables, create new
accessors to avoid check on every access (Tvrtko)
v3 (from Paulo): Rebase.
v4:
- Range 09400-097FF should be FORCEWAKE_ALL (Daniele)
- Use the BIT macro for forcewake domains (Daniele)
- Add a comment about the range ordering (Oscar)
- Updated commit message (Oscar)
v5: Rebased
v6: Use I915_MAX_VCS/VECS (Michal)
v7: translate FORCEWAKE_ALL to available domains
v8: rebase, add clarification on fallback ack in commit message.
v9: fix rebase issue, change check in fw_domains_init from IS_GEN11
to GEN >= 11
v10: Generate is_genX_shadowed with a macro (Daniele)
Include gen11_fw_ranges in the selftest (Michel)
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Acked-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 4 +
drivers/gpu/drm/i915/intel_uncore.c | 155 ++++++++++++++++++++++++--
drivers/gpu/drm/i915/intel_uncore.h | 27 ++++-
drivers/gpu/drm/i915/selftests/intel_uncore.c | 31 ++++--
4 files changed, 193 insertions(+), 24 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d29e8a0e2ca3..eaca12292ffe 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -8015,9 +8015,13 @@ enum {
#define VLV_GTLC_PW_RENDER_STATUS_MASK (1 << 7)
#define FORCEWAKE_MT _MMIO(0xa188) /* multi-threaded */
#define FORCEWAKE_MEDIA_GEN9 _MMIO(0xa270)
+#define FORCEWAKE_MEDIA_VDBOX_GEN11(n) _MMIO(0xa540 + (n) * 4)
+#define FORCEWAKE_MEDIA_VEBOX_GEN11(n) _MMIO(0xa560 + (n) * 4)
#define FORCEWAKE_RENDER_GEN9 _MMIO(0xa278)
#define FORCEWAKE_BLITTER_GEN9 _MMIO(0xa188)
#define FORCEWAKE_ACK_MEDIA_GEN9 _MMIO(0x0D88)
+#define FORCEWAKE_ACK_MEDIA_VDBOX_GEN11(n) _MMIO(0x0D50 + (n) * 4)
+#define FORCEWAKE_ACK_MEDIA_VEBOX_GEN11(n) _MMIO(0x0D70 + (n) * 4)
#define FORCEWAKE_ACK_RENDER_GEN9 _MMIO(0x0D84)
#define FORCEWAKE_ACK_BLITTER_GEN9 _MMIO(0x130044)
#define FORCEWAKE_KERNEL BIT(0)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 164dbb8cfa36..c1953043604b 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -37,6 +37,12 @@ static const char * const forcewake_domain_names[] = {
"render",
"blitter",
"media",
+ "vdbox0",
+ "vdbox1",
+ "vdbox2",
+ "vdbox3",
+ "vebox0",
+ "vebox1",
};
const char *
@@ -773,6 +779,8 @@ void assert_forcewakes_active(struct drm_i915_private *dev_priv,
/* We give fast paths for the really cool registers */
#define NEEDS_FORCE_WAKE(reg) ((reg) < 0x40000)
+#define GEN11_NEEDS_FORCE_WAKE(reg) \
+ ((reg) < 0x40000 || ((reg) >= 0x1c0000 && (reg) < 0x1dc000))
#define __gen6_reg_read_fw_domains(offset) \
({ \
@@ -826,6 +834,14 @@ find_fw_domain(struct drm_i915_private *dev_priv, u32 offset)
if (!entry)
return 0;
+ /*
+ * The list of FW domains depends on the SKU in gen11+ so we
+ * can't determine it statically. We use FORCEWAKE_ALL and
+ * translate it here to the list of available domains.
+ */
+ if (entry->domains == FORCEWAKE_ALL)
+ return dev_priv->uncore.fw_domains;
+
WARN(entry->domains & ~dev_priv->uncore.fw_domains,
"Uninitialized forcewake domain(s) 0x%x accessed at 0x%x\n",
entry->domains & ~dev_priv->uncore.fw_domains, offset);
@@ -860,6 +876,14 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] = {
__fwd; \
})
+#define __gen11_fwtable_reg_read_fw_domains(offset) \
+({ \
+ enum forcewake_domains __fwd = 0; \
+ if (GEN11_NEEDS_FORCE_WAKE((offset))) \
+ __fwd = find_fw_domain(dev_priv, offset); \
+ __fwd; \
+})
+
/* *Must* be sorted by offset! See intel_shadow_table_check(). */
static const i915_reg_t gen8_shadowed_regs[] = {
RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */
@@ -871,6 +895,20 @@ static const i915_reg_t gen8_shadowed_regs[] = {
/* TODO: Other registers are not yet used */
};
+static const i915_reg_t gen11_shadowed_regs[] = {
+ RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */
+ GEN6_RPNSWREQ, /* 0xA008 */
+ GEN6_RC_VIDEO_FREQ, /* 0xA00C */
+ RING_TAIL(BLT_RING_BASE), /* 0x22000 (base) */
+ RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */
+ RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */
+ RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */
+ RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */
+ RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */
+ RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */
+ /* TODO: Other registers are not yet used */
+};
+
static int mmio_reg_cmp(u32 key, const i915_reg_t *reg)
{
u32 offset = i915_mmio_reg_offset(*reg);
@@ -883,14 +921,17 @@ static int mmio_reg_cmp(u32 key, const i915_reg_t *reg)
return 0;
}
-static bool is_gen8_shadowed(u32 offset)
-{
- const i915_reg_t *regs = gen8_shadowed_regs;
-
- return BSEARCH(offset, regs, ARRAY_SIZE(gen8_shadowed_regs),
- mmio_reg_cmp);
+#define __is_genX_shadowed(x) \
+static bool is_gen##x##_shadowed(u32 offset) \
+{ \
+ const i915_reg_t *regs = gen##x##_shadowed_regs; \
+ return BSEARCH(offset, regs, ARRAY_SIZE(gen##x##_shadowed_regs), \
+ mmio_reg_cmp); \
}
+__is_genX_shadowed(8)
+__is_genX_shadowed(11)
+
#define __gen8_reg_write_fw_domains(offset) \
({ \
enum forcewake_domains __fwd; \
@@ -929,6 +970,14 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = {
__fwd; \
})
+#define __gen11_fwtable_reg_write_fw_domains(offset) \
+({ \
+ enum forcewake_domains __fwd = 0; \
+ if (GEN11_NEEDS_FORCE_WAKE((offset)) && !is_gen11_shadowed(offset)) \
+ __fwd = find_fw_domain(dev_priv, offset); \
+ __fwd; \
+})
+
/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
static const struct intel_forcewake_range __gen9_fw_ranges[] = {
GEN_FW_RANGE(0x0, 0xaff, FORCEWAKE_BLITTER),
@@ -965,6 +1014,40 @@ static const struct intel_forcewake_range __gen9_fw_ranges[] = {
GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
};
+/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
+static const struct intel_forcewake_range __gen11_fw_ranges[] = {
+ GEN_FW_RANGE(0x0, 0xaff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0xb00, 0x1fff, 0), /* uncore range */
+ GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x2700, 0x2fff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x3000, 0x3fff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x4000, 0x51ff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x8000, 0x813f, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x8160, 0x82ff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x8500, 0x8bff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x8c00, 0x8cff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x8d00, 0x93ff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x9400, 0x97ff, FORCEWAKE_ALL),
+ GEN_FW_RANGE(0x9800, 0xafff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0xb480, 0xdfff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0xe000, 0xe8ff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0xe900, 0x243ff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x24400, 0x247ff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x24800, 0x3ffff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x40000, 0x1bffff, 0),
+ GEN_FW_RANGE(0x1c0000, 0x1c3fff, FORCEWAKE_MEDIA_VDBOX0),
+ GEN_FW_RANGE(0x1c4000, 0x1c7fff, FORCEWAKE_MEDIA_VDBOX1),
+ GEN_FW_RANGE(0x1c8000, 0x1cbfff, FORCEWAKE_MEDIA_VEBOX0),
+ GEN_FW_RANGE(0x1cc000, 0x1cffff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x1d0000, 0x1d3fff, FORCEWAKE_MEDIA_VDBOX2),
+ GEN_FW_RANGE(0x1d4000, 0x1d7fff, FORCEWAKE_MEDIA_VDBOX3),
+ GEN_FW_RANGE(0x1d8000, 0x1dbfff, FORCEWAKE_MEDIA_VEBOX1)
+};
+
static void
ilk_dummy_write(struct drm_i915_private *dev_priv)
{
@@ -1095,7 +1178,12 @@ func##_read##x(struct drm_i915_private *dev_priv, i915_reg_t reg, bool trace) {
}
#define __gen6_read(x) __gen_read(gen6, x)
#define __fwtable_read(x) __gen_read(fwtable, x)
+#define __gen11_fwtable_read(x) __gen_read(gen11_fwtable, x)
+__gen11_fwtable_read(8)
+__gen11_fwtable_read(16)
+__gen11_fwtable_read(32)
+__gen11_fwtable_read(64)
__fwtable_read(8)
__fwtable_read(16)
__fwtable_read(32)
@@ -1105,6 +1193,7 @@ __gen6_read(16)
__gen6_read(32)
__gen6_read(64)
+#undef __gen11_fwtable_read
#undef __fwtable_read
#undef __gen6_read
#undef GEN6_READ_FOOTER
@@ -1181,7 +1270,11 @@ func##_write##x(struct drm_i915_private *dev_priv, i915_reg_t reg, u##x val, boo
}
#define __gen8_write(x) __gen_write(gen8, x)
#define __fwtable_write(x) __gen_write(fwtable, x)
+#define __gen11_fwtable_write(x) __gen_write(gen11_fwtable, x)
+__gen11_fwtable_write(8)
+__gen11_fwtable_write(16)
+__gen11_fwtable_write(32)
__fwtable_write(8)
__fwtable_write(16)
__fwtable_write(32)
@@ -1192,6 +1285,7 @@ __gen6_write(8)
__gen6_write(16)
__gen6_write(32)
+#undef __gen11_fwtable_write
#undef __fwtable_write
#undef __gen8_write
#undef __gen6_write
@@ -1240,6 +1334,13 @@ static void fw_domain_init(struct drm_i915_private *dev_priv,
BUILD_BUG_ON(FORCEWAKE_RENDER != (1 << FW_DOMAIN_ID_RENDER));
BUILD_BUG_ON(FORCEWAKE_BLITTER != (1 << FW_DOMAIN_ID_BLITTER));
BUILD_BUG_ON(FORCEWAKE_MEDIA != (1 << FW_DOMAIN_ID_MEDIA));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX0 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX0));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX1 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX1));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX2 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX2));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX3 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX3));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VEBOX0 != (1 << FW_DOMAIN_ID_MEDIA_VEBOX0));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VEBOX1 != (1 << FW_DOMAIN_ID_MEDIA_VEBOX1));
+
d->mask = BIT(domain_id);
@@ -1267,7 +1368,34 @@ static void intel_uncore_fw_domains_init(struct drm_i915_private *dev_priv)
dev_priv->uncore.fw_clear = _MASKED_BIT_DISABLE(FORCEWAKE_KERNEL);
}
- if (INTEL_GEN(dev_priv) >= 9) {
+ if (INTEL_GEN(dev_priv) >= 11) {
+ int i;
+
+ dev_priv->uncore.funcs.force_wake_get = fw_domains_get;
+ dev_priv->uncore.funcs.force_wake_put = fw_domains_put;
+ fw_domain_init(dev_priv, FW_DOMAIN_ID_RENDER,
+ FORCEWAKE_RENDER_GEN9,
+ FORCEWAKE_ACK_RENDER_GEN9);
+ fw_domain_init(dev_priv, FW_DOMAIN_ID_BLITTER,
+ FORCEWAKE_BLITTER_GEN9,
+ FORCEWAKE_ACK_BLITTER_GEN9);
+ for (i = 0; i < I915_MAX_VCS; i++) {
+ if (!HAS_ENGINE(dev_priv, _VCS(i)))
+ continue;
+
+ fw_domain_init(dev_priv, FW_DOMAIN_ID_MEDIA_VDBOX0 + i,
+ FORCEWAKE_MEDIA_VDBOX_GEN11(i),
+ FORCEWAKE_ACK_MEDIA_VDBOX_GEN11(i));
+ }
+ for (i = 0; i < I915_MAX_VECS; i++) {
+ if (!HAS_ENGINE(dev_priv, _VECS(i)))
+ continue;
+
+ fw_domain_init(dev_priv, FW_DOMAIN_ID_MEDIA_VEBOX0 + i,
+ FORCEWAKE_MEDIA_VEBOX_GEN11(i),
+ FORCEWAKE_ACK_MEDIA_VEBOX_GEN11(i));
+ }
+ } else if (IS_GEN9(dev_priv) || IS_GEN10(dev_priv)) {
dev_priv->uncore.funcs.force_wake_get =
fw_domains_get_with_fallback;
dev_priv->uncore.funcs.force_wake_put = fw_domains_put;
@@ -1422,10 +1549,14 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, gen8);
ASSIGN_READ_MMIO_VFUNCS(dev_priv, gen6);
}
- } else {
+ } else if (IS_GEN(dev_priv, 9, 10)) {
ASSIGN_FW_DOMAINS_TABLE(__gen9_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, fwtable);
ASSIGN_READ_MMIO_VFUNCS(dev_priv, fwtable);
+ } else {
+ ASSIGN_FW_DOMAINS_TABLE(__gen11_fw_ranges);
+ ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, gen11_fwtable);
+ ASSIGN_READ_MMIO_VFUNCS(dev_priv, gen11_fwtable);
}
iosf_mbi_register_pmic_bus_access_notifier(
@@ -1985,7 +2116,9 @@ intel_uncore_forcewake_for_read(struct drm_i915_private *dev_priv,
u32 offset = i915_mmio_reg_offset(reg);
enum forcewake_domains fw_domains;
- if (HAS_FWTABLE(dev_priv)) {
+ if (INTEL_GEN(dev_priv) >= 11) {
+ fw_domains = __gen11_fwtable_reg_read_fw_domains(offset);
+ } else if (HAS_FWTABLE(dev_priv)) {
fw_domains = __fwtable_reg_read_fw_domains(offset);
} else if (INTEL_GEN(dev_priv) >= 6) {
fw_domains = __gen6_reg_read_fw_domains(offset);
@@ -2006,7 +2139,9 @@ intel_uncore_forcewake_for_write(struct drm_i915_private *dev_priv,
u32 offset = i915_mmio_reg_offset(reg);
enum forcewake_domains fw_domains;
- if (HAS_FWTABLE(dev_priv) && !IS_VALLEYVIEW(dev_priv)) {
+ if (INTEL_GEN(dev_priv) >= 11) {
+ fw_domains = __gen11_fwtable_reg_write_fw_domains(offset);
+ } else if (HAS_FWTABLE(dev_priv) && !IS_VALLEYVIEW(dev_priv)) {
fw_domains = __fwtable_reg_write_fw_domains(offset);
} else if (IS_GEN8(dev_priv)) {
fw_domains = __gen8_reg_write_fw_domains(offset);
diff --git a/drivers/gpu/drm/i915/intel_uncore.h b/drivers/gpu/drm/i915/intel_uncore.h
index bed019ef000f..9e8330c5808e 100644
--- a/drivers/gpu/drm/i915/intel_uncore.h
+++ b/drivers/gpu/drm/i915/intel_uncore.h
@@ -37,17 +37,36 @@ enum forcewake_domain_id {
FW_DOMAIN_ID_RENDER = 0,
FW_DOMAIN_ID_BLITTER,
FW_DOMAIN_ID_MEDIA,
+ FW_DOMAIN_ID_MEDIA_VDBOX0,
+ FW_DOMAIN_ID_MEDIA_VDBOX1,
+ FW_DOMAIN_ID_MEDIA_VDBOX2,
+ FW_DOMAIN_ID_MEDIA_VDBOX3,
+ FW_DOMAIN_ID_MEDIA_VEBOX0,
+ FW_DOMAIN_ID_MEDIA_VEBOX1,
FW_DOMAIN_ID_COUNT
};
enum forcewake_domains {
- FORCEWAKE_RENDER = BIT(FW_DOMAIN_ID_RENDER),
- FORCEWAKE_BLITTER = BIT(FW_DOMAIN_ID_BLITTER),
- FORCEWAKE_MEDIA = BIT(FW_DOMAIN_ID_MEDIA),
+ FORCEWAKE_RENDER = BIT(FW_DOMAIN_ID_RENDER),
+ FORCEWAKE_BLITTER = BIT(FW_DOMAIN_ID_BLITTER),
+ FORCEWAKE_MEDIA = BIT(FW_DOMAIN_ID_MEDIA),
+ FORCEWAKE_MEDIA_VDBOX0 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX0),
+ FORCEWAKE_MEDIA_VDBOX1 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX1),
+ FORCEWAKE_MEDIA_VDBOX2 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX2),
+ FORCEWAKE_MEDIA_VDBOX3 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX3),
+ FORCEWAKE_MEDIA_VEBOX0 = BIT(FW_DOMAIN_ID_MEDIA_VEBOX0),
+ FORCEWAKE_MEDIA_VEBOX1 = BIT(FW_DOMAIN_ID_MEDIA_VEBOX1),
+
FORCEWAKE_ALL = (FORCEWAKE_RENDER |
FORCEWAKE_BLITTER |
- FORCEWAKE_MEDIA)
+ FORCEWAKE_MEDIA |
+ FORCEWAKE_MEDIA_VDBOX0 |
+ FORCEWAKE_MEDIA_VDBOX1 |
+ FORCEWAKE_MEDIA_VDBOX2 |
+ FORCEWAKE_MEDIA_VDBOX3 |
+ FORCEWAKE_MEDIA_VEBOX0 |
+ FORCEWAKE_MEDIA_VEBOX1)
};
struct intel_uncore_funcs {
diff --git a/drivers/gpu/drm/i915/selftests/intel_uncore.c b/drivers/gpu/drm/i915/selftests/intel_uncore.c
index 2f6367643171..f76f2597df5c 100644
--- a/drivers/gpu/drm/i915/selftests/intel_uncore.c
+++ b/drivers/gpu/drm/i915/selftests/intel_uncore.c
@@ -61,20 +61,30 @@ static int intel_fw_table_check(const struct intel_forcewake_range *ranges,
static int intel_shadow_table_check(void)
{
- const i915_reg_t *reg = gen8_shadowed_regs;
- unsigned int i;
+ struct {
+ const i915_reg_t *regs;
+ unsigned int size;
+ } reg_lists[] = {
+ { gen8_shadowed_regs, ARRAY_SIZE(gen8_shadowed_regs) },
+ { gen11_shadowed_regs, ARRAY_SIZE(gen11_shadowed_regs) },
+ };
+ const i915_reg_t *reg;
+ unsigned int i, j;
s32 prev;
- for (i = 0, prev = -1; i < ARRAY_SIZE(gen8_shadowed_regs); i++, reg++) {
- u32 offset = i915_mmio_reg_offset(*reg);
+ for (j = 0; j < ARRAY_SIZE(reg_lists); ++j) {
+ reg = reg_lists[j].regs;
+ for (i = 0, prev = -1; i < reg_lists[j].size; i++, reg++) {
+ u32 offset = i915_mmio_reg_offset(*reg);
- if (prev >= (s32)offset) {
- pr_err("%s: entry[%d]:(%x) is before previous (%x)\n",
- __func__, i, offset, prev);
- return -EINVAL;
- }
+ if (prev >= (s32)offset) {
+ pr_err("%s: entry[%d]:(%x) is before previous (%x)\n",
+ __func__, i, offset, prev);
+ return -EINVAL;
+ }
- prev = offset;
+ prev = offset;
+ }
}
return 0;
@@ -90,6 +100,7 @@ int intel_uncore_mock_selftests(void)
{ __vlv_fw_ranges, ARRAY_SIZE(__vlv_fw_ranges), false },
{ __chv_fw_ranges, ARRAY_SIZE(__chv_fw_ranges), false },
{ __gen9_fw_ranges, ARRAY_SIZE(__gen9_fw_ranges), true },
+ { __gen11_fw_ranges, ARRAY_SIZE(__gen11_fw_ranges), true },
};
int err, i;
--
2.16.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH v10] drm/i915/icl: Gen11 forcewake support
2018-02-01 0:52 ` [PATCH v10] " Michel Thierry
@ 2018-02-01 10:25 ` Tvrtko Ursulin
2018-02-01 16:02 ` Michel Thierry
2018-02-01 16:08 ` [PATCH v11] " Michel Thierry
` (2 subsequent siblings)
3 siblings, 1 reply; 118+ messages in thread
From: Tvrtko Ursulin @ 2018-02-01 10:25 UTC (permalink / raw)
To: Michel Thierry, intel-gfx; +Cc: Paulo Zanoni
On 01/02/2018 00:52, Michel Thierry wrote:
> From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>
> The main difference with previous GENs is that starting from Gen11
> each VCS and VECS engine has its own power well, which only exist
> if the related engine exists in the HW.
> The fallback forcewake request workaround is only needed on gen9
> according to the HSDES WA entry (1604254524), so we can go back to using
> the simpler fw_domains_get/put functions.
>
> BSpec: 18331
>
> v2: fix fwtable, use array to test shadow tables, create new
> accessors to avoid check on every access (Tvrtko)
> v3 (from Paulo): Rebase.
> v4:
> - Range 09400-097FF should be FORCEWAKE_ALL (Daniele)
> - Use the BIT macro for forcewake domains (Daniele)
> - Add a comment about the range ordering (Oscar)
> - Updated commit message (Oscar)
> v5: Rebased
> v6: Use I915_MAX_VCS/VECS (Michal)
> v7: translate FORCEWAKE_ALL to available domains
> v8: rebase, add clarification on fallback ack in commit message.
> v9: fix rebase issue, change check in fw_domains_init from IS_GEN11
> to GEN >= 11
> v10: Generate is_genX_shadowed with a macro (Daniele)
> Include gen11_fw_ranges in the selftest (Michel)
>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Acked-by: Michel Thierry <michel.thierry@intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
>
> drivers/gpu/drm/i915/i915_reg.h | 4 +
> drivers/gpu/drm/i915/intel_uncore.c | 155 ++++++++++++++++++++++++--
> drivers/gpu/drm/i915/intel_uncore.h | 27 ++++-
> drivers/gpu/drm/i915/selftests/intel_uncore.c | 31 ++++--
> 4 files changed, 193 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index d29e8a0e2ca3..eaca12292ffe 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -8015,9 +8015,13 @@ enum {
> #define VLV_GTLC_PW_RENDER_STATUS_MASK (1 << 7)
> #define FORCEWAKE_MT _MMIO(0xa188) /* multi-threaded */
> #define FORCEWAKE_MEDIA_GEN9 _MMIO(0xa270)
> +#define FORCEWAKE_MEDIA_VDBOX_GEN11(n) _MMIO(0xa540 + (n) * 4)
> +#define FORCEWAKE_MEDIA_VEBOX_GEN11(n) _MMIO(0xa560 + (n) * 4)
> #define FORCEWAKE_RENDER_GEN9 _MMIO(0xa278)
> #define FORCEWAKE_BLITTER_GEN9 _MMIO(0xa188)
> #define FORCEWAKE_ACK_MEDIA_GEN9 _MMIO(0x0D88)
> +#define FORCEWAKE_ACK_MEDIA_VDBOX_GEN11(n) _MMIO(0x0D50 + (n) * 4)
> +#define FORCEWAKE_ACK_MEDIA_VEBOX_GEN11(n) _MMIO(0x0D70 + (n) * 4)
> #define FORCEWAKE_ACK_RENDER_GEN9 _MMIO(0x0D84)
> #define FORCEWAKE_ACK_BLITTER_GEN9 _MMIO(0x130044)
> #define FORCEWAKE_KERNEL BIT(0)
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index 164dbb8cfa36..c1953043604b 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -37,6 +37,12 @@ static const char * const forcewake_domain_names[] = {
> "render",
> "blitter",
> "media",
> + "vdbox0",
> + "vdbox1",
> + "vdbox2",
> + "vdbox3",
> + "vebox0",
> + "vebox1",
> };
>
> const char *
> @@ -773,6 +779,8 @@ void assert_forcewakes_active(struct drm_i915_private *dev_priv,
>
> /* We give fast paths for the really cool registers */
> #define NEEDS_FORCE_WAKE(reg) ((reg) < 0x40000)
> +#define GEN11_NEEDS_FORCE_WAKE(reg) \
> + ((reg) < 0x40000 || ((reg) >= 0x1c0000 && (reg) < 0x1dc000))
Nitpick - I'd perhaps at least have a blank line between the two
defines, or even moved the GEN11 lower in file, just before the first
mention of GEN11 specific code starts appearing.
>
> #define __gen6_reg_read_fw_domains(offset) \
> ({ \
> @@ -826,6 +834,14 @@ find_fw_domain(struct drm_i915_private *dev_priv, u32 offset)
> if (!entry)
> return 0;
>
> + /*
> + * The list of FW domains depends on the SKU in gen11+ so we
> + * can't determine it statically. We use FORCEWAKE_ALL and
> + * translate it here to the list of available domains.
> + */
> + if (entry->domains == FORCEWAKE_ALL)
> + return dev_priv->uncore.fw_domains;
> +
> WARN(entry->domains & ~dev_priv->uncore.fw_domains,
> "Uninitialized forcewake domain(s) 0x%x accessed at 0x%x\n",
> entry->domains & ~dev_priv->uncore.fw_domains, offset);
> @@ -860,6 +876,14 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] = {
> __fwd; \
> })
>
> +#define __gen11_fwtable_reg_read_fw_domains(offset) \
> +({ \
> + enum forcewake_domains __fwd = 0; \
> + if (GEN11_NEEDS_FORCE_WAKE((offset))) \
> + __fwd = find_fw_domain(dev_priv, offset); \
> + __fwd; \
> +})
> +
> /* *Must* be sorted by offset! See intel_shadow_table_check(). */
> static const i915_reg_t gen8_shadowed_regs[] = {
> RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */
> @@ -871,6 +895,20 @@ static const i915_reg_t gen8_shadowed_regs[] = {
> /* TODO: Other registers are not yet used */
> };
>
> +static const i915_reg_t gen11_shadowed_regs[] = {
> + RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */
> + GEN6_RPNSWREQ, /* 0xA008 */
> + GEN6_RC_VIDEO_FREQ, /* 0xA00C */
> + RING_TAIL(BLT_RING_BASE), /* 0x22000 (base) */
> + RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */
> + RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */
> + RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */
> + RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */
> + RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */
> + RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */
> + /* TODO: Other registers are not yet used */
> +};
> +
> static int mmio_reg_cmp(u32 key, const i915_reg_t *reg)
> {
> u32 offset = i915_mmio_reg_offset(*reg);
> @@ -883,14 +921,17 @@ static int mmio_reg_cmp(u32 key, const i915_reg_t *reg)
> return 0;
> }
>
> -static bool is_gen8_shadowed(u32 offset)
> -{
> - const i915_reg_t *regs = gen8_shadowed_regs;
> -
> - return BSEARCH(offset, regs, ARRAY_SIZE(gen8_shadowed_regs),
> - mmio_reg_cmp);
> +#define __is_genX_shadowed(x) \
> +static bool is_gen##x##_shadowed(u32 offset) \
> +{ \
> + const i915_reg_t *regs = gen##x##_shadowed_regs; \
> + return BSEARCH(offset, regs, ARRAY_SIZE(gen##x##_shadowed_regs), \
> + mmio_reg_cmp); \
> }
>
> +__is_genX_shadowed(8)
> +__is_genX_shadowed(11)
> +
> #define __gen8_reg_write_fw_domains(offset) \
> ({ \
> enum forcewake_domains __fwd; \
> @@ -929,6 +970,14 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = {
> __fwd; \
> })
>
> +#define __gen11_fwtable_reg_write_fw_domains(offset) \
> +({ \
> + enum forcewake_domains __fwd = 0; \
> + if (GEN11_NEEDS_FORCE_WAKE((offset)) && !is_gen11_shadowed(offset)) \
> + __fwd = find_fw_domain(dev_priv, offset); \
> + __fwd; \
> +})
> +
> /* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
> static const struct intel_forcewake_range __gen9_fw_ranges[] = {
> GEN_FW_RANGE(0x0, 0xaff, FORCEWAKE_BLITTER),
> @@ -965,6 +1014,40 @@ static const struct intel_forcewake_range __gen9_fw_ranges[] = {
> GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
> };
>
> +/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
> +static const struct intel_forcewake_range __gen11_fw_ranges[] = {
> + GEN_FW_RANGE(0x0, 0xaff, FORCEWAKE_BLITTER),
> + GEN_FW_RANGE(0xb00, 0x1fff, 0), /* uncore range */
> + GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),
> + GEN_FW_RANGE(0x2700, 0x2fff, FORCEWAKE_BLITTER),
> + GEN_FW_RANGE(0x3000, 0x3fff, FORCEWAKE_RENDER),
> + GEN_FW_RANGE(0x4000, 0x51ff, FORCEWAKE_BLITTER),
> + GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),
> + GEN_FW_RANGE(0x8000, 0x813f, FORCEWAKE_BLITTER),
> + GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER),
> + GEN_FW_RANGE(0x8160, 0x82ff, FORCEWAKE_BLITTER),
> + GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),
> + GEN_FW_RANGE(0x8500, 0x8bff, FORCEWAKE_BLITTER),
> + GEN_FW_RANGE(0x8c00, 0x8cff, FORCEWAKE_RENDER),
> + GEN_FW_RANGE(0x8d00, 0x93ff, FORCEWAKE_BLITTER),
> + GEN_FW_RANGE(0x9400, 0x97ff, FORCEWAKE_ALL),
> + GEN_FW_RANGE(0x9800, 0xafff, FORCEWAKE_BLITTER),
> + GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
> + GEN_FW_RANGE(0xb480, 0xdfff, FORCEWAKE_BLITTER),
> + GEN_FW_RANGE(0xe000, 0xe8ff, FORCEWAKE_RENDER),
> + GEN_FW_RANGE(0xe900, 0x243ff, FORCEWAKE_BLITTER),
> + GEN_FW_RANGE(0x24400, 0x247ff, FORCEWAKE_RENDER),
> + GEN_FW_RANGE(0x24800, 0x3ffff, FORCEWAKE_BLITTER),
> + GEN_FW_RANGE(0x40000, 0x1bffff, 0),
> + GEN_FW_RANGE(0x1c0000, 0x1c3fff, FORCEWAKE_MEDIA_VDBOX0),
> + GEN_FW_RANGE(0x1c4000, 0x1c7fff, FORCEWAKE_MEDIA_VDBOX1),
> + GEN_FW_RANGE(0x1c8000, 0x1cbfff, FORCEWAKE_MEDIA_VEBOX0),
> + GEN_FW_RANGE(0x1cc000, 0x1cffff, FORCEWAKE_BLITTER),
> + GEN_FW_RANGE(0x1d0000, 0x1d3fff, FORCEWAKE_MEDIA_VDBOX2),
> + GEN_FW_RANGE(0x1d4000, 0x1d7fff, FORCEWAKE_MEDIA_VDBOX3),
> + GEN_FW_RANGE(0x1d8000, 0x1dbfff, FORCEWAKE_MEDIA_VEBOX1)
> +};
> +
> static void
> ilk_dummy_write(struct drm_i915_private *dev_priv)
> {
> @@ -1095,7 +1178,12 @@ func##_read##x(struct drm_i915_private *dev_priv, i915_reg_t reg, bool trace) {
> }
> #define __gen6_read(x) __gen_read(gen6, x)
> #define __fwtable_read(x) __gen_read(fwtable, x)
> +#define __gen11_fwtable_read(x) __gen_read(gen11_fwtable, x)
>
> +__gen11_fwtable_read(8)
> +__gen11_fwtable_read(16)
> +__gen11_fwtable_read(32)
> +__gen11_fwtable_read(64)
> __fwtable_read(8)
> __fwtable_read(16)
> __fwtable_read(32)
> @@ -1105,6 +1193,7 @@ __gen6_read(16)
> __gen6_read(32)
> __gen6_read(64)
>
> +#undef __gen11_fwtable_read
> #undef __fwtable_read
> #undef __gen6_read
> #undef GEN6_READ_FOOTER
> @@ -1181,7 +1270,11 @@ func##_write##x(struct drm_i915_private *dev_priv, i915_reg_t reg, u##x val, boo
> }
> #define __gen8_write(x) __gen_write(gen8, x)
> #define __fwtable_write(x) __gen_write(fwtable, x)
> +#define __gen11_fwtable_write(x) __gen_write(gen11_fwtable, x)
>
> +__gen11_fwtable_write(8)
> +__gen11_fwtable_write(16)
> +__gen11_fwtable_write(32)
> __fwtable_write(8)
> __fwtable_write(16)
> __fwtable_write(32)
> @@ -1192,6 +1285,7 @@ __gen6_write(8)
> __gen6_write(16)
> __gen6_write(32)
>
> +#undef __gen11_fwtable_write
> #undef __fwtable_write
> #undef __gen8_write
> #undef __gen6_write
> @@ -1240,6 +1334,13 @@ static void fw_domain_init(struct drm_i915_private *dev_priv,
> BUILD_BUG_ON(FORCEWAKE_RENDER != (1 << FW_DOMAIN_ID_RENDER));
> BUILD_BUG_ON(FORCEWAKE_BLITTER != (1 << FW_DOMAIN_ID_BLITTER));
> BUILD_BUG_ON(FORCEWAKE_MEDIA != (1 << FW_DOMAIN_ID_MEDIA));
> + BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX0 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX0));
> + BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX1 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX1));
> + BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX2 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX2));
> + BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX3 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX3));
> + BUILD_BUG_ON(FORCEWAKE_MEDIA_VEBOX0 != (1 << FW_DOMAIN_ID_MEDIA_VEBOX0));
> + BUILD_BUG_ON(FORCEWAKE_MEDIA_VEBOX1 != (1 << FW_DOMAIN_ID_MEDIA_VEBOX1));
> +
>
> d->mask = BIT(domain_id);
>
> @@ -1267,7 +1368,34 @@ static void intel_uncore_fw_domains_init(struct drm_i915_private *dev_priv)
> dev_priv->uncore.fw_clear = _MASKED_BIT_DISABLE(FORCEWAKE_KERNEL);
> }
>
> - if (INTEL_GEN(dev_priv) >= 9) {
> + if (INTEL_GEN(dev_priv) >= 11) {
> + int i;
> +
> + dev_priv->uncore.funcs.force_wake_get = fw_domains_get;
> + dev_priv->uncore.funcs.force_wake_put = fw_domains_put;
> + fw_domain_init(dev_priv, FW_DOMAIN_ID_RENDER,
> + FORCEWAKE_RENDER_GEN9,
> + FORCEWAKE_ACK_RENDER_GEN9);
> + fw_domain_init(dev_priv, FW_DOMAIN_ID_BLITTER,
> + FORCEWAKE_BLITTER_GEN9,
> + FORCEWAKE_ACK_BLITTER_GEN9);
> + for (i = 0; i < I915_MAX_VCS; i++) {
> + if (!HAS_ENGINE(dev_priv, _VCS(i)))
> + continue;
> +
> + fw_domain_init(dev_priv, FW_DOMAIN_ID_MEDIA_VDBOX0 + i,
> + FORCEWAKE_MEDIA_VDBOX_GEN11(i),
> + FORCEWAKE_ACK_MEDIA_VDBOX_GEN11(i));
> + }
> + for (i = 0; i < I915_MAX_VECS; i++) {
> + if (!HAS_ENGINE(dev_priv, _VECS(i)))
> + continue;
> +
> + fw_domain_init(dev_priv, FW_DOMAIN_ID_MEDIA_VEBOX0 + i,
> + FORCEWAKE_MEDIA_VEBOX_GEN11(i),
> + FORCEWAKE_ACK_MEDIA_VEBOX_GEN11(i));
> + }
> + } else if (IS_GEN9(dev_priv) || IS_GEN10(dev_priv)) {
> dev_priv->uncore.funcs.force_wake_get =
> fw_domains_get_with_fallback;
> dev_priv->uncore.funcs.force_wake_put = fw_domains_put;
> @@ -1422,10 +1549,14 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
> ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, gen8);
> ASSIGN_READ_MMIO_VFUNCS(dev_priv, gen6);
> }
> - } else {
> + } else if (IS_GEN(dev_priv, 9, 10)) {
> ASSIGN_FW_DOMAINS_TABLE(__gen9_fw_ranges);
> ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, fwtable);
> ASSIGN_READ_MMIO_VFUNCS(dev_priv, fwtable);
> + } else {
> + ASSIGN_FW_DOMAINS_TABLE(__gen11_fw_ranges);
> + ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, gen11_fwtable);
> + ASSIGN_READ_MMIO_VFUNCS(dev_priv, gen11_fwtable);
> }
>
> iosf_mbi_register_pmic_bus_access_notifier(
> @@ -1985,7 +2116,9 @@ intel_uncore_forcewake_for_read(struct drm_i915_private *dev_priv,
> u32 offset = i915_mmio_reg_offset(reg);
> enum forcewake_domains fw_domains;
>
> - if (HAS_FWTABLE(dev_priv)) {
> + if (INTEL_GEN(dev_priv) >= 11) {
> + fw_domains = __gen11_fwtable_reg_read_fw_domains(offset);
> + } else if (HAS_FWTABLE(dev_priv)) {
> fw_domains = __fwtable_reg_read_fw_domains(offset);
> } else if (INTEL_GEN(dev_priv) >= 6) {
> fw_domains = __gen6_reg_read_fw_domains(offset);
> @@ -2006,7 +2139,9 @@ intel_uncore_forcewake_for_write(struct drm_i915_private *dev_priv,
> u32 offset = i915_mmio_reg_offset(reg);
> enum forcewake_domains fw_domains;
>
> - if (HAS_FWTABLE(dev_priv) && !IS_VALLEYVIEW(dev_priv)) {
> + if (INTEL_GEN(dev_priv) >= 11) {
> + fw_domains = __gen11_fwtable_reg_write_fw_domains(offset);
> + } else if (HAS_FWTABLE(dev_priv) && !IS_VALLEYVIEW(dev_priv)) {
> fw_domains = __fwtable_reg_write_fw_domains(offset);
> } else if (IS_GEN8(dev_priv)) {
> fw_domains = __gen8_reg_write_fw_domains(offset);
> diff --git a/drivers/gpu/drm/i915/intel_uncore.h b/drivers/gpu/drm/i915/intel_uncore.h
> index bed019ef000f..9e8330c5808e 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.h
> +++ b/drivers/gpu/drm/i915/intel_uncore.h
> @@ -37,17 +37,36 @@ enum forcewake_domain_id {
> FW_DOMAIN_ID_RENDER = 0,
> FW_DOMAIN_ID_BLITTER,
> FW_DOMAIN_ID_MEDIA,
> + FW_DOMAIN_ID_MEDIA_VDBOX0,
> + FW_DOMAIN_ID_MEDIA_VDBOX1,
> + FW_DOMAIN_ID_MEDIA_VDBOX2,
> + FW_DOMAIN_ID_MEDIA_VDBOX3,
> + FW_DOMAIN_ID_MEDIA_VEBOX0,
> + FW_DOMAIN_ID_MEDIA_VEBOX1,
>
> FW_DOMAIN_ID_COUNT
> };
>
> enum forcewake_domains {
> - FORCEWAKE_RENDER = BIT(FW_DOMAIN_ID_RENDER),
> - FORCEWAKE_BLITTER = BIT(FW_DOMAIN_ID_BLITTER),
> - FORCEWAKE_MEDIA = BIT(FW_DOMAIN_ID_MEDIA),
> + FORCEWAKE_RENDER = BIT(FW_DOMAIN_ID_RENDER),
> + FORCEWAKE_BLITTER = BIT(FW_DOMAIN_ID_BLITTER),
> + FORCEWAKE_MEDIA = BIT(FW_DOMAIN_ID_MEDIA),
> + FORCEWAKE_MEDIA_VDBOX0 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX0),
> + FORCEWAKE_MEDIA_VDBOX1 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX1),
> + FORCEWAKE_MEDIA_VDBOX2 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX2),
> + FORCEWAKE_MEDIA_VDBOX3 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX3),
> + FORCEWAKE_MEDIA_VEBOX0 = BIT(FW_DOMAIN_ID_MEDIA_VEBOX0),
> + FORCEWAKE_MEDIA_VEBOX1 = BIT(FW_DOMAIN_ID_MEDIA_VEBOX1),
> +
> FORCEWAKE_ALL = (FORCEWAKE_RENDER |
> FORCEWAKE_BLITTER |
> - FORCEWAKE_MEDIA)
> + FORCEWAKE_MEDIA |
> + FORCEWAKE_MEDIA_VDBOX0 |
> + FORCEWAKE_MEDIA_VDBOX1 |
> + FORCEWAKE_MEDIA_VDBOX2 |
> + FORCEWAKE_MEDIA_VDBOX3 |
> + FORCEWAKE_MEDIA_VEBOX0 |
> + FORCEWAKE_MEDIA_VEBOX1)
If I am not confused, this this could be simplified as:
FORCEWAKE_ALL = BIT(FW_DOMAIN_ID_COUNT) - 1;
> };
>
> struct intel_uncore_funcs {
> diff --git a/drivers/gpu/drm/i915/selftests/intel_uncore.c b/drivers/gpu/drm/i915/selftests/intel_uncore.c
> index 2f6367643171..f76f2597df5c 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_uncore.c
> @@ -61,20 +61,30 @@ static int intel_fw_table_check(const struct intel_forcewake_range *ranges,
>
> static int intel_shadow_table_check(void)
> {
> - const i915_reg_t *reg = gen8_shadowed_regs;
> - unsigned int i;
> + struct {
> + const i915_reg_t *regs;
> + unsigned int size;
> + } reg_lists[] = {
> + { gen8_shadowed_regs, ARRAY_SIZE(gen8_shadowed_regs) },
> + { gen11_shadowed_regs, ARRAY_SIZE(gen11_shadowed_regs) },
> + };
> + const i915_reg_t *reg;
> + unsigned int i, j;
> s32 prev;
>
> - for (i = 0, prev = -1; i < ARRAY_SIZE(gen8_shadowed_regs); i++, reg++) {
> - u32 offset = i915_mmio_reg_offset(*reg);
> + for (j = 0; j < ARRAY_SIZE(reg_lists); ++j) {
> + reg = reg_lists[j].regs;
> + for (i = 0, prev = -1; i < reg_lists[j].size; i++, reg++) {
> + u32 offset = i915_mmio_reg_offset(*reg);
>
> - if (prev >= (s32)offset) {
> - pr_err("%s: entry[%d]:(%x) is before previous (%x)\n",
> - __func__, i, offset, prev);
> - return -EINVAL;
> - }
> + if (prev >= (s32)offset) {
> + pr_err("%s: entry[%d]:(%x) is before previous (%x)\n",
> + __func__, i, offset, prev);
> + return -EINVAL;
> + }
>
> - prev = offset;
> + prev = offset;
> + }
> }
>
> return 0;
> @@ -90,6 +100,7 @@ int intel_uncore_mock_selftests(void)
> { __vlv_fw_ranges, ARRAY_SIZE(__vlv_fw_ranges), false },
> { __chv_fw_ranges, ARRAY_SIZE(__chv_fw_ranges), false },
> { __gen9_fw_ranges, ARRAY_SIZE(__gen9_fw_ranges), true },
> + { __gen11_fw_ranges, ARRAY_SIZE(__gen11_fw_ranges), true },
> };
> int err, i;
>
>
I haven't checked the ranges, but the code looks good. With or without
the nitpicks:
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH v10] drm/i915/icl: Gen11 forcewake support
2018-02-01 10:25 ` Tvrtko Ursulin
@ 2018-02-01 16:02 ` Michel Thierry
0 siblings, 0 replies; 118+ messages in thread
From: Michel Thierry @ 2018-02-01 16:02 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx; +Cc: Paulo Zanoni
On 2/1/2018 2:25 AM, Tvrtko Ursulin wrote:
>
> On 01/02/2018 00:52, Michel Thierry wrote:
>> From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>
>> The main difference with previous GENs is that starting from Gen11
>> each VCS and VECS engine has its own power well, which only exist
>> if the related engine exists in the HW.
>> The fallback forcewake request workaround is only needed on gen9
>> according to the HSDES WA entry (1604254524), so we can go back to using
>> the simpler fw_domains_get/put functions.
>>
>> BSpec: 18331
>>
>> v2: fix fwtable, use array to test shadow tables, create new
>> accessors to avoid check on every access (Tvrtko)
>> v3 (from Paulo): Rebase.
>> v4:
>> - Range 09400-097FF should be FORCEWAKE_ALL (Daniele)
>> - Use the BIT macro for forcewake domains (Daniele)
>> - Add a comment about the range ordering (Oscar)
>> - Updated commit message (Oscar)
>> v5: Rebased
>> v6: Use I915_MAX_VCS/VECS (Michal)
>> v7: translate FORCEWAKE_ALL to available domains
>> v8: rebase, add clarification on fallback ack in commit message.
>> v9: fix rebase issue, change check in fw_domains_init from IS_GEN11
>> to GEN >= 11
>> v10: Generate is_genX_shadowed with a macro (Daniele)
>> Include gen11_fw_ranges in the selftest (Michel)
>>
>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> Acked-by: Michel Thierry <michel.thierry@intel.com>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>> ---
>>
>> drivers/gpu/drm/i915/i915_reg.h | 4 +
>> drivers/gpu/drm/i915/intel_uncore.c | 155
>> ++++++++++++++++++++++++--
>> drivers/gpu/drm/i915/intel_uncore.h | 27 ++++-
>> drivers/gpu/drm/i915/selftests/intel_uncore.c | 31 ++++--
>> 4 files changed, 193 insertions(+), 24 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h
>> b/drivers/gpu/drm/i915/i915_reg.h
>> index d29e8a0e2ca3..eaca12292ffe 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -8015,9 +8015,13 @@ enum {
>> #define VLV_GTLC_PW_RENDER_STATUS_MASK (1 << 7)
>> #define FORCEWAKE_MT _MMIO(0xa188) /* multi-threaded */
>> #define FORCEWAKE_MEDIA_GEN9 _MMIO(0xa270)
>> +#define FORCEWAKE_MEDIA_VDBOX_GEN11(n) _MMIO(0xa540 + (n) * 4)
>> +#define FORCEWAKE_MEDIA_VEBOX_GEN11(n) _MMIO(0xa560 + (n) * 4)
>> #define FORCEWAKE_RENDER_GEN9 _MMIO(0xa278)
>> #define FORCEWAKE_BLITTER_GEN9 _MMIO(0xa188)
>> #define FORCEWAKE_ACK_MEDIA_GEN9 _MMIO(0x0D88)
>> +#define FORCEWAKE_ACK_MEDIA_VDBOX_GEN11(n) _MMIO(0x0D50 + (n) * 4)
>> +#define FORCEWAKE_ACK_MEDIA_VEBOX_GEN11(n) _MMIO(0x0D70 + (n) * 4)
>> #define FORCEWAKE_ACK_RENDER_GEN9 _MMIO(0x0D84)
>> #define FORCEWAKE_ACK_BLITTER_GEN9 _MMIO(0x130044)
>> #define FORCEWAKE_KERNEL BIT(0)
>> diff --git a/drivers/gpu/drm/i915/intel_uncore.c
>> b/drivers/gpu/drm/i915/intel_uncore.c
>> index 164dbb8cfa36..c1953043604b 100644
>> --- a/drivers/gpu/drm/i915/intel_uncore.c
>> +++ b/drivers/gpu/drm/i915/intel_uncore.c
>> @@ -37,6 +37,12 @@ static const char * const forcewake_domain_names[] = {
>> "render",
>> "blitter",
>> "media",
>> + "vdbox0",
>> + "vdbox1",
>> + "vdbox2",
>> + "vdbox3",
>> + "vebox0",
>> + "vebox1",
>> };
>> const char *
>> @@ -773,6 +779,8 @@ void assert_forcewakes_active(struct
>> drm_i915_private *dev_priv,
>> /* We give fast paths for the really cool registers */
>> #define NEEDS_FORCE_WAKE(reg) ((reg) < 0x40000)
>> +#define GEN11_NEEDS_FORCE_WAKE(reg) \
>> + ((reg) < 0x40000 || ((reg) >= 0x1c0000 && (reg) < 0x1dc000))
>
> Nitpick - I'd perhaps at least have a blank line between the two
> defines, or even moved the GEN11 lower in file, just before the first
> mention of GEN11 specific code starts appearing.
>
I'd go for a new blank line, it makes it obvious something changed
between gens.
>> #define __gen6_reg_read_fw_domains(offset) \
>> ({ \
>> @@ -826,6 +834,14 @@ find_fw_domain(struct drm_i915_private *dev_priv,
>> u32 offset)
>> if (!entry)
>> return 0;
>> + /*
>> + * The list of FW domains depends on the SKU in gen11+ so we
>> + * can't determine it statically. We use FORCEWAKE_ALL and
>> + * translate it here to the list of available domains.
>> + */
>> + if (entry->domains == FORCEWAKE_ALL)
>> + return dev_priv->uncore.fw_domains;
>> +
>> WARN(entry->domains & ~dev_priv->uncore.fw_domains,
>> "Uninitialized forcewake domain(s) 0x%x accessed at 0x%x\n",
>> entry->domains & ~dev_priv->uncore.fw_domains, offset);
>> @@ -860,6 +876,14 @@ static const struct intel_forcewake_range
>> __vlv_fw_ranges[] = {
>> __fwd; \
>> })
>> +#define __gen11_fwtable_reg_read_fw_domains(offset) \
>> +({ \
>> + enum forcewake_domains __fwd = 0; \
>> + if (GEN11_NEEDS_FORCE_WAKE((offset))) \
>> + __fwd = find_fw_domain(dev_priv, offset); \
>> + __fwd; \
>> +})
>> +
>> /* *Must* be sorted by offset! See intel_shadow_table_check(). */
>> static const i915_reg_t gen8_shadowed_regs[] = {
>> RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */
>> @@ -871,6 +895,20 @@ static const i915_reg_t gen8_shadowed_regs[] = {
>> /* TODO: Other registers are not yet used */
>> };
>> +static const i915_reg_t gen11_shadowed_regs[] = {
>> + RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */
>> + GEN6_RPNSWREQ, /* 0xA008 */
>> + GEN6_RC_VIDEO_FREQ, /* 0xA00C */
>> + RING_TAIL(BLT_RING_BASE), /* 0x22000 (base) */
>> + RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */
>> + RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */
>> + RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */
>> + RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */
>> + RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */
>> + RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */
>> + /* TODO: Other registers are not yet used */
>> +};
>> +
>> static int mmio_reg_cmp(u32 key, const i915_reg_t *reg)
>> {
>> u32 offset = i915_mmio_reg_offset(*reg);
>> @@ -883,14 +921,17 @@ static int mmio_reg_cmp(u32 key, const
>> i915_reg_t *reg)
>> return 0;
>> }
>> -static bool is_gen8_shadowed(u32 offset)
>> -{
>> - const i915_reg_t *regs = gen8_shadowed_regs;
>> -
>> - return BSEARCH(offset, regs, ARRAY_SIZE(gen8_shadowed_regs),
>> - mmio_reg_cmp);
>> +#define __is_genX_shadowed(x) \
>> +static bool is_gen##x##_shadowed(u32 offset) \
>> +{ \
>> + const i915_reg_t *regs = gen##x##_shadowed_regs; \
>> + return BSEARCH(offset, regs, ARRAY_SIZE(gen##x##_shadowed_regs), \
>> + mmio_reg_cmp); \
>> }
>> +__is_genX_shadowed(8)
>> +__is_genX_shadowed(11)
>> +
>> #define __gen8_reg_write_fw_domains(offset) \
>> ({ \
>> enum forcewake_domains __fwd; \
>> @@ -929,6 +970,14 @@ static const struct intel_forcewake_range
>> __chv_fw_ranges[] = {
>> __fwd; \
>> })
>> +#define __gen11_fwtable_reg_write_fw_domains(offset) \
>> +({ \
>> + enum forcewake_domains __fwd = 0; \
>> + if (GEN11_NEEDS_FORCE_WAKE((offset)) &&
>> !is_gen11_shadowed(offset)) \
>> + __fwd = find_fw_domain(dev_priv, offset); \
>> + __fwd; \
>> +})
>> +
>> /* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
>> static const struct intel_forcewake_range __gen9_fw_ranges[] = {
>> GEN_FW_RANGE(0x0, 0xaff, FORCEWAKE_BLITTER),
>> @@ -965,6 +1014,40 @@ static const struct intel_forcewake_range
>> __gen9_fw_ranges[] = {
>> GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
>> };
>> +/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
>> +static const struct intel_forcewake_range __gen11_fw_ranges[] = {
>> + GEN_FW_RANGE(0x0, 0xaff, FORCEWAKE_BLITTER),
>> + GEN_FW_RANGE(0xb00, 0x1fff, 0), /* uncore range */
>> + GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),
>> + GEN_FW_RANGE(0x2700, 0x2fff, FORCEWAKE_BLITTER),
>> + GEN_FW_RANGE(0x3000, 0x3fff, FORCEWAKE_RENDER),
>> + GEN_FW_RANGE(0x4000, 0x51ff, FORCEWAKE_BLITTER),
>> + GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),
>> + GEN_FW_RANGE(0x8000, 0x813f, FORCEWAKE_BLITTER),
>> + GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER),
>> + GEN_FW_RANGE(0x8160, 0x82ff, FORCEWAKE_BLITTER),
>> + GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),
>> + GEN_FW_RANGE(0x8500, 0x8bff, FORCEWAKE_BLITTER),
>> + GEN_FW_RANGE(0x8c00, 0x8cff, FORCEWAKE_RENDER),
>> + GEN_FW_RANGE(0x8d00, 0x93ff, FORCEWAKE_BLITTER),
>> + GEN_FW_RANGE(0x9400, 0x97ff, FORCEWAKE_ALL),
>> + GEN_FW_RANGE(0x9800, 0xafff, FORCEWAKE_BLITTER),
>> + GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
>> + GEN_FW_RANGE(0xb480, 0xdfff, FORCEWAKE_BLITTER),
>> + GEN_FW_RANGE(0xe000, 0xe8ff, FORCEWAKE_RENDER),
>> + GEN_FW_RANGE(0xe900, 0x243ff, FORCEWAKE_BLITTER),
>> + GEN_FW_RANGE(0x24400, 0x247ff, FORCEWAKE_RENDER),
>> + GEN_FW_RANGE(0x24800, 0x3ffff, FORCEWAKE_BLITTER),
>> + GEN_FW_RANGE(0x40000, 0x1bffff, 0),
>> + GEN_FW_RANGE(0x1c0000, 0x1c3fff, FORCEWAKE_MEDIA_VDBOX0),
>> + GEN_FW_RANGE(0x1c4000, 0x1c7fff, FORCEWAKE_MEDIA_VDBOX1),
>> + GEN_FW_RANGE(0x1c8000, 0x1cbfff, FORCEWAKE_MEDIA_VEBOX0),
>> + GEN_FW_RANGE(0x1cc000, 0x1cffff, FORCEWAKE_BLITTER),
>> + GEN_FW_RANGE(0x1d0000, 0x1d3fff, FORCEWAKE_MEDIA_VDBOX2),
>> + GEN_FW_RANGE(0x1d4000, 0x1d7fff, FORCEWAKE_MEDIA_VDBOX3),
>> + GEN_FW_RANGE(0x1d8000, 0x1dbfff, FORCEWAKE_MEDIA_VEBOX1)
>> +};
>> +
>> static void
>> ilk_dummy_write(struct drm_i915_private *dev_priv)
>> {
>> @@ -1095,7 +1178,12 @@ func##_read##x(struct drm_i915_private
>> *dev_priv, i915_reg_t reg, bool trace) {
>> }
>> #define __gen6_read(x) __gen_read(gen6, x)
>> #define __fwtable_read(x) __gen_read(fwtable, x)
>> +#define __gen11_fwtable_read(x) __gen_read(gen11_fwtable, x)
>> +__gen11_fwtable_read(8)
>> +__gen11_fwtable_read(16)
>> +__gen11_fwtable_read(32)
>> +__gen11_fwtable_read(64)
>> __fwtable_read(8)
>> __fwtable_read(16)
>> __fwtable_read(32)
>> @@ -1105,6 +1193,7 @@ __gen6_read(16)
>> __gen6_read(32)
>> __gen6_read(64)
>> +#undef __gen11_fwtable_read
>> #undef __fwtable_read
>> #undef __gen6_read
>> #undef GEN6_READ_FOOTER
>> @@ -1181,7 +1270,11 @@ func##_write##x(struct drm_i915_private
>> *dev_priv, i915_reg_t reg, u##x val, boo
>> }
>> #define __gen8_write(x) __gen_write(gen8, x)
>> #define __fwtable_write(x) __gen_write(fwtable, x)
>> +#define __gen11_fwtable_write(x) __gen_write(gen11_fwtable, x)
>> +__gen11_fwtable_write(8)
>> +__gen11_fwtable_write(16)
>> +__gen11_fwtable_write(32)
>> __fwtable_write(8)
>> __fwtable_write(16)
>> __fwtable_write(32)
>> @@ -1192,6 +1285,7 @@ __gen6_write(8)
>> __gen6_write(16)
>> __gen6_write(32)
>> +#undef __gen11_fwtable_write
>> #undef __fwtable_write
>> #undef __gen8_write
>> #undef __gen6_write
>> @@ -1240,6 +1334,13 @@ static void fw_domain_init(struct
>> drm_i915_private *dev_priv,
>> BUILD_BUG_ON(FORCEWAKE_RENDER != (1 << FW_DOMAIN_ID_RENDER));
>> BUILD_BUG_ON(FORCEWAKE_BLITTER != (1 << FW_DOMAIN_ID_BLITTER));
>> BUILD_BUG_ON(FORCEWAKE_MEDIA != (1 << FW_DOMAIN_ID_MEDIA));
>> + BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX0 != (1 <<
>> FW_DOMAIN_ID_MEDIA_VDBOX0));
>> + BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX1 != (1 <<
>> FW_DOMAIN_ID_MEDIA_VDBOX1));
>> + BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX2 != (1 <<
>> FW_DOMAIN_ID_MEDIA_VDBOX2));
>> + BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX3 != (1 <<
>> FW_DOMAIN_ID_MEDIA_VDBOX3));
>> + BUILD_BUG_ON(FORCEWAKE_MEDIA_VEBOX0 != (1 <<
>> FW_DOMAIN_ID_MEDIA_VEBOX0));
>> + BUILD_BUG_ON(FORCEWAKE_MEDIA_VEBOX1 != (1 <<
>> FW_DOMAIN_ID_MEDIA_VEBOX1));
>> +
>> d->mask = BIT(domain_id);
>> @@ -1267,7 +1368,34 @@ static void intel_uncore_fw_domains_init(struct
>> drm_i915_private *dev_priv)
>> dev_priv->uncore.fw_clear =
>> _MASKED_BIT_DISABLE(FORCEWAKE_KERNEL);
>> }
>> - if (INTEL_GEN(dev_priv) >= 9) {
>> + if (INTEL_GEN(dev_priv) >= 11) {
>> + int i;
>> +
>> + dev_priv->uncore.funcs.force_wake_get = fw_domains_get;
>> + dev_priv->uncore.funcs.force_wake_put = fw_domains_put;
>> + fw_domain_init(dev_priv, FW_DOMAIN_ID_RENDER,
>> + FORCEWAKE_RENDER_GEN9,
>> + FORCEWAKE_ACK_RENDER_GEN9);
>> + fw_domain_init(dev_priv, FW_DOMAIN_ID_BLITTER,
>> + FORCEWAKE_BLITTER_GEN9,
>> + FORCEWAKE_ACK_BLITTER_GEN9);
>> + for (i = 0; i < I915_MAX_VCS; i++) {
>> + if (!HAS_ENGINE(dev_priv, _VCS(i)))
>> + continue;
>> +
>> + fw_domain_init(dev_priv, FW_DOMAIN_ID_MEDIA_VDBOX0 + i,
>> + FORCEWAKE_MEDIA_VDBOX_GEN11(i),
>> + FORCEWAKE_ACK_MEDIA_VDBOX_GEN11(i));
>> + }
>> + for (i = 0; i < I915_MAX_VECS; i++) {
>> + if (!HAS_ENGINE(dev_priv, _VECS(i)))
>> + continue;
>> +
>> + fw_domain_init(dev_priv, FW_DOMAIN_ID_MEDIA_VEBOX0 + i,
>> + FORCEWAKE_MEDIA_VEBOX_GEN11(i),
>> + FORCEWAKE_ACK_MEDIA_VEBOX_GEN11(i));
>> + }
>> + } else if (IS_GEN9(dev_priv) || IS_GEN10(dev_priv)) {
>> dev_priv->uncore.funcs.force_wake_get =
>> fw_domains_get_with_fallback;
>> dev_priv->uncore.funcs.force_wake_put = fw_domains_put;
>> @@ -1422,10 +1549,14 @@ void intel_uncore_init(struct drm_i915_private
>> *dev_priv)
>> ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, gen8);
>> ASSIGN_READ_MMIO_VFUNCS(dev_priv, gen6);
>> }
>> - } else {
>> + } else if (IS_GEN(dev_priv, 9, 10)) {
>> ASSIGN_FW_DOMAINS_TABLE(__gen9_fw_ranges);
>> ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, fwtable);
>> ASSIGN_READ_MMIO_VFUNCS(dev_priv, fwtable);
>> + } else {
>> + ASSIGN_FW_DOMAINS_TABLE(__gen11_fw_ranges);
>> + ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, gen11_fwtable);
>> + ASSIGN_READ_MMIO_VFUNCS(dev_priv, gen11_fwtable);
>> }
>> iosf_mbi_register_pmic_bus_access_notifier(
>> @@ -1985,7 +2116,9 @@ intel_uncore_forcewake_for_read(struct
>> drm_i915_private *dev_priv,
>> u32 offset = i915_mmio_reg_offset(reg);
>> enum forcewake_domains fw_domains;
>> - if (HAS_FWTABLE(dev_priv)) {
>> + if (INTEL_GEN(dev_priv) >= 11) {
>> + fw_domains = __gen11_fwtable_reg_read_fw_domains(offset);
>> + } else if (HAS_FWTABLE(dev_priv)) {
>> fw_domains = __fwtable_reg_read_fw_domains(offset);
>> } else if (INTEL_GEN(dev_priv) >= 6) {
>> fw_domains = __gen6_reg_read_fw_domains(offset);
>> @@ -2006,7 +2139,9 @@ intel_uncore_forcewake_for_write(struct
>> drm_i915_private *dev_priv,
>> u32 offset = i915_mmio_reg_offset(reg);
>> enum forcewake_domains fw_domains;
>> - if (HAS_FWTABLE(dev_priv) && !IS_VALLEYVIEW(dev_priv)) {
>> + if (INTEL_GEN(dev_priv) >= 11) {
>> + fw_domains = __gen11_fwtable_reg_write_fw_domains(offset);
>> + } else if (HAS_FWTABLE(dev_priv) && !IS_VALLEYVIEW(dev_priv)) {
>> fw_domains = __fwtable_reg_write_fw_domains(offset);
>> } else if (IS_GEN8(dev_priv)) {
>> fw_domains = __gen8_reg_write_fw_domains(offset);
>> diff --git a/drivers/gpu/drm/i915/intel_uncore.h
>> b/drivers/gpu/drm/i915/intel_uncore.h
>> index bed019ef000f..9e8330c5808e 100644
>> --- a/drivers/gpu/drm/i915/intel_uncore.h
>> +++ b/drivers/gpu/drm/i915/intel_uncore.h
>> @@ -37,17 +37,36 @@ enum forcewake_domain_id {
>> FW_DOMAIN_ID_RENDER = 0,
>> FW_DOMAIN_ID_BLITTER,
>> FW_DOMAIN_ID_MEDIA,
>> + FW_DOMAIN_ID_MEDIA_VDBOX0,
>> + FW_DOMAIN_ID_MEDIA_VDBOX1,
>> + FW_DOMAIN_ID_MEDIA_VDBOX2,
>> + FW_DOMAIN_ID_MEDIA_VDBOX3,
>> + FW_DOMAIN_ID_MEDIA_VEBOX0,
>> + FW_DOMAIN_ID_MEDIA_VEBOX1,
>> FW_DOMAIN_ID_COUNT
>> };
>> enum forcewake_domains {
>> - FORCEWAKE_RENDER = BIT(FW_DOMAIN_ID_RENDER),
>> - FORCEWAKE_BLITTER = BIT(FW_DOMAIN_ID_BLITTER),
>> - FORCEWAKE_MEDIA = BIT(FW_DOMAIN_ID_MEDIA),
>> + FORCEWAKE_RENDER = BIT(FW_DOMAIN_ID_RENDER),
>> + FORCEWAKE_BLITTER = BIT(FW_DOMAIN_ID_BLITTER),
>> + FORCEWAKE_MEDIA = BIT(FW_DOMAIN_ID_MEDIA),
>> + FORCEWAKE_MEDIA_VDBOX0 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX0),
>> + FORCEWAKE_MEDIA_VDBOX1 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX1),
>> + FORCEWAKE_MEDIA_VDBOX2 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX2),
>> + FORCEWAKE_MEDIA_VDBOX3 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX3),
>> + FORCEWAKE_MEDIA_VEBOX0 = BIT(FW_DOMAIN_ID_MEDIA_VEBOX0),
>> + FORCEWAKE_MEDIA_VEBOX1 = BIT(FW_DOMAIN_ID_MEDIA_VEBOX1),
>> +
>> FORCEWAKE_ALL = (FORCEWAKE_RENDER |
>> FORCEWAKE_BLITTER |
>> - FORCEWAKE_MEDIA)
>> + FORCEWAKE_MEDIA |
>> + FORCEWAKE_MEDIA_VDBOX0 |
>> + FORCEWAKE_MEDIA_VDBOX1 |
>> + FORCEWAKE_MEDIA_VDBOX2 |
>> + FORCEWAKE_MEDIA_VDBOX3 |
>> + FORCEWAKE_MEDIA_VEBOX0 |
>> + FORCEWAKE_MEDIA_VEBOX1)
>
> If I am not confused, this this could be simplified as:
>
> FORCEWAKE_ALL = BIT(FW_DOMAIN_ID_COUNT) - 1;
>
Yes, that's the same (and looks much better).
>> };
>> struct intel_uncore_funcs {
>> diff --git a/drivers/gpu/drm/i915/selftests/intel_uncore.c
>> b/drivers/gpu/drm/i915/selftests/intel_uncore.c
>> index 2f6367643171..f76f2597df5c 100644
>> --- a/drivers/gpu/drm/i915/selftests/intel_uncore.c
>> +++ b/drivers/gpu/drm/i915/selftests/intel_uncore.c
>> @@ -61,20 +61,30 @@ static int intel_fw_table_check(const struct
>> intel_forcewake_range *ranges,
>> static int intel_shadow_table_check(void)
>> {
>> - const i915_reg_t *reg = gen8_shadowed_regs;
>> - unsigned int i;
>> + struct {
>> + const i915_reg_t *regs;
>> + unsigned int size;
>> + } reg_lists[] = {
>> + { gen8_shadowed_regs, ARRAY_SIZE(gen8_shadowed_regs) },
>> + { gen11_shadowed_regs, ARRAY_SIZE(gen11_shadowed_regs) },
>> + };
>> + const i915_reg_t *reg;
>> + unsigned int i, j;
>> s32 prev;
>> - for (i = 0, prev = -1; i < ARRAY_SIZE(gen8_shadowed_regs); i++,
>> reg++) {
>> - u32 offset = i915_mmio_reg_offset(*reg);
>> + for (j = 0; j < ARRAY_SIZE(reg_lists); ++j) {
>> + reg = reg_lists[j].regs;
>> + for (i = 0, prev = -1; i < reg_lists[j].size; i++, reg++) {
>> + u32 offset = i915_mmio_reg_offset(*reg);
>> - if (prev >= (s32)offset) {
>> - pr_err("%s: entry[%d]:(%x) is before previous (%x)\n",
>> - __func__, i, offset, prev);
>> - return -EINVAL;
>> - }
>> + if (prev >= (s32)offset) {
>> + pr_err("%s: entry[%d]:(%x) is before previous (%x)\n",
>> + __func__, i, offset, prev);
>> + return -EINVAL;
>> + }
>> - prev = offset;
>> + prev = offset;
>> + }
>> }
>> return 0;
>> @@ -90,6 +100,7 @@ int intel_uncore_mock_selftests(void)
>> { __vlv_fw_ranges, ARRAY_SIZE(__vlv_fw_ranges), false },
>> { __chv_fw_ranges, ARRAY_SIZE(__chv_fw_ranges), false },
>> { __gen9_fw_ranges, ARRAY_SIZE(__gen9_fw_ranges), true },
>> + { __gen11_fw_ranges, ARRAY_SIZE(__gen11_fw_ranges), true },
>> };
>> int err, i;
>>
>
> I haven't checked the ranges, but the code looks good. With or without
> the nitpicks:
>
Both are fair improvements, I'll resend it with these changes.
Thanks!
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Regards,
>
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* [PATCH v11] drm/i915/icl: Gen11 forcewake support
2018-02-01 0:52 ` [PATCH v10] " Michel Thierry
2018-02-01 10:25 ` Tvrtko Ursulin
@ 2018-02-01 16:08 ` Michel Thierry
2018-02-03 20:26 ` [PATCH v10] " kbuild test robot
2018-02-03 21:43 ` kbuild test robot
3 siblings, 0 replies; 118+ messages in thread
From: Michel Thierry @ 2018-02-01 16:08 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
The main difference with previous GENs is that starting from Gen11
each VCS and VECS engine has its own power well, which only exist
if the related engine exists in the HW.
The fallback forcewake request workaround is only needed on gen9
according to the HSDES WA entry (1604254524), so we can go back to using
the simpler fw_domains_get/put functions.
BSpec: 18331
v2: fix fwtable, use array to test shadow tables, create new
accessors to avoid check on every access (Tvrtko)
v3 (from Paulo): Rebase.
v4:
- Range 09400-097FF should be FORCEWAKE_ALL (Daniele)
- Use the BIT macro for forcewake domains (Daniele)
- Add a comment about the range ordering (Oscar)
- Updated commit message (Oscar)
v5: Rebased
v6: Use I915_MAX_VCS/VECS (Michal)
v7: translate FORCEWAKE_ALL to available domains
v8: rebase, add clarification on fallback ack in commit message.
v9: fix rebase issue, change check in fw_domains_init from IS_GEN11
to GEN >= 11
v10: Generate is_genX_shadowed with a macro (Daniele)
Include gen11_fw_ranges in the selftest (Michel)
v11: Simplify FORCEWAKE_ALL, new line between NEEDS_FORCEWAKEs (Tvrtko)
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Acked-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 4 +
drivers/gpu/drm/i915/intel_uncore.c | 157 ++++++++++++++++++++++++--
drivers/gpu/drm/i915/intel_uncore.h | 23 +++-
drivers/gpu/drm/i915/selftests/intel_uncore.c | 31 +++--
4 files changed, 189 insertions(+), 26 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d29e8a0e2ca3..eaca12292ffe 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -8015,9 +8015,13 @@ enum {
#define VLV_GTLC_PW_RENDER_STATUS_MASK (1 << 7)
#define FORCEWAKE_MT _MMIO(0xa188) /* multi-threaded */
#define FORCEWAKE_MEDIA_GEN9 _MMIO(0xa270)
+#define FORCEWAKE_MEDIA_VDBOX_GEN11(n) _MMIO(0xa540 + (n) * 4)
+#define FORCEWAKE_MEDIA_VEBOX_GEN11(n) _MMIO(0xa560 + (n) * 4)
#define FORCEWAKE_RENDER_GEN9 _MMIO(0xa278)
#define FORCEWAKE_BLITTER_GEN9 _MMIO(0xa188)
#define FORCEWAKE_ACK_MEDIA_GEN9 _MMIO(0x0D88)
+#define FORCEWAKE_ACK_MEDIA_VDBOX_GEN11(n) _MMIO(0x0D50 + (n) * 4)
+#define FORCEWAKE_ACK_MEDIA_VEBOX_GEN11(n) _MMIO(0x0D70 + (n) * 4)
#define FORCEWAKE_ACK_RENDER_GEN9 _MMIO(0x0D84)
#define FORCEWAKE_ACK_BLITTER_GEN9 _MMIO(0x130044)
#define FORCEWAKE_KERNEL BIT(0)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 164dbb8cfa36..abe3e2d44a25 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -37,6 +37,12 @@ static const char * const forcewake_domain_names[] = {
"render",
"blitter",
"media",
+ "vdbox0",
+ "vdbox1",
+ "vdbox2",
+ "vdbox3",
+ "vebox0",
+ "vebox1",
};
const char *
@@ -774,6 +780,9 @@ void assert_forcewakes_active(struct drm_i915_private *dev_priv,
/* We give fast paths for the really cool registers */
#define NEEDS_FORCE_WAKE(reg) ((reg) < 0x40000)
+#define GEN11_NEEDS_FORCE_WAKE(reg) \
+ ((reg) < 0x40000 || ((reg) >= 0x1c0000 && (reg) < 0x1dc000))
+
#define __gen6_reg_read_fw_domains(offset) \
({ \
enum forcewake_domains __fwd; \
@@ -826,6 +835,14 @@ find_fw_domain(struct drm_i915_private *dev_priv, u32 offset)
if (!entry)
return 0;
+ /*
+ * The list of FW domains depends on the SKU in gen11+ so we
+ * can't determine it statically. We use FORCEWAKE_ALL and
+ * translate it here to the list of available domains.
+ */
+ if (entry->domains == FORCEWAKE_ALL)
+ return dev_priv->uncore.fw_domains;
+
WARN(entry->domains & ~dev_priv->uncore.fw_domains,
"Uninitialized forcewake domain(s) 0x%x accessed at 0x%x\n",
entry->domains & ~dev_priv->uncore.fw_domains, offset);
@@ -860,6 +877,14 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] = {
__fwd; \
})
+#define __gen11_fwtable_reg_read_fw_domains(offset) \
+({ \
+ enum forcewake_domains __fwd = 0; \
+ if (GEN11_NEEDS_FORCE_WAKE((offset))) \
+ __fwd = find_fw_domain(dev_priv, offset); \
+ __fwd; \
+})
+
/* *Must* be sorted by offset! See intel_shadow_table_check(). */
static const i915_reg_t gen8_shadowed_regs[] = {
RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */
@@ -871,6 +896,20 @@ static const i915_reg_t gen8_shadowed_regs[] = {
/* TODO: Other registers are not yet used */
};
+static const i915_reg_t gen11_shadowed_regs[] = {
+ RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */
+ GEN6_RPNSWREQ, /* 0xA008 */
+ GEN6_RC_VIDEO_FREQ, /* 0xA00C */
+ RING_TAIL(BLT_RING_BASE), /* 0x22000 (base) */
+ RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */
+ RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */
+ RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */
+ RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */
+ RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */
+ RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */
+ /* TODO: Other registers are not yet used */
+};
+
static int mmio_reg_cmp(u32 key, const i915_reg_t *reg)
{
u32 offset = i915_mmio_reg_offset(*reg);
@@ -883,14 +922,17 @@ static int mmio_reg_cmp(u32 key, const i915_reg_t *reg)
return 0;
}
-static bool is_gen8_shadowed(u32 offset)
-{
- const i915_reg_t *regs = gen8_shadowed_regs;
-
- return BSEARCH(offset, regs, ARRAY_SIZE(gen8_shadowed_regs),
- mmio_reg_cmp);
+#define __is_genX_shadowed(x) \
+static bool is_gen##x##_shadowed(u32 offset) \
+{ \
+ const i915_reg_t *regs = gen##x##_shadowed_regs; \
+ return BSEARCH(offset, regs, ARRAY_SIZE(gen##x##_shadowed_regs), \
+ mmio_reg_cmp); \
}
+__is_genX_shadowed(8)
+__is_genX_shadowed(11)
+
#define __gen8_reg_write_fw_domains(offset) \
({ \
enum forcewake_domains __fwd; \
@@ -929,6 +971,14 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = {
__fwd; \
})
+#define __gen11_fwtable_reg_write_fw_domains(offset) \
+({ \
+ enum forcewake_domains __fwd = 0; \
+ if (GEN11_NEEDS_FORCE_WAKE((offset)) && !is_gen11_shadowed(offset)) \
+ __fwd = find_fw_domain(dev_priv, offset); \
+ __fwd; \
+})
+
/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
static const struct intel_forcewake_range __gen9_fw_ranges[] = {
GEN_FW_RANGE(0x0, 0xaff, FORCEWAKE_BLITTER),
@@ -965,6 +1015,40 @@ static const struct intel_forcewake_range __gen9_fw_ranges[] = {
GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_MEDIA),
};
+/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
+static const struct intel_forcewake_range __gen11_fw_ranges[] = {
+ GEN_FW_RANGE(0x0, 0xaff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0xb00, 0x1fff, 0), /* uncore range */
+ GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x2700, 0x2fff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x3000, 0x3fff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x4000, 0x51ff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x8000, 0x813f, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x8160, 0x82ff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x8500, 0x8bff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x8c00, 0x8cff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x8d00, 0x93ff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x9400, 0x97ff, FORCEWAKE_ALL),
+ GEN_FW_RANGE(0x9800, 0xafff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0xb000, 0xb47f, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0xb480, 0xdfff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0xe000, 0xe8ff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0xe900, 0x243ff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x24400, 0x247ff, FORCEWAKE_RENDER),
+ GEN_FW_RANGE(0x24800, 0x3ffff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x40000, 0x1bffff, 0),
+ GEN_FW_RANGE(0x1c0000, 0x1c3fff, FORCEWAKE_MEDIA_VDBOX0),
+ GEN_FW_RANGE(0x1c4000, 0x1c7fff, FORCEWAKE_MEDIA_VDBOX1),
+ GEN_FW_RANGE(0x1c8000, 0x1cbfff, FORCEWAKE_MEDIA_VEBOX0),
+ GEN_FW_RANGE(0x1cc000, 0x1cffff, FORCEWAKE_BLITTER),
+ GEN_FW_RANGE(0x1d0000, 0x1d3fff, FORCEWAKE_MEDIA_VDBOX2),
+ GEN_FW_RANGE(0x1d4000, 0x1d7fff, FORCEWAKE_MEDIA_VDBOX3),
+ GEN_FW_RANGE(0x1d8000, 0x1dbfff, FORCEWAKE_MEDIA_VEBOX1)
+};
+
static void
ilk_dummy_write(struct drm_i915_private *dev_priv)
{
@@ -1095,7 +1179,12 @@ func##_read##x(struct drm_i915_private *dev_priv, i915_reg_t reg, bool trace) {
}
#define __gen6_read(x) __gen_read(gen6, x)
#define __fwtable_read(x) __gen_read(fwtable, x)
+#define __gen11_fwtable_read(x) __gen_read(gen11_fwtable, x)
+__gen11_fwtable_read(8)
+__gen11_fwtable_read(16)
+__gen11_fwtable_read(32)
+__gen11_fwtable_read(64)
__fwtable_read(8)
__fwtable_read(16)
__fwtable_read(32)
@@ -1105,6 +1194,7 @@ __gen6_read(16)
__gen6_read(32)
__gen6_read(64)
+#undef __gen11_fwtable_read
#undef __fwtable_read
#undef __gen6_read
#undef GEN6_READ_FOOTER
@@ -1181,7 +1271,11 @@ func##_write##x(struct drm_i915_private *dev_priv, i915_reg_t reg, u##x val, boo
}
#define __gen8_write(x) __gen_write(gen8, x)
#define __fwtable_write(x) __gen_write(fwtable, x)
+#define __gen11_fwtable_write(x) __gen_write(gen11_fwtable, x)
+__gen11_fwtable_write(8)
+__gen11_fwtable_write(16)
+__gen11_fwtable_write(32)
__fwtable_write(8)
__fwtable_write(16)
__fwtable_write(32)
@@ -1192,6 +1286,7 @@ __gen6_write(8)
__gen6_write(16)
__gen6_write(32)
+#undef __gen11_fwtable_write
#undef __fwtable_write
#undef __gen8_write
#undef __gen6_write
@@ -1240,6 +1335,13 @@ static void fw_domain_init(struct drm_i915_private *dev_priv,
BUILD_BUG_ON(FORCEWAKE_RENDER != (1 << FW_DOMAIN_ID_RENDER));
BUILD_BUG_ON(FORCEWAKE_BLITTER != (1 << FW_DOMAIN_ID_BLITTER));
BUILD_BUG_ON(FORCEWAKE_MEDIA != (1 << FW_DOMAIN_ID_MEDIA));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX0 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX0));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX1 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX1));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX2 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX2));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX3 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX3));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VEBOX0 != (1 << FW_DOMAIN_ID_MEDIA_VEBOX0));
+ BUILD_BUG_ON(FORCEWAKE_MEDIA_VEBOX1 != (1 << FW_DOMAIN_ID_MEDIA_VEBOX1));
+
d->mask = BIT(domain_id);
@@ -1267,7 +1369,34 @@ static void intel_uncore_fw_domains_init(struct drm_i915_private *dev_priv)
dev_priv->uncore.fw_clear = _MASKED_BIT_DISABLE(FORCEWAKE_KERNEL);
}
- if (INTEL_GEN(dev_priv) >= 9) {
+ if (INTEL_GEN(dev_priv) >= 11) {
+ int i;
+
+ dev_priv->uncore.funcs.force_wake_get = fw_domains_get;
+ dev_priv->uncore.funcs.force_wake_put = fw_domains_put;
+ fw_domain_init(dev_priv, FW_DOMAIN_ID_RENDER,
+ FORCEWAKE_RENDER_GEN9,
+ FORCEWAKE_ACK_RENDER_GEN9);
+ fw_domain_init(dev_priv, FW_DOMAIN_ID_BLITTER,
+ FORCEWAKE_BLITTER_GEN9,
+ FORCEWAKE_ACK_BLITTER_GEN9);
+ for (i = 0; i < I915_MAX_VCS; i++) {
+ if (!HAS_ENGINE(dev_priv, _VCS(i)))
+ continue;
+
+ fw_domain_init(dev_priv, FW_DOMAIN_ID_MEDIA_VDBOX0 + i,
+ FORCEWAKE_MEDIA_VDBOX_GEN11(i),
+ FORCEWAKE_ACK_MEDIA_VDBOX_GEN11(i));
+ }
+ for (i = 0; i < I915_MAX_VECS; i++) {
+ if (!HAS_ENGINE(dev_priv, _VECS(i)))
+ continue;
+
+ fw_domain_init(dev_priv, FW_DOMAIN_ID_MEDIA_VEBOX0 + i,
+ FORCEWAKE_MEDIA_VEBOX_GEN11(i),
+ FORCEWAKE_ACK_MEDIA_VEBOX_GEN11(i));
+ }
+ } else if (IS_GEN9(dev_priv) || IS_GEN10(dev_priv)) {
dev_priv->uncore.funcs.force_wake_get =
fw_domains_get_with_fallback;
dev_priv->uncore.funcs.force_wake_put = fw_domains_put;
@@ -1422,10 +1551,14 @@ void intel_uncore_init(struct drm_i915_private *dev_priv)
ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, gen8);
ASSIGN_READ_MMIO_VFUNCS(dev_priv, gen6);
}
- } else {
+ } else if (IS_GEN(dev_priv, 9, 10)) {
ASSIGN_FW_DOMAINS_TABLE(__gen9_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, fwtable);
ASSIGN_READ_MMIO_VFUNCS(dev_priv, fwtable);
+ } else {
+ ASSIGN_FW_DOMAINS_TABLE(__gen11_fw_ranges);
+ ASSIGN_WRITE_MMIO_VFUNCS(dev_priv, gen11_fwtable);
+ ASSIGN_READ_MMIO_VFUNCS(dev_priv, gen11_fwtable);
}
iosf_mbi_register_pmic_bus_access_notifier(
@@ -1985,7 +2118,9 @@ intel_uncore_forcewake_for_read(struct drm_i915_private *dev_priv,
u32 offset = i915_mmio_reg_offset(reg);
enum forcewake_domains fw_domains;
- if (HAS_FWTABLE(dev_priv)) {
+ if (INTEL_GEN(dev_priv) >= 11) {
+ fw_domains = __gen11_fwtable_reg_read_fw_domains(offset);
+ } else if (HAS_FWTABLE(dev_priv)) {
fw_domains = __fwtable_reg_read_fw_domains(offset);
} else if (INTEL_GEN(dev_priv) >= 6) {
fw_domains = __gen6_reg_read_fw_domains(offset);
@@ -2006,7 +2141,9 @@ intel_uncore_forcewake_for_write(struct drm_i915_private *dev_priv,
u32 offset = i915_mmio_reg_offset(reg);
enum forcewake_domains fw_domains;
- if (HAS_FWTABLE(dev_priv) && !IS_VALLEYVIEW(dev_priv)) {
+ if (INTEL_GEN(dev_priv) >= 11) {
+ fw_domains = __gen11_fwtable_reg_write_fw_domains(offset);
+ } else if (HAS_FWTABLE(dev_priv) && !IS_VALLEYVIEW(dev_priv)) {
fw_domains = __fwtable_reg_write_fw_domains(offset);
} else if (IS_GEN8(dev_priv)) {
fw_domains = __gen8_reg_write_fw_domains(offset);
diff --git a/drivers/gpu/drm/i915/intel_uncore.h b/drivers/gpu/drm/i915/intel_uncore.h
index bed019ef000f..703da58d7dcd 100644
--- a/drivers/gpu/drm/i915/intel_uncore.h
+++ b/drivers/gpu/drm/i915/intel_uncore.h
@@ -37,17 +37,28 @@ enum forcewake_domain_id {
FW_DOMAIN_ID_RENDER = 0,
FW_DOMAIN_ID_BLITTER,
FW_DOMAIN_ID_MEDIA,
+ FW_DOMAIN_ID_MEDIA_VDBOX0,
+ FW_DOMAIN_ID_MEDIA_VDBOX1,
+ FW_DOMAIN_ID_MEDIA_VDBOX2,
+ FW_DOMAIN_ID_MEDIA_VDBOX3,
+ FW_DOMAIN_ID_MEDIA_VEBOX0,
+ FW_DOMAIN_ID_MEDIA_VEBOX1,
FW_DOMAIN_ID_COUNT
};
enum forcewake_domains {
- FORCEWAKE_RENDER = BIT(FW_DOMAIN_ID_RENDER),
- FORCEWAKE_BLITTER = BIT(FW_DOMAIN_ID_BLITTER),
- FORCEWAKE_MEDIA = BIT(FW_DOMAIN_ID_MEDIA),
- FORCEWAKE_ALL = (FORCEWAKE_RENDER |
- FORCEWAKE_BLITTER |
- FORCEWAKE_MEDIA)
+ FORCEWAKE_RENDER = BIT(FW_DOMAIN_ID_RENDER),
+ FORCEWAKE_BLITTER = BIT(FW_DOMAIN_ID_BLITTER),
+ FORCEWAKE_MEDIA = BIT(FW_DOMAIN_ID_MEDIA),
+ FORCEWAKE_MEDIA_VDBOX0 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX0),
+ FORCEWAKE_MEDIA_VDBOX1 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX1),
+ FORCEWAKE_MEDIA_VDBOX2 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX2),
+ FORCEWAKE_MEDIA_VDBOX3 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX3),
+ FORCEWAKE_MEDIA_VEBOX0 = BIT(FW_DOMAIN_ID_MEDIA_VEBOX0),
+ FORCEWAKE_MEDIA_VEBOX1 = BIT(FW_DOMAIN_ID_MEDIA_VEBOX1),
+
+ FORCEWAKE_ALL = BIT(FW_DOMAIN_ID_COUNT) - 1
};
struct intel_uncore_funcs {
diff --git a/drivers/gpu/drm/i915/selftests/intel_uncore.c b/drivers/gpu/drm/i915/selftests/intel_uncore.c
index 2f6367643171..f76f2597df5c 100644
--- a/drivers/gpu/drm/i915/selftests/intel_uncore.c
+++ b/drivers/gpu/drm/i915/selftests/intel_uncore.c
@@ -61,20 +61,30 @@ static int intel_fw_table_check(const struct intel_forcewake_range *ranges,
static int intel_shadow_table_check(void)
{
- const i915_reg_t *reg = gen8_shadowed_regs;
- unsigned int i;
+ struct {
+ const i915_reg_t *regs;
+ unsigned int size;
+ } reg_lists[] = {
+ { gen8_shadowed_regs, ARRAY_SIZE(gen8_shadowed_regs) },
+ { gen11_shadowed_regs, ARRAY_SIZE(gen11_shadowed_regs) },
+ };
+ const i915_reg_t *reg;
+ unsigned int i, j;
s32 prev;
- for (i = 0, prev = -1; i < ARRAY_SIZE(gen8_shadowed_regs); i++, reg++) {
- u32 offset = i915_mmio_reg_offset(*reg);
+ for (j = 0; j < ARRAY_SIZE(reg_lists); ++j) {
+ reg = reg_lists[j].regs;
+ for (i = 0, prev = -1; i < reg_lists[j].size; i++, reg++) {
+ u32 offset = i915_mmio_reg_offset(*reg);
- if (prev >= (s32)offset) {
- pr_err("%s: entry[%d]:(%x) is before previous (%x)\n",
- __func__, i, offset, prev);
- return -EINVAL;
- }
+ if (prev >= (s32)offset) {
+ pr_err("%s: entry[%d]:(%x) is before previous (%x)\n",
+ __func__, i, offset, prev);
+ return -EINVAL;
+ }
- prev = offset;
+ prev = offset;
+ }
}
return 0;
@@ -90,6 +100,7 @@ int intel_uncore_mock_selftests(void)
{ __vlv_fw_ranges, ARRAY_SIZE(__vlv_fw_ranges), false },
{ __chv_fw_ranges, ARRAY_SIZE(__chv_fw_ranges), false },
{ __gen9_fw_ranges, ARRAY_SIZE(__gen9_fw_ranges), true },
+ { __gen11_fw_ranges, ARRAY_SIZE(__gen11_fw_ranges), true },
};
int err, i;
--
2.16.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH v10] drm/i915/icl: Gen11 forcewake support
2018-02-01 0:52 ` [PATCH v10] " Michel Thierry
2018-02-01 10:25 ` Tvrtko Ursulin
2018-02-01 16:08 ` [PATCH v11] " Michel Thierry
@ 2018-02-03 20:26 ` kbuild test robot
2018-02-03 21:43 ` kbuild test robot
3 siblings, 0 replies; 118+ messages in thread
From: kbuild test robot @ 2018-02-03 20:26 UTC (permalink / raw)
To: Michel Thierry; +Cc: intel-gfx, kbuild-all, Paulo Zanoni
[-- Attachment #1: Type: text/plain, Size: 7585 bytes --]
Hi Daniele,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on next-20180202]
[cannot apply to v4.15]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Michel-Thierry/drm-i915-icl-Gen11-forcewake-support/20180204-034751
base: git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-x019-201805 (attached as .config)
compiler: gcc-7 (Debian 7.2.0-12) 7.2.1 20171025
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All error/warnings (new ones prefixed by >>):
In file included from drivers/gpu/drm/i915/i915_drv.h:56:0,
from drivers/gpu/drm/i915/intel_uncore.c:24:
>> drivers/gpu/drm/i915/intel_uncore.c:903:12: error: 'GEN11_BSD_RING_BASE' undeclared here (not in a function); did you mean 'GEN6_BSD_RING_BASE'?
RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */
^
drivers/gpu/drm/i915/i915_reg.h:123:47: note: in definition of macro '_MMIO'
#define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
^
>> drivers/gpu/drm/i915/intel_uncore.c:903:2: note: in expansion of macro 'RING_TAIL'
RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */
^~~~~~~~~
>> drivers/gpu/drm/i915/intel_uncore.c:904:12: error: 'GEN11_BSD2_RING_BASE' undeclared here (not in a function); did you mean 'GEN11_BSD_RING_BASE'?
RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */
^
drivers/gpu/drm/i915/i915_reg.h:123:47: note: in definition of macro '_MMIO'
#define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
^
drivers/gpu/drm/i915/intel_uncore.c:904:2: note: in expansion of macro 'RING_TAIL'
RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */
^~~~~~~~~
>> drivers/gpu/drm/i915/intel_uncore.c:905:12: error: 'GEN11_VEBOX_RING_BASE' undeclared here (not in a function); did you mean 'GEN11_BSD_RING_BASE'?
RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */
^
drivers/gpu/drm/i915/i915_reg.h:123:47: note: in definition of macro '_MMIO'
#define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
^
drivers/gpu/drm/i915/intel_uncore.c:905:2: note: in expansion of macro 'RING_TAIL'
RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */
^~~~~~~~~
>> drivers/gpu/drm/i915/intel_uncore.c:906:12: error: 'GEN11_BSD3_RING_BASE' undeclared here (not in a function); did you mean 'GEN11_BSD2_RING_BASE'?
RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */
^
drivers/gpu/drm/i915/i915_reg.h:123:47: note: in definition of macro '_MMIO'
#define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
^
drivers/gpu/drm/i915/intel_uncore.c:906:2: note: in expansion of macro 'RING_TAIL'
RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */
^~~~~~~~~
>> drivers/gpu/drm/i915/intel_uncore.c:907:12: error: 'GEN11_BSD4_RING_BASE' undeclared here (not in a function); did you mean 'GEN11_BSD3_RING_BASE'?
RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */
^
drivers/gpu/drm/i915/i915_reg.h:123:47: note: in definition of macro '_MMIO'
#define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
^
drivers/gpu/drm/i915/intel_uncore.c:907:2: note: in expansion of macro 'RING_TAIL'
RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */
^~~~~~~~~
>> drivers/gpu/drm/i915/intel_uncore.c:908:12: error: 'GEN11_VEBOX2_RING_BASE' undeclared here (not in a function); did you mean 'GEN11_VEBOX_RING_BASE'?
RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */
^
drivers/gpu/drm/i915/i915_reg.h:123:47: note: in definition of macro '_MMIO'
#define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
^
drivers/gpu/drm/i915/intel_uncore.c:908:2: note: in expansion of macro 'RING_TAIL'
RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */
^~~~~~~~~
drivers/gpu/drm/i915/intel_uncore.c: In function 'intel_uncore_fw_domains_init':
>> drivers/gpu/drm/i915/intel_uncore.c:1382:19: error: 'I915_MAX_VCS' undeclared (first use in this function); did you mean 'I915_MAP_WC'?
for (i = 0; i < I915_MAX_VCS; i++) {
^~~~~~~~~~~~
I915_MAP_WC
drivers/gpu/drm/i915/intel_uncore.c:1382:19: note: each undeclared identifier is reported only once for each function it appears in
>> drivers/gpu/drm/i915/intel_uncore.c:1382:17: warning: comparison between pointer and integer
for (i = 0; i < I915_MAX_VCS; i++) {
^
>> drivers/gpu/drm/i915/intel_uncore.c:1390:19: error: 'I915_MAX_VECS' undeclared (first use in this function); did you mean 'I915_MAX_VCS'?
for (i = 0; i < I915_MAX_VECS; i++) {
^~~~~~~~~~~~~
I915_MAX_VCS
drivers/gpu/drm/i915/intel_uncore.c:1390:17: warning: comparison between pointer and integer
for (i = 0; i < I915_MAX_VECS; i++) {
^
In file included from include/linux/kernel.h:11:0,
from include/asm-generic/bug.h:18,
from arch/x86/include/asm/bug.h:82,
from include/linux/bug.h:5,
from include/linux/mmdebug.h:5,
from include/linux/gfp.h:5,
from include/linux/slab.h:15,
from include/linux/io-mapping.h:22,
from drivers/gpu/drm/i915/i915_drv.h:36,
from drivers/gpu/drm/i915/intel_uncore.c:24:
>> drivers/gpu/drm/i915/intel_uncore.c:1391:30: error: implicit declaration of function '_VECS'; did you mean '_VCS'? [-Werror=implicit-function-declaration]
if (!HAS_ENGINE(dev_priv, _VECS(i)))
^
include/linux/bitops.h:7:28: note: in definition of macro 'BIT'
#define BIT(nr) (1UL << (nr))
^~
>> drivers/gpu/drm/i915/i915_drv.h:2723:35: note: in expansion of macro 'ENGINE_MASK'
(!!((dev_priv)->info.ring_mask & ENGINE_MASK(id)))
^~~~~~~~~~~
>> drivers/gpu/drm/i915/intel_uncore.c:1391:9: note: in expansion of macro 'HAS_ENGINE'
if (!HAS_ENGINE(dev_priv, _VECS(i)))
^~~~~~~~~~
cc1: some warnings being treated as errors
vim +903 drivers/gpu/drm/i915/intel_uncore.c
897
898 static const i915_reg_t gen11_shadowed_regs[] = {
899 RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */
900 GEN6_RPNSWREQ, /* 0xA008 */
901 GEN6_RC_VIDEO_FREQ, /* 0xA00C */
902 RING_TAIL(BLT_RING_BASE), /* 0x22000 (base) */
> 903 RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */
> 904 RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */
> 905 RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */
> 906 RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */
> 907 RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */
> 908 RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */
909 /* TODO: Other registers are not yet used */
910 };
911
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 32271 bytes --]
[-- Attachment #3: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH v10] drm/i915/icl: Gen11 forcewake support
2018-02-01 0:52 ` [PATCH v10] " Michel Thierry
` (2 preceding siblings ...)
2018-02-03 20:26 ` [PATCH v10] " kbuild test robot
@ 2018-02-03 21:43 ` kbuild test robot
3 siblings, 0 replies; 118+ messages in thread
From: kbuild test robot @ 2018-02-03 21:43 UTC (permalink / raw)
To: Michel Thierry; +Cc: intel-gfx, kbuild-all, Paulo Zanoni
[-- Attachment #1: Type: text/plain, Size: 7169 bytes --]
Hi Daniele,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on next-20180202]
[cannot apply to v4.15]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Michel-Thierry/drm-i915-icl-Gen11-forcewake-support/20180204-034751
base: git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-u0-02040445 (attached as .config)
compiler: gcc-5 (Debian 5.5.0-3) 5.4.1 20171010
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
All errors (new ones prefixed by >>):
In file included from drivers/gpu//drm/i915/i915_drv.h:56:0,
from drivers/gpu//drm/i915/intel_uncore.c:24:
drivers/gpu//drm/i915/intel_uncore.c:903:12: error: 'GEN11_BSD_RING_BASE' undeclared here (not in a function)
RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */
^
drivers/gpu//drm/i915/i915_reg.h:123:47: note: in definition of macro '_MMIO'
#define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
^
drivers/gpu//drm/i915/intel_uncore.c:903:2: note: in expansion of macro 'RING_TAIL'
RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */
^
>> drivers/gpu//drm/i915/intel_uncore.c:904:12: error: 'GEN11_BSD2_RING_BASE' undeclared here (not in a function)
RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */
^
drivers/gpu//drm/i915/i915_reg.h:123:47: note: in definition of macro '_MMIO'
#define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
^
drivers/gpu//drm/i915/intel_uncore.c:904:2: note: in expansion of macro 'RING_TAIL'
RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */
^
>> drivers/gpu//drm/i915/intel_uncore.c:905:12: error: 'GEN11_VEBOX_RING_BASE' undeclared here (not in a function)
RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */
^
drivers/gpu//drm/i915/i915_reg.h:123:47: note: in definition of macro '_MMIO'
#define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
^
drivers/gpu//drm/i915/intel_uncore.c:905:2: note: in expansion of macro 'RING_TAIL'
RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */
^
>> drivers/gpu//drm/i915/intel_uncore.c:906:12: error: 'GEN11_BSD3_RING_BASE' undeclared here (not in a function)
RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */
^
drivers/gpu//drm/i915/i915_reg.h:123:47: note: in definition of macro '_MMIO'
#define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
^
drivers/gpu//drm/i915/intel_uncore.c:906:2: note: in expansion of macro 'RING_TAIL'
RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */
^
>> drivers/gpu//drm/i915/intel_uncore.c:907:12: error: 'GEN11_BSD4_RING_BASE' undeclared here (not in a function)
RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */
^
drivers/gpu//drm/i915/i915_reg.h:123:47: note: in definition of macro '_MMIO'
#define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
^
drivers/gpu//drm/i915/intel_uncore.c:907:2: note: in expansion of macro 'RING_TAIL'
RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */
^
>> drivers/gpu//drm/i915/intel_uncore.c:908:12: error: 'GEN11_VEBOX2_RING_BASE' undeclared here (not in a function)
RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */
^
drivers/gpu//drm/i915/i915_reg.h:123:47: note: in definition of macro '_MMIO'
#define _MMIO(r) ((const i915_reg_t){ .reg = (r) })
^
drivers/gpu//drm/i915/intel_uncore.c:908:2: note: in expansion of macro 'RING_TAIL'
RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */
^
drivers/gpu//drm/i915/intel_uncore.c: In function 'intel_uncore_fw_domains_init':
>> drivers/gpu//drm/i915/intel_uncore.c:1382:19: error: 'I915_MAX_VCS' undeclared (first use in this function)
for (i = 0; i < I915_MAX_VCS; i++) {
^
drivers/gpu//drm/i915/intel_uncore.c:1382:19: note: each undeclared identifier is reported only once for each function it appears in
drivers/gpu//drm/i915/intel_uncore.c:1382:17: warning: comparison between pointer and integer
for (i = 0; i < I915_MAX_VCS; i++) {
^
>> drivers/gpu//drm/i915/intel_uncore.c:1390:19: error: 'I915_MAX_VECS' undeclared (first use in this function)
for (i = 0; i < I915_MAX_VECS; i++) {
^
drivers/gpu//drm/i915/intel_uncore.c:1390:17: warning: comparison between pointer and integer
for (i = 0; i < I915_MAX_VECS; i++) {
^
In file included from include/linux/kernel.h:11:0,
from include/asm-generic/bug.h:18,
from arch/x86/include/asm/bug.h:82,
from include/linux/bug.h:5,
from include/linux/mmdebug.h:5,
from include/linux/gfp.h:5,
from include/linux/slab.h:15,
from include/linux/io-mapping.h:22,
from drivers/gpu//drm/i915/i915_drv.h:36,
from drivers/gpu//drm/i915/intel_uncore.c:24:
drivers/gpu//drm/i915/intel_uncore.c:1391:30: error: implicit declaration of function '_VECS' [-Werror=implicit-function-declaration]
if (!HAS_ENGINE(dev_priv, _VECS(i)))
^
include/linux/bitops.h:7:28: note: in definition of macro 'BIT'
#define BIT(nr) (1UL << (nr))
^
drivers/gpu//drm/i915/i915_drv.h:2723:35: note: in expansion of macro 'ENGINE_MASK'
(!!((dev_priv)->info.ring_mask & ENGINE_MASK(id)))
^
drivers/gpu//drm/i915/intel_uncore.c:1391:9: note: in expansion of macro 'HAS_ENGINE'
if (!HAS_ENGINE(dev_priv, _VECS(i)))
^
cc1: some warnings being treated as errors
vim +/GEN11_BSD2_RING_BASE +904 drivers/gpu//drm/i915/intel_uncore.c
897
898 static const i915_reg_t gen11_shadowed_regs[] = {
899 RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */
900 GEN6_RPNSWREQ, /* 0xA008 */
901 GEN6_RC_VIDEO_FREQ, /* 0xA00C */
902 RING_TAIL(BLT_RING_BASE), /* 0x22000 (base) */
> 903 RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */
> 904 RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */
> 905 RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */
> 906 RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */
> 907 RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */
> 908 RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */
909 /* TODO: Other registers are not yet used */
910 };
911
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 37042 bytes --]
[-- Attachment #3: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* [PATCH 14/27] drm/i915/icl: Set graphics mode register for gen11
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (2 preceding siblings ...)
2018-01-09 23:28 ` [PATCH 13/27] drm/i915/icl: Gen11 forcewake support Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-01-10 13:40 ` Arkadiusz Hiler
` (2 more replies)
2018-01-09 23:28 ` [PATCH 15/27] drm/i915/icl: new context descriptor support Paulo Zanoni
` (15 subsequent siblings)
19 siblings, 3 replies; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: kgardine <kelvin.gardiner@intel.com>
This patch clears a single bit. The bit is 0 by default but expected not to be
set. Explicitly clearing the bit in this patch is intended to indicate some
thinking has occurred, and that we want this bit cleared and we are not just
excepting the default value.
v2 (from Paulo): fix indentation.
v3 (from Paulo): rebase.
Signed-off-by: kgardine <kelvin.gardiner@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 2 ++
drivers/gpu/drm/i915/intel_lrc.c | 10 ++++++++--
2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index f383ee5cc592..a16a8a2b17b4 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2597,6 +2597,8 @@ enum i915_power_well_id {
#define GFX_FORWARD_VBLANK_ALWAYS (1<<5)
#define GFX_FORWARD_VBLANK_COND (2<<5)
+#define GEN11_GFX_DISABLE_LEGACY_MODE (1<<3)
+
#define VLV_DISPLAY_BASE 0x180000
#define VLV_MIPI_BASE VLV_DISPLAY_BASE
#define BXT_MIPI_BASE 0x60000
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index dab988f20833..d435a9982d0b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1500,8 +1500,14 @@ static void enable_execlists(struct intel_engine_cs *engine)
struct drm_i915_private *dev_priv = engine->i915;
I915_WRITE(RING_HWSTAM(engine->mmio_base), 0xffffffff);
- I915_WRITE(RING_MODE_GEN7(engine),
- _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
+
+ if (IS_GEN11(dev_priv))
+ I915_WRITE(RING_MODE_GEN7(engine),
+ _MASKED_BIT_DISABLE(GEN11_GFX_DISABLE_LEGACY_MODE));
+ else
+ I915_WRITE(RING_MODE_GEN7(engine),
+ _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
+
I915_WRITE(RING_HWS_PGA(engine->mmio_base),
engine->status_page.ggtt_offset);
POSTING_READ(RING_HWS_PGA(engine->mmio_base));
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH 14/27] drm/i915/icl: Set graphics mode register for gen11
2018-01-09 23:28 ` [PATCH 14/27] drm/i915/icl: Set graphics mode register for gen11 Paulo Zanoni
@ 2018-01-10 13:40 ` Arkadiusz Hiler
2018-01-11 19:32 ` Daniele Ceraolo Spurio
2018-01-19 19:30 ` [PATCH v3] " Kelvin Gardiner
2 siblings, 0 replies; 118+ messages in thread
From: Arkadiusz Hiler @ 2018-01-10 13:40 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx
On Tue, Jan 09, 2018 at 09:28:22PM -0200, Paulo Zanoni wrote:
> From: kgardine <kelvin.gardiner@intel.com>
Please fix this to use Kelvin's full name when pushing.
Both here and in the s-o-b line.
It may trigger this rule causing a rejection somewhere up the merge
chain:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst#n460
> This patch clears a single bit. The bit is 0 by default but expected not to be
> set. Explicitly clearing the bit in this patch is intended to indicate some
> thinking has occurred, and that we want this bit cleared and we are not just
> excepting the default value.
>
> v2 (from Paulo): fix indentation.
> v3 (from Paulo): rebase.
>
> Signed-off-by: kgardine <kelvin.gardiner@intel.com>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
> drivers/gpu/drm/i915/i915_reg.h | 2 ++
> drivers/gpu/drm/i915/intel_lrc.c | 10 ++++++++--
> 2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index f383ee5cc592..a16a8a2b17b4 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -2597,6 +2597,8 @@ enum i915_power_well_id {
> #define GFX_FORWARD_VBLANK_ALWAYS (1<<5)
> #define GFX_FORWARD_VBLANK_COND (2<<5)
>
> +#define GEN11_GFX_DISABLE_LEGACY_MODE (1<<3)
> +
> #define VLV_DISPLAY_BASE 0x180000
> #define VLV_MIPI_BASE VLV_DISPLAY_BASE
> #define BXT_MIPI_BASE 0x60000
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index dab988f20833..d435a9982d0b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1500,8 +1500,14 @@ static void enable_execlists(struct intel_engine_cs *engine)
> struct drm_i915_private *dev_priv = engine->i915;
>
> I915_WRITE(RING_HWSTAM(engine->mmio_base), 0xffffffff);
> - I915_WRITE(RING_MODE_GEN7(engine),
> - _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
> +
> + if (IS_GEN11(dev_priv))
> + I915_WRITE(RING_MODE_GEN7(engine),
> + _MASKED_BIT_DISABLE(GEN11_GFX_DISABLE_LEGACY_MODE));
> + else
> + I915_WRITE(RING_MODE_GEN7(engine),
> + _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
> +
> I915_WRITE(RING_HWS_PGA(engine->mmio_base),
> engine->status_page.ggtt_offset);
> POSTING_READ(RING_HWS_PGA(engine->mmio_base));
> --
> 2.14.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH 14/27] drm/i915/icl: Set graphics mode register for gen11
2018-01-09 23:28 ` [PATCH 14/27] drm/i915/icl: Set graphics mode register for gen11 Paulo Zanoni
2018-01-10 13:40 ` Arkadiusz Hiler
@ 2018-01-11 19:32 ` Daniele Ceraolo Spurio
2018-01-19 19:30 ` [PATCH v3] " Kelvin Gardiner
2 siblings, 0 replies; 118+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-01-11 19:32 UTC (permalink / raw)
To: Paulo Zanoni, intel-gfx
On 09/01/18 15:28, Paulo Zanoni wrote:
> From: kgardine <kelvin.gardiner@intel.com>
>
> This patch clears a single bit. The bit is 0 by default but expected not to be
> set. Explicitly clearing the bit in this patch is intended to indicate some
> thinking has occurred, and that we want this bit cleared and we are not just
> excepting the default value.
>
> v2 (from Paulo): fix indentation.
> v3 (from Paulo): rebase.
>
> Signed-off-by: kgardine <kelvin.gardiner@intel.com>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
> drivers/gpu/drm/i915/i915_reg.h | 2 ++
> drivers/gpu/drm/i915/intel_lrc.c | 10 ++++++++--
> 2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index f383ee5cc592..a16a8a2b17b4 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -2597,6 +2597,8 @@ enum i915_power_well_id {
> #define GFX_FORWARD_VBLANK_ALWAYS (1<<5)
> #define GFX_FORWARD_VBLANK_COND (2<<5)
>
> +#define GEN11_GFX_DISABLE_LEGACY_MODE (1<<3)
> +
> #define VLV_DISPLAY_BASE 0x180000
> #define VLV_MIPI_BASE VLV_DISPLAY_BASE
> #define BXT_MIPI_BASE 0x60000
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index dab988f20833..d435a9982d0b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1500,8 +1500,14 @@ static void enable_execlists(struct intel_engine_cs *engine)
> struct drm_i915_private *dev_priv = engine->i915;
>
> I915_WRITE(RING_HWSTAM(engine->mmio_base), 0xffffffff);
> - I915_WRITE(RING_MODE_GEN7(engine),
> - _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
> +
> + if (IS_GEN11(dev_priv))
INTEL_GEN >= 11? I'd expect this to be valid going forward instead of
flipping back to the old value settings. Also we could use a comment,
because the bit name is not very clear. Something like:
/*
* Make sure we're not enabling the new 12-deep CSB
* FIFO as that requires a slightly updated handling
* in the ctx switch irq. Since we're currently only
* using only 2 elements of the enhanced execlists the
* deeper FIFO it's not needed and it's not worth adding
* more statements to the irq handler to support it.
*/
Daniele
> + I915_WRITE(RING_MODE_GEN7(engine),
> + _MASKED_BIT_DISABLE(GEN11_GFX_DISABLE_LEGACY_MODE));
> + else
> + I915_WRITE(RING_MODE_GEN7(engine),
> + _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
> +
> I915_WRITE(RING_HWS_PGA(engine->mmio_base),
> engine->status_page.ggtt_offset);
> POSTING_READ(RING_HWS_PGA(engine->mmio_base));
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* [PATCH v3] drm/i915/icl: Set graphics mode register for gen11
2018-01-09 23:28 ` [PATCH 14/27] drm/i915/icl: Set graphics mode register for gen11 Paulo Zanoni
2018-01-10 13:40 ` Arkadiusz Hiler
2018-01-11 19:32 ` Daniele Ceraolo Spurio
@ 2018-01-19 19:30 ` Kelvin Gardiner
2018-01-19 22:46 ` Daniele Ceraolo Spurio
2 siblings, 1 reply; 118+ messages in thread
From: Kelvin Gardiner @ 2018-01-19 19:30 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
This patch clears a single bit. The bit is 0 by default but expected not to be
set. Explicitly clearing the bit in this patch is intended to indicate some
thinking has occurred, and that we want this bit cleared and we are not just
excepting the default value.
v2 (from Paulo): fix indentation.
v3: Changed GEN check to >= 11. Corrected author name.
Signed-off-by: Kelvin Gardiner <kelvin.gardiner@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 2 ++
drivers/gpu/drm/i915/intel_lrc.c | 18 ++++++++++++++++--
2 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 73c9c36..057f90e 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2604,6 +2604,8 @@ enum i915_power_well_id {
#define GFX_FORWARD_VBLANK_ALWAYS (1<<5)
#define GFX_FORWARD_VBLANK_COND (2<<5)
+#define GEN11_GFX_DISABLE_LEGACY_MODE (1<<3)
+
#define VLV_DISPLAY_BASE 0x180000
#define VLV_MIPI_BASE VLV_DISPLAY_BASE
#define BXT_MIPI_BASE 0x60000
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index dab988f..d4cc5c9 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1500,8 +1500,22 @@ static void enable_execlists(struct intel_engine_cs *engine)
struct drm_i915_private *dev_priv = engine->i915;
I915_WRITE(RING_HWSTAM(engine->mmio_base), 0xffffffff);
- I915_WRITE(RING_MODE_GEN7(engine),
- _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
+
+ /*
+ * Make sure we're not enabling the new 12-deep CSB
+ * FIFO as that requires a slightly updated handling
+ * in the ctx switch irq. Since we're currently only
+ * using only 2 elements of the enhanced execlists the
+ * deeper FIFO it's not needed and it's not worth adding
+ * more statements to the irq handler to support it.
+ */
+ if (INTEL_GEN(dev_priv) >= 11)
+ I915_WRITE(RING_MODE_GEN7(engine),
+ _MASKED_BIT_DISABLE(GEN11_GFX_DISABLE_LEGACY_MODE));
+ else
+ I915_WRITE(RING_MODE_GEN7(engine),
+ _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
+
I915_WRITE(RING_HWS_PGA(engine->mmio_base),
engine->status_page.ggtt_offset);
POSTING_READ(RING_HWS_PGA(engine->mmio_base));
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH v3] drm/i915/icl: Set graphics mode register for gen11
2018-01-19 19:30 ` [PATCH v3] " Kelvin Gardiner
@ 2018-01-19 22:46 ` Daniele Ceraolo Spurio
0 siblings, 0 replies; 118+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-01-19 22:46 UTC (permalink / raw)
To: Kelvin Gardiner, intel-gfx; +Cc: Paulo Zanoni
On 19/01/18 11:30, Kelvin Gardiner wrote:
> This patch clears a single bit. The bit is 0 by default but expected not to be
> set. Explicitly clearing the bit in this patch is intended to indicate some
> thinking has occurred, and that we want this bit cleared and we are not just
> excepting the default value.
>
We also stop setting GFX_RUN_LIST_ENABLE, which is correct since that
bit is gone. Code is self-explanatory but could still use a mention in
the commit message IMHO.
> v2 (from Paulo): fix indentation.
> v3: Changed GEN check to >= 11. Corrected author name.
>
> Signed-off-by: Kelvin Gardiner <kelvin.gardiner@intel.com>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Thanks,
Daniele
> ---
> drivers/gpu/drm/i915/i915_reg.h | 2 ++
> drivers/gpu/drm/i915/intel_lrc.c | 18 ++++++++++++++++--
> 2 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 73c9c36..057f90e 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -2604,6 +2604,8 @@ enum i915_power_well_id {
> #define GFX_FORWARD_VBLANK_ALWAYS (1<<5)
> #define GFX_FORWARD_VBLANK_COND (2<<5)
>
> +#define GEN11_GFX_DISABLE_LEGACY_MODE (1<<3)
> +
> #define VLV_DISPLAY_BASE 0x180000
> #define VLV_MIPI_BASE VLV_DISPLAY_BASE
> #define BXT_MIPI_BASE 0x60000
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index dab988f..d4cc5c9 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1500,8 +1500,22 @@ static void enable_execlists(struct intel_engine_cs *engine)
> struct drm_i915_private *dev_priv = engine->i915;
>
> I915_WRITE(RING_HWSTAM(engine->mmio_base), 0xffffffff);
> - I915_WRITE(RING_MODE_GEN7(engine),
> - _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
> +
> + /*
> + * Make sure we're not enabling the new 12-deep CSB
> + * FIFO as that requires a slightly updated handling
> + * in the ctx switch irq. Since we're currently only
> + * using only 2 elements of the enhanced execlists the
> + * deeper FIFO it's not needed and it's not worth adding
> + * more statements to the irq handler to support it.
> + */
> + if (INTEL_GEN(dev_priv) >= 11)
> + I915_WRITE(RING_MODE_GEN7(engine),
> + _MASKED_BIT_DISABLE(GEN11_GFX_DISABLE_LEGACY_MODE));
> + else
> + I915_WRITE(RING_MODE_GEN7(engine),
> + _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
> +
> I915_WRITE(RING_HWS_PGA(engine->mmio_base),
> engine->status_page.ggtt_offset);
> POSTING_READ(RING_HWS_PGA(engine->mmio_base));
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* [PATCH 15/27] drm/i915/icl: new context descriptor support
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (3 preceding siblings ...)
2018-01-09 23:28 ` [PATCH 14/27] drm/i915/icl: Set graphics mode register for gen11 Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-01-09 23:28 ` [PATCH 16/27] drm/i915/icl: Check for fused-off VDBOX and VEBOX instances Paulo Zanoni
` (14 subsequent siblings)
19 siblings, 0 replies; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx
From: "Ceraolo Spurio, Daniele" <daniele.ceraolospurio@intel.com>
Starting from Gen11 the context descriptor format has been updated in
the HW. The hw_id field has been considerably reduced in size and engine
class and instance fields have been added.
There is a slight name clashing issue because the field that we call
hw_id is actually called SW Context ID in the specs for Gen11+.
With the current size of the hw_id field we can have a maximum of 2k
contexts at any time, but we could use the sw_counter field (which is sw
defined) to increase that because the HW requirement is that
engine_id + sw id + sw_counter is a unique number.
GuC uses a similar method to support more contexts but does its tracking
at lrc level. To avoid doing an implementation that will need to be
reworked once GuC support lands, defer it for now and mark it as TODO.
v2: rebased, add documentation, fix GEN11_ENGINE_INSTANCE_SHIFT
v3: rebased, bring back lost code from i915_gem_context.c
Cc: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_gem_context.c | 11 +++++++++--
drivers/gpu/drm/i915/i915_reg.h | 4 ++++
drivers/gpu/drm/i915/intel_lrc.c | 31 ++++++++++++++++++++++++++++++-
4 files changed, 44 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bcd8301456f7..2635e73e0ca5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2083,6 +2083,7 @@ struct drm_i915_private {
*/
struct ida hw_ida;
#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
+#define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
} contexts;
u32 fdi_rx_config;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 648e7536ff51..dbc50b9e18c9 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -211,9 +211,15 @@ static void context_close(struct i915_gem_context *ctx)
static int assign_hw_id(struct drm_i915_private *dev_priv, unsigned *out)
{
int ret;
+ unsigned int max;
+
+ if (INTEL_GEN(dev_priv) >= 11)
+ max = GEN11_MAX_CONTEXT_HW_ID;
+ else
+ max = MAX_CONTEXT_HW_ID;
ret = ida_simple_get(&dev_priv->contexts.hw_ida,
- 0, MAX_CONTEXT_HW_ID, GFP_KERNEL);
+ 0, max, GFP_KERNEL);
if (ret < 0) {
/* Contexts are only released when no longer active.
* Flush any pending retires to hopefully release some
@@ -221,7 +227,7 @@ static int assign_hw_id(struct drm_i915_private *dev_priv, unsigned *out)
*/
i915_gem_retire_requests(dev_priv);
ret = ida_simple_get(&dev_priv->contexts.hw_ida,
- 0, MAX_CONTEXT_HW_ID, GFP_KERNEL);
+ 0, max, GFP_KERNEL);
if (ret < 0)
return ret;
}
@@ -462,6 +468,7 @@ int i915_gem_contexts_init(struct drm_i915_private *dev_priv)
/* Using the simple ida interface, the max is limited by sizeof(int) */
BUILD_BUG_ON(MAX_CONTEXT_HW_ID > INT_MAX);
+ BUILD_BUG_ON(GEN11_MAX_CONTEXT_HW_ID > INT_MAX);
ida_init(&dev_priv->contexts.hw_ida);
/* lowest priority; idle task */
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index a16a8a2b17b4..84a36302066f 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3844,6 +3844,10 @@ enum {
#define GEN8_CTX_ID_SHIFT 32
#define GEN8_CTX_ID_WIDTH 21
+#define GEN11_SW_CTX_ID_SHIFT 37
+#define GEN11_SW_CTX_ID_WIDTH 11
+#define GEN11_ENGINE_CLASS_SHIFT 61
+#define GEN11_ENGINE_INSTANCE_SHIFT 48
#define CHV_CLK_CTL1 _MMIO(0x101100)
#define VLV_CLK_CTL2 _MMIO(0x101104)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d435a9982d0b..d527a79c872c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -237,6 +237,18 @@ static void execlists_init_reg_state(u32 *reg_state,
* bits 32-52: ctx ID, a globally unique tag
* bits 53-54: mbz, reserved for use by hardware
* bits 55-63: group ID, currently unused and set to 0
+ *
+ * Starting from Gen11, the upper dword of the descriptor has a new format:
+ *
+ * bits 32-36: reserved
+ * bits 37-47: SW context ID (ctx->hw_id)
+ * bits 48:53: engine instance
+ * bit 54: mbz, reserved for use by hardware
+ * bits 55-60: SW counter
+ * bits 61-63: engine class
+ *
+ * engine info, SW context ID and SW counter need to form a unique number
+ * (Context ID) per lrc.
*/
static void
intel_lr_context_descriptor_update(struct i915_gem_context *ctx,
@@ -246,11 +258,28 @@ intel_lr_context_descriptor_update(struct i915_gem_context *ctx,
u64 desc;
BUILD_BUG_ON(MAX_CONTEXT_HW_ID > (1<<GEN8_CTX_ID_WIDTH));
+ BUILD_BUG_ON(GEN11_MAX_CONTEXT_HW_ID > (1<<GEN11_SW_CTX_ID_WIDTH));
desc = ctx->desc_template; /* bits 0-11 */
desc |= i915_ggtt_offset(ce->state) + LRC_HEADER_PAGES * PAGE_SIZE;
/* bits 12-31 */
- desc |= (u64)ctx->hw_id << GEN8_CTX_ID_SHIFT; /* bits 32-52 */
+
+ if (INTEL_GEN(ctx->i915) >= 11) {
+ desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
+ /* bits 61-63 */
+
+ /*
+ * TODO: use SW counter (bits 60-55) to support more CTXs by
+ * combining it with the SW context ID field?
+ */
+
+ desc |= (u64)engine->instance << GEN11_ENGINE_INSTANCE_SHIFT;
+ /* bits 53-48 */
+ desc |= (u64)ctx->hw_id << GEN11_SW_CTX_ID_SHIFT;
+ /* bits 37-47 */
+ } else {
+ desc |= (u64)ctx->hw_id << GEN8_CTX_ID_SHIFT; /* bits 32-52 */
+ }
ce->lrc_desc = desc;
}
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* [PATCH 16/27] drm/i915/icl: Check for fused-off VDBOX and VEBOX instances
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (4 preceding siblings ...)
2018-01-09 23:28 ` [PATCH 15/27] drm/i915/icl: new context descriptor support Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-01-10 9:36 ` Chris Wilson
2018-01-10 23:03 ` [PATCH v8] " Oscar Mateo
2018-01-09 23:28 ` [PATCH 17/27] drm/i915/icl: Enable the extra video decode and enhancement boxes for Icelake 11 Paulo Zanoni
` (13 subsequent siblings)
19 siblings, 2 replies; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni, Rodrigo Vivi
From: Oscar Mateo <oscar.mateo@intel.com>
In Gen11, the Video Decode engines (aka VDBOX, aka VCS, aka BSD) and the
Video Enhancement engines (aka VEBOX, aka VECS) could be fused off. Also,
each VDBOX and VEBOX has its own power well, which only exist if the related
engine exists in the HW.
Unfortunately, we have a Catch-22 situation going on: we need to read an
MMIO register with the fuse info, but we cannot fully enable MMIO until
we read it (since we need the real engines to initialize the forcewake
domains). We workaround this problem by reading the fuse after the MMIO
is partially ready, but before we initialize forcewake.
Bspec: 20680
v2: We were shifting incorrectly for vebox disable (Vinay)
v3: Assert mmio is ready and warn if we have attempted to initialize
forcewake for fused-off engines (Paulo)
v4:
- Use INTEL_GEN in new code (Tvrtko)
- Shorter local variable (Tvrtko, Michal)
- Keep "if (!...) continue" style (Tvrtko)
- No unnecessary BUG_ON (Tvrtko)
- WARN_ON and cleanup if wrong mask (Tvrtko, Michal)
- Use I915_READ_FW (Michal)
- Use I915_MAX_VCS/VECS macros (Michal)
v5: Rebased by Rodrigo fixing conflicts on top of:
commit 33def1ff7b0 ("drm/i915: Simplify intel_engines_init")
v6: Fix v5. Remove info->num_rings. (by Oscar)
v7: Rebase (Rodrigo).
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
drivers/gpu/drm/i915/i915_drv.c | 2 ++
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_reg.h | 5 +++
drivers/gpu/drm/i915/intel_device_info.c | 53 ++++++++++++++++++++++++++++++++
drivers/gpu/drm/i915/intel_device_info.h | 4 +++
5 files changed, 65 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 6c8da9d20c33..60aa09410d94 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1018,6 +1018,8 @@ static int i915_driver_init_mmio(struct drm_i915_private *dev_priv)
if (ret < 0)
goto err_bridge;
+ intel_device_info_fused_off_engines(dev_priv);
+
intel_uncore_init(dev_priv);
intel_uc_init_mmio(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2635e73e0ca5..aa4f2b178d97 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3418,6 +3418,7 @@ void i915_unreserve_fence(struct drm_i915_fence_reg *fence);
void i915_gem_revoke_fences(struct drm_i915_private *dev_priv);
void i915_gem_restore_fences(struct drm_i915_private *dev_priv);
+void intel_device_info_fused_off_engines(struct drm_i915_private *dev_priv);
void i915_gem_detect_bit_6_swizzle(struct drm_i915_private *dev_priv);
void i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj,
struct sg_table *pages);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 84a36302066f..c9b62502ce69 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2804,6 +2804,11 @@ enum i915_power_well_id {
#define GEN10_EU_DISABLE3 _MMIO(0x9140)
#define GEN10_EU_DIS_SS_MASK 0xff
+#define GEN11_GT_VEBOX_VDBOX_DISABLE _MMIO(0x9140)
+#define GEN11_GT_VDBOX_DISABLE_MASK 0xff
+#define GEN11_GT_VEBOX_DISABLE_SHIFT 16
+#define GEN11_GT_VEBOX_DISABLE_MASK (0xff << GEN11_GT_VEBOX_DISABLE_SHIFT)
+
#define GEN6_BSD_SLEEP_PSMI_CONTROL _MMIO(0x12050)
#define GEN6_BSD_SLEEP_MSG_DISABLE (1 << 0)
#define GEN6_BSD_SLEEP_FLUSH_DISABLE (1 << 2)
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index 25448e38ee76..3316470363a0 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -589,3 +589,56 @@ void intel_device_info_runtime_init(struct intel_device_info *info)
/* Initialize command stream timestamp frequency */
info->cs_timestamp_frequency_khz = read_timestamp_frequency(dev_priv);
}
+
+/*
+ * Determine which engines are fused off in our particular hardware.
+ *
+ * This function needs to be called after the MMIO has been setup (as we need
+ * to read registers) but before uncore init (because the powerwell for the
+ * fused off engines doesn't exist, so we cannot initialize forcewake for them)
+ */
+void intel_device_info_fused_off_engines(struct drm_i915_private *dev_priv)
+{
+ struct intel_device_info *info = mkwrite_device_info(dev_priv);
+ u32 media_fuse;
+ int i;
+
+ if (INTEL_GEN(dev_priv) < 11)
+ return;
+
+ GEM_BUG_ON(!dev_priv->regs);
+
+ media_fuse = I915_READ_FW(GEN11_GT_VEBOX_VDBOX_DISABLE);
+
+ info->vdbox_disable = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK;
+ info->vebox_disable = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
+ GEN11_GT_VEBOX_DISABLE_SHIFT;
+
+ DRM_DEBUG_DRIVER("vdbox disable: %04x\n", info->vdbox_disable);
+ for (i = 0; i < I915_MAX_VCS; i++) {
+ if (!HAS_ENGINE(dev_priv, _VCS(i)))
+ continue;
+
+ if (!(BIT(i) & info->vdbox_disable))
+ continue;
+
+ info->ring_mask &= ~ENGINE_MASK(_VCS(i));
+ WARN_ON(dev_priv->uncore.fw_domains &
+ BIT(FW_DOMAIN_ID_MEDIA_VDBOX0 + i));
+ DRM_DEBUG_DRIVER("vcs%u fused off\n", i);
+ }
+
+ DRM_DEBUG_DRIVER("vebox disable: %04x\n", info->vebox_disable);
+ for (i = 0; i < I915_MAX_VECS; i++) {
+ if (!HAS_ENGINE(dev_priv, _VECS(i)))
+ continue;
+
+ if (!(BIT(i) & info->vebox_disable))
+ continue;
+
+ info->ring_mask &= ~ENGINE_MASK(_VECS(i));
+ WARN_ON(dev_priv->uncore.fw_domains &
+ BIT(FW_DOMAIN_ID_MEDIA_VEBOX0 + i));
+ DRM_DEBUG_DRIVER("vecs%u fused off\n", i);
+ }
+}
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index 980893a9e5e9..65951d6ccb18 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -163,6 +163,10 @@ struct intel_device_info {
u32 cs_timestamp_frequency_khz;
+ /* Fused-off engine info */
+ u8 vdbox_disable;
+ u8 vebox_disable;
+
struct color_luts {
u16 degamma_lut_size;
u16 gamma_lut_size;
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH 16/27] drm/i915/icl: Check for fused-off VDBOX and VEBOX instances
2018-01-09 23:28 ` [PATCH 16/27] drm/i915/icl: Check for fused-off VDBOX and VEBOX instances Paulo Zanoni
@ 2018-01-10 9:36 ` Chris Wilson
2018-01-10 19:25 ` Oscar Mateo
2018-01-10 23:03 ` [PATCH v8] " Oscar Mateo
1 sibling, 1 reply; 118+ messages in thread
From: Chris Wilson @ 2018-01-10 9:36 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni, Rodrigo Vivi
Quoting Paulo Zanoni (2018-01-09 23:28:24)
> From: Oscar Mateo <oscar.mateo@intel.com>
>
> In Gen11, the Video Decode engines (aka VDBOX, aka VCS, aka BSD) and the
> Video Enhancement engines (aka VEBOX, aka VECS) could be fused off. Also,
> each VDBOX and VEBOX has its own power well, which only exist if the related
> engine exists in the HW.
>
> Unfortunately, we have a Catch-22 situation going on: we need to read an
> MMIO register with the fuse info, but we cannot fully enable MMIO until
> we read it (since we need the real engines to initialize the forcewake
> domains). We workaround this problem by reading the fuse after the MMIO
> is partially ready, but before we initialize forcewake.
>
> Bspec: 20680
>
> v2: We were shifting incorrectly for vebox disable (Vinay)
>
> v3: Assert mmio is ready and warn if we have attempted to initialize
> forcewake for fused-off engines (Paulo)
>
> v4:
> - Use INTEL_GEN in new code (Tvrtko)
> - Shorter local variable (Tvrtko, Michal)
> - Keep "if (!...) continue" style (Tvrtko)
> - No unnecessary BUG_ON (Tvrtko)
> - WARN_ON and cleanup if wrong mask (Tvrtko, Michal)
> - Use I915_READ_FW (Michal)
> - Use I915_MAX_VCS/VECS macros (Michal)
>
> v5: Rebased by Rodrigo fixing conflicts on top of:
> commit 33def1ff7b0 ("drm/i915: Simplify intel_engines_init")
>
> v6: Fix v5. Remove info->num_rings. (by Oscar)
>
> v7: Rebase (Rodrigo).
>
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.c | 2 ++
> drivers/gpu/drm/i915/i915_drv.h | 1 +
> drivers/gpu/drm/i915/i915_reg.h | 5 +++
> drivers/gpu/drm/i915/intel_device_info.c | 53 ++++++++++++++++++++++++++++++++
> drivers/gpu/drm/i915/intel_device_info.h | 4 +++
> 5 files changed, 65 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 6c8da9d20c33..60aa09410d94 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1018,6 +1018,8 @@ static int i915_driver_init_mmio(struct drm_i915_private *dev_priv)
> if (ret < 0)
> goto err_bridge;
>
> + intel_device_info_fused_off_engines(dev_priv);
intel_device_info_init_mmio();
> +
> intel_uncore_init(dev_priv);
>
> intel_uc_init_mmio(dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 2635e73e0ca5..aa4f2b178d97 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3418,6 +3418,7 @@ void i915_unreserve_fence(struct drm_i915_fence_reg *fence);
> void i915_gem_revoke_fences(struct drm_i915_private *dev_priv);
> void i915_gem_restore_fences(struct drm_i915_private *dev_priv);
>
> +void intel_device_info_fused_off_engines(struct drm_i915_private *dev_priv);
> void i915_gem_detect_bit_6_swizzle(struct drm_i915_private *dev_priv);
> void i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj,
> struct sg_table *pages);
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 84a36302066f..c9b62502ce69 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -2804,6 +2804,11 @@ enum i915_power_well_id {
> #define GEN10_EU_DISABLE3 _MMIO(0x9140)
> #define GEN10_EU_DIS_SS_MASK 0xff
>
> +#define GEN11_GT_VEBOX_VDBOX_DISABLE _MMIO(0x9140)
> +#define GEN11_GT_VDBOX_DISABLE_MASK 0xff
> +#define GEN11_GT_VEBOX_DISABLE_SHIFT 16
> +#define GEN11_GT_VEBOX_DISABLE_MASK (0xff << GEN11_GT_VEBOX_DISABLE_SHIFT)
> +
> #define GEN6_BSD_SLEEP_PSMI_CONTROL _MMIO(0x12050)
> #define GEN6_BSD_SLEEP_MSG_DISABLE (1 << 0)
> #define GEN6_BSD_SLEEP_FLUSH_DISABLE (1 << 2)
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
> index 25448e38ee76..3316470363a0 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -589,3 +589,56 @@ void intel_device_info_runtime_init(struct intel_device_info *info)
> /* Initialize command stream timestamp frequency */
> info->cs_timestamp_frequency_khz = read_timestamp_frequency(dev_priv);
> }
> +
> +/*
> + * Determine which engines are fused off in our particular hardware.
> + *
> + * This function needs to be called after the MMIO has been setup (as we need
> + * to read registers) but before uncore init (because the powerwell for the
> + * fused off engines doesn't exist, so we cannot initialize forcewake for them)
> + */
> +void intel_device_info_fused_off_engines(struct drm_i915_private *dev_priv)
> +{
> + struct intel_device_info *info = mkwrite_device_info(dev_priv);
> + u32 media_fuse;
> + int i;
> +
> + if (INTEL_GEN(dev_priv) < 11)
> + return;
> +
> + GEM_BUG_ON(!dev_priv->regs);
> +
> + media_fuse = I915_READ_FW(GEN11_GT_VEBOX_VDBOX_DISABLE);
> +
> + info->vdbox_disable = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK;
> + info->vebox_disable = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
> + GEN11_GT_VEBOX_DISABLE_SHIFT;
We don't need to keep these (just locals will do), the permanent
information is in info->ring_mask.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH 16/27] drm/i915/icl: Check for fused-off VDBOX and VEBOX instances
2018-01-10 9:36 ` Chris Wilson
@ 2018-01-10 19:25 ` Oscar Mateo
2018-01-10 19:32 ` Chris Wilson
0 siblings, 1 reply; 118+ messages in thread
From: Oscar Mateo @ 2018-01-10 19:25 UTC (permalink / raw)
To: Chris Wilson, Paulo Zanoni, intel-gfx; +Cc: Rodrigo Vivi
On 01/10/2018 01:36 AM, Chris Wilson wrote:
> Quoting Paulo Zanoni (2018-01-09 23:28:24)
>> From: Oscar Mateo <oscar.mateo@intel.com>
>>
>> In Gen11, the Video Decode engines (aka VDBOX, aka VCS, aka BSD) and the
>> Video Enhancement engines (aka VEBOX, aka VECS) could be fused off. Also,
>> each VDBOX and VEBOX has its own power well, which only exist if the related
>> engine exists in the HW.
>>
>> Unfortunately, we have a Catch-22 situation going on: we need to read an
>> MMIO register with the fuse info, but we cannot fully enable MMIO until
>> we read it (since we need the real engines to initialize the forcewake
>> domains). We workaround this problem by reading the fuse after the MMIO
>> is partially ready, but before we initialize forcewake.
>>
>> Bspec: 20680
>>
>> v2: We were shifting incorrectly for vebox disable (Vinay)
>>
>> v3: Assert mmio is ready and warn if we have attempted to initialize
>> forcewake for fused-off engines (Paulo)
>>
>> v4:
>> - Use INTEL_GEN in new code (Tvrtko)
>> - Shorter local variable (Tvrtko, Michal)
>> - Keep "if (!...) continue" style (Tvrtko)
>> - No unnecessary BUG_ON (Tvrtko)
>> - WARN_ON and cleanup if wrong mask (Tvrtko, Michal)
>> - Use I915_READ_FW (Michal)
>> - Use I915_MAX_VCS/VECS macros (Michal)
>>
>> v5: Rebased by Rodrigo fixing conflicts on top of:
>> commit 33def1ff7b0 ("drm/i915: Simplify intel_engines_init")
>>
>> v6: Fix v5. Remove info->num_rings. (by Oscar)
>>
>> v7: Rebase (Rodrigo).
>>
>> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_drv.c | 2 ++
>> drivers/gpu/drm/i915/i915_drv.h | 1 +
>> drivers/gpu/drm/i915/i915_reg.h | 5 +++
>> drivers/gpu/drm/i915/intel_device_info.c | 53 ++++++++++++++++++++++++++++++++
>> drivers/gpu/drm/i915/intel_device_info.h | 4 +++
>> 5 files changed, 65 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>> index 6c8da9d20c33..60aa09410d94 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -1018,6 +1018,8 @@ static int i915_driver_init_mmio(struct drm_i915_private *dev_priv)
>> if (ret < 0)
>> goto err_bridge;
>>
>> + intel_device_info_fused_off_engines(dev_priv);
> intel_device_info_init_mmio();
>
>> +
>> intel_uncore_init(dev_priv);
>>
>> intel_uc_init_mmio(dev_priv);
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 2635e73e0ca5..aa4f2b178d97 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -3418,6 +3418,7 @@ void i915_unreserve_fence(struct drm_i915_fence_reg *fence);
>> void i915_gem_revoke_fences(struct drm_i915_private *dev_priv);
>> void i915_gem_restore_fences(struct drm_i915_private *dev_priv);
>>
>> +void intel_device_info_fused_off_engines(struct drm_i915_private *dev_priv);
>> void i915_gem_detect_bit_6_swizzle(struct drm_i915_private *dev_priv);
>> void i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj,
>> struct sg_table *pages);
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index 84a36302066f..c9b62502ce69 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -2804,6 +2804,11 @@ enum i915_power_well_id {
>> #define GEN10_EU_DISABLE3 _MMIO(0x9140)
>> #define GEN10_EU_DIS_SS_MASK 0xff
>>
>> +#define GEN11_GT_VEBOX_VDBOX_DISABLE _MMIO(0x9140)
>> +#define GEN11_GT_VDBOX_DISABLE_MASK 0xff
>> +#define GEN11_GT_VEBOX_DISABLE_SHIFT 16
>> +#define GEN11_GT_VEBOX_DISABLE_MASK (0xff << GEN11_GT_VEBOX_DISABLE_SHIFT)
>> +
>> #define GEN6_BSD_SLEEP_PSMI_CONTROL _MMIO(0x12050)
>> #define GEN6_BSD_SLEEP_MSG_DISABLE (1 << 0)
>> #define GEN6_BSD_SLEEP_FLUSH_DISABLE (1 << 2)
>> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
>> index 25448e38ee76..3316470363a0 100644
>> --- a/drivers/gpu/drm/i915/intel_device_info.c
>> +++ b/drivers/gpu/drm/i915/intel_device_info.c
>> @@ -589,3 +589,56 @@ void intel_device_info_runtime_init(struct intel_device_info *info)
>> /* Initialize command stream timestamp frequency */
>> info->cs_timestamp_frequency_khz = read_timestamp_frequency(dev_priv);
>> }
>> +
>> +/*
>> + * Determine which engines are fused off in our particular hardware.
>> + *
>> + * This function needs to be called after the MMIO has been setup (as we need
>> + * to read registers) but before uncore init (because the powerwell for the
>> + * fused off engines doesn't exist, so we cannot initialize forcewake for them)
>> + */
>> +void intel_device_info_fused_off_engines(struct drm_i915_private *dev_priv)
>> +{
>> + struct intel_device_info *info = mkwrite_device_info(dev_priv);
>> + u32 media_fuse;
>> + int i;
>> +
>> + if (INTEL_GEN(dev_priv) < 11)
>> + return;
>> +
>> + GEM_BUG_ON(!dev_priv->regs);
>> +
>> + media_fuse = I915_READ_FW(GEN11_GT_VEBOX_VDBOX_DISABLE);
>> +
>> + info->vdbox_disable = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK;
>> + info->vebox_disable = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
>> + GEN11_GT_VEBOX_DISABLE_SHIFT;
> We don't need to keep these (just locals will do), the permanent
> information is in info->ring_mask.
There are subsequent patches that pass this info to GuC, that's why I
was keeping them. I could retrieve the information back from
info->ring_mask, but it's a pity since we already have it in the right
format here.
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH 16/27] drm/i915/icl: Check for fused-off VDBOX and VEBOX instances
2018-01-10 19:25 ` Oscar Mateo
@ 2018-01-10 19:32 ` Chris Wilson
2018-01-10 19:33 ` Chris Wilson
0 siblings, 1 reply; 118+ messages in thread
From: Chris Wilson @ 2018-01-10 19:32 UTC (permalink / raw)
To: Oscar Mateo, Paulo Zanoni, intel-gfx; +Cc: Rodrigo Vivi
Quoting Oscar Mateo (2018-01-10 19:25:39)
>
>
> On 01/10/2018 01:36 AM, Chris Wilson wrote:
> >> +/*
> >> + * Determine which engines are fused off in our particular hardware.
> >> + *
> >> + * This function needs to be called after the MMIO has been setup (as we need
> >> + * to read registers) but before uncore init (because the powerwell for the
> >> + * fused off engines doesn't exist, so we cannot initialize forcewake for them)
> >> + */
> >> +void intel_device_info_fused_off_engines(struct drm_i915_private *dev_priv)
> >> +{
> >> + struct intel_device_info *info = mkwrite_device_info(dev_priv);
> >> + u32 media_fuse;
> >> + int i;
> >> +
> >> + if (INTEL_GEN(dev_priv) < 11)
> >> + return;
> >> +
> >> + GEM_BUG_ON(!dev_priv->regs);
> >> +
> >> + media_fuse = I915_READ_FW(GEN11_GT_VEBOX_VDBOX_DISABLE);
> >> +
> >> + info->vdbox_disable = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK;
> >> + info->vebox_disable = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
> >> + GEN11_GT_VEBOX_DISABLE_SHIFT;
> > We don't need to keep these (just locals will do), the permanent
> > information is in info->ring_mask.
>
> There are subsequent patches that pass this info to GuC, that's why I
> was keeping them. I could retrieve the information back from
> info->ring_mask, but it's a pity since we already have it in the right
> format here.
If there's a use, sure. You can always add it later along with the user
if the patches are separated into a different series. Just nothing in
this patch justified keeping them.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH 16/27] drm/i915/icl: Check for fused-off VDBOX and VEBOX instances
2018-01-10 19:32 ` Chris Wilson
@ 2018-01-10 19:33 ` Chris Wilson
2018-01-10 23:02 ` Oscar Mateo
0 siblings, 1 reply; 118+ messages in thread
From: Chris Wilson @ 2018-01-10 19:33 UTC (permalink / raw)
To: Oscar Mateo, Paulo Zanoni, intel-gfx; +Cc: Rodrigo Vivi
Quoting Chris Wilson (2018-01-10 19:32:09)
> Quoting Oscar Mateo (2018-01-10 19:25:39)
> >
> >
> > On 01/10/2018 01:36 AM, Chris Wilson wrote:
> > >> +/*
> > >> + * Determine which engines are fused off in our particular hardware.
> > >> + *
> > >> + * This function needs to be called after the MMIO has been setup (as we need
> > >> + * to read registers) but before uncore init (because the powerwell for the
> > >> + * fused off engines doesn't exist, so we cannot initialize forcewake for them)
> > >> + */
> > >> +void intel_device_info_fused_off_engines(struct drm_i915_private *dev_priv)
> > >> +{
> > >> + struct intel_device_info *info = mkwrite_device_info(dev_priv);
> > >> + u32 media_fuse;
> > >> + int i;
> > >> +
> > >> + if (INTEL_GEN(dev_priv) < 11)
> > >> + return;
> > >> +
> > >> + GEM_BUG_ON(!dev_priv->regs);
> > >> +
> > >> + media_fuse = I915_READ_FW(GEN11_GT_VEBOX_VDBOX_DISABLE);
> > >> +
> > >> + info->vdbox_disable = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK;
> > >> + info->vebox_disable = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
> > >> + GEN11_GT_VEBOX_DISABLE_SHIFT;
> > > We don't need to keep these (just locals will do), the permanent
> > > information is in info->ring_mask.
> >
> > There are subsequent patches that pass this info to GuC, that's why I
> > was keeping them. I could retrieve the information back from
> > info->ring_mask, but it's a pity since we already have it in the right
> > format here.
>
> If there's a use, sure. You can always add it later along with the user
> if the patches are separated into a different series. Just nothing in
> this patch justified keeping them.
The counter argument is that if there is only a single use case, reading
the registers again isn't an issue, especially if, as you say, the
register contents are exactly what the guc wants to be told.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH 16/27] drm/i915/icl: Check for fused-off VDBOX and VEBOX instances
2018-01-10 19:33 ` Chris Wilson
@ 2018-01-10 23:02 ` Oscar Mateo
0 siblings, 0 replies; 118+ messages in thread
From: Oscar Mateo @ 2018-01-10 23:02 UTC (permalink / raw)
To: Chris Wilson, Paulo Zanoni, intel-gfx; +Cc: Rodrigo Vivi
On 01/10/2018 11:33 AM, Chris Wilson wrote:
> Quoting Chris Wilson (2018-01-10 19:32:09)
>> Quoting Oscar Mateo (2018-01-10 19:25:39)
>>>
>>> On 01/10/2018 01:36 AM, Chris Wilson wrote:
>>>>> +/*
>>>>> + * Determine which engines are fused off in our particular hardware.
>>>>> + *
>>>>> + * This function needs to be called after the MMIO has been setup (as we need
>>>>> + * to read registers) but before uncore init (because the powerwell for the
>>>>> + * fused off engines doesn't exist, so we cannot initialize forcewake for them)
>>>>> + */
>>>>> +void intel_device_info_fused_off_engines(struct drm_i915_private *dev_priv)
>>>>> +{
>>>>> + struct intel_device_info *info = mkwrite_device_info(dev_priv);
>>>>> + u32 media_fuse;
>>>>> + int i;
>>>>> +
>>>>> + if (INTEL_GEN(dev_priv) < 11)
>>>>> + return;
>>>>> +
>>>>> + GEM_BUG_ON(!dev_priv->regs);
>>>>> +
>>>>> + media_fuse = I915_READ_FW(GEN11_GT_VEBOX_VDBOX_DISABLE);
>>>>> +
>>>>> + info->vdbox_disable = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK;
>>>>> + info->vebox_disable = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
>>>>> + GEN11_GT_VEBOX_DISABLE_SHIFT;
>>>> We don't need to keep these (just locals will do), the permanent
>>>> information is in info->ring_mask.
>>> There are subsequent patches that pass this info to GuC, that's why I
>>> was keeping them. I could retrieve the information back from
>>> info->ring_mask, but it's a pity since we already have it in the right
>>> format here.
>> If there's a use, sure. You can always add it later along with the user
>> if the patches are separated into a different series. Just nothing in
>> this patch justified keeping them.
> The counter argument is that if there is only a single use case, reading
> the registers again isn't an issue, especially if, as you say, the
> register contents are exactly what the guc wants to be told.
> -Chris
That's fair enough. I'll resend.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* [PATCH v8] drm/i915/icl: Check for fused-off VDBOX and VEBOX instances
2018-01-09 23:28 ` [PATCH 16/27] drm/i915/icl: Check for fused-off VDBOX and VEBOX instances Paulo Zanoni
2018-01-10 9:36 ` Chris Wilson
@ 2018-01-10 23:03 ` Oscar Mateo
1 sibling, 0 replies; 118+ messages in thread
From: Oscar Mateo @ 2018-01-10 23:03 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni, Rodrigo Vivi
In Gen11, the Video Decode engines (aka VDBOX, aka VCS, aka BSD) and the
Video Enhancement engines (aka VEBOX, aka VECS) could be fused off. Also,
each VDBOX and VEBOX has its own power well, which only exist if the related
engine exists in the HW.
Unfortunately, we have a Catch-22 situation going on: we need to read an
MMIO register with the fuse info, but we cannot fully enable MMIO until
we read it (since we need the real engines to initialize the forcewake
domains). We workaround this problem by reading the fuse after the MMIO
is partially ready, but before we initialize forcewake.
Bspec: 20680
v2: We were shifting incorrectly for vebox disable (Vinay)
v3: Assert mmio is ready and warn if we have attempted to initialize
forcewake for fused-off engines (Paulo)
v4:
- Use INTEL_GEN in new code (Tvrtko)
- Shorter local variable (Tvrtko, Michal)
- Keep "if (!...) continue" style (Tvrtko)
- No unnecessary BUG_ON (Tvrtko)
- WARN_ON and cleanup if wrong mask (Tvrtko, Michal)
- Use I915_READ_FW (Michal)
- Use I915_MAX_VCS/VECS macros (Michal)
v5: Rebased by Rodrigo fixing conflicts on top of:
commit 33def1ff7b0 ("drm/i915: Simplify intel_engines_init")
v6: Fix v5. Remove info->num_rings. (by Oscar)
v7: Rebase (Rodrigo).
v8:
- s/intel_device_info_fused_off_engines/intel_device_info_init_mmio (Chris)
- Make vdbox_disable & vebox_disable local variables (Chris)
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
drivers/gpu/drm/i915/i915_drv.c | 2 ++
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_reg.h | 5 +++
drivers/gpu/drm/i915/intel_device_info.c | 54 ++++++++++++++++++++++++++++++++
4 files changed, 62 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 6c8da9d..fc2c1f3 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1018,6 +1018,8 @@ static int i915_driver_init_mmio(struct drm_i915_private *dev_priv)
if (ret < 0)
goto err_bridge;
+ intel_device_info_init_mmio(dev_priv);
+
intel_uncore_init(dev_priv);
intel_uc_init_mmio(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2635e73..f85a047 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3418,6 +3418,7 @@ struct drm_i915_fence_reg *
void i915_gem_revoke_fences(struct drm_i915_private *dev_priv);
void i915_gem_restore_fences(struct drm_i915_private *dev_priv);
+void intel_device_info_init_mmio(struct drm_i915_private *dev_priv);
void i915_gem_detect_bit_6_swizzle(struct drm_i915_private *dev_priv);
void i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj,
struct sg_table *pages);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 84a3630..c9b6250 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2804,6 +2804,11 @@ enum i915_power_well_id {
#define GEN10_EU_DISABLE3 _MMIO(0x9140)
#define GEN10_EU_DIS_SS_MASK 0xff
+#define GEN11_GT_VEBOX_VDBOX_DISABLE _MMIO(0x9140)
+#define GEN11_GT_VDBOX_DISABLE_MASK 0xff
+#define GEN11_GT_VEBOX_DISABLE_SHIFT 16
+#define GEN11_GT_VEBOX_DISABLE_MASK (0xff << GEN11_GT_VEBOX_DISABLE_SHIFT)
+
#define GEN6_BSD_SLEEP_PSMI_CONTROL _MMIO(0x12050)
#define GEN6_BSD_SLEEP_MSG_DISABLE (1 << 0)
#define GEN6_BSD_SLEEP_FLUSH_DISABLE (1 << 2)
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index 25448e3..4aa4ee4 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -589,3 +589,57 @@ void intel_device_info_runtime_init(struct intel_device_info *info)
/* Initialize command stream timestamp frequency */
info->cs_timestamp_frequency_khz = read_timestamp_frequency(dev_priv);
}
+
+/*
+ * Determine which engines are fused off in our particular hardware.
+ *
+ * This function needs to be called after the MMIO has been setup (as we need
+ * to read registers) but before uncore init (because the powerwell for the
+ * fused off engines doesn't exist, so we cannot initialize forcewake for them)
+ */
+void intel_device_info_init_mmio(struct drm_i915_private *dev_priv)
+{
+ struct intel_device_info *info = mkwrite_device_info(dev_priv);
+ u8 vdbox_disable, vebox_disable;
+ u32 media_fuse;
+ int i;
+
+ if (INTEL_GEN(dev_priv) < 11)
+ return;
+
+ GEM_BUG_ON(!dev_priv->regs);
+
+ media_fuse = I915_READ_FW(GEN11_GT_VEBOX_VDBOX_DISABLE);
+
+ vdbox_disable = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK;
+ vebox_disable = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
+ GEN11_GT_VEBOX_DISABLE_SHIFT;
+
+ DRM_DEBUG_DRIVER("vdbox disable: %04x\n", vdbox_disable);
+ for (i = 0; i < I915_MAX_VCS; i++) {
+ if (!HAS_ENGINE(dev_priv, _VCS(i)))
+ continue;
+
+ if (!(BIT(i) & vdbox_disable))
+ continue;
+
+ info->ring_mask &= ~ENGINE_MASK(_VCS(i));
+ WARN_ON(dev_priv->uncore.fw_domains &
+ BIT(FW_DOMAIN_ID_MEDIA_VDBOX0 + i));
+ DRM_DEBUG_DRIVER("vcs%u fused off\n", i);
+ }
+
+ DRM_DEBUG_DRIVER("vebox disable: %04x\n", vebox_disable);
+ for (i = 0; i < I915_MAX_VECS; i++) {
+ if (!HAS_ENGINE(dev_priv, _VECS(i)))
+ continue;
+
+ if (!(BIT(i) & vebox_disable))
+ continue;
+
+ info->ring_mask &= ~ENGINE_MASK(_VECS(i));
+ WARN_ON(dev_priv->uncore.fw_domains &
+ BIT(FW_DOMAIN_ID_MEDIA_VEBOX0 + i));
+ DRM_DEBUG_DRIVER("vecs%u fused off\n", i);
+ }
+}
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* [PATCH 17/27] drm/i915/icl: Enable the extra video decode and enhancement boxes for Icelake 11
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (5 preceding siblings ...)
2018-01-09 23:28 ` [PATCH 16/27] drm/i915/icl: Check for fused-off VDBOX and VEBOX instances Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-01-09 23:28 ` [PATCH 18/27] drm/i915/icl: Update subslice define for ICL 11 Paulo Zanoni
` (12 subsequent siblings)
19 siblings, 0 replies; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx
From: Oscar Mateo <oscar.mateo@intel.com>
Icelake 11 has one vebox and two vdboxes (0 and 2).
Bspec: 21140
v2: Split out in two (Daniele)
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
drivers/gpu/drm/i915/i915_pci.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 0a807bb44583..8984cdb98ac3 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -590,6 +590,7 @@ static const struct intel_device_info intel_icelake_11_info = {
.platform = INTEL_ICELAKE,
.is_alpha_support = 1,
.has_resource_streamer = 0,
+ .ring_mask = RENDER_RING | BLT_RING | VEBOX_RING | BSD_RING | BSD3_RING,
};
/*
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* [PATCH 18/27] drm/i915/icl: Update subslice define for ICL 11
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (6 preceding siblings ...)
2018-01-09 23:28 ` [PATCH 17/27] drm/i915/icl: Enable the extra video decode and enhancement boxes for Icelake 11 Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-01-11 0:06 ` Oscar Mateo
2018-01-11 18:25 ` [PATCH v2] " Oscar Mateo
2018-01-09 23:28 ` [PATCH 19/27] drm/i915/icl: Added ICL 11 slice, subslice and EU fuse detection Paulo Zanoni
` (11 subsequent siblings)
19 siblings, 2 replies; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx
From: Kelvin Gardiner <kelvin.gardiner@intel.com>
ICL 11 has a greater number of maximum subslices. This patch updates the
subslice max define to reflect this.
Bspec: 21139
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Kelvin Gardiner <kelvin.gardiner@intel.com>
---
drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 2a8823166a0b..029093a54cd3 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -82,7 +82,7 @@ hangcheck_action_to_str(const enum intel_engine_hangcheck_action a)
}
#define I915_MAX_SLICES 3
-#define I915_MAX_SUBSLICES 3
+#define I915_MAX_SUBSLICES 8
#define instdone_slice_mask(dev_priv__) \
(INTEL_GEN(dev_priv__) == 7 ? \
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH 18/27] drm/i915/icl: Update subslice define for ICL 11
2018-01-09 23:28 ` [PATCH 18/27] drm/i915/icl: Update subslice define for ICL 11 Paulo Zanoni
@ 2018-01-11 0:06 ` Oscar Mateo
2018-01-11 18:25 ` [PATCH v2] " Oscar Mateo
1 sibling, 0 replies; 118+ messages in thread
From: Oscar Mateo @ 2018-01-11 0:06 UTC (permalink / raw)
To: Paulo Zanoni, intel-gfx
On 01/09/2018 03:28 PM, Paulo Zanoni wrote:
> From: Kelvin Gardiner <kelvin.gardiner@intel.com>
>
> ICL 11 has a greater number of maximum subslices. This patch updates the
> subslice max define to reflect this.
>
> Bspec: 21139
>
> Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Signed-off-by: Kelvin Gardiner <kelvin.gardiner@intel.com>
Hmmm... my r-b does not stand. This also needs a GEN11 update to all the
fields in GEN8_MCR_SELECTOR (and a "Bspec: 21108" tag)
> ---
> drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 2a8823166a0b..029093a54cd3 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -82,7 +82,7 @@ hangcheck_action_to_str(const enum intel_engine_hangcheck_action a)
> }
>
> #define I915_MAX_SLICES 3
> -#define I915_MAX_SUBSLICES 3
> +#define I915_MAX_SUBSLICES 8
>
> #define instdone_slice_mask(dev_priv__) \
> (INTEL_GEN(dev_priv__) == 7 ? \
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* [PATCH v2] drm/i915/icl: Update subslice define for ICL 11
2018-01-09 23:28 ` [PATCH 18/27] drm/i915/icl: Update subslice define for ICL 11 Paulo Zanoni
2018-01-11 0:06 ` Oscar Mateo
@ 2018-01-11 18:25 ` Oscar Mateo
2018-02-08 16:35 ` Lionel Landwerlin
2018-02-09 18:00 ` [PATCH v3] " Oscar Mateo
1 sibling, 2 replies; 118+ messages in thread
From: Oscar Mateo @ 2018-01-11 18:25 UTC (permalink / raw)
To: intel-gfx
From: Kelvin Gardiner <kelvin.gardiner@intel.com>
ICL 11 has a greater number of maximum subslices. This patch
reflects this.
v2: GEN11 updates to MCR_SELECTOR (Oscar)
Bspec: 21139
BSpec: 21108
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)
Signed-off-by: Kelvin Gardiner <kelvin.gardiner@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 4 ++++
drivers/gpu/drm/i915/intel_engine_cs.c | 22 ++++++++++++++++++----
drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +-
3 files changed, 23 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index c9b6250..c79ca5b 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2444,6 +2444,10 @@ enum i915_power_well_id {
#define GEN8_MCR_SLICE_MASK GEN8_MCR_SLICE(3)
#define GEN8_MCR_SUBSLICE(subslice) (((subslice) & 3) << 24)
#define GEN8_MCR_SUBSLICE_MASK GEN8_MCR_SUBSLICE(3)
+#define GEN11_MCR_SLICE(slice) (((slice) & 0xf) << 27)
+#define GEN11_MCR_SLICE_MASK GEN8_MCR_SLICE(0xf)
+#define GEN11_MCR_SUBSLICE(subslice) (((subslice) & 0x7) << 24)
+#define GEN11_MCR_SUBSLICE_MASK GEN8_MCR_SUBSLICE(0x7)
#define RING_IPEIR(base) _MMIO((base)+0x64)
#define RING_IPEHR(base) _MMIO((base)+0x68)
/*
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index a373bcb..8c0da94 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -775,10 +775,24 @@ const char *i915_cache_level_str(struct drm_i915_private *i915, int type)
read_subslice_reg(struct drm_i915_private *dev_priv, int slice,
int subslice, i915_reg_t reg)
{
+ uint32_t mcr_slice_subslice_mask;
+ uint32_t mcr_slice_subslice_select;
uint32_t mcr;
uint32_t ret;
enum forcewake_domains fw_domains;
+ if (INTEL_GEN(dev_priv) >= 11) {
+ mcr_slice_subslice_mask = GEN11_MCR_SLICE_MASK |
+ GEN11_MCR_SUBSLICE_MASK;
+ mcr_slice_subslice_select = GEN11_MCR_SLICE(slice) |
+ GEN11_MCR_SUBSLICE(subslice);
+ } else {
+ mcr_slice_subslice_mask = GEN8_MCR_SLICE_MASK |
+ GEN8_MCR_SUBSLICE_MASK;
+ mcr_slice_subslice_select = GEN8_MCR_SLICE(slice) |
+ GEN8_MCR_SUBSLICE(subslice);
+ }
+
fw_domains = intel_uncore_forcewake_for_reg(dev_priv, reg,
FW_REG_READ);
fw_domains |= intel_uncore_forcewake_for_reg(dev_priv,
@@ -793,14 +807,14 @@ const char *i915_cache_level_str(struct drm_i915_private *i915, int type)
* The HW expects the slice and sublice selectors to be reset to 0
* after reading out the registers.
*/
- WARN_ON_ONCE(mcr & (GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK));
- mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
- mcr |= GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
+ WARN_ON_ONCE(mcr & mcr_slice_subslice_mask);
+ mcr &= ~mcr_slice_subslice_mask;
+ mcr |= mcr_slice_subslice_select;
I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
ret = I915_READ_FW(reg);
- mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
+ mcr &= ~mcr_slice_subslice_mask;
I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
intel_uncore_forcewake_put__locked(dev_priv, fw_domains);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 2a88231..029093a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -82,7 +82,7 @@ enum intel_engine_hangcheck_action {
}
#define I915_MAX_SLICES 3
-#define I915_MAX_SUBSLICES 3
+#define I915_MAX_SUBSLICES 8
#define instdone_slice_mask(dev_priv__) \
(INTEL_GEN(dev_priv__) == 7 ? \
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH v2] drm/i915/icl: Update subslice define for ICL 11
2018-01-11 18:25 ` [PATCH v2] " Oscar Mateo
@ 2018-02-08 16:35 ` Lionel Landwerlin
2018-02-09 17:44 ` Oscar Mateo
2018-02-09 18:00 ` [PATCH v3] " Oscar Mateo
1 sibling, 1 reply; 118+ messages in thread
From: Lionel Landwerlin @ 2018-02-08 16:35 UTC (permalink / raw)
To: Oscar Mateo, intel-gfx
On 11/01/18 18:25, Oscar Mateo wrote:
> From: Kelvin Gardiner <kelvin.gardiner@intel.com>
>
> ICL 11 has a greater number of maximum subslices. This patch
> reflects this.
>
> v2: GEN11 updates to MCR_SELECTOR (Oscar)
>
> Bspec: 21139
> BSpec: 21108
>
> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)
> Signed-off-by: Kelvin Gardiner <kelvin.gardiner@intel.com>
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> ---
> drivers/gpu/drm/i915/i915_reg.h | 4 ++++
> drivers/gpu/drm/i915/intel_engine_cs.c | 22 ++++++++++++++++++----
> drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +-
> 3 files changed, 23 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index c9b6250..c79ca5b 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -2444,6 +2444,10 @@ enum i915_power_well_id {
> #define GEN8_MCR_SLICE_MASK GEN8_MCR_SLICE(3)
> #define GEN8_MCR_SUBSLICE(subslice) (((subslice) & 3) << 24)
> #define GEN8_MCR_SUBSLICE_MASK GEN8_MCR_SUBSLICE(3)
> +#define GEN11_MCR_SLICE(slice) (((slice) & 0xf) << 27)
> +#define GEN11_MCR_SLICE_MASK GEN8_MCR_SLICE(0xf)
I think you got that line above wrong, should be GEN11_MCR_SLICE(0xf)
> +#define GEN11_MCR_SUBSLICE(subslice) (((subslice) & 0x7) << 24)
> +#define GEN11_MCR_SUBSLICE_MASK GEN8_MCR_SUBSLICE(0x7)
Same issue on the line above : GEN11_MCR_SUBSLICE(0x7)
Otherwise, looks good.
> #define RING_IPEIR(base) _MMIO((base)+0x64)
> #define RING_IPEHR(base) _MMIO((base)+0x68)
> /*
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index a373bcb..8c0da94 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -775,10 +775,24 @@ const char *i915_cache_level_str(struct drm_i915_private *i915, int type)
> read_subslice_reg(struct drm_i915_private *dev_priv, int slice,
> int subslice, i915_reg_t reg)
> {
> + uint32_t mcr_slice_subslice_mask;
> + uint32_t mcr_slice_subslice_select;
> uint32_t mcr;
> uint32_t ret;
> enum forcewake_domains fw_domains;
>
> + if (INTEL_GEN(dev_priv) >= 11) {
> + mcr_slice_subslice_mask = GEN11_MCR_SLICE_MASK |
> + GEN11_MCR_SUBSLICE_MASK;
> + mcr_slice_subslice_select = GEN11_MCR_SLICE(slice) |
> + GEN11_MCR_SUBSLICE(subslice);
> + } else {
> + mcr_slice_subslice_mask = GEN8_MCR_SLICE_MASK |
> + GEN8_MCR_SUBSLICE_MASK;
> + mcr_slice_subslice_select = GEN8_MCR_SLICE(slice) |
> + GEN8_MCR_SUBSLICE(subslice);
> + }
> +
> fw_domains = intel_uncore_forcewake_for_reg(dev_priv, reg,
> FW_REG_READ);
> fw_domains |= intel_uncore_forcewake_for_reg(dev_priv,
> @@ -793,14 +807,14 @@ const char *i915_cache_level_str(struct drm_i915_private *i915, int type)
> * The HW expects the slice and sublice selectors to be reset to 0
> * after reading out the registers.
> */
> - WARN_ON_ONCE(mcr & (GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK));
> - mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
> - mcr |= GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
> + WARN_ON_ONCE(mcr & mcr_slice_subslice_mask);
> + mcr &= ~mcr_slice_subslice_mask;
> + mcr |= mcr_slice_subslice_select;
> I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
>
> ret = I915_READ_FW(reg);
>
> - mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
> + mcr &= ~mcr_slice_subslice_mask;
> I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
>
> intel_uncore_forcewake_put__locked(dev_priv, fw_domains);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 2a88231..029093a 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -82,7 +82,7 @@ enum intel_engine_hangcheck_action {
> }
>
> #define I915_MAX_SLICES 3
> -#define I915_MAX_SUBSLICES 3
> +#define I915_MAX_SUBSLICES 8
>
> #define instdone_slice_mask(dev_priv__) \
> (INTEL_GEN(dev_priv__) == 7 ? \
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH v2] drm/i915/icl: Update subslice define for ICL 11
2018-02-08 16:35 ` Lionel Landwerlin
@ 2018-02-09 17:44 ` Oscar Mateo
2018-02-09 17:48 ` Lionel Landwerlin
0 siblings, 1 reply; 118+ messages in thread
From: Oscar Mateo @ 2018-02-09 17:44 UTC (permalink / raw)
To: Lionel Landwerlin, intel-gfx
On 02/08/2018 08:35 AM, Lionel Landwerlin wrote:
> On 11/01/18 18:25, Oscar Mateo wrote:
>> From: Kelvin Gardiner <kelvin.gardiner@intel.com>
>>
>> ICL 11 has a greater number of maximum subslices. This patch
>> reflects this.
>>
>> v2: GEN11 updates to MCR_SELECTOR (Oscar)
>>
>> Bspec: 21139
>> BSpec: 21108
>>
>> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> (v1)
>> Signed-off-by: Kelvin Gardiner <kelvin.gardiner@intel.com>
>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_reg.h | 4 ++++
>> drivers/gpu/drm/i915/intel_engine_cs.c | 22 ++++++++++++++++++----
>> drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +-
>> 3 files changed, 23 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h
>> b/drivers/gpu/drm/i915/i915_reg.h
>> index c9b6250..c79ca5b 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -2444,6 +2444,10 @@ enum i915_power_well_id {
>> #define GEN8_MCR_SLICE_MASK GEN8_MCR_SLICE(3)
>> #define GEN8_MCR_SUBSLICE(subslice) (((subslice) & 3) << 24)
>> #define GEN8_MCR_SUBSLICE_MASK GEN8_MCR_SUBSLICE(3)
>> +#define GEN11_MCR_SLICE(slice) (((slice) & 0xf) << 27)
>> +#define GEN11_MCR_SLICE_MASK GEN8_MCR_SLICE(0xf)
> I think you got that line above wrong, should be GEN11_MCR_SLICE(0xf)
>> +#define GEN11_MCR_SUBSLICE(subslice) (((subslice) & 0x7) << 24)
>> +#define GEN11_MCR_SUBSLICE_MASK GEN8_MCR_SUBSLICE(0x7)
>
> Same issue on the line above : GEN11_MCR_SUBSLICE(0x7)
>
> Otherwise, looks good.
Oops, thank you for spotting this. Do I have your r-b for the fixed
version then?
>
>> #define RING_IPEIR(base) _MMIO((base)+0x64)
>> #define RING_IPEHR(base) _MMIO((base)+0x68)
>> /*
>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c
>> b/drivers/gpu/drm/i915/intel_engine_cs.c
>> index a373bcb..8c0da94 100644
>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>> @@ -775,10 +775,24 @@ const char *i915_cache_level_str(struct
>> drm_i915_private *i915, int type)
>> read_subslice_reg(struct drm_i915_private *dev_priv, int slice,
>> int subslice, i915_reg_t reg)
>> {
>> + uint32_t mcr_slice_subslice_mask;
>> + uint32_t mcr_slice_subslice_select;
>> uint32_t mcr;
>> uint32_t ret;
>> enum forcewake_domains fw_domains;
>> + if (INTEL_GEN(dev_priv) >= 11) {
>> + mcr_slice_subslice_mask = GEN11_MCR_SLICE_MASK |
>> + GEN11_MCR_SUBSLICE_MASK;
>> + mcr_slice_subslice_select = GEN11_MCR_SLICE(slice) |
>> + GEN11_MCR_SUBSLICE(subslice);
>> + } else {
>> + mcr_slice_subslice_mask = GEN8_MCR_SLICE_MASK |
>> + GEN8_MCR_SUBSLICE_MASK;
>> + mcr_slice_subslice_select = GEN8_MCR_SLICE(slice) |
>> + GEN8_MCR_SUBSLICE(subslice);
>> + }
>> +
>> fw_domains = intel_uncore_forcewake_for_reg(dev_priv, reg,
>> FW_REG_READ);
>> fw_domains |= intel_uncore_forcewake_for_reg(dev_priv,
>> @@ -793,14 +807,14 @@ const char *i915_cache_level_str(struct
>> drm_i915_private *i915, int type)
>> * The HW expects the slice and sublice selectors to be reset to 0
>> * after reading out the registers.
>> */
>> - WARN_ON_ONCE(mcr & (GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK));
>> - mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
>> - mcr |= GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
>> + WARN_ON_ONCE(mcr & mcr_slice_subslice_mask);
>> + mcr &= ~mcr_slice_subslice_mask;
>> + mcr |= mcr_slice_subslice_select;
>> I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
>> ret = I915_READ_FW(reg);
>> - mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
>> + mcr &= ~mcr_slice_subslice_mask;
>> I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
>> intel_uncore_forcewake_put__locked(dev_priv, fw_domains);
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h
>> b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> index 2a88231..029093a 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> @@ -82,7 +82,7 @@ enum intel_engine_hangcheck_action {
>> }
>> #define I915_MAX_SLICES 3
>> -#define I915_MAX_SUBSLICES 3
>> +#define I915_MAX_SUBSLICES 8
>> #define instdone_slice_mask(dev_priv__) \
>> (INTEL_GEN(dev_priv__) == 7 ? \
>
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH v2] drm/i915/icl: Update subslice define for ICL 11
2018-02-09 17:44 ` Oscar Mateo
@ 2018-02-09 17:48 ` Lionel Landwerlin
0 siblings, 0 replies; 118+ messages in thread
From: Lionel Landwerlin @ 2018-02-09 17:48 UTC (permalink / raw)
To: Oscar Mateo, intel-gfx
On 09/02/18 17:44, Oscar Mateo wrote:
>
>
> On 02/08/2018 08:35 AM, Lionel Landwerlin wrote:
>> On 11/01/18 18:25, Oscar Mateo wrote:
>>> From: Kelvin Gardiner <kelvin.gardiner@intel.com>
>>>
>>> ICL 11 has a greater number of maximum subslices. This patch
>>> reflects this.
>>>
>>> v2: GEN11 updates to MCR_SELECTOR (Oscar)
>>>
>>> Bspec: 21139
>>> BSpec: 21108
>>>
>>> Reviewed-by: Daniele Ceraolo Spurio
>>> <daniele.ceraolospurio@intel.com> (v1)
>>> Signed-off-by: Kelvin Gardiner <kelvin.gardiner@intel.com>
>>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_reg.h | 4 ++++
>>> drivers/gpu/drm/i915/intel_engine_cs.c | 22 ++++++++++++++++++----
>>> drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +-
>>> 3 files changed, 23 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_reg.h
>>> b/drivers/gpu/drm/i915/i915_reg.h
>>> index c9b6250..c79ca5b 100644
>>> --- a/drivers/gpu/drm/i915/i915_reg.h
>>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>>> @@ -2444,6 +2444,10 @@ enum i915_power_well_id {
>>> #define GEN8_MCR_SLICE_MASK GEN8_MCR_SLICE(3)
>>> #define GEN8_MCR_SUBSLICE(subslice) (((subslice) & 3) << 24)
>>> #define GEN8_MCR_SUBSLICE_MASK GEN8_MCR_SUBSLICE(3)
>>> +#define GEN11_MCR_SLICE(slice) (((slice) & 0xf) << 27)
>>> +#define GEN11_MCR_SLICE_MASK GEN8_MCR_SLICE(0xf)
>> I think you got that line above wrong, should be GEN11_MCR_SLICE(0xf)
>>> +#define GEN11_MCR_SUBSLICE(subslice) (((subslice) & 0x7) << 24)
>>> +#define GEN11_MCR_SUBSLICE_MASK GEN8_MCR_SUBSLICE(0x7)
>>
>> Same issue on the line above : GEN11_MCR_SUBSLICE(0x7)
>>
>> Otherwise, looks good.
>
> Oops, thank you for spotting this. Do I have your r-b for the fixed
> version then?
Sure, with the change, it's :
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>
>>
>>> #define RING_IPEIR(base) _MMIO((base)+0x64)
>>> #define RING_IPEHR(base) _MMIO((base)+0x68)
>>> /*
>>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c
>>> b/drivers/gpu/drm/i915/intel_engine_cs.c
>>> index a373bcb..8c0da94 100644
>>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>>> @@ -775,10 +775,24 @@ const char *i915_cache_level_str(struct
>>> drm_i915_private *i915, int type)
>>> read_subslice_reg(struct drm_i915_private *dev_priv, int slice,
>>> int subslice, i915_reg_t reg)
>>> {
>>> + uint32_t mcr_slice_subslice_mask;
>>> + uint32_t mcr_slice_subslice_select;
>>> uint32_t mcr;
>>> uint32_t ret;
>>> enum forcewake_domains fw_domains;
>>> + if (INTEL_GEN(dev_priv) >= 11) {
>>> + mcr_slice_subslice_mask = GEN11_MCR_SLICE_MASK |
>>> + GEN11_MCR_SUBSLICE_MASK;
>>> + mcr_slice_subslice_select = GEN11_MCR_SLICE(slice) |
>>> + GEN11_MCR_SUBSLICE(subslice);
>>> + } else {
>>> + mcr_slice_subslice_mask = GEN8_MCR_SLICE_MASK |
>>> + GEN8_MCR_SUBSLICE_MASK;
>>> + mcr_slice_subslice_select = GEN8_MCR_SLICE(slice) |
>>> + GEN8_MCR_SUBSLICE(subslice);
>>> + }
>>> +
>>> fw_domains = intel_uncore_forcewake_for_reg(dev_priv, reg,
>>> FW_REG_READ);
>>> fw_domains |= intel_uncore_forcewake_for_reg(dev_priv,
>>> @@ -793,14 +807,14 @@ const char *i915_cache_level_str(struct
>>> drm_i915_private *i915, int type)
>>> * The HW expects the slice and sublice selectors to be reset
>>> to 0
>>> * after reading out the registers.
>>> */
>>> - WARN_ON_ONCE(mcr & (GEN8_MCR_SLICE_MASK |
>>> GEN8_MCR_SUBSLICE_MASK));
>>> - mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
>>> - mcr |= GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
>>> + WARN_ON_ONCE(mcr & mcr_slice_subslice_mask);
>>> + mcr &= ~mcr_slice_subslice_mask;
>>> + mcr |= mcr_slice_subslice_select;
>>> I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
>>> ret = I915_READ_FW(reg);
>>> - mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
>>> + mcr &= ~mcr_slice_subslice_mask;
>>> I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
>>> intel_uncore_forcewake_put__locked(dev_priv, fw_domains);
>>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> index 2a88231..029093a 100644
>>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> @@ -82,7 +82,7 @@ enum intel_engine_hangcheck_action {
>>> }
>>> #define I915_MAX_SLICES 3
>>> -#define I915_MAX_SUBSLICES 3
>>> +#define I915_MAX_SUBSLICES 8
>>> #define instdone_slice_mask(dev_priv__) \
>>> (INTEL_GEN(dev_priv__) == 7 ? \
>>
>>
>
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* [PATCH v3] drm/i915/icl: Update subslice define for ICL 11
2018-01-11 18:25 ` [PATCH v2] " Oscar Mateo
2018-02-08 16:35 ` Lionel Landwerlin
@ 2018-02-09 18:00 ` Oscar Mateo
1 sibling, 0 replies; 118+ messages in thread
From: Oscar Mateo @ 2018-02-09 18:00 UTC (permalink / raw)
To: intel-gfx
From: Kelvin Gardiner <kelvin.gardiner@intel.com>
ICL 11 has a greater number of maximum subslices. This patch
reflects this.
v2: GEN11 updates to MCR_SELECTOR (Oscar)
v3: Copypasta error in the new defines (Lionel)
Bspec: 21139
BSpec: 21108
Signed-off-by: Kelvin Gardiner <kelvin.gardiner@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> (v1)
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> (v1)
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 4 ++++
drivers/gpu/drm/i915/intel_engine_cs.c | 22 ++++++++++++++++++----
drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +-
3 files changed, 23 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index e9c79b5..e149789 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2437,6 +2437,10 @@ enum i915_power_well_id {
#define GEN8_MCR_SLICE_MASK GEN8_MCR_SLICE(3)
#define GEN8_MCR_SUBSLICE(subslice) (((subslice) & 3) << 24)
#define GEN8_MCR_SUBSLICE_MASK GEN8_MCR_SUBSLICE(3)
+#define GEN11_MCR_SLICE(slice) (((slice) & 0xf) << 27)
+#define GEN11_MCR_SLICE_MASK GEN11_MCR_SLICE(0xf)
+#define GEN11_MCR_SUBSLICE(subslice) (((subslice) & 0x7) << 24)
+#define GEN11_MCR_SUBSLICE_MASK GEN11_MCR_SUBSLICE(0x7)
#define RING_IPEIR(base) _MMIO((base)+0x64)
#define RING_IPEHR(base) _MMIO((base)+0x68)
/*
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 0ad9184..e265923 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -736,10 +736,24 @@ const char *i915_cache_level_str(struct drm_i915_private *i915, int type)
read_subslice_reg(struct drm_i915_private *dev_priv, int slice,
int subslice, i915_reg_t reg)
{
+ uint32_t mcr_slice_subslice_mask;
+ uint32_t mcr_slice_subslice_select;
uint32_t mcr;
uint32_t ret;
enum forcewake_domains fw_domains;
+ if (INTEL_GEN(dev_priv) >= 11) {
+ mcr_slice_subslice_mask = GEN11_MCR_SLICE_MASK |
+ GEN11_MCR_SUBSLICE_MASK;
+ mcr_slice_subslice_select = GEN11_MCR_SLICE(slice) |
+ GEN11_MCR_SUBSLICE(subslice);
+ } else {
+ mcr_slice_subslice_mask = GEN8_MCR_SLICE_MASK |
+ GEN8_MCR_SUBSLICE_MASK;
+ mcr_slice_subslice_select = GEN8_MCR_SLICE(slice) |
+ GEN8_MCR_SUBSLICE(subslice);
+ }
+
fw_domains = intel_uncore_forcewake_for_reg(dev_priv, reg,
FW_REG_READ);
fw_domains |= intel_uncore_forcewake_for_reg(dev_priv,
@@ -754,14 +768,14 @@ const char *i915_cache_level_str(struct drm_i915_private *i915, int type)
* The HW expects the slice and sublice selectors to be reset to 0
* after reading out the registers.
*/
- WARN_ON_ONCE(mcr & (GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK));
- mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
- mcr |= GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
+ WARN_ON_ONCE(mcr & mcr_slice_subslice_mask);
+ mcr &= ~mcr_slice_subslice_mask;
+ mcr |= mcr_slice_subslice_select;
I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
ret = I915_READ_FW(reg);
- mcr &= ~(GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK);
+ mcr &= ~mcr_slice_subslice_mask;
I915_WRITE_FW(GEN8_MCR_SELECTOR, mcr);
intel_uncore_forcewake_put__locked(dev_priv, fw_domains);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 8f1a4ba..8d988ff 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -82,7 +82,7 @@ enum intel_engine_hangcheck_action {
}
#define I915_MAX_SLICES 3
-#define I915_MAX_SUBSLICES 3
+#define I915_MAX_SUBSLICES 8
#define instdone_slice_mask(dev_priv__) \
(INTEL_GEN(dev_priv__) == 7 ? \
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* [PATCH 19/27] drm/i915/icl: Added ICL 11 slice, subslice and EU fuse detection
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (7 preceding siblings ...)
2018-01-09 23:28 ` [PATCH 18/27] drm/i915/icl: Update subslice define for ICL 11 Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-01-10 12:02 ` Tvrtko Ursulin
2018-01-09 23:28 ` [PATCH 20/27] drm/i915/icl: Make use of the SW counter field in the new context descriptor Paulo Zanoni
` (10 subsequent siblings)
19 siblings, 1 reply; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx
From: Kelvin Gardiner <kelvin.gardiner@intel.com>
This patch adds support to detect ICL, slice, subslice and EU fuse
settings.
Add addresses for ICL 11 slice, subslice and EU fuses registers.
These register addresses are the same as previous platforms but the
format and / or the meaning of the information is different. Therefore
Gen11 defines for these registers are added.
Bspec: 9731
Bspec: 20643
Bspec: 20673
Signed-off-by: Kelvin Gardiner <kelvin.gardiner@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 9 +++++++++
drivers/gpu/drm/i915/intel_device_info.c | 25 ++++++++++++++++++++++++-
2 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index c9b62502ce69..d8b537570b8e 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2809,6 +2809,15 @@ enum i915_power_well_id {
#define GEN11_GT_VEBOX_DISABLE_SHIFT 16
#define GEN11_GT_VEBOX_DISABLE_MASK (0xff << GEN11_GT_VEBOX_DISABLE_SHIFT)
+#define GEN11_EU_DISABLE _MMIO(0x9134)
+#define GEN11_EU_DIS_MASK 0xFF
+
+#define GEN11_GT_SLICE_ENABLE _MMIO(0x9138)
+#define GEN11_GT_S_ENA_MASK 0xFF
+
+#define GEN11_GT_SUBSLICE_DISABLE _MMIO(0x913C)
+#define GEN11_GT_SS_DIS_MASK 0xFF
+
#define GEN6_BSD_SLEEP_PSMI_CONTROL _MMIO(0x12050)
#define GEN6_BSD_SLEEP_MSG_DISABLE (1 << 0)
#define GEN6_BSD_SLEEP_FLUSH_DISABLE (1 << 2)
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index 3316470363a0..895c41ef4abf 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -120,6 +120,27 @@ void intel_device_info_dump(const struct intel_device_info *info,
intel_device_info_dump_flags(info, p);
}
+static void gen11_sseu_info_init(struct drm_i915_private *dev_priv)
+{
+ struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
+ int eu_max = 8;
+ u32 eu_disable;
+
+ sseu->slice_mask = I915_READ(GEN11_GT_SLICE_ENABLE) &
+ GEN11_GT_S_ENA_MASK;
+ sseu->subslice_mask = ~(I915_READ(GEN11_GT_SUBSLICE_DISABLE) &
+ GEN11_GT_SS_DIS_MASK);
+ eu_disable = I915_READ(GEN11_EU_DISABLE) & GEN11_GT_S_ENA_MASK;
+
+ sseu->eu_per_subslice = eu_max - hweight32(eu_disable);
+ sseu->eu_total = sseu->eu_per_subslice * hweight32(sseu->subslice_mask);
+
+ /* ICL has no power gating restrictions. */
+ sseu->has_slice_pg = 1;
+ sseu->has_subslice_pg = 1;
+ sseu->has_eu_pg = 1;
+}
+
static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
{
struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
@@ -583,8 +604,10 @@ void intel_device_info_runtime_init(struct intel_device_info *info)
broadwell_sseu_info_init(dev_priv);
else if (INTEL_GEN(dev_priv) == 9)
gen9_sseu_info_init(dev_priv);
- else if (INTEL_GEN(dev_priv) >= 10)
+ else if (INTEL_GEN(dev_priv) == 10)
gen10_sseu_info_init(dev_priv);
+ else if (INTEL_INFO(dev_priv)->gen >= 11)
+ gen11_sseu_info_init(dev_priv);
/* Initialize command stream timestamp frequency */
info->cs_timestamp_frequency_khz = read_timestamp_frequency(dev_priv);
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH 19/27] drm/i915/icl: Added ICL 11 slice, subslice and EU fuse detection
2018-01-09 23:28 ` [PATCH 19/27] drm/i915/icl: Added ICL 11 slice, subslice and EU fuse detection Paulo Zanoni
@ 2018-01-10 12:02 ` Tvrtko Ursulin
0 siblings, 0 replies; 118+ messages in thread
From: Tvrtko Ursulin @ 2018-01-10 12:02 UTC (permalink / raw)
To: Paulo Zanoni, intel-gfx
On 09/01/2018 23:28, Paulo Zanoni wrote:
> From: Kelvin Gardiner <kelvin.gardiner@intel.com>
>
> This patch adds support to detect ICL, slice, subslice and EU fuse
> settings.
>
> Add addresses for ICL 11 slice, subslice and EU fuses registers.
> These register addresses are the same as previous platforms but the
> format and / or the meaning of the information is different. Therefore
> Gen11 defines for these registers are added.
>
> Bspec: 9731
> Bspec: 20643
> Bspec: 20673
>
> Signed-off-by: Kelvin Gardiner <kelvin.gardiner@intel.com>
> ---
> drivers/gpu/drm/i915/i915_reg.h | 9 +++++++++
> drivers/gpu/drm/i915/intel_device_info.c | 25 ++++++++++++++++++++++++-
> 2 files changed, 33 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index c9b62502ce69..d8b537570b8e 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -2809,6 +2809,15 @@ enum i915_power_well_id {
> #define GEN11_GT_VEBOX_DISABLE_SHIFT 16
> #define GEN11_GT_VEBOX_DISABLE_MASK (0xff << GEN11_GT_VEBOX_DISABLE_SHIFT)
>
> +#define GEN11_EU_DISABLE _MMIO(0x9134)
> +#define GEN11_EU_DIS_MASK 0xFF
> +
> +#define GEN11_GT_SLICE_ENABLE _MMIO(0x9138)
> +#define GEN11_GT_S_ENA_MASK 0xFF
> +
> +#define GEN11_GT_SUBSLICE_DISABLE _MMIO(0x913C)
> +#define GEN11_GT_SS_DIS_MASK 0xFF
> +
> #define GEN6_BSD_SLEEP_PSMI_CONTROL _MMIO(0x12050)
> #define GEN6_BSD_SLEEP_MSG_DISABLE (1 << 0)
> #define GEN6_BSD_SLEEP_FLUSH_DISABLE (1 << 2)
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
> index 3316470363a0..895c41ef4abf 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -120,6 +120,27 @@ void intel_device_info_dump(const struct intel_device_info *info,
> intel_device_info_dump_flags(info, p);
> }
>
> +static void gen11_sseu_info_init(struct drm_i915_private *dev_priv)
> +{
> + struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
> + int eu_max = 8;
> + u32 eu_disable;
> +
> + sseu->slice_mask = I915_READ(GEN11_GT_SLICE_ENABLE) &
> + GEN11_GT_S_ENA_MASK;
> + sseu->subslice_mask = ~(I915_READ(GEN11_GT_SUBSLICE_DISABLE) &
> + GEN11_GT_SS_DIS_MASK);
> + eu_disable = I915_READ(GEN11_EU_DISABLE) & GEN11_GT_S_ENA_MASK;
> +
> + sseu->eu_per_subslice = eu_max - hweight32(eu_disable);
> + sseu->eu_total = sseu->eu_per_subslice * hweight32(sseu->subslice_mask);
> +
> + /* ICL has no power gating restrictions. */
> + sseu->has_slice_pg = 1;
> + sseu->has_subslice_pg = 1;
> + sseu->has_eu_pg = 1;
> +}
> +
> static void gen10_sseu_info_init(struct drm_i915_private *dev_priv)
> {
> struct sseu_dev_info *sseu = &mkwrite_device_info(dev_priv)->sseu;
> @@ -583,8 +604,10 @@ void intel_device_info_runtime_init(struct intel_device_info *info)
> broadwell_sseu_info_init(dev_priv);
> else if (INTEL_GEN(dev_priv) == 9)
> gen9_sseu_info_init(dev_priv);
> - else if (INTEL_GEN(dev_priv) >= 10)
> + else if (INTEL_GEN(dev_priv) == 10)
We usually use IS_GEN10 and the == construct is very rare. I suggest to
change in while touching the line.
> gen10_sseu_info_init(dev_priv);
> + else if (INTEL_INFO(dev_priv)->gen >= 11)
INTEL_GEN should be used in new code.
Regards,
Tvrtko
> + gen11_sseu_info_init(dev_priv);
>
> /* Initialize command stream timestamp frequency */
> info->cs_timestamp_frequency_khz = read_timestamp_frequency(dev_priv);
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* [PATCH 20/27] drm/i915/icl: Make use of the SW counter field in the new context descriptor
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (8 preceding siblings ...)
2018-01-09 23:28 ` [PATCH 19/27] drm/i915/icl: Added ICL 11 slice, subslice and EU fuse detection Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-01-11 21:10 ` Daniele Ceraolo Spurio
2018-01-09 23:28 ` [PATCH 21/27] drm/i915/icl: Add reset control register changes Paulo Zanoni
` (9 subsequent siblings)
19 siblings, 1 reply; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx; +Cc: Rodrigo Vivi
From: Oscar Mateo <oscar.mateo@intel.com>
The new context descriptor format in Gen11 contains two assignable fields: the
SW Context ID (technically 11 bits, but practically limited to 2032 entries due
to some being reserved for future use by the GuC) and the SW Counter (6 bits).
We don't want to limit ourselves too much in the maximum number of concurrent
contexts we want to allow, so ideally we want to employ every possible bit
available. Unfortunately, a further limitation in the interface with the GuC
means the combination of SW Context ID + SW Counter has to be unique within the
same engine class (as we use the SW Context ID to index in the GuC stage
descriptor pool, and the Engine Class + SW Counter to index in the 2-dimensional
lrc array). This essentially means we need to somehow encode the engine instance.
Since the BSpec allows 6 bits for engine instance, we use the whole SW counter
for this task. If the limitation of 2032 maximum simultaneous contexts is too
restrictive, we can always squeeze things a bit more (3 extras bits for hw_id,
3 bits for instance) and things will still work (Gen11 does not instance more
than 8 engines of any class).
Another alternative would be to generate the hw_id per HW context instead of per
GEM context, but that has other problems (e.g. maximum number of user-created
contexts would be variable, no relationship between a GuC principal descriptor
and the proxy descriptor it uses, etc...).
Bspec: 12254
v2:
- Squashed with parts of "Interface changes for GuC fw 22.108" (Daniele)
- Do not apply the 16 reserved entries limitation to the non-GuC path (Joonas)
v3: Rebased by Rodrigo.
v4: Rebased (s/i915_modparams.enable_guc_submission/USES_GUC_SUBMISSION(dev_priv))
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 11 ++++++++---
drivers/gpu/drm/i915/i915_gem_context.c | 9 ++++++---
drivers/gpu/drm/i915/i915_gem_context.h | 2 ++
drivers/gpu/drm/i915/i915_reg.h | 2 ++
drivers/gpu/drm/i915/intel_lrc.c | 15 ++++++++-------
5 files changed, 26 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index aa4f2b178d97..3f1d8c0d2b0a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2079,11 +2079,16 @@ struct drm_i915_private {
/* The hw wants to have a stable context identifier for the
* lifetime of the context (for OA, PASID, faults, etc).
- * This is limited in execlists to 21 bits.
+ * This is limited in execlists to 21 bits. In enhanced execlist
+ * (GEN11+) this is limited to 11 bits (the SW Context ID field)
+ * but GuC limits it a bit further (11 bits - 16) due to some
+ * entries being reserved for future use (so the firmware only
+ * supports a GuC stage descriptor pool of 2032 entries).
*/
struct ida hw_ida;
-#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
-#define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
+#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
+#define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
+#define GEN11_MAX_CONTEXT_HW_ID_WITH_GUC GEN11_MAX_CONTEXT_HW_ID - 16
} contexts;
u32 fdi_rx_config;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index dbc50b9e18c9..bb5d070083f5 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -213,9 +213,12 @@ static int assign_hw_id(struct drm_i915_private *dev_priv, unsigned *out)
int ret;
unsigned int max;
- if (INTEL_GEN(dev_priv) >= 11)
- max = GEN11_MAX_CONTEXT_HW_ID;
- else
+ if (INTEL_GEN(dev_priv) >= 11) {
+ if (USES_GUC_SUBMISSION(dev_priv))
+ max = GEN11_MAX_CONTEXT_HW_ID_WITH_GUC;
+ else
+ max = GEN11_MAX_CONTEXT_HW_ID;
+ } else
max = MAX_CONTEXT_HW_ID;
ret = ida_simple_get(&dev_priv->contexts.hw_ida,
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 4bfb72f8e1cb..7a39d54e9962 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -156,6 +156,8 @@ struct i915_gem_context {
struct intel_ring *ring;
u32 *lrc_reg_state;
u64 lrc_desc;
+ u32 sw_context_id;
+ u32 sw_counter;
int pin_count;
} engine[I915_NUM_ENGINES];
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d8b537570b8e..6d5e2c651580 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3860,6 +3860,8 @@ enum {
#define GEN8_CTX_ID_WIDTH 21
#define GEN11_SW_CTX_ID_SHIFT 37
#define GEN11_SW_CTX_ID_WIDTH 11
+#define GEN11_SW_COUNTER_SHIFT 55
+#define GEN11_SW_COUNTER_WIDTH 6
#define GEN11_ENGINE_CLASS_SHIFT 61
#define GEN11_ENGINE_INSTANCE_SHIFT 48
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d527a79c872c..edf050de8ffe 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -267,15 +267,11 @@ intel_lr_context_descriptor_update(struct i915_gem_context *ctx,
if (INTEL_GEN(ctx->i915) >= 11) {
desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
/* bits 61-63 */
-
- /*
- * TODO: use SW counter (bits 60-55) to support more CTXs by
- * combining it with the SW context ID field?
- */
-
+ desc |= (u64)ce->sw_counter << GEN11_SW_COUNTER_SHIFT;
+ /* bits 55-60 */
desc |= (u64)engine->instance << GEN11_ENGINE_INSTANCE_SHIFT;
/* bits 53-48 */
- desc |= (u64)ctx->hw_id << GEN11_SW_CTX_ID_SHIFT;
+ desc |= (u64)ce->sw_context_id << GEN11_SW_CTX_ID_SHIFT;
/* bits 37-47 */
} else {
desc |= (u64)ctx->hw_id << GEN8_CTX_ID_SHIFT; /* bits 32-52 */
@@ -2398,6 +2394,11 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
ce->ring = ring;
ce->state = vma;
+ if (INTEL_GEN(ctx->i915) >= 11) {
+ ce->sw_context_id = ctx->hw_id;
+ ce->sw_counter = engine->instance;
+ }
+
return 0;
error_ring_free:
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH 20/27] drm/i915/icl: Make use of the SW counter field in the new context descriptor
2018-01-09 23:28 ` [PATCH 20/27] drm/i915/icl: Make use of the SW counter field in the new context descriptor Paulo Zanoni
@ 2018-01-11 21:10 ` Daniele Ceraolo Spurio
2018-01-11 22:37 ` Oscar Mateo
0 siblings, 1 reply; 118+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-01-11 21:10 UTC (permalink / raw)
To: Paulo Zanoni, intel-gfx; +Cc: Rodrigo Vivi
This could potentially be squashed with patch 15, as it doesn't make
much sense to add a TODO there and solve it here. We might also want to
update the comment above intel_lr_context_descriptor_update to remove
the implication that SW context ID == ctx->hw_id (which is still
technically true after this patch but we're preparing for it not to be
anymore).
Thanks,
Daniele
On 09/01/18 15:28, Paulo Zanoni wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
>
> The new context descriptor format in Gen11 contains two assignable fields: the
> SW Context ID (technically 11 bits, but practically limited to 2032 entries due
> to some being reserved for future use by the GuC) and the SW Counter (6 bits).
>
> We don't want to limit ourselves too much in the maximum number of concurrent
> contexts we want to allow, so ideally we want to employ every possible bit
> available. Unfortunately, a further limitation in the interface with the GuC
> means the combination of SW Context ID + SW Counter has to be unique within the
> same engine class (as we use the SW Context ID to index in the GuC stage
> descriptor pool, and the Engine Class + SW Counter to index in the 2-dimensional
> lrc array). This essentially means we need to somehow encode the engine instance.
>
> Since the BSpec allows 6 bits for engine instance, we use the whole SW counter
> for this task. If the limitation of 2032 maximum simultaneous contexts is too
> restrictive, we can always squeeze things a bit more (3 extras bits for hw_id,
> 3 bits for instance) and things will still work (Gen11 does not instance more
> than 8 engines of any class).
>
> Another alternative would be to generate the hw_id per HW context instead of per
> GEM context, but that has other problems (e.g. maximum number of user-created
> contexts would be variable, no relationship between a GuC principal descriptor
> and the proxy descriptor it uses, etc...).
>
> Bspec: 12254
>
> v2:
> - Squashed with parts of "Interface changes for GuC fw 22.108" (Daniele)
> - Do not apply the 16 reserved entries limitation to the non-GuC path (Joonas)
> v3: Rebased by Rodrigo.
> v4: Rebased (s/i915_modparams.enable_guc_submission/USES_GUC_SUBMISSION(dev_priv))
>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 11 ++++++++---
> drivers/gpu/drm/i915/i915_gem_context.c | 9 ++++++---
> drivers/gpu/drm/i915/i915_gem_context.h | 2 ++
> drivers/gpu/drm/i915/i915_reg.h | 2 ++
> drivers/gpu/drm/i915/intel_lrc.c | 15 ++++++++-------
> 5 files changed, 26 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index aa4f2b178d97..3f1d8c0d2b0a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2079,11 +2079,16 @@ struct drm_i915_private {
>
> /* The hw wants to have a stable context identifier for the
> * lifetime of the context (for OA, PASID, faults, etc).
> - * This is limited in execlists to 21 bits.
> + * This is limited in execlists to 21 bits. In enhanced execlist
> + * (GEN11+) this is limited to 11 bits (the SW Context ID field)
> + * but GuC limits it a bit further (11 bits - 16) due to some
> + * entries being reserved for future use (so the firmware only
> + * supports a GuC stage descriptor pool of 2032 entries).
> */
> struct ida hw_ida;
> -#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
> -#define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
> +#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
> +#define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
> +#define GEN11_MAX_CONTEXT_HW_ID_WITH_GUC GEN11_MAX_CONTEXT_HW_ID - 16
> } contexts;
>
> u32 fdi_rx_config;
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index dbc50b9e18c9..bb5d070083f5 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -213,9 +213,12 @@ static int assign_hw_id(struct drm_i915_private *dev_priv, unsigned *out)
> int ret;
> unsigned int max;
>
> - if (INTEL_GEN(dev_priv) >= 11)
> - max = GEN11_MAX_CONTEXT_HW_ID;
> - else
> + if (INTEL_GEN(dev_priv) >= 11) {
> + if (USES_GUC_SUBMISSION(dev_priv))
> + max = GEN11_MAX_CONTEXT_HW_ID_WITH_GUC;
> + else
> + max = GEN11_MAX_CONTEXT_HW_ID;
> + } else
> max = MAX_CONTEXT_HW_ID;
>
> ret = ida_simple_get(&dev_priv->contexts.hw_ida,
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index 4bfb72f8e1cb..7a39d54e9962 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -156,6 +156,8 @@ struct i915_gem_context {
> struct intel_ring *ring;
> u32 *lrc_reg_state;
> u64 lrc_desc;
> + u32 sw_context_id;
> + u32 sw_counter;
> int pin_count;
> } engine[I915_NUM_ENGINES];
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index d8b537570b8e..6d5e2c651580 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -3860,6 +3860,8 @@ enum {
> #define GEN8_CTX_ID_WIDTH 21
> #define GEN11_SW_CTX_ID_SHIFT 37
> #define GEN11_SW_CTX_ID_WIDTH 11
> +#define GEN11_SW_COUNTER_SHIFT 55
> +#define GEN11_SW_COUNTER_WIDTH 6
> #define GEN11_ENGINE_CLASS_SHIFT 61
> #define GEN11_ENGINE_INSTANCE_SHIFT 48
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index d527a79c872c..edf050de8ffe 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -267,15 +267,11 @@ intel_lr_context_descriptor_update(struct i915_gem_context *ctx,
> if (INTEL_GEN(ctx->i915) >= 11) {
> desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
> /* bits 61-63 */
> -
> - /*
> - * TODO: use SW counter (bits 60-55) to support more CTXs by
> - * combining it with the SW context ID field?
> - */
> -
> + desc |= (u64)ce->sw_counter << GEN11_SW_COUNTER_SHIFT;
> + /* bits 55-60 */
> desc |= (u64)engine->instance << GEN11_ENGINE_INSTANCE_SHIFT;
> /* bits 53-48 */
> - desc |= (u64)ctx->hw_id << GEN11_SW_CTX_ID_SHIFT;
> + desc |= (u64)ce->sw_context_id << GEN11_SW_CTX_ID_SHIFT;
> /* bits 37-47 */
> } else {
> desc |= (u64)ctx->hw_id << GEN8_CTX_ID_SHIFT; /* bits 32-52 */
> @@ -2398,6 +2394,11 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
> ce->ring = ring;
> ce->state = vma;
>
> + if (INTEL_GEN(ctx->i915) >= 11) {
> + ce->sw_context_id = ctx->hw_id;
> + ce->sw_counter = engine->instance;
> + }
> +
> return 0;
>
> error_ring_free:
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH 20/27] drm/i915/icl: Make use of the SW counter field in the new context descriptor
2018-01-11 21:10 ` Daniele Ceraolo Spurio
@ 2018-01-11 22:37 ` Oscar Mateo
2018-01-11 23:11 ` Daniele Ceraolo Spurio
0 siblings, 1 reply; 118+ messages in thread
From: Oscar Mateo @ 2018-01-11 22:37 UTC (permalink / raw)
To: Daniele Ceraolo Spurio, Paulo Zanoni, intel-gfx; +Cc: Rodrigo Vivi
On 01/11/2018 01:10 PM, Daniele Ceraolo Spurio wrote:
> This could potentially be squashed with patch 15, as it doesn't make
> much sense to add a TODO there and solve it here. We might also want
> to update the comment above intel_lr_context_descriptor_update to
> remove the implication that SW context ID == ctx->hw_id (which is
> still technically true after this patch but we're preparing for it not
> to be anymore).
>
I was actually thinking of a different fate for this patch: leave patch
15 as is (maybe make the TODO more open, like "TODO: decide what to do
with sw_counter"), slap a "drm/i915/icl/guc" prefix on this one and
consider it together with the rest of the GuC patches. At least in the
meantime, while we decide how to go about sw_counter (CCing Tvrtko).
What do you think?
> Thanks,
> Daniele
>
> On 09/01/18 15:28, Paulo Zanoni wrote:
>> From: Oscar Mateo <oscar.mateo@intel.com>
>>
>> The new context descriptor format in Gen11 contains two assignable
>> fields: the
>> SW Context ID (technically 11 bits, but practically limited to 2032
>> entries due
>> to some being reserved for future use by the GuC) and the SW Counter
>> (6 bits).
>>
>> We don't want to limit ourselves too much in the maximum number of
>> concurrent
>> contexts we want to allow, so ideally we want to employ every
>> possible bit
>> available. Unfortunately, a further limitation in the interface with
>> the GuC
>> means the combination of SW Context ID + SW Counter has to be unique
>> within the
>> same engine class (as we use the SW Context ID to index in the GuC stage
>> descriptor pool, and the Engine Class + SW Counter to index in the
>> 2-dimensional
>> lrc array). This essentially means we need to somehow encode the
>> engine instance.
>>
>> Since the BSpec allows 6 bits for engine instance, we use the whole
>> SW counter
>> for this task. If the limitation of 2032 maximum simultaneous
>> contexts is too
>> restrictive, we can always squeeze things a bit more (3 extras bits
>> for hw_id,
>> 3 bits for instance) and things will still work (Gen11 does not
>> instance more
>> than 8 engines of any class).
>>
>> Another alternative would be to generate the hw_id per HW context
>> instead of per
>> GEM context, but that has other problems (e.g. maximum number of
>> user-created
>> contexts would be variable, no relationship between a GuC principal
>> descriptor
>> and the proxy descriptor it uses, etc...).
>>
>> Bspec: 12254
>>
>> v2:
>> - Squashed with parts of "Interface changes for GuC fw 22.108"
>> (Daniele)
>> - Do not apply the 16 reserved entries limitation to the non-GuC
>> path (Joonas)
>> v3: Rebased by Rodrigo.
>> v4: Rebased
>> (s/i915_modparams.enable_guc_submission/USES_GUC_SUBMISSION(dev_priv))
>>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> Cc: Michel Thierry <michel.thierry@intel.com>
>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_drv.h | 11 ++++++++---
>> drivers/gpu/drm/i915/i915_gem_context.c | 9 ++++++---
>> drivers/gpu/drm/i915/i915_gem_context.h | 2 ++
>> drivers/gpu/drm/i915/i915_reg.h | 2 ++
>> drivers/gpu/drm/i915/intel_lrc.c | 15 ++++++++-------
>> 5 files changed, 26 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index aa4f2b178d97..3f1d8c0d2b0a 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2079,11 +2079,16 @@ struct drm_i915_private {
>> /* The hw wants to have a stable context identifier for the
>> * lifetime of the context (for OA, PASID, faults, etc).
>> - * This is limited in execlists to 21 bits.
>> + * This is limited in execlists to 21 bits. In enhanced
>> execlist
>> + * (GEN11+) this is limited to 11 bits (the SW Context ID
>> field)
>> + * but GuC limits it a bit further (11 bits - 16) due to some
>> + * entries being reserved for future use (so the firmware only
>> + * supports a GuC stage descriptor pool of 2032 entries).
>> */
>> struct ida hw_ida;
>> -#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
>> -#define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
>> +#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
>> +#define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
>> +#define GEN11_MAX_CONTEXT_HW_ID_WITH_GUC GEN11_MAX_CONTEXT_HW_ID - 16
>> } contexts;
>> u32 fdi_rx_config;
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c
>> b/drivers/gpu/drm/i915/i915_gem_context.c
>> index dbc50b9e18c9..bb5d070083f5 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>> @@ -213,9 +213,12 @@ static int assign_hw_id(struct drm_i915_private
>> *dev_priv, unsigned *out)
>> int ret;
>> unsigned int max;
>> - if (INTEL_GEN(dev_priv) >= 11)
>> - max = GEN11_MAX_CONTEXT_HW_ID;
>> - else
>> + if (INTEL_GEN(dev_priv) >= 11) {
>> + if (USES_GUC_SUBMISSION(dev_priv))
>> + max = GEN11_MAX_CONTEXT_HW_ID_WITH_GUC;
>> + else
>> + max = GEN11_MAX_CONTEXT_HW_ID;
>> + } else
>> max = MAX_CONTEXT_HW_ID;
>> ret = ida_simple_get(&dev_priv->contexts.hw_ida,
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h
>> b/drivers/gpu/drm/i915/i915_gem_context.h
>> index 4bfb72f8e1cb..7a39d54e9962 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.h
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
>> @@ -156,6 +156,8 @@ struct i915_gem_context {
>> struct intel_ring *ring;
>> u32 *lrc_reg_state;
>> u64 lrc_desc;
>> + u32 sw_context_id;
>> + u32 sw_counter;
>> int pin_count;
>> } engine[I915_NUM_ENGINES];
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h
>> b/drivers/gpu/drm/i915/i915_reg.h
>> index d8b537570b8e..6d5e2c651580 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -3860,6 +3860,8 @@ enum {
>> #define GEN8_CTX_ID_WIDTH 21
>> #define GEN11_SW_CTX_ID_SHIFT 37
>> #define GEN11_SW_CTX_ID_WIDTH 11
>> +#define GEN11_SW_COUNTER_SHIFT 55
>> +#define GEN11_SW_COUNTER_WIDTH 6
>> #define GEN11_ENGINE_CLASS_SHIFT 61
>> #define GEN11_ENGINE_INSTANCE_SHIFT 48
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index d527a79c872c..edf050de8ffe 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -267,15 +267,11 @@ intel_lr_context_descriptor_update(struct
>> i915_gem_context *ctx,
>> if (INTEL_GEN(ctx->i915) >= 11) {
>> desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
>> /* bits 61-63 */
>> -
>> - /*
>> - * TODO: use SW counter (bits 60-55) to support more CTXs by
>> - * combining it with the SW context ID field?
>> - */
>> -
>> + desc |= (u64)ce->sw_counter << GEN11_SW_COUNTER_SHIFT;
>> + /* bits 55-60 */
>> desc |= (u64)engine->instance << GEN11_ENGINE_INSTANCE_SHIFT;
>> /* bits 53-48 */
>> - desc |= (u64)ctx->hw_id << GEN11_SW_CTX_ID_SHIFT;
>> + desc |= (u64)ce->sw_context_id << GEN11_SW_CTX_ID_SHIFT;
>> /* bits 37-47 */
>> } else {
>> desc |= (u64)ctx->hw_id << GEN8_CTX_ID_SHIFT; /* bits
>> 32-52 */
>> @@ -2398,6 +2394,11 @@ static int
>> execlists_context_deferred_alloc(struct i915_gem_context *ctx,
>> ce->ring = ring;
>> ce->state = vma;
>> + if (INTEL_GEN(ctx->i915) >= 11) {
>> + ce->sw_context_id = ctx->hw_id;
>> + ce->sw_counter = engine->instance;
>> + }
>> +
>> return 0;
>> error_ring_free:
>>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH 20/27] drm/i915/icl: Make use of the SW counter field in the new context descriptor
2018-01-11 22:37 ` Oscar Mateo
@ 2018-01-11 23:11 ` Daniele Ceraolo Spurio
0 siblings, 0 replies; 118+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-01-11 23:11 UTC (permalink / raw)
To: Oscar Mateo, Paulo Zanoni, intel-gfx; +Cc: Rodrigo Vivi
On 11/01/18 14:37, Oscar Mateo wrote:
>
>
> On 01/11/2018 01:10 PM, Daniele Ceraolo Spurio wrote:
>> This could potentially be squashed with patch 15, as it doesn't make
>> much sense to add a TODO there and solve it here. We might also want
>> to update the comment above intel_lr_context_descriptor_update to
>> remove the implication that SW context ID == ctx->hw_id (which is
>> still technically true after this patch but we're preparing for it not
>> to be anymore).
>>
>
> I was actually thinking of a different fate for this patch: leave patch
> 15 as is (maybe make the TODO more open, like "TODO: decide what to do
> with sw_counter"), slap a "drm/i915/icl/guc" prefix on this one and
> consider it together with the rest of the GuC patches. At least in the
> meantime, while we decide how to go about sw_counter (CCing Tvrtko).
> What do you think?
>
Sounds good to me, this patch doesn't really impact the !GuC path anyway
since the number of IDs stays the same. I'll wait to see if there is any
more feedback and if no one complains I'll send a new revision of patch 15.
>> Thanks,
>> Daniele
>>
>> On 09/01/18 15:28, Paulo Zanoni wrote:
>>> From: Oscar Mateo <oscar.mateo@intel.com>
>>>
>>> The new context descriptor format in Gen11 contains two assignable
>>> fields: the
>>> SW Context ID (technically 11 bits, but practically limited to 2032
>>> entries due
>>> to some being reserved for future use by the GuC) and the SW Counter
>>> (6 bits).
>>>
>>> We don't want to limit ourselves too much in the maximum number of
>>> concurrent
>>> contexts we want to allow, so ideally we want to employ every
>>> possible bit
>>> available. Unfortunately, a further limitation in the interface with
>>> the GuC
>>> means the combination of SW Context ID + SW Counter has to be unique
>>> within the
>>> same engine class (as we use the SW Context ID to index in the GuC stage
>>> descriptor pool, and the Engine Class + SW Counter to index in the
>>> 2-dimensional
>>> lrc array). This essentially means we need to somehow encode the
>>> engine instance.
>>>
>>> Since the BSpec allows 6 bits for engine instance, we use the whole
>>> SW counter
>>> for this task. If the limitation of 2032 maximum simultaneous
>>> contexts is too
>>> restrictive, we can always squeeze things a bit more (3 extras bits
>>> for hw_id,
>>> 3 bits for instance) and things will still work (Gen11 does not
>>> instance more
>>> than 8 engines of any class).
>>>
>>> Another alternative would be to generate the hw_id per HW context
>>> instead of per
>>> GEM context, but that has other problems (e.g. maximum number of
>>> user-created
>>> contexts would be variable, no relationship between a GuC principal
>>> descriptor
>>> and the proxy descriptor it uses, etc...).
>>>
>>> Bspec: 12254
>>>
>>> v2:
>>> - Squashed with parts of "Interface changes for GuC fw 22.108"
>>> (Daniele)
>>> - Do not apply the 16 reserved entries limitation to the non-GuC
>>> path (Joonas)
>>> v3: Rebased by Rodrigo.
>>> v4: Rebased
>>> (s/i915_modparams.enable_guc_submission/USES_GUC_SUBMISSION(dev_priv))
>>>
>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>> Cc: Michel Thierry <michel.thierry@intel.com>
>>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_drv.h | 11 ++++++++---
>>> drivers/gpu/drm/i915/i915_gem_context.c | 9 ++++++---
>>> drivers/gpu/drm/i915/i915_gem_context.h | 2 ++
>>> drivers/gpu/drm/i915/i915_reg.h | 2 ++
>>> drivers/gpu/drm/i915/intel_lrc.c | 15 ++++++++-------
>>> 5 files changed, 26 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>> b/drivers/gpu/drm/i915/i915_drv.h
>>> index aa4f2b178d97..3f1d8c0d2b0a 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -2079,11 +2079,16 @@ struct drm_i915_private {
>>> /* The hw wants to have a stable context identifier for the
>>> * lifetime of the context (for OA, PASID, faults, etc).
>>> - * This is limited in execlists to 21 bits.
>>> + * This is limited in execlists to 21 bits. In enhanced
>>> execlist
>>> + * (GEN11+) this is limited to 11 bits (the SW Context ID
>>> field)
>>> + * but GuC limits it a bit further (11 bits - 16) due to some
>>> + * entries being reserved for future use (so the firmware only
>>> + * supports a GuC stage descriptor pool of 2032 entries).
>>> */
>>> struct ida hw_ida;
>>> -#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
>>> -#define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
>>> +#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
>>> +#define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
>>> +#define GEN11_MAX_CONTEXT_HW_ID_WITH_GUC GEN11_MAX_CONTEXT_HW_ID - 16
>>> } contexts;
>>> u32 fdi_rx_config;
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c
>>> b/drivers/gpu/drm/i915/i915_gem_context.c
>>> index dbc50b9e18c9..bb5d070083f5 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>>> @@ -213,9 +213,12 @@ static int assign_hw_id(struct drm_i915_private
>>> *dev_priv, unsigned *out)
>>> int ret;
>>> unsigned int max;
>>> - if (INTEL_GEN(dev_priv) >= 11)
>>> - max = GEN11_MAX_CONTEXT_HW_ID;
>>> - else
>>> + if (INTEL_GEN(dev_priv) >= 11) {
>>> + if (USES_GUC_SUBMISSION(dev_priv))
>>> + max = GEN11_MAX_CONTEXT_HW_ID_WITH_GUC;
>>> + else
>>> + max = GEN11_MAX_CONTEXT_HW_ID;
>>> + } else
>>> max = MAX_CONTEXT_HW_ID;
>>> ret = ida_simple_get(&dev_priv->contexts.hw_ida,
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h
>>> b/drivers/gpu/drm/i915/i915_gem_context.h
>>> index 4bfb72f8e1cb..7a39d54e9962 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_context.h
>>> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
>>> @@ -156,6 +156,8 @@ struct i915_gem_context {
>>> struct intel_ring *ring;
>>> u32 *lrc_reg_state;
>>> u64 lrc_desc;
>>> + u32 sw_context_id;
>>> + u32 sw_counter;
>>> int pin_count;
>>> } engine[I915_NUM_ENGINES];
>>> diff --git a/drivers/gpu/drm/i915/i915_reg.h
>>> b/drivers/gpu/drm/i915/i915_reg.h
>>> index d8b537570b8e..6d5e2c651580 100644
>>> --- a/drivers/gpu/drm/i915/i915_reg.h
>>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>>> @@ -3860,6 +3860,8 @@ enum {
>>> #define GEN8_CTX_ID_WIDTH 21
>>> #define GEN11_SW_CTX_ID_SHIFT 37
>>> #define GEN11_SW_CTX_ID_WIDTH 11
>>> +#define GEN11_SW_COUNTER_SHIFT 55
>>> +#define GEN11_SW_COUNTER_WIDTH 6
>>> #define GEN11_ENGINE_CLASS_SHIFT 61
>>> #define GEN11_ENGINE_INSTANCE_SHIFT 48
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>>> b/drivers/gpu/drm/i915/intel_lrc.c
>>> index d527a79c872c..edf050de8ffe 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -267,15 +267,11 @@ intel_lr_context_descriptor_update(struct
>>> i915_gem_context *ctx,
>>> if (INTEL_GEN(ctx->i915) >= 11) {
>>> desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
>>> /* bits 61-63 */
>>> -
>>> - /*
>>> - * TODO: use SW counter (bits 60-55) to support more CTXs by
>>> - * combining it with the SW context ID field?
>>> - */
>>> -
>>> + desc |= (u64)ce->sw_counter << GEN11_SW_COUNTER_SHIFT;
>>> + /* bits 55-60 */
>>> desc |= (u64)engine->instance << GEN11_ENGINE_INSTANCE_SHIFT;
>>> /* bits 53-48 */
>>> - desc |= (u64)ctx->hw_id << GEN11_SW_CTX_ID_SHIFT;
>>> + desc |= (u64)ce->sw_context_id << GEN11_SW_CTX_ID_SHIFT;
>>> /* bits 37-47 */
>>> } else {
>>> desc |= (u64)ctx->hw_id << GEN8_CTX_ID_SHIFT; /* bits
>>> 32-52 */
>>> @@ -2398,6 +2394,11 @@ static int
>>> execlists_context_deferred_alloc(struct i915_gem_context *ctx,
>>> ce->ring = ring;
>>> ce->state = vma;
>>> + if (INTEL_GEN(ctx->i915) >= 11) {
>>> + ce->sw_context_id = ctx->hw_id;
>>> + ce->sw_counter = engine->instance;
>>> + }
>>> +
>>> return 0;
>>> error_ring_free:
>>>
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* [PATCH 21/27] drm/i915/icl: Add reset control register changes
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (9 preceding siblings ...)
2018-01-09 23:28 ` [PATCH 20/27] drm/i915/icl: Make use of the SW counter field in the new context descriptor Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-01-09 23:28 ` [PATCH 22/27] drm/i915/icl: Add configuring MOCS in new Icelake engines Paulo Zanoni
` (8 subsequent siblings)
19 siblings, 0 replies; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Michel Thierry <michel.thierry@intel.com>
The bits used to reset the different engines/domains have changed in
GEN11, this patch maps the reset engine mask bits with the new bits
in the reset control register.
v2: Use shift-left instead of BIT macro to match the file style (Paulo).
v3: Reuse gen8_reset_engines (Daniele).
v4: Do not call intel_uncore_forcewake_reset after reset, we may be
using the forcewake to read protected registers elsewhere and those
results may be clobbered by the concurrent dropping of forcewake.
bspec: 19212
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 11 ++++++++
drivers/gpu/drm/i915/intel_uncore.c | 53 +++++++++++++++++++++++++++++++++++--
2 files changed, 62 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 6d5e2c651580..b36550831807 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -304,6 +304,17 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
#define GEN6_GRDOM_VECS (1 << 4)
#define GEN9_GRDOM_GUC (1 << 5)
#define GEN8_GRDOM_MEDIA2 (1 << 7)
+/* GEN11 changed all bit defs except for FULL & RENDER */
+#define GEN11_GRDOM_FULL GEN6_GRDOM_FULL
+#define GEN11_GRDOM_RENDER GEN6_GRDOM_RENDER
+#define GEN11_GRDOM_BLT (1 << 2)
+#define GEN11_GRDOM_GUC (1 << 3)
+#define GEN11_GRDOM_MEDIA (1 << 5)
+#define GEN11_GRDOM_MEDIA2 (1 << 6)
+#define GEN11_GRDOM_MEDIA3 (1 << 7)
+#define GEN11_GRDOM_MEDIA4 (1 << 8)
+#define GEN11_GRDOM_VECS (1 << 13)
+#define GEN11_GRDOM_VECS2 (1 << 14)
#define RING_PP_DIR_BASE(engine) _MMIO((engine)->mmio_base+0x228)
#define RING_PP_DIR_BASE_READ(engine) _MMIO((engine)->mmio_base+0x518)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 44f0c5ab58e3..c49beb326414 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1847,6 +1847,50 @@ static int gen6_reset_engines(struct drm_i915_private *dev_priv,
return gen6_hw_domain_reset(dev_priv, hw_mask);
}
+/**
+ * gen11_reset_engines - reset individual engines
+ * @dev_priv: i915 device
+ * @engine_mask: mask of intel_ring_flag() engines or ALL_ENGINES for full reset
+ *
+ * This function will reset the individual engines that are set in engine_mask.
+ * If you provide ALL_ENGINES as mask, full global domain reset will be issued.
+ *
+ * Note: It is responsibility of the caller to handle the difference between
+ * asking full domain reset versus reset for all available individual engines.
+ *
+ * Returns 0 on success, nonzero on error.
+ */
+static int gen11_reset_engines(struct drm_i915_private *dev_priv,
+ unsigned engine_mask)
+{
+ struct intel_engine_cs *engine;
+ const u32 hw_engine_mask[I915_NUM_ENGINES] = {
+ [RCS] = GEN11_GRDOM_RENDER,
+ [BCS] = GEN11_GRDOM_BLT,
+ [VCS] = GEN11_GRDOM_MEDIA,
+ [VCS2] = GEN11_GRDOM_MEDIA2,
+ [VCS3] = GEN11_GRDOM_MEDIA3,
+ [VCS4] = GEN11_GRDOM_MEDIA4,
+ [VECS] = GEN11_GRDOM_VECS,
+ [VECS2] = GEN11_GRDOM_VECS2,
+ };
+ u32 hw_mask;
+
+ BUILD_BUG_ON(VECS2 + 1 != I915_NUM_ENGINES);
+
+ if (engine_mask == ALL_ENGINES) {
+ hw_mask = GEN11_GRDOM_FULL;
+ } else {
+ unsigned int tmp;
+
+ hw_mask = 0;
+ for_each_engine_masked(engine, dev_priv, engine_mask, tmp)
+ hw_mask |= hw_engine_mask[engine->id];
+ }
+
+ return gen6_hw_domain_reset(dev_priv, hw_mask);
+}
+
/**
* __intel_wait_for_register_fw - wait until register matches expected state
* @dev_priv: the i915 device
@@ -1994,7 +2038,10 @@ static int gen8_reset_engines(struct drm_i915_private *dev_priv,
if (gen8_reset_engine_start(engine))
goto not_ready;
- return gen6_reset_engines(dev_priv, engine_mask);
+ if (INTEL_GEN(dev_priv) >= 11)
+ return gen11_reset_engines(dev_priv, engine_mask);
+ else
+ return gen6_reset_engines(dev_priv, engine_mask);
not_ready:
for_each_engine_masked(engine, dev_priv, engine_mask, tmp)
@@ -2079,12 +2126,14 @@ bool intel_has_reset_engine(struct drm_i915_private *dev_priv)
int intel_reset_guc(struct drm_i915_private *dev_priv)
{
+ u32 guc_domain = INTEL_GEN(dev_priv) >= 11 ? GEN11_GRDOM_GUC :
+ GEN9_GRDOM_GUC;
int ret;
GEM_BUG_ON(!HAS_GUC(dev_priv));
intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
- ret = gen6_hw_domain_reset(dev_priv, GEN9_GRDOM_GUC);
+ ret = gen6_hw_domain_reset(dev_priv, guc_domain);
intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
return ret;
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* [PATCH 22/27] drm/i915/icl: Add configuring MOCS in new Icelake engines
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (10 preceding siblings ...)
2018-01-09 23:28 ` [PATCH 21/27] drm/i915/icl: Add reset control register changes Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-01-09 23:28 ` [PATCH 23/27] drm/i915/icl: Split out the servicing of the Selector and Shared IIR registers Paulo Zanoni
` (7 subsequent siblings)
19 siblings, 0 replies; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx
From: Tomasz Lis <tomasz.lis@intel.com>
In Icelake, there are more engines on which Memory Object Control States need
to be configured. Besides adding Icelake under Skylake config, the patch makes
sure MOCS register addresses for the new engines are properly defined.
Additional patch might be need later, in case the specification will
propose different MOCS config values for Icelake than in previous gens.
v2: Restricted comments to gen11, updated description, renamed defines.
v3: Used proper engine indexes for gen11.
v4: Removed engines which are not part of gen11.0
v5: Style fixes (proposed by mwajdeczko)
BSpec: 19405
BSpec: 21140
Cc: Oscar Mateo Lozano <oscar.mateo@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 2 ++
drivers/gpu/drm/i915/intel_mocs.c | 5 ++++-
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index b36550831807..eb6c7dcd4db0 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -9678,6 +9678,8 @@ enum skl_power_gate {
#define GEN9_MFX1_MOCS(i) _MMIO(0xca00 + (i) * 4) /* Media 1 MOCS registers */
#define GEN9_VEBOX_MOCS(i) _MMIO(0xcb00 + (i) * 4) /* Video MOCS registers */
#define GEN9_BLT_MOCS(i) _MMIO(0xcc00 + (i) * 4) /* Blitter MOCS registers */
+/* Media decoder 2 MOCS registers */
+#define GEN11_MFX2_MOCS(i) _MMIO(0x10000 + (i) * 4)
/* gamt regs */
#define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
diff --git a/drivers/gpu/drm/i915/intel_mocs.c b/drivers/gpu/drm/i915/intel_mocs.c
index f4c46b0b8f0a..11a37fbd720e 100644
--- a/drivers/gpu/drm/i915/intel_mocs.c
+++ b/drivers/gpu/drm/i915/intel_mocs.c
@@ -178,7 +178,8 @@ static bool get_mocs_settings(struct drm_i915_private *dev_priv,
{
bool result = false;
- if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv)) {
+ if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv) ||
+ IS_ICELAKE(dev_priv)) {
table->size = ARRAY_SIZE(skylake_mocs_table);
table->table = skylake_mocs_table;
result = true;
@@ -217,6 +218,8 @@ static i915_reg_t mocs_register(enum intel_engine_id engine_id, int index)
return GEN9_VEBOX_MOCS(index);
case VCS2:
return GEN9_MFX1_MOCS(index);
+ case VCS3:
+ return GEN11_MFX2_MOCS(index);
default:
MISSING_CASE(engine_id);
return INVALID_MMIO_REG;
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* [PATCH 23/27] drm/i915/icl: Split out the servicing of the Selector and Shared IIR registers
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (11 preceding siblings ...)
2018-01-09 23:28 ` [PATCH 22/27] drm/i915/icl: Add configuring MOCS in new Icelake engines Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-01-09 23:28 ` [PATCH 24/27] drm/i915/icl: Handle RPS interrupts correctly for Gen11 Paulo Zanoni
` (6 subsequent siblings)
19 siblings, 0 replies; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx
From: Oscar Mateo <oscar.mateo@intel.com>
Both for clarity and so that we can reuse it later on.
v2:
- local_clock returns a u64 (Tvrtko)
- Use the funky BIT(bit) version (Tvrtko)
- wait_start not required (Tvrtko)
- Use time_after64 (Oscar)
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
drivers/gpu/drm/i915/i915_irq.c | 58 +++++++++++++++++++++++++----------------
1 file changed, 35 insertions(+), 23 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 49fb8d60f770..c5bc0e8ae071 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -243,6 +243,37 @@ void i915_hotplug_interrupt_update(struct drm_i915_private *dev_priv,
spin_unlock_irq(&dev_priv->irq_lock);
}
+static u16 gen11_service_shared_iir(struct drm_i915_private *dev_priv,
+ unsigned int bank,
+ unsigned int bit)
+{
+ u64 wait_end;
+ u16 irq;
+ u32 ident;
+
+ I915_WRITE_FW(GEN11_IIR_REG_SELECTOR(bank), BIT(bit));
+ /*
+ * NB: Specs do not specify how long to spin wait.
+ * Taking 100us as an educated guess
+ */
+ wait_end = (local_clock() >> 10) + 100;
+ do {
+ ident = I915_READ_FW(GEN11_INTR_IDENTITY_REG(bank));
+ } while (!(ident & GEN11_INTR_DATA_VALID) &&
+ !time_after64(local_clock() >> 10, wait_end));
+
+ if (!(ident & GEN11_INTR_DATA_VALID))
+ DRM_ERROR("INTR_IDENTITY_REG%u:%u timed out!\n", bank, bit);
+
+ irq = ident & GEN11_INTR_ENGINE_MASK;
+ if (!irq)
+ DRM_ERROR("INTR_IDENTITY_REG%u:%u blank!\n", bank, bit);
+
+ I915_WRITE_FW(GEN11_INTR_IDENTITY_REG(bank), ident);
+
+ return irq;
+}
+
/**
* ilk_update_display_irq - update DEIMR
* @dev_priv: driver private
@@ -2768,10 +2799,9 @@ gen11_gt_irq_handler(struct drm_i915_private *dev_priv, u32 master_ctl)
{
irqreturn_t ret = IRQ_NONE;
u16 irq[2][32];
- u32 dw, ident;
+ u32 dw;
unsigned long tmp;
unsigned int bank, bit, engine;
- unsigned long wait_start, wait_end;
memset(irq, 0, sizeof(irq));
@@ -2781,27 +2811,9 @@ gen11_gt_irq_handler(struct drm_i915_private *dev_priv, u32 master_ctl)
if (!dw)
DRM_ERROR("GT_INTR_DW%u blank!\n", bank);
tmp = dw;
- for_each_set_bit(bit, &tmp, 32) {
- I915_WRITE_FW(GEN11_IIR_REG_SELECTOR(bank), 1 << bit);
- wait_start = local_clock() >> 10;
- /* NB: Specs do not specify how long to spin wait.
- * Taking 100us as an educated guess */
- wait_end = wait_start + 100;
- do {
- ident = I915_READ_FW(GEN11_INTR_IDENTITY_REG(bank));
- } while (!(ident & GEN11_INTR_DATA_VALID) &&
- !time_after((unsigned long)local_clock() >> 10, wait_end));
-
- if (!(ident & GEN11_INTR_DATA_VALID))
- DRM_ERROR("INTR_IDENTITY_REG%u:%u timed out!\n",
- bank, bit);
-
- irq[bank][bit] = ident & GEN11_INTR_ENGINE_MASK;
- if (!irq[bank][bit])
- DRM_ERROR("INTR_IDENTITY_REG%u:%u blank!\n",
- bank, bit);
- I915_WRITE_FW(GEN11_INTR_IDENTITY_REG(bank), ident);
- }
+ for_each_set_bit(bit, &tmp, 32)
+ irq[bank][bit] =
+ gen11_service_shared_iir(dev_priv, bank, bit);
I915_WRITE_FW(GEN11_GT_INTR_DW(bank), dw);
}
}
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* [PATCH 24/27] drm/i915/icl: Handle RPS interrupts correctly for Gen11
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (12 preceding siblings ...)
2018-01-09 23:28 ` [PATCH 23/27] drm/i915/icl: Split out the servicing of the Selector and Shared IIR registers Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-01-09 23:28 ` [PATCH 25/27] drm/i915/icl: Enable RC6 and RPS in Gen11 Paulo Zanoni
` (5 subsequent siblings)
19 siblings, 0 replies; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Oscar Mateo <oscar.mateo@intel.com>
Using the new hierarchical interrupt infrastructure.
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
drivers/gpu/drm/i915/i915_irq.c | 68 +++++++++++++++++++++++++++++++++-------
drivers/gpu/drm/i915/intel_drv.h | 1 +
drivers/gpu/drm/i915/intel_pm.c | 6 ++--
3 files changed, 60 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index c5bc0e8ae071..08aa7d7de163 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -339,17 +339,29 @@ void gen5_disable_gt_irq(struct drm_i915_private *dev_priv, uint32_t mask)
static i915_reg_t gen6_pm_iir(struct drm_i915_private *dev_priv)
{
+ WARN_ON_ONCE(INTEL_GEN(dev_priv) >= 11);
+
return INTEL_GEN(dev_priv) >= 8 ? GEN8_GT_IIR(2) : GEN6_PMIIR;
}
static i915_reg_t gen6_pm_imr(struct drm_i915_private *dev_priv)
{
- return INTEL_GEN(dev_priv) >= 8 ? GEN8_GT_IMR(2) : GEN6_PMIMR;
+ if (INTEL_GEN(dev_priv) >= 11)
+ return GEN11_GPM_WGBOXPERF_INTR_MASK;
+ else if (INTEL_GEN(dev_priv) >= 8)
+ return GEN8_GT_IMR(2);
+ else
+ return GEN6_PMIMR;
}
static i915_reg_t gen6_pm_ier(struct drm_i915_private *dev_priv)
{
- return INTEL_GEN(dev_priv) >= 8 ? GEN8_GT_IER(2) : GEN6_PMIER;
+ if (INTEL_GEN(dev_priv) >= 11)
+ return GEN11_GPM_WGBOXPERF_INTR_ENABLE;
+ else if (INTEL_GEN(dev_priv) >= 8)
+ return GEN8_GT_IER(2);
+ else
+ return GEN6_PMIER;
}
/**
@@ -431,6 +443,28 @@ static void gen6_disable_pm_irq(struct drm_i915_private *dev_priv, u32 disable_m
/* though a barrier is missing here, but don't really need a one */
}
+void gen11_reset_rps_interrupts(struct drm_i915_private *dev_priv)
+{
+ u32 dw;
+
+ spin_lock_irq(&dev_priv->irq_lock);
+
+ /*
+ * According to the BSpec, DW_IIR bits cannot be cleared without
+ * first servicing the Selector & Shared IIR registers.
+ */
+ dw = I915_READ_FW(GEN11_GT_INTR_DW0);
+ while (dw & BIT(GEN11_GTPM)) {
+ gen11_service_shared_iir(dev_priv, 0, GEN11_GTPM);
+ I915_WRITE_FW(GEN11_GT_INTR_DW0, BIT(GEN11_GTPM));
+ dw = I915_READ_FW(GEN11_GT_INTR_DW0);
+ }
+
+ dev_priv->gt_pm.rps.pm_iir = 0;
+
+ spin_unlock_irq(&dev_priv->irq_lock);
+}
+
void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv)
{
spin_lock_irq(&dev_priv->irq_lock);
@@ -446,12 +480,12 @@ void gen6_enable_rps_interrupts(struct drm_i915_private *dev_priv)
if (READ_ONCE(rps->interrupts_enabled))
return;
- if (WARN_ON_ONCE(IS_GEN11(dev_priv)))
- return;
-
spin_lock_irq(&dev_priv->irq_lock);
WARN_ON_ONCE(rps->pm_iir);
- WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) & dev_priv->pm_rps_events);
+ if (INTEL_GEN(dev_priv) >= 11)
+ WARN_ON_ONCE(I915_READ_FW(GEN11_GT_INTR_DW0) & BIT(GEN11_GTPM));
+ else
+ WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) & dev_priv->pm_rps_events);
rps->interrupts_enabled = true;
gen6_enable_pm_irq(dev_priv, dev_priv->pm_rps_events);
@@ -465,9 +499,6 @@ void gen6_disable_rps_interrupts(struct drm_i915_private *dev_priv)
if (!READ_ONCE(rps->interrupts_enabled))
return;
- if (WARN_ON_ONCE(IS_GEN11(dev_priv)))
- return;
-
spin_lock_irq(&dev_priv->irq_lock);
rps->interrupts_enabled = false;
@@ -484,7 +515,10 @@ void gen6_disable_rps_interrupts(struct drm_i915_private *dev_priv)
* state of the worker can be discarded.
*/
cancel_work_sync(&rps->work);
- gen6_reset_rps_interrupts(dev_priv);
+ if (INTEL_GEN(dev_priv) >= 11)
+ gen11_reset_rps_interrupts(dev_priv);
+ else
+ gen6_reset_rps_interrupts(dev_priv);
}
void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv)
@@ -2847,8 +2881,8 @@ gen11_gt_irq_handler(struct drm_i915_private *dev_priv, u32 master_ctl)
}
if (irq[0][GEN11_GTPM] & dev_priv->pm_rps_events) {
+ gen6_rps_irq_handler(dev_priv, (u32)irq[0][GEN11_GTPM]);
ret = IRQ_HANDLED;
- gen6_rps_irq_handler(dev_priv, tmp);
}
return ret;
@@ -3319,6 +3353,9 @@ static void gen11_gt_irq_reset(struct drm_i915_private *dev_priv)
I915_WRITE(GEN11_VCS0_VCS1_INTR_MASK, ~0);
I915_WRITE(GEN11_VCS2_VCS3_INTR_MASK, ~0);
I915_WRITE(GEN11_VECS0_VECS1_INTR_MASK, ~0);
+
+ I915_WRITE(GEN11_GPM_WGBOXPERF_INTR_ENABLE, 0);
+ I915_WRITE(GEN11_GPM_WGBOXPERF_INTR_MASK, ~0);
}
static void gen11_irq_reset(struct drm_device *dev)
@@ -3854,7 +3891,14 @@ static void gen11_gt_irq_postinstall(struct drm_i915_private *dev_priv)
I915_WRITE(GEN11_VCS2_VCS3_INTR_MASK, ~(irqs | irqs << 16));
I915_WRITE(GEN11_VECS0_VECS1_INTR_MASK, ~(irqs | irqs << 16));
- dev_priv->pm_imr = 0xffffffff; /* TODO */
+ /*
+ * RPS interrupts will get enabled/disabled on demand when RPS itself
+ * is enabled/disabled.
+ */
+ dev_priv->pm_ier = 0x0;
+ dev_priv->pm_imr = ~dev_priv->pm_ier;
+ I915_WRITE(GEN11_GPM_WGBOXPERF_INTR_ENABLE, 0);
+ I915_WRITE(GEN11_GPM_WGBOXPERF_INTR_MASK, ~0);
}
static int gen11_irq_postinstall(struct drm_device *dev)
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 731dc36d7129..7778535bbf73 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1315,6 +1315,7 @@ void gen5_enable_gt_irq(struct drm_i915_private *dev_priv, uint32_t mask);
void gen5_disable_gt_irq(struct drm_i915_private *dev_priv, uint32_t mask);
void gen6_mask_pm_irq(struct drm_i915_private *dev_priv, u32 mask);
void gen6_unmask_pm_irq(struct drm_i915_private *dev_priv, u32 mask);
+void gen11_reset_rps_interrupts(struct drm_i915_private *dev_priv);
void gen6_reset_rps_interrupts(struct drm_i915_private *dev_priv);
void gen6_enable_rps_interrupts(struct drm_i915_private *dev_priv);
void gen6_disable_rps_interrupts(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 8d02d8abeea3..17252caeeb4b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -7957,10 +7957,10 @@ void intel_sanitize_gt_powersave(struct drm_i915_private *dev_priv)
dev_priv->gt_pm.rc6.enabled = true; /* force RC6 disabling */
intel_disable_gt_powersave(dev_priv);
- if (INTEL_GEN(dev_priv) < 11)
- gen6_reset_rps_interrupts(dev_priv);
+ if (INTEL_GEN(dev_priv) >= 11)
+ gen11_reset_rps_interrupts(dev_priv);
else
- WARN_ON_ONCE(1);
+ gen6_reset_rps_interrupts(dev_priv);
}
static inline void intel_disable_llc_pstate(struct drm_i915_private *i915)
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* [PATCH 25/27] drm/i915/icl: Enable RC6 and RPS in Gen11
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (13 preceding siblings ...)
2018-01-09 23:28 ` [PATCH 24/27] drm/i915/icl: Handle RPS interrupts correctly for Gen11 Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-01-09 23:28 ` [PATCH 26/27] drm/i915/icl: allow the reg_read ioctl to read the RCS TIMESTAMP register Paulo Zanoni
` (4 subsequent siblings)
19 siblings, 0 replies; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
From: Oscar Mateo <oscar.mateo@intel.com>
AFAICT, once the new interrupt is in place, the rest should behave the
same as Gen10.
v2: Update ring frequencies (Sagar)
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 10 +++++-----
drivers/gpu/drm/i915/intel_pm.c | 10 ++++------
2 files changed, 9 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index e66318e1f76e..f8ecbf6af83e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1243,20 +1243,20 @@ static int i915_frequency_info(struct seq_file *m, void *unused)
max_freq = (IS_GEN9_LP(dev_priv) ? rp_state_cap >> 0 :
rp_state_cap >> 16) & 0xff;
max_freq *= (IS_GEN9_BC(dev_priv) ||
- IS_CANNONLAKE(dev_priv) ? GEN9_FREQ_SCALER : 1);
+ INTEL_GEN(dev_priv) >= 10 ? GEN9_FREQ_SCALER : 1);
seq_printf(m, "Lowest (RPN) frequency: %dMHz\n",
intel_gpu_freq(dev_priv, max_freq));
max_freq = (rp_state_cap & 0xff00) >> 8;
max_freq *= (IS_GEN9_BC(dev_priv) ||
- IS_CANNONLAKE(dev_priv) ? GEN9_FREQ_SCALER : 1);
+ INTEL_GEN(dev_priv) >= 10 ? GEN9_FREQ_SCALER : 1);
seq_printf(m, "Nominal (RP1) frequency: %dMHz\n",
intel_gpu_freq(dev_priv, max_freq));
max_freq = (IS_GEN9_LP(dev_priv) ? rp_state_cap >> 16 :
rp_state_cap >> 0) & 0xff;
max_freq *= (IS_GEN9_BC(dev_priv) ||
- IS_CANNONLAKE(dev_priv) ? GEN9_FREQ_SCALER : 1);
+ INTEL_GEN(dev_priv) >= 10 ? GEN9_FREQ_SCALER : 1);
seq_printf(m, "Max non-overclocked (RP0) frequency: %dMHz\n",
intel_gpu_freq(dev_priv, max_freq));
seq_printf(m, "Max overclocked frequency: %dMHz\n",
@@ -1844,7 +1844,7 @@ static int i915_ring_freq_table(struct seq_file *m, void *unused)
if (ret)
goto out;
- if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv)) {
+ if (IS_GEN9_BC(dev_priv) || INTEL_GEN(dev_priv) >= 10) {
/* Convert GT frequency to 50 HZ units */
min_gpu_freq = rps->min_freq_softlimit / GEN9_FREQ_SCALER;
max_gpu_freq = rps->max_freq_softlimit / GEN9_FREQ_SCALER;
@@ -1863,7 +1863,7 @@ static int i915_ring_freq_table(struct seq_file *m, void *unused)
seq_printf(m, "%d\t\t%d\t\t\t\t%d\n",
intel_gpu_freq(dev_priv, (gpu_freq *
(IS_GEN9_BC(dev_priv) ||
- IS_CANNONLAKE(dev_priv) ?
+ INTEL_GEN(dev_priv) >= 10 ?
GEN9_FREQ_SCALER : 1))),
((ia_freq >> 0) & 0xff) * 100,
((ia_freq >> 8) & 0xff) * 100);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 17252caeeb4b..65c700a876c2 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6523,7 +6523,7 @@ static void gen6_init_rps_frequencies(struct drm_i915_private *dev_priv)
rps->efficient_freq = rps->rp1_freq;
if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv) ||
- IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv)) {
+ IS_GEN9_BC(dev_priv) || INTEL_GEN(dev_priv) >= 10) {
u32 ddcc_status = 0;
if (sandybridge_pcode_read(dev_priv,
@@ -6536,7 +6536,7 @@ static void gen6_init_rps_frequencies(struct drm_i915_private *dev_priv)
rps->max_freq);
}
- if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv)) {
+ if (IS_GEN9_BC(dev_priv) || INTEL_GEN(dev_priv) >= 10) {
/* Store the frequency values in 16.66 MHZ units, which is
* the natural hardware unit for SKL
*/
@@ -6849,7 +6849,7 @@ static void gen6_update_ring_freq(struct drm_i915_private *dev_priv)
/* convert DDR frequency from units of 266.6MHz to bandwidth */
min_ring_freq = mult_frac(min_ring_freq, 8, 3);
- if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv)) {
+ if (IS_GEN9_BC(dev_priv) || INTEL_GEN(dev_priv) >= 10) {
/* Convert GT frequency to 50 HZ units */
min_gpu_freq = rps->min_freq / GEN9_FREQ_SCALER;
max_gpu_freq = rps->max_freq / GEN9_FREQ_SCALER;
@@ -6867,7 +6867,7 @@ static void gen6_update_ring_freq(struct drm_i915_private *dev_priv)
int diff = max_gpu_freq - gpu_freq;
unsigned int ia_freq = 0, ring_freq = 0;
- if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv)) {
+ if (IS_GEN9_BC(dev_priv) || INTEL_GEN(dev_priv) >= 10) {
/*
* ring_freq = 2 * GT. ring_freq is in 100MHz units
* No floor required for ring frequency on SKL.
@@ -8073,8 +8073,6 @@ static void intel_enable_rps(struct drm_i915_private *dev_priv)
cherryview_enable_rps(dev_priv);
} else if (IS_VALLEYVIEW(dev_priv)) {
valleyview_enable_rps(dev_priv);
- } else if (WARN_ON_ONCE(INTEL_GEN(dev_priv) >= 11)) {
- /* TODO */
} else if (INTEL_GEN(dev_priv) >= 9) {
gen9_enable_rps(dev_priv);
} else if (IS_BROADWELL(dev_priv)) {
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* [PATCH 26/27] drm/i915/icl: allow the reg_read ioctl to read the RCS TIMESTAMP register
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (14 preceding siblings ...)
2018-01-09 23:28 ` [PATCH 25/27] drm/i915/icl: Enable RC6 and RPS in Gen11 Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-01-11 1:19 ` Rodrigo Vivi
2018-01-09 23:28 ` [PATCH 27/27] drm/i915/gen11: add support for reading the timestamp frequency Paulo Zanoni
` (3 subsequent siblings)
19 siblings, 1 reply; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx; +Cc: Anuj Phogat, Nanley Chery, Paulo Zanoni, Rodrigo Vivi
This enables the Mesa driver to advertise support for ARB_timer_query,
and thus an OpenGL version higher than 3.2.
Based on the CNL patch by Nanley Chery.
v2: Rebase.
Cc: Anuj Phogat <anuj.phogat@intel.com>
Cc: Nanley Chery <nanley.g.chery@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Requested-by: Anuj Phogat <anuj.phogat@intel.com>
Tested-by: Anuj Phogat <anuj.phogat@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/intel_uncore.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index c49beb326414..ae1196b6e1f4 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1588,7 +1588,7 @@ static const struct reg_whitelist {
} reg_read_whitelist[] = { {
.offset_ldw = RING_TIMESTAMP(RENDER_RING_BASE),
.offset_udw = RING_TIMESTAMP_UDW(RENDER_RING_BASE),
- .gen_mask = INTEL_GEN_MASK(4, 10),
+ .gen_mask = INTEL_GEN_MASK(4, 11),
.size = 8
} };
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH 26/27] drm/i915/icl: allow the reg_read ioctl to read the RCS TIMESTAMP register
2018-01-09 23:28 ` [PATCH 26/27] drm/i915/icl: allow the reg_read ioctl to read the RCS TIMESTAMP register Paulo Zanoni
@ 2018-01-11 1:19 ` Rodrigo Vivi
0 siblings, 0 replies; 118+ messages in thread
From: Rodrigo Vivi @ 2018-01-11 1:19 UTC (permalink / raw)
To: Paulo Zanoni; +Cc: intel-gfx, Anuj Phogat, Nanley Chery
On Tue, Jan 09, 2018 at 11:28:34PM +0000, Paulo Zanoni wrote:
> This enables the Mesa driver to advertise support for ARB_timer_query,
> and thus an OpenGL version higher than 3.2.
>
> Based on the CNL patch by Nanley Chery.
>
> v2: Rebase.
>
> Cc: Anuj Phogat <anuj.phogat@intel.com>
> Cc: Nanley Chery <nanley.g.chery@intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Requested-by: Anuj Phogat <anuj.phogat@intel.com>
> Tested-by: Anuj Phogat <anuj.phogat@intel.com>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
> drivers/gpu/drm/i915/intel_uncore.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index c49beb326414..ae1196b6e1f4 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -1588,7 +1588,7 @@ static const struct reg_whitelist {
> } reg_read_whitelist[] = { {
> .offset_ldw = RING_TIMESTAMP(RENDER_RING_BASE),
> .offset_udw = RING_TIMESTAMP_UDW(RENDER_RING_BASE),
> - .gen_mask = INTEL_GEN_MASK(4, 10),
> + .gen_mask = INTEL_GEN_MASK(4, 11),
I don't like this mask, but I don't have better solutions and
I'm glad we didn't forget to this gen.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> .size = 8
> } };
>
> --
> 2.14.3
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* [PATCH 27/27] drm/i915/gen11: add support for reading the timestamp frequency
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (15 preceding siblings ...)
2018-01-09 23:28 ` [PATCH 26/27] drm/i915/icl: allow the reg_read ioctl to read the RCS TIMESTAMP register Paulo Zanoni
@ 2018-01-09 23:28 ` Paulo Zanoni
2018-03-28 11:34 ` Lionel Landwerlin
2018-01-10 9:45 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Chris Wilson
` (2 subsequent siblings)
19 siblings, 1 reply; 118+ messages in thread
From: Paulo Zanoni @ 2018-01-09 23:28 UTC (permalink / raw)
To: intel-gfx; +Cc: Paulo Zanoni
The only thing that differs here is that the crystal clock freq now
has four possible values.
This patch gets rid of the "Unknown gen, unable to compute..." message
at boot for gen11.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
drivers/gpu/drm/i915/i915_reg.h | 6 +++
drivers/gpu/drm/i915/intel_device_info.c | 71 +++++++++++++++++++++++++-------
2 files changed, 61 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index eb6c7dcd4db0..fde88cd91ef1 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1138,6 +1138,12 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
#define GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK (1 << GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT)
#define GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ 0
#define GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ 1
+#define GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT 3
+#define GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK (0x7 << GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT)
+#define GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ 0
+#define GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ 1
+#define GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_38_4_MHZ 2
+#define GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_25_MHZ 3
#define GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT 1
#define GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK (0x3 << GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT)
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index 895c41ef4abf..168f6ba83ddd 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -395,6 +395,52 @@ static u32 read_reference_ts_freq(struct drm_i915_private *dev_priv)
return base_freq + frac_freq;
}
+static u32 gen10_get_crystal_clock_freq(struct drm_i915_private *dev_priv,
+ u32 rpm_config_reg)
+{
+ u32 f19_2_mhz = 19200;
+ u32 f24_mhz = 24000;
+ u32 crystal_clock = (rpm_config_reg &
+ GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
+ GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
+
+ switch (crystal_clock) {
+ case GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ:
+ return f19_2_mhz;
+ case GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ:
+ return f24_mhz;
+ default:
+ MISSING_CASE(crystal_clock);
+ return 0;
+ }
+}
+
+static u32 gen11_get_crystal_clock_freq(struct drm_i915_private *dev_priv,
+ u32 rpm_config_reg)
+{
+ u32 f19_2_mhz = 19200;
+ u32 f24_mhz = 24000;
+ u32 f25_mhz = 25000;
+ u32 f38_4_mhz = 38400;
+ u32 crystal_clock = (rpm_config_reg &
+ GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
+ GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
+
+ switch (crystal_clock) {
+ case GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ:
+ return f24_mhz;
+ case GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ:
+ return f19_2_mhz;
+ case GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_38_4_MHZ:
+ return f38_4_mhz;
+ case GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_25_MHZ:
+ return f25_mhz;
+ default:
+ MISSING_CASE(crystal_clock);
+ return 0;
+ }
+}
+
static u32 read_timestamp_frequency(struct drm_i915_private *dev_priv)
{
u32 f12_5_mhz = 12500;
@@ -435,10 +481,9 @@ static u32 read_timestamp_frequency(struct drm_i915_private *dev_priv)
}
return freq;
- } else if (INTEL_GEN(dev_priv) <= 10) {
+ } else if (INTEL_GEN(dev_priv) <= 11) {
u32 ctc_reg = I915_READ(CTC_MODE);
u32 freq = 0;
- u32 rpm_config_reg = 0;
/* First figure out the reference frequency. There are 2 ways
* we can compute the frequency, either through the
@@ -448,20 +493,14 @@ static u32 read_timestamp_frequency(struct drm_i915_private *dev_priv)
if ((ctc_reg & CTC_SOURCE_PARAMETER_MASK) == CTC_SOURCE_DIVIDE_LOGIC) {
freq = read_reference_ts_freq(dev_priv);
} else {
- u32 crystal_clock;
-
- rpm_config_reg = I915_READ(RPM_CONFIG0);
- crystal_clock = (rpm_config_reg &
- GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
- GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
- switch (crystal_clock) {
- case GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ:
- freq = f19_2_mhz;
- break;
- case GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ:
- freq = f24_mhz;
- break;
- }
+ u32 rpm_config_reg = I915_READ(RPM_CONFIG0);
+
+ if (INTEL_GEN(dev_priv) <= 10)
+ freq = gen10_get_crystal_clock_freq(dev_priv,
+ rpm_config_reg);
+ else
+ freq = gen11_get_crystal_clock_freq(dev_priv,
+ rpm_config_reg);
/* Now figure out how the command stream's timestamp
* register increments from this frequency (it might
--
2.14.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH 27/27] drm/i915/gen11: add support for reading the timestamp frequency
2018-01-09 23:28 ` [PATCH 27/27] drm/i915/gen11: add support for reading the timestamp frequency Paulo Zanoni
@ 2018-03-28 11:34 ` Lionel Landwerlin
0 siblings, 0 replies; 118+ messages in thread
From: Lionel Landwerlin @ 2018-03-28 11:34 UTC (permalink / raw)
To: Paulo Zanoni, intel-gfx
On 09/01/18 23:28, Paulo Zanoni wrote:
> The only thing that differs here is that the crystal clock freq now
> has four possible values.
>
> This patch gets rid of the "Unknown gen, unable to compute..." message
> at boot for gen11.
>
> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Still
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
> drivers/gpu/drm/i915/i915_reg.h | 6 +++
> drivers/gpu/drm/i915/intel_device_info.c | 71 +++++++++++++++++++++++++-------
> 2 files changed, 61 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index eb6c7dcd4db0..fde88cd91ef1 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -1138,6 +1138,12 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
> #define GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK (1 << GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT)
> #define GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ 0
> #define GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ 1
> +#define GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT 3
> +#define GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK (0x7 << GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT)
> +#define GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ 0
> +#define GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ 1
> +#define GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_38_4_MHZ 2
> +#define GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_25_MHZ 3
> #define GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT 1
> #define GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_MASK (0x3 << GEN10_RPM_CONFIG0_CTC_SHIFT_PARAMETER_SHIFT)
>
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
> index 895c41ef4abf..168f6ba83ddd 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -395,6 +395,52 @@ static u32 read_reference_ts_freq(struct drm_i915_private *dev_priv)
> return base_freq + frac_freq;
> }
>
> +static u32 gen10_get_crystal_clock_freq(struct drm_i915_private *dev_priv,
> + u32 rpm_config_reg)
> +{
> + u32 f19_2_mhz = 19200;
> + u32 f24_mhz = 24000;
> + u32 crystal_clock = (rpm_config_reg &
> + GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
> + GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
> +
> + switch (crystal_clock) {
> + case GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ:
> + return f19_2_mhz;
> + case GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ:
> + return f24_mhz;
> + default:
> + MISSING_CASE(crystal_clock);
> + return 0;
> + }
> +}
> +
> +static u32 gen11_get_crystal_clock_freq(struct drm_i915_private *dev_priv,
> + u32 rpm_config_reg)
> +{
> + u32 f19_2_mhz = 19200;
> + u32 f24_mhz = 24000;
> + u32 f25_mhz = 25000;
> + u32 f38_4_mhz = 38400;
> + u32 crystal_clock = (rpm_config_reg &
> + GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
> + GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
> +
> + switch (crystal_clock) {
> + case GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ:
> + return f24_mhz;
> + case GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ:
> + return f19_2_mhz;
> + case GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_38_4_MHZ:
> + return f38_4_mhz;
> + case GEN11_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_25_MHZ:
> + return f25_mhz;
> + default:
> + MISSING_CASE(crystal_clock);
> + return 0;
> + }
> +}
> +
> static u32 read_timestamp_frequency(struct drm_i915_private *dev_priv)
> {
> u32 f12_5_mhz = 12500;
> @@ -435,10 +481,9 @@ static u32 read_timestamp_frequency(struct drm_i915_private *dev_priv)
> }
>
> return freq;
> - } else if (INTEL_GEN(dev_priv) <= 10) {
> + } else if (INTEL_GEN(dev_priv) <= 11) {
> u32 ctc_reg = I915_READ(CTC_MODE);
> u32 freq = 0;
> - u32 rpm_config_reg = 0;
>
> /* First figure out the reference frequency. There are 2 ways
> * we can compute the frequency, either through the
> @@ -448,20 +493,14 @@ static u32 read_timestamp_frequency(struct drm_i915_private *dev_priv)
> if ((ctc_reg & CTC_SOURCE_PARAMETER_MASK) == CTC_SOURCE_DIVIDE_LOGIC) {
> freq = read_reference_ts_freq(dev_priv);
> } else {
> - u32 crystal_clock;
> -
> - rpm_config_reg = I915_READ(RPM_CONFIG0);
> - crystal_clock = (rpm_config_reg &
> - GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK) >>
> - GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_SHIFT;
> - switch (crystal_clock) {
> - case GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_19_2_MHZ:
> - freq = f19_2_mhz;
> - break;
> - case GEN9_RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_24_MHZ:
> - freq = f24_mhz;
> - break;
> - }
> + u32 rpm_config_reg = I915_READ(RPM_CONFIG0);
> +
> + if (INTEL_GEN(dev_priv) <= 10)
> + freq = gen10_get_crystal_clock_freq(dev_priv,
> + rpm_config_reg);
> + else
> + freq = gen11_get_crystal_clock_freq(dev_priv,
> + rpm_config_reg);
>
> /* Now figure out how the command stream's timestamp
> * register increments from this frequency (it might
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH 10/27] drm/i915/icl: Enhanced execution list support
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (16 preceding siblings ...)
2018-01-09 23:28 ` [PATCH 27/27] drm/i915/gen11: add support for reading the timestamp frequency Paulo Zanoni
@ 2018-01-10 9:45 ` Chris Wilson
2018-01-11 19:55 ` Daniele Ceraolo Spurio
2018-01-17 21:53 ` [PATCH v5] " Daniele Ceraolo Spurio
19 siblings, 0 replies; 118+ messages in thread
From: Chris Wilson @ 2018-01-10 9:45 UTC (permalink / raw)
To: Paulo Zanoni, intel-gfx; +Cc: Thomas Daniel, Rodrigo Vivi
Quoting Paulo Zanoni (2018-01-09 23:28:18)
> From: Thomas Daniel <thomas.daniel@intel.com>
>
> Supports two-element submission using the new enhanced execlist mechanism
>
> v2: Rebase.
> v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
> v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
>
> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
> drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++++++++++++++-
> drivers/gpu/drm/i915/intel_lrc.h | 3 +++
> 2 files changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index de41ad2d5fbc..3c6f587fa903 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -430,9 +430,18 @@ static inline void elsp_write(u64 desc, u32 __iomem *elsp)
>
> static void execlists_submit_ports(struct intel_engine_cs *engine)
> {
> + struct drm_i915_private *dev_priv = engine->i915;
> struct execlist_port *port = engine->execlists.port;
> + u32 __iomem *elsq =
> + engine->i915->regs + i915_mmio_reg_offset(RING_ELSQ(engine));
Overwrite engine->execlists.elsp with the alternate address.
> unsigned int n;
>
> + /*
> + * Gen11+ note: the submit queue is not cleared after being submitted
> + * to the HW so we need to make sure we always clean it up. This is
> + * currently ensured by the fact that we always write the same number
> + * of elsq entries, keep this in mind before changing the loop below.
> + */
> for (n = execlists_num_ports(&engine->execlists); n--; ) {
> struct drm_i915_gem_request *rq;
> unsigned int count;
> @@ -456,8 +465,18 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
> desc = 0;
> }
>
> - elsp_write(desc, engine->execlists.elsp);
> + if (INTEL_GEN(engine->i915) >= 11) {
> + writel(lower_32_bits(desc), elsq + n * 2);
> + writel(upper_32_bits(desc), elsq + n * 2 + 1);
> + } else {
> + elsp_write(desc, engine->execlists.elsp);
> + }
Missed the other consumer of elsp_write, preemption. So add the offset
to elsp_write() and do the magic there. It may even be worth a vfunc.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH 10/27] drm/i915/icl: Enhanced execution list support
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (17 preceding siblings ...)
2018-01-10 9:45 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Chris Wilson
@ 2018-01-11 19:55 ` Daniele Ceraolo Spurio
2018-01-11 20:55 ` Daniele Ceraolo Spurio
2018-01-17 21:53 ` [PATCH v5] " Daniele Ceraolo Spurio
19 siblings, 1 reply; 118+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-01-11 19:55 UTC (permalink / raw)
To: Paulo Zanoni, intel-gfx; +Cc: Thomas Daniel, Rodrigo Vivi
On 09/01/18 15:28, Paulo Zanoni wrote:
> From: Thomas Daniel <thomas.daniel@intel.com>
>
> Supports two-element submission using the new enhanced execlist mechanism
>
This could use a few lines to describe enhanced execlist. Something like:
"Enhanced Execlists is an upgraded version of execlists which supports
up to 8 ports. The lrcs to be submitted are written to a submit queue,
which is then loaded on the HW. When writing to the ELSP register, the
lrcs are written cyclically in the queue from position 0 to position 7.
Alternatively, it is possible to write directly in the individual
positions of the queue using the ELSQ registers. To be able to re-use
all the existing code we're using the latter method and we're currently
limiting ourself to only using 2 elements"
> v2: Rebase.
> v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
> v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
>
> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> ---
> drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++++++++++++++-
> drivers/gpu/drm/i915/intel_lrc.h | 3 +++
> 2 files changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index de41ad2d5fbc..3c6f587fa903 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -430,9 +430,18 @@ static inline void elsp_write(u64 desc, u32 __iomem *elsp)
>
> static void execlists_submit_ports(struct intel_engine_cs *engine)
> {
> + struct drm_i915_private *dev_priv = engine->i915;
> struct execlist_port *port = engine->execlists.port;
> + u32 __iomem *elsq =
> + engine->i915->regs + i915_mmio_reg_offset(RING_ELSQ(engine));
Should we cache this, like we do for execlists.elsp?
> unsigned int n;
>
> + /*
> + * Gen11+ note: the submit queue is not cleared after being submitted
> + * to the HW so we need to make sure we always clean it up. This is
> + * currently ensured by the fact that we always write the same number
> + * of elsq entries, keep this in mind before changing the loop below.
> + */
> for (n = execlists_num_ports(&engine->execlists); n--; ) {
> struct drm_i915_gem_request *rq;
> unsigned int count;
> @@ -456,8 +465,18 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
> desc = 0;
> }
>
> - elsp_write(desc, engine->execlists.elsp);
> + if (INTEL_GEN(engine->i915) >= 11) {
> + writel(lower_32_bits(desc), elsq + n * 2);
> + writel(upper_32_bits(desc), elsq + n * 2 + 1);
> + } else {
> + elsp_write(desc, engine->execlists.elsp);
> + }
> }
> +
> + /* for gen11+ we need to manually load the submit queue */
> + if (INTEL_GEN(engine->i915) >= 11)
> + I915_WRITE_FW(RING_ELCR(engine), ELCR_LOAD);
> +
> execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
> }
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> index 6d4f9b995a11..cb00e1dd6ed2 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -38,6 +38,9 @@
> #define CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT (1 << 0)
> #define CTX_CTRL_RS_CTX_ENABLE (1 << 1)
> #define RING_CONTEXT_STATUS_BUF_BASE(engine) _MMIO((engine)->mmio_base + 0x370)
> +#define RING_ELSQ(engine) _MMIO((engine)->mmio_base + 0x510)
> +#define RING_ELCR(engine) _MMIO((engine)->mmio_base + 0x550)
Do we need to add the new regs to the forcewake domain identification in
logical_ring_setup? They should be in the same well as ELSP so maybe
just that one is enough.
> +#define ELCR_LOAD (1 << 0) > #define RING_CONTEXT_STATUS_BUF_LO(engine, i)
_MMIO((engine)->mmio_base + 0x370 + (i) * 8)
> #define RING_CONTEXT_STATUS_BUF_HI(engine, i) _MMIO((engine)->mmio_base + 0x370 + (i) * 8 + 4)
> #define RING_CONTEXT_STATUS_PTR(engine) _MMIO((engine)->mmio_base + 0x3a0)
>
Side note, we do have pre-emption enabled in the features but this patch
does not update inject_preempt_context to use the new way to submit to
the HW, so I guess that's probably break.
Daniele
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH 10/27] drm/i915/icl: Enhanced execution list support
2018-01-11 19:55 ` Daniele Ceraolo Spurio
@ 2018-01-11 20:55 ` Daniele Ceraolo Spurio
0 siblings, 0 replies; 118+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-01-11 20:55 UTC (permalink / raw)
To: Paulo Zanoni, intel-gfx; +Cc: Thomas Daniel, Rodrigo Vivi
The review from Chris had ended up in my spam folder and I missed it,
apologies for duplicating some of the comments.
Daniele
On 11/01/18 11:55, Daniele Ceraolo Spurio wrote:
>
>
> On 09/01/18 15:28, Paulo Zanoni wrote:
>> From: Thomas Daniel <thomas.daniel@intel.com>
>>
>> Supports two-element submission using the new enhanced execlist mechanism
>>
>
> This could use a few lines to describe enhanced execlist. Something like:
>
> "Enhanced Execlists is an upgraded version of execlists which supports
> up to 8 ports. The lrcs to be submitted are written to a submit queue,
> which is then loaded on the HW. When writing to the ELSP register, the
> lrcs are written cyclically in the queue from position 0 to position 7.
> Alternatively, it is possible to write directly in the individual
> positions of the queue using the ELSQ registers. To be able to re-use
> all the existing code we're using the latter method and we're currently
> limiting ourself to only using 2 elements"
>
>> v2: Rebase.
>> v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
>> v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
>>
>> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> ---
>> drivers/gpu/drm/i915/intel_lrc.c | 21 ++++++++++++++++++++-
>> drivers/gpu/drm/i915/intel_lrc.h | 3 +++
>> 2 files changed, 23 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index de41ad2d5fbc..3c6f587fa903 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -430,9 +430,18 @@ static inline void elsp_write(u64 desc, u32
>> __iomem *elsp)
>> static void execlists_submit_ports(struct intel_engine_cs *engine)
>> {
>> + struct drm_i915_private *dev_priv = engine->i915;
>> struct execlist_port *port = engine->execlists.port;
>> + u32 __iomem *elsq =
>> + engine->i915->regs + i915_mmio_reg_offset(RING_ELSQ(engine));
>
> Should we cache this, like we do for execlists.elsp?
>
>> unsigned int n;
>> + /*
>> + * Gen11+ note: the submit queue is not cleared after being
>> submitted
>> + * to the HW so we need to make sure we always clean it up. This is
>> + * currently ensured by the fact that we always write the same
>> number
>> + * of elsq entries, keep this in mind before changing the loop
>> below.
>> + */
>> for (n = execlists_num_ports(&engine->execlists); n--; ) {
>> struct drm_i915_gem_request *rq;
>> unsigned int count;
>> @@ -456,8 +465,18 @@ static void execlists_submit_ports(struct
>> intel_engine_cs *engine)
>> desc = 0;
>> }
>> - elsp_write(desc, engine->execlists.elsp);
>> + if (INTEL_GEN(engine->i915) >= 11) {
>> + writel(lower_32_bits(desc), elsq + n * 2);
>> + writel(upper_32_bits(desc), elsq + n * 2 + 1);
>> + } else {
>> + elsp_write(desc, engine->execlists.elsp);
>> + }
>> }
>> +
>> + /* for gen11+ we need to manually load the submit queue */
>> + if (INTEL_GEN(engine->i915) >= 11)
>> + I915_WRITE_FW(RING_ELCR(engine), ELCR_LOAD);
>> +
>> execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
>> }
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.h
>> b/drivers/gpu/drm/i915/intel_lrc.h
>> index 6d4f9b995a11..cb00e1dd6ed2 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.h
>> +++ b/drivers/gpu/drm/i915/intel_lrc.h
>> @@ -38,6 +38,9 @@
>> #define CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT (1 << 0)
>> #define CTX_CTRL_RS_CTX_ENABLE (1 << 1)
>> #define RING_CONTEXT_STATUS_BUF_BASE(engine)
>> _MMIO((engine)->mmio_base + 0x370)
>> +#define RING_ELSQ(engine) _MMIO((engine)->mmio_base + 0x510)
>> +#define RING_ELCR(engine) _MMIO((engine)->mmio_base + 0x550)
>
> Do we need to add the new regs to the forcewake domain identification in
> logical_ring_setup? They should be in the same well as ELSP so maybe
> just that one is enough.
>
>> +#define ELCR_LOAD (1 << 0) > #define
>> RING_CONTEXT_STATUS_BUF_LO(engine, i)
> _MMIO((engine)->mmio_base + 0x370 + (i) * 8)
>> #define RING_CONTEXT_STATUS_BUF_HI(engine, i)
>> _MMIO((engine)->mmio_base + 0x370 + (i) * 8 + 4)
>> #define RING_CONTEXT_STATUS_PTR(engine)
>> _MMIO((engine)->mmio_base + 0x3a0)
>>
>
> Side note, we do have pre-emption enabled in the features but this patch
> does not update inject_preempt_context to use the new way to submit to
> the HW, so I guess that's probably break.
>
> Daniele
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* [PATCH v5] drm/i915/icl: Enhanced execution list support
2018-01-09 23:28 ` [PATCH 10/27] drm/i915/icl: Enhanced execution list support Paulo Zanoni
` (18 preceding siblings ...)
2018-01-11 19:55 ` Daniele Ceraolo Spurio
@ 2018-01-17 21:53 ` Daniele Ceraolo Spurio
2018-01-19 13:05 ` Mika Kuoppala
19 siblings, 1 reply; 118+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-01-17 21:53 UTC (permalink / raw)
To: intel-gfx; +Cc: Thomas Daniel, Rodrigo Vivi
From: Thomas Daniel <thomas.daniel@intel.com>
Enhanced Execlists is an upgraded version of execlists which supports
up to 8 ports. The lrcs to be submitted are written to a submit queue,
which is then loaded on the HW. When writing to the ELSP register, the
lrcs are written cyclically in the queue from position 0 to position 7.
Alternatively, it is possible to write directly in the individual
positions of the queue using the ELSQ registers. To be able to re-use
all the existing code we're using the latter method and we're currently
limiting ourself to only using 2 elements.
The preemption flow is sligthly different with enhanced execlists, so
this patch turns preemption off temporarily for Gen11+ while we wait for
the new mechanism to land.
v2: Rebase.
v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
v5: Reword commit, rename regs to be closer to specs, turn off
preemption (Daniele), reuse engine->execlists.elsp (Chris)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 5 ++++-
drivers/gpu/drm/i915/intel_lrc.c | 35 ++++++++++++++++++++++++++++-----
drivers/gpu/drm/i915/intel_lrc.h | 3 +++
drivers/gpu/drm/i915/intel_ringbuffer.h | 6 ++++--
4 files changed, 41 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c42015b..3163543 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2738,8 +2738,11 @@ static inline unsigned int i915_sg_segment_size(void)
#define HAS_LOGICAL_RING_CONTEXTS(dev_priv) \
((dev_priv)->info.has_logical_ring_contexts)
+
+/* XXX: Preemption disabled for Gen11+ until support for new flow lands */
#define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \
- ((dev_priv)->info.has_logical_ring_preemption)
+ ((dev_priv)->info.has_logical_ring_preemption && \
+ INTEL_GEN(dev_priv) < 11)
#define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ff25f20..67ad7c9 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -428,11 +428,24 @@ static inline void elsp_write(u64 desc, u32 __iomem *elsp)
writel(lower_32_bits(desc), elsp);
}
+static inline void elsqc_write(u64 desc, u32 __iomem *elsqc, u32 port)
+{
+ writel(lower_32_bits(desc), elsqc + port * 2);
+ writel(upper_32_bits(desc), elsqc + port * 2 + 1);
+}
+
static void execlists_submit_ports(struct intel_engine_cs *engine)
{
+ struct drm_i915_private *dev_priv = engine->i915;
struct execlist_port *port = engine->execlists.port;
unsigned int n;
+ /*
+ * Gen11+ note: the submit queue is not cleared after being submitted
+ * to the HW so we need to make sure we always clean it up. This is
+ * currently ensured by the fact that we always write the same number
+ * of elsq entries, keep this in mind before changing the loop below.
+ */
for (n = execlists_num_ports(&engine->execlists); n--; ) {
struct drm_i915_gem_request *rq;
unsigned int count;
@@ -456,8 +469,16 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
desc = 0;
}
- elsp_write(desc, engine->execlists.elsp);
+ if (INTEL_GEN(dev_priv) >= 11)
+ elsqc_write(desc, engine->execlists.els, n);
+ else
+ elsp_write(desc, engine->execlists.els);
}
+
+ /* for gen11+ we need to manually load the submit queue */
+ if (INTEL_GEN(dev_priv) >= 11)
+ I915_WRITE_FW(RING_EXECLIST_CONTROL(engine), EL_CTRL_LOAD);
+
execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
}
@@ -506,9 +527,9 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
GEM_TRACE("%s\n", engine->name);
for (n = execlists_num_ports(&engine->execlists); --n; )
- elsp_write(0, engine->execlists.elsp);
+ elsp_write(0, engine->execlists.els);
- elsp_write(ce->lrc_desc, engine->execlists.elsp);
+ elsp_write(ce->lrc_desc, engine->execlists.els);
execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
}
@@ -2016,8 +2037,12 @@ static int logical_ring_init(struct intel_engine_cs *engine)
if (ret)
goto error;
- engine->execlists.elsp =
- engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
+ if (INTEL_GEN(engine->i915) >= 11)
+ engine->execlists.els = engine->i915->regs +
+ i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(engine));
+ else
+ engine->execlists.els = engine->i915->regs +
+ i915_mmio_reg_offset(RING_ELSP(engine));
return 0;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 6d4f9b9..3ab4266 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -38,6 +38,9 @@
#define CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT (1 << 0)
#define CTX_CTRL_RS_CTX_ENABLE (1 << 1)
#define RING_CONTEXT_STATUS_BUF_BASE(engine) _MMIO((engine)->mmio_base + 0x370)
+#define RING_EXECLIST_SQ_CONTENTS(engine) _MMIO((engine)->mmio_base + 0x510)
+#define RING_EXECLIST_CONTROL(engine) _MMIO((engine)->mmio_base + 0x550)
+#define EL_CTRL_LOAD (1 << 0)
#define RING_CONTEXT_STATUS_BUF_LO(engine, i) _MMIO((engine)->mmio_base + 0x370 + (i) * 8)
#define RING_CONTEXT_STATUS_BUF_HI(engine, i) _MMIO((engine)->mmio_base + 0x370 + (i) * 8 + 4)
#define RING_CONTEXT_STATUS_PTR(engine) _MMIO((engine)->mmio_base + 0x3a0)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index c5ff203..d36bb73 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -200,9 +200,11 @@ struct intel_engine_execlists {
bool no_priolist;
/**
- * @elsp: the ExecList Submission Port register
+ * @els: gen-specific execlist submission register
+ * set to the ExecList Submission Port (elsp) register pre-Gen11 and to
+ * the ExecList Submission Queue Contents register array for Gen11+
*/
- u32 __iomem *elsp;
+ u32 __iomem *els;
/**
* @port: execlist port states
--
1.9.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 118+ messages in thread
* Re: [PATCH v5] drm/i915/icl: Enhanced execution list support
2018-01-17 21:53 ` [PATCH v5] " Daniele Ceraolo Spurio
@ 2018-01-19 13:05 ` Mika Kuoppala
2018-01-19 16:15 ` Daniele Ceraolo Spurio
0 siblings, 1 reply; 118+ messages in thread
From: Mika Kuoppala @ 2018-01-19 13:05 UTC (permalink / raw)
To: Daniele Ceraolo Spurio, intel-gfx; +Cc: Thomas Daniel, Rodrigo Vivi
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> writes:
> From: Thomas Daniel <thomas.daniel@intel.com>
>
> Enhanced Execlists is an upgraded version of execlists which supports
> up to 8 ports. The lrcs to be submitted are written to a submit queue,
> which is then loaded on the HW. When writing to the ELSP register, the
> lrcs are written cyclically in the queue from position 0 to position 7.
> Alternatively, it is possible to write directly in the individual
> positions of the queue using the ELSQ registers. To be able to re-use
> all the existing code we're using the latter method and we're currently
> limiting ourself to only using 2 elements.
>
> The preemption flow is sligthly different with enhanced execlists, so
> this patch turns preemption off temporarily for Gen11+ while we wait for
> the new mechanism to land.
>
> v2: Rebase.
> v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
> v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
> v5: Reword commit, rename regs to be closer to specs, turn off
> preemption (Daniele), reuse engine->execlists.elsp (Chris)
>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Was going to adopt this patch from Rodrigo but you were faster.
I choose to stash the elsq and use it as a gen11 vs rest toggle:
Relevant bits:
+static inline void write_port(struct intel_engine_execlists * const execlists,
+ unsigned int n,
+ u64 desc)
+{
+ if (execlists->elsq)
+ gen11_elsq_write(desc, n, execlists->elsq);
+ else
+ gen8_elsp_write(desc, execlists->elsp);
+}
+
+static inline void submit_ports(struct intel_engine_execlists * const execlists)
+{
+ /* for gen11+ we need to manually load the submit queue */
+ if (execlists->elsq) {
+ struct intel_engine_cs *engine =
+ container_of(execlists,
+ struct intel_engine_cs,
+ execlists);
+ struct drm_i915_private *dev_priv = engine->i915;
+
+ I915_WRITE_FW(RING_ELCR(engine), ELCR_LOAD);
+ }
+}
+
...
-Mika
> ---
> drivers/gpu/drm/i915/i915_drv.h | 5 ++++-
> drivers/gpu/drm/i915/intel_lrc.c | 35 ++++++++++++++++++++++++++++-----
> drivers/gpu/drm/i915/intel_lrc.h | 3 +++
> drivers/gpu/drm/i915/intel_ringbuffer.h | 6 ++++--
> 4 files changed, 41 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c42015b..3163543 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2738,8 +2738,11 @@ static inline unsigned int i915_sg_segment_size(void)
>
> #define HAS_LOGICAL_RING_CONTEXTS(dev_priv) \
> ((dev_priv)->info.has_logical_ring_contexts)
> +
> +/* XXX: Preemption disabled for Gen11+ until support for new flow lands */
> #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \
> - ((dev_priv)->info.has_logical_ring_preemption)
> + ((dev_priv)->info.has_logical_ring_preemption && \
> + INTEL_GEN(dev_priv) < 11)
>
> #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index ff25f20..67ad7c9 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -428,11 +428,24 @@ static inline void elsp_write(u64 desc, u32 __iomem *elsp)
> writel(lower_32_bits(desc), elsp);
> }
>
> +static inline void elsqc_write(u64 desc, u32 __iomem *elsqc, u32 port)
> +{
> + writel(lower_32_bits(desc), elsqc + port * 2);
> + writel(upper_32_bits(desc), elsqc + port * 2 + 1);
> +}
> +
> static void execlists_submit_ports(struct intel_engine_cs *engine)
> {
> + struct drm_i915_private *dev_priv = engine->i915;
> struct execlist_port *port = engine->execlists.port;
> unsigned int n;
>
> + /*
> + * Gen11+ note: the submit queue is not cleared after being submitted
> + * to the HW so we need to make sure we always clean it up. This is
> + * currently ensured by the fact that we always write the same number
> + * of elsq entries, keep this in mind before changing the loop below.
> + */
> for (n = execlists_num_ports(&engine->execlists); n--; ) {
> struct drm_i915_gem_request *rq;
> unsigned int count;
> @@ -456,8 +469,16 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
> desc = 0;
> }
>
> - elsp_write(desc, engine->execlists.elsp);
> + if (INTEL_GEN(dev_priv) >= 11)
> + elsqc_write(desc, engine->execlists.els, n);
> + else
> + elsp_write(desc, engine->execlists.els);
> }
> +
> + /* for gen11+ we need to manually load the submit queue */
> + if (INTEL_GEN(dev_priv) >= 11)
> + I915_WRITE_FW(RING_EXECLIST_CONTROL(engine), EL_CTRL_LOAD);
> +
> execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
> }
>
> @@ -506,9 +527,9 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
>
> GEM_TRACE("%s\n", engine->name);
> for (n = execlists_num_ports(&engine->execlists); --n; )
> - elsp_write(0, engine->execlists.elsp);
> + elsp_write(0, engine->execlists.els);
>
> - elsp_write(ce->lrc_desc, engine->execlists.elsp);
> + elsp_write(ce->lrc_desc, engine->execlists.els);
> execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
> }
>
> @@ -2016,8 +2037,12 @@ static int logical_ring_init(struct intel_engine_cs *engine)
> if (ret)
> goto error;
>
> - engine->execlists.elsp =
> - engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
> + if (INTEL_GEN(engine->i915) >= 11)
> + engine->execlists.els = engine->i915->regs +
> + i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(engine));
> + else
> + engine->execlists.els = engine->i915->regs +
> + i915_mmio_reg_offset(RING_ELSP(engine));
>
> return 0;
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
> index 6d4f9b9..3ab4266 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.h
> +++ b/drivers/gpu/drm/i915/intel_lrc.h
> @@ -38,6 +38,9 @@
> #define CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT (1 << 0)
> #define CTX_CTRL_RS_CTX_ENABLE (1 << 1)
> #define RING_CONTEXT_STATUS_BUF_BASE(engine) _MMIO((engine)->mmio_base + 0x370)
> +#define RING_EXECLIST_SQ_CONTENTS(engine) _MMIO((engine)->mmio_base + 0x510)
> +#define RING_EXECLIST_CONTROL(engine) _MMIO((engine)->mmio_base + 0x550)
> +#define EL_CTRL_LOAD (1 << 0)
> #define RING_CONTEXT_STATUS_BUF_LO(engine, i) _MMIO((engine)->mmio_base + 0x370 + (i) * 8)
> #define RING_CONTEXT_STATUS_BUF_HI(engine, i) _MMIO((engine)->mmio_base + 0x370 + (i) * 8 + 4)
> #define RING_CONTEXT_STATUS_PTR(engine) _MMIO((engine)->mmio_base + 0x3a0)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index c5ff203..d36bb73 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -200,9 +200,11 @@ struct intel_engine_execlists {
> bool no_priolist;
>
> /**
> - * @elsp: the ExecList Submission Port register
> + * @els: gen-specific execlist submission register
> + * set to the ExecList Submission Port (elsp) register pre-Gen11 and to
> + * the ExecList Submission Queue Contents register array for Gen11+
> */
> - u32 __iomem *elsp;
> + u32 __iomem *els;
>
> /**
> * @port: execlist port states
> --
> 1.9.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH v5] drm/i915/icl: Enhanced execution list support
2018-01-19 13:05 ` Mika Kuoppala
@ 2018-01-19 16:15 ` Daniele Ceraolo Spurio
2018-01-22 15:08 ` Mika Kuoppala
0 siblings, 1 reply; 118+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-01-19 16:15 UTC (permalink / raw)
To: Mika Kuoppala, intel-gfx; +Cc: Thomas Daniel, Rodrigo Vivi
On 19/01/18 05:05, Mika Kuoppala wrote:
> Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> writes:
>
>> From: Thomas Daniel <thomas.daniel@intel.com>
>>
>> Enhanced Execlists is an upgraded version of execlists which supports
>> up to 8 ports. The lrcs to be submitted are written to a submit queue,
>> which is then loaded on the HW. When writing to the ELSP register, the
>> lrcs are written cyclically in the queue from position 0 to position 7.
>> Alternatively, it is possible to write directly in the individual
>> positions of the queue using the ELSQ registers. To be able to re-use
>> all the existing code we're using the latter method and we're currently
>> limiting ourself to only using 2 elements.
>>
>> The preemption flow is sligthly different with enhanced execlists, so
>> this patch turns preemption off temporarily for Gen11+ while we wait for
>> the new mechanism to land.
>>
>> v2: Rebase.
>> v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
>> v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
>> v5: Reword commit, rename regs to be closer to specs, turn off
>> preemption (Daniele), reuse engine->execlists.elsp (Chris)
>>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>
> Was going to adopt this patch from Rodrigo but you were faster.
>
> I choose to stash the elsq and use it as a gen11 vs rest toggle:
>
> Relevant bits:
>
> +static inline void write_port(struct intel_engine_execlists * const execlists,
> + unsigned int n,
> + u64 desc)
> +{
> + if (execlists->elsq)
> + gen11_elsq_write(desc, n, execlists->elsq);
> + else
> + gen8_elsp_write(desc, execlists->elsp);
> +}
> +
> +static inline void submit_ports(struct intel_engine_execlists * const execlists)
> +{
> + /* for gen11+ we need to manually load the submit queue */
> + if (execlists->elsq) {
> + struct intel_engine_cs *engine =
> + container_of(execlists,
> + struct intel_engine_cs,
> + execlists);
> + struct drm_i915_private *dev_priv = engine->i915;
> +
> + I915_WRITE_FW(RING_ELCR(engine), ELCR_LOAD);
> + }
> +}
> +
>
I was undecided about hiding the code in sub-functions because of the
pre-emption path. There is no need in gen11 to inject a context to
preempt to idle, so the inject_preempt function will be pre-gen11 only
and therefore I'd prefer to keep a direct call to elsp_write there. IMHO
it'd be cleaner to have similar code in both places, hence the
open-coding. This said, I'd be happy to change it like you proposed if
there is a general preference to abstract things a bit in the shared
path even if the pre-emption path stays different.
Regarding using execlists->elsq as a toggle, I was thinking that we
could have a device info flag instead, so we could use it even before
setting execlists->elsq. Any preference on this?
Thanks,
Daniele
P.S. If you want to take over feel free to send an updated patch ;)
> ...
> -Mika
>
>> ---
>> drivers/gpu/drm/i915/i915_drv.h | 5 ++++-
>> drivers/gpu/drm/i915/intel_lrc.c | 35 ++++++++++++++++++++++++++++-----
>> drivers/gpu/drm/i915/intel_lrc.h | 3 +++
>> drivers/gpu/drm/i915/intel_ringbuffer.h | 6 ++++--
>> 4 files changed, 41 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index c42015b..3163543 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2738,8 +2738,11 @@ static inline unsigned int i915_sg_segment_size(void)
>>
>> #define HAS_LOGICAL_RING_CONTEXTS(dev_priv) \
>> ((dev_priv)->info.has_logical_ring_contexts)
>> +
>> +/* XXX: Preemption disabled for Gen11+ until support for new flow lands */
>> #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \
>> - ((dev_priv)->info.has_logical_ring_preemption)
>> + ((dev_priv)->info.has_logical_ring_preemption && \
>> + INTEL_GEN(dev_priv) < 11)
>>
>> #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index ff25f20..67ad7c9 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -428,11 +428,24 @@ static inline void elsp_write(u64 desc, u32 __iomem *elsp)
>> writel(lower_32_bits(desc), elsp);
>> }
>>
>> +static inline void elsqc_write(u64 desc, u32 __iomem *elsqc, u32 port)
>> +{
>> + writel(lower_32_bits(desc), elsqc + port * 2);
>> + writel(upper_32_bits(desc), elsqc + port * 2 + 1);
>> +}
>> +
>> static void execlists_submit_ports(struct intel_engine_cs *engine)
>> {
>> + struct drm_i915_private *dev_priv = engine->i915;
>> struct execlist_port *port = engine->execlists.port;
>> unsigned int n;
>>
>> + /*
>> + * Gen11+ note: the submit queue is not cleared after being submitted
>> + * to the HW so we need to make sure we always clean it up. This is
>> + * currently ensured by the fact that we always write the same number
>> + * of elsq entries, keep this in mind before changing the loop below.
>> + */
>> for (n = execlists_num_ports(&engine->execlists); n--; ) {
>> struct drm_i915_gem_request *rq;
>> unsigned int count;
>> @@ -456,8 +469,16 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>> desc = 0;
>> }
>>
>> - elsp_write(desc, engine->execlists.elsp);
>> + if (INTEL_GEN(dev_priv) >= 11)
>> + elsqc_write(desc, engine->execlists.els, n);
>> + else
>> + elsp_write(desc, engine->execlists.els);
>> }
>> +
>> + /* for gen11+ we need to manually load the submit queue */
>> + if (INTEL_GEN(dev_priv) >= 11)
>> + I915_WRITE_FW(RING_EXECLIST_CONTROL(engine), EL_CTRL_LOAD);
>> +
>> execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
>> }
>>
>> @@ -506,9 +527,9 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
>>
>> GEM_TRACE("%s\n", engine->name);
>> for (n = execlists_num_ports(&engine->execlists); --n; )
>> - elsp_write(0, engine->execlists.elsp);
>> + elsp_write(0, engine->execlists.els);
>>
>> - elsp_write(ce->lrc_desc, engine->execlists.elsp);
>> + elsp_write(ce->lrc_desc, engine->execlists.els);
>> execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
>> }
>>
>> @@ -2016,8 +2037,12 @@ static int logical_ring_init(struct intel_engine_cs *engine)
>> if (ret)
>> goto error;
>>
>> - engine->execlists.elsp =
>> - engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
>> + if (INTEL_GEN(engine->i915) >= 11)
>> + engine->execlists.els = engine->i915->regs +
>> + i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(engine));
>> + else
>> + engine->execlists.els = engine->i915->regs +
>> + i915_mmio_reg_offset(RING_ELSP(engine));
>>
>> return 0;
>>
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
>> index 6d4f9b9..3ab4266 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.h
>> +++ b/drivers/gpu/drm/i915/intel_lrc.h
>> @@ -38,6 +38,9 @@
>> #define CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT (1 << 0)
>> #define CTX_CTRL_RS_CTX_ENABLE (1 << 1)
>> #define RING_CONTEXT_STATUS_BUF_BASE(engine) _MMIO((engine)->mmio_base + 0x370)
>> +#define RING_EXECLIST_SQ_CONTENTS(engine) _MMIO((engine)->mmio_base + 0x510)
>> +#define RING_EXECLIST_CONTROL(engine) _MMIO((engine)->mmio_base + 0x550)
>> +#define EL_CTRL_LOAD (1 << 0)
>> #define RING_CONTEXT_STATUS_BUF_LO(engine, i) _MMIO((engine)->mmio_base + 0x370 + (i) * 8)
>> #define RING_CONTEXT_STATUS_BUF_HI(engine, i) _MMIO((engine)->mmio_base + 0x370 + (i) * 8 + 4)
>> #define RING_CONTEXT_STATUS_PTR(engine) _MMIO((engine)->mmio_base + 0x3a0)
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> index c5ff203..d36bb73 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> @@ -200,9 +200,11 @@ struct intel_engine_execlists {
>> bool no_priolist;
>>
>> /**
>> - * @elsp: the ExecList Submission Port register
>> + * @els: gen-specific execlist submission register
>> + * set to the ExecList Submission Port (elsp) register pre-Gen11 and to
>> + * the ExecList Submission Queue Contents register array for Gen11+
>> */
>> - u32 __iomem *elsp;
>> + u32 __iomem *els;
>>
>> /**
>> * @port: execlist port states
>> --
>> 1.9.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH v5] drm/i915/icl: Enhanced execution list support
2018-01-19 16:15 ` Daniele Ceraolo Spurio
@ 2018-01-22 15:08 ` Mika Kuoppala
2018-01-22 15:13 ` Chris Wilson
0 siblings, 1 reply; 118+ messages in thread
From: Mika Kuoppala @ 2018-01-22 15:08 UTC (permalink / raw)
To: Daniele Ceraolo Spurio, intel-gfx; +Cc: Thomas Daniel, Rodrigo Vivi
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> writes:
> On 19/01/18 05:05, Mika Kuoppala wrote:
>> Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> writes:
>>
>>> From: Thomas Daniel <thomas.daniel@intel.com>
>>>
>>> Enhanced Execlists is an upgraded version of execlists which supports
>>> up to 8 ports. The lrcs to be submitted are written to a submit queue,
>>> which is then loaded on the HW. When writing to the ELSP register, the
>>> lrcs are written cyclically in the queue from position 0 to position 7.
>>> Alternatively, it is possible to write directly in the individual
>>> positions of the queue using the ELSQ registers. To be able to re-use
>>> all the existing code we're using the latter method and we're currently
>>> limiting ourself to only using 2 elements.
>>>
>>> The preemption flow is sligthly different with enhanced execlists, so
>>> this patch turns preemption off temporarily for Gen11+ while we wait for
>>> the new mechanism to land.
>>>
>>> v2: Rebase.
>>> v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
>>> v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
>>> v5: Reword commit, rename regs to be closer to specs, turn off
>>> preemption (Daniele), reuse engine->execlists.elsp (Chris)
>>>
>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>
>> Was going to adopt this patch from Rodrigo but you were faster.
>>
>> I choose to stash the elsq and use it as a gen11 vs rest toggle:
>>
>> Relevant bits:
>>
>> +static inline void write_port(struct intel_engine_execlists * const execlists,
>> + unsigned int n,
>> + u64 desc)
>> +{
>> + if (execlists->elsq)
>> + gen11_elsq_write(desc, n, execlists->elsq);
>> + else
>> + gen8_elsp_write(desc, execlists->elsp);
>> +}
>> +
>> +static inline void submit_ports(struct intel_engine_execlists * const execlists)
>> +{
>> + /* for gen11+ we need to manually load the submit queue */
>> + if (execlists->elsq) {
>> + struct intel_engine_cs *engine =
>> + container_of(execlists,
>> + struct intel_engine_cs,
>> + execlists);
>> + struct drm_i915_private *dev_priv = engine->i915;
>> +
>> + I915_WRITE_FW(RING_ELCR(engine), ELCR_LOAD);
>> + }
>> +}
>> +
>>
>
> I was undecided about hiding the code in sub-functions because of the
> pre-emption path. There is no need in gen11 to inject a context to
> preempt to idle, so the inject_preempt function will be pre-gen11 only
> and therefore I'd prefer to keep a direct call to elsp_write there. IMHO
> it'd be cleaner to have similar code in both places, hence the
> open-coding. This said, I'd be happy to change it like you proposed if
> there is a general preference to abstract things a bit in the shared
> path even if the pre-emption path stays different.
>
Please don't change. I did the more abstract version before
learning that gen11 don't need the special preempt switch.
> Regarding using execlists->elsq as a toggle, I was thinking that we
> could have a device info flag instead, so we could use it even before
> setting execlists->elsq. Any preference on this?
has_logical_ring_elsq? Doesn't taste bad.
> Thanks,
> Daniele
>
> P.S. If you want to take over feel free to send an updated patch ;)
>
No need to take over, I thought it was orphaned patch :)
-Mika
>> ...
>> -Mika
>>
>>> ---
>>> drivers/gpu/drm/i915/i915_drv.h | 5 ++++-
>>> drivers/gpu/drm/i915/intel_lrc.c | 35 ++++++++++++++++++++++++++++-----
>>> drivers/gpu/drm/i915/intel_lrc.h | 3 +++
>>> drivers/gpu/drm/i915/intel_ringbuffer.h | 6 ++++--
>>> 4 files changed, 41 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index c42015b..3163543 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -2738,8 +2738,11 @@ static inline unsigned int i915_sg_segment_size(void)
>>>
>>> #define HAS_LOGICAL_RING_CONTEXTS(dev_priv) \
>>> ((dev_priv)->info.has_logical_ring_contexts)
>>> +
>>> +/* XXX: Preemption disabled for Gen11+ until support for new flow lands */
>>> #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \
>>> - ((dev_priv)->info.has_logical_ring_preemption)
>>> + ((dev_priv)->info.has_logical_ring_preemption && \
>>> + INTEL_GEN(dev_priv) < 11)
>>>
>>> #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>>> index ff25f20..67ad7c9 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -428,11 +428,24 @@ static inline void elsp_write(u64 desc, u32 __iomem *elsp)
>>> writel(lower_32_bits(desc), elsp);
>>> }
>>>
>>> +static inline void elsqc_write(u64 desc, u32 __iomem *elsqc, u32 port)
>>> +{
>>> + writel(lower_32_bits(desc), elsqc + port * 2);
>>> + writel(upper_32_bits(desc), elsqc + port * 2 + 1);
>>> +}
>>> +
>>> static void execlists_submit_ports(struct intel_engine_cs *engine)
>>> {
>>> + struct drm_i915_private *dev_priv = engine->i915;
>>> struct execlist_port *port = engine->execlists.port;
>>> unsigned int n;
>>>
>>> + /*
>>> + * Gen11+ note: the submit queue is not cleared after being submitted
>>> + * to the HW so we need to make sure we always clean it up. This is
>>> + * currently ensured by the fact that we always write the same number
>>> + * of elsq entries, keep this in mind before changing the loop below.
>>> + */
>>> for (n = execlists_num_ports(&engine->execlists); n--; ) {
>>> struct drm_i915_gem_request *rq;
>>> unsigned int count;
>>> @@ -456,8 +469,16 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>>> desc = 0;
>>> }
>>>
>>> - elsp_write(desc, engine->execlists.elsp);
>>> + if (INTEL_GEN(dev_priv) >= 11)
>>> + elsqc_write(desc, engine->execlists.els, n);
>>> + else
>>> + elsp_write(desc, engine->execlists.els);
>>> }
>>> +
>>> + /* for gen11+ we need to manually load the submit queue */
>>> + if (INTEL_GEN(dev_priv) >= 11)
>>> + I915_WRITE_FW(RING_EXECLIST_CONTROL(engine), EL_CTRL_LOAD);
>>> +
>>> execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
>>> }
>>>
>>> @@ -506,9 +527,9 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
>>>
>>> GEM_TRACE("%s\n", engine->name);
>>> for (n = execlists_num_ports(&engine->execlists); --n; )
>>> - elsp_write(0, engine->execlists.elsp);
>>> + elsp_write(0, engine->execlists.els);
>>>
>>> - elsp_write(ce->lrc_desc, engine->execlists.elsp);
>>> + elsp_write(ce->lrc_desc, engine->execlists.els);
>>> execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
>>> }
>>>
>>> @@ -2016,8 +2037,12 @@ static int logical_ring_init(struct intel_engine_cs *engine)
>>> if (ret)
>>> goto error;
>>>
>>> - engine->execlists.elsp =
>>> - engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
>>> + if (INTEL_GEN(engine->i915) >= 11)
>>> + engine->execlists.els = engine->i915->regs +
>>> + i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(engine));
>>> + else
>>> + engine->execlists.els = engine->i915->regs +
>>> + i915_mmio_reg_offset(RING_ELSP(engine));
>>>
>>> return 0;
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
>>> index 6d4f9b9..3ab4266 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.h
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.h
>>> @@ -38,6 +38,9 @@
>>> #define CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT (1 << 0)
>>> #define CTX_CTRL_RS_CTX_ENABLE (1 << 1)
>>> #define RING_CONTEXT_STATUS_BUF_BASE(engine) _MMIO((engine)->mmio_base + 0x370)
>>> +#define RING_EXECLIST_SQ_CONTENTS(engine) _MMIO((engine)->mmio_base + 0x510)
>>> +#define RING_EXECLIST_CONTROL(engine) _MMIO((engine)->mmio_base + 0x550)
>>> +#define EL_CTRL_LOAD (1 << 0)
>>> #define RING_CONTEXT_STATUS_BUF_LO(engine, i) _MMIO((engine)->mmio_base + 0x370 + (i) * 8)
>>> #define RING_CONTEXT_STATUS_BUF_HI(engine, i) _MMIO((engine)->mmio_base + 0x370 + (i) * 8 + 4)
>>> #define RING_CONTEXT_STATUS_PTR(engine) _MMIO((engine)->mmio_base + 0x3a0)
>>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> index c5ff203..d36bb73 100644
>>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> @@ -200,9 +200,11 @@ struct intel_engine_execlists {
>>> bool no_priolist;
>>>
>>> /**
>>> - * @elsp: the ExecList Submission Port register
>>> + * @els: gen-specific execlist submission register
>>> + * set to the ExecList Submission Port (elsp) register pre-Gen11 and to
>>> + * the ExecList Submission Queue Contents register array for Gen11+
>>> */
>>> - u32 __iomem *elsp;
>>> + u32 __iomem *els;
>>>
>>> /**
>>> * @port: execlist port states
>>> --
>>> 1.9.1
>>>
>>> _______________________________________________
>>> Intel-gfx mailing list
>>> Intel-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH v5] drm/i915/icl: Enhanced execution list support
2018-01-22 15:08 ` Mika Kuoppala
@ 2018-01-22 15:13 ` Chris Wilson
2018-01-22 16:09 ` Daniele Ceraolo Spurio
0 siblings, 1 reply; 118+ messages in thread
From: Chris Wilson @ 2018-01-22 15:13 UTC (permalink / raw)
To: Mika Kuoppala, Daniele Ceraolo Spurio, intel-gfx
Cc: Thomas Daniel, Rodrigo Vivi
Quoting Mika Kuoppala (2018-01-22 15:08:16)
> Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> writes:
>
> > On 19/01/18 05:05, Mika Kuoppala wrote:
> >> Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> writes:
> >>
> >>> From: Thomas Daniel <thomas.daniel@intel.com>
> >>>
> >>> Enhanced Execlists is an upgraded version of execlists which supports
> >>> up to 8 ports. The lrcs to be submitted are written to a submit queue,
> >>> which is then loaded on the HW. When writing to the ELSP register, the
> >>> lrcs are written cyclically in the queue from position 0 to position 7.
> >>> Alternatively, it is possible to write directly in the individual
> >>> positions of the queue using the ELSQ registers. To be able to re-use
> >>> all the existing code we're using the latter method and we're currently
> >>> limiting ourself to only using 2 elements.
> >>>
> >>> The preemption flow is sligthly different with enhanced execlists, so
> >>> this patch turns preemption off temporarily for Gen11+ while we wait for
> >>> the new mechanism to land.
> >>>
> >>> v2: Rebase.
> >>> v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
> >>> v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
> >>> v5: Reword commit, rename regs to be closer to specs, turn off
> >>> preemption (Daniele), reuse engine->execlists.elsp (Chris)
> >>>
> >>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
> >>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> >>
> >> Was going to adopt this patch from Rodrigo but you were faster.
> >>
> >> I choose to stash the elsq and use it as a gen11 vs rest toggle:
> >>
> >> Relevant bits:
> >>
> >> +static inline void write_port(struct intel_engine_execlists * const execlists,
> >> + unsigned int n,
> >> + u64 desc)
> >> +{
> >> + if (execlists->elsq)
> >> + gen11_elsq_write(desc, n, execlists->elsq);
> >> + else
> >> + gen8_elsp_write(desc, execlists->elsp);
> >> +}
> >> +
> >> +static inline void submit_ports(struct intel_engine_execlists * const execlists)
> >> +{
> >> + /* for gen11+ we need to manually load the submit queue */
> >> + if (execlists->elsq) {
> >> + struct intel_engine_cs *engine =
> >> + container_of(execlists,
> >> + struct intel_engine_cs,
> >> + execlists);
> >> + struct drm_i915_private *dev_priv = engine->i915;
> >> +
> >> + I915_WRITE_FW(RING_ELCR(engine), ELCR_LOAD);
> >> + }
> >> +}
> >> +
> >>
> >
> > I was undecided about hiding the code in sub-functions because of the
> > pre-emption path. There is no need in gen11 to inject a context to
> > preempt to idle,
Really? The preempt-to-idle is so that we can sync the bookkeeping with
the pending CS interrupts. The HW doesn't require it currently either,
it's the SW that does. If you have a way to avoid that, that should be
applicable to the current code as well?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH v5] drm/i915/icl: Enhanced execution list support
2018-01-22 15:13 ` Chris Wilson
@ 2018-01-22 16:09 ` Daniele Ceraolo Spurio
2018-01-22 17:32 ` Chris Wilson
0 siblings, 1 reply; 118+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-01-22 16:09 UTC (permalink / raw)
To: Chris Wilson, Mika Kuoppala, intel-gfx; +Cc: Thomas Daniel, Rodrigo Vivi
On 22/01/18 07:13, Chris Wilson wrote:
> Quoting Mika Kuoppala (2018-01-22 15:08:16)
>> Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> writes:
>>
>>> On 19/01/18 05:05, Mika Kuoppala wrote:
>>>> Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> writes:
>>>>
>>>>> From: Thomas Daniel <thomas.daniel@intel.com>
>>>>>
>>>>> Enhanced Execlists is an upgraded version of execlists which supports
>>>>> up to 8 ports. The lrcs to be submitted are written to a submit queue,
>>>>> which is then loaded on the HW. When writing to the ELSP register, the
>>>>> lrcs are written cyclically in the queue from position 0 to position 7.
>>>>> Alternatively, it is possible to write directly in the individual
>>>>> positions of the queue using the ELSQ registers. To be able to re-use
>>>>> all the existing code we're using the latter method and we're currently
>>>>> limiting ourself to only using 2 elements.
>>>>>
>>>>> The preemption flow is sligthly different with enhanced execlists, so
>>>>> this patch turns preemption off temporarily for Gen11+ while we wait for
>>>>> the new mechanism to land.
>>>>>
>>>>> v2: Rebase.
>>>>> v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
>>>>> v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
>>>>> v5: Reword commit, rename regs to be closer to specs, turn off
>>>>> preemption (Daniele), reuse engine->execlists.elsp (Chris)
>>>>>
>>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
>>>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>>>
>>>> Was going to adopt this patch from Rodrigo but you were faster.
>>>>
>>>> I choose to stash the elsq and use it as a gen11 vs rest toggle:
>>>>
>>>> Relevant bits:
>>>>
>>>> +static inline void write_port(struct intel_engine_execlists * const execlists,
>>>> + unsigned int n,
>>>> + u64 desc)
>>>> +{
>>>> + if (execlists->elsq)
>>>> + gen11_elsq_write(desc, n, execlists->elsq);
>>>> + else
>>>> + gen8_elsp_write(desc, execlists->elsp);
>>>> +}
>>>> +
>>>> +static inline void submit_ports(struct intel_engine_execlists * const execlists)
>>>> +{
>>>> + /* for gen11+ we need to manually load the submit queue */
>>>> + if (execlists->elsq) {
>>>> + struct intel_engine_cs *engine =
>>>> + container_of(execlists,
>>>> + struct intel_engine_cs,
>>>> + execlists);
>>>> + struct drm_i915_private *dev_priv = engine->i915;
>>>> +
>>>> + I915_WRITE_FW(RING_ELCR(engine), ELCR_LOAD);
>>>> + }
>>>> +}
>>>> +
>>>>
>>>
>>> I was undecided about hiding the code in sub-functions because of the
>>> pre-emption path. There is no need in gen11 to inject a context to
>>> preempt to idle,
>
> Really? The preempt-to-idle is so that we can sync the bookkeeping with
> the pending CS interrupts. The HW doesn't require it currently either,
> it's the SW that does. If you have a way to avoid that, that should be
> applicable to the current code as well?
> -Chris
>
We can't avoid preempt-to-idle, we can do it in a simpler way. There is
a bit in RING_EXECLIST_CONTROL that triggers a preemp-to-idle, without
the need to do a ctx injection. We'll need to move preemption completion
detection to the CSB value instead of the HWSP write, not sure about the
impact of that on our bookkeeping.
Daniele
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH v5] drm/i915/icl: Enhanced execution list support
2018-01-22 16:09 ` Daniele Ceraolo Spurio
@ 2018-01-22 17:32 ` Chris Wilson
2018-01-22 21:38 ` Daniele Ceraolo Spurio
0 siblings, 1 reply; 118+ messages in thread
From: Chris Wilson @ 2018-01-22 17:32 UTC (permalink / raw)
To: Daniele Ceraolo Spurio, Mika Kuoppala, intel-gfx
Cc: Thomas Daniel, Rodrigo Vivi
Quoting Daniele Ceraolo Spurio (2018-01-22 16:09:28)
>
>
> On 22/01/18 07:13, Chris Wilson wrote:
> > Quoting Mika Kuoppala (2018-01-22 15:08:16)
> >> Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> writes:
> >>
> >>> On 19/01/18 05:05, Mika Kuoppala wrote:
> >>>> Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> writes:
> >>>>
> >>>>> From: Thomas Daniel <thomas.daniel@intel.com>
> >>>>>
> >>>>> Enhanced Execlists is an upgraded version of execlists which supports
> >>>>> up to 8 ports. The lrcs to be submitted are written to a submit queue,
> >>>>> which is then loaded on the HW. When writing to the ELSP register, the
> >>>>> lrcs are written cyclically in the queue from position 0 to position 7.
> >>>>> Alternatively, it is possible to write directly in the individual
> >>>>> positions of the queue using the ELSQ registers. To be able to re-use
> >>>>> all the existing code we're using the latter method and we're currently
> >>>>> limiting ourself to only using 2 elements.
> >>>>>
> >>>>> The preemption flow is sligthly different with enhanced execlists, so
> >>>>> this patch turns preemption off temporarily for Gen11+ while we wait for
> >>>>> the new mechanism to land.
> >>>>>
> >>>>> v2: Rebase.
> >>>>> v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
> >>>>> v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
> >>>>> v5: Reword commit, rename regs to be closer to specs, turn off
> >>>>> preemption (Daniele), reuse engine->execlists.elsp (Chris)
> >>>>>
> >>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >>>>> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
> >>>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >>>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> >>>>
> >>>> Was going to adopt this patch from Rodrigo but you were faster.
> >>>>
> >>>> I choose to stash the elsq and use it as a gen11 vs rest toggle:
> >>>>
> >>>> Relevant bits:
> >>>>
> >>>> +static inline void write_port(struct intel_engine_execlists * const execlists,
> >>>> + unsigned int n,
> >>>> + u64 desc)
> >>>> +{
> >>>> + if (execlists->elsq)
> >>>> + gen11_elsq_write(desc, n, execlists->elsq);
> >>>> + else
> >>>> + gen8_elsp_write(desc, execlists->elsp);
> >>>> +}
> >>>> +
> >>>> +static inline void submit_ports(struct intel_engine_execlists * const execlists)
> >>>> +{
> >>>> + /* for gen11+ we need to manually load the submit queue */
> >>>> + if (execlists->elsq) {
> >>>> + struct intel_engine_cs *engine =
> >>>> + container_of(execlists,
> >>>> + struct intel_engine_cs,
> >>>> + execlists);
> >>>> + struct drm_i915_private *dev_priv = engine->i915;
> >>>> +
> >>>> + I915_WRITE_FW(RING_ELCR(engine), ELCR_LOAD);
> >>>> + }
> >>>> +}
> >>>> +
> >>>>
> >>>
> >>> I was undecided about hiding the code in sub-functions because of the
> >>> pre-emption path. There is no need in gen11 to inject a context to
> >>> preempt to idle,
> >
> > Really? The preempt-to-idle is so that we can sync the bookkeeping with
> > the pending CS interrupts. The HW doesn't require it currently either,
> > it's the SW that does. If you have a way to avoid that, that should be
> > applicable to the current code as well?
> > -Chris
> >
>
> We can't avoid preempt-to-idle, we can do it in a simpler way. There is
> a bit in RING_EXECLIST_CONTROL that triggers a preemp-to-idle, without
> the need to do a ctx injection. We'll need to move preemption completion
> detection to the CSB value instead of the HWSP write, not sure about the
> impact of that on our bookkeeping.
Ah, shucks :( I was hoping there's a simple way to avoid idling.
I have to ask, is it worth it? If we still have to do a CS interrupt
round trip, what's the difference? Hmm. I wonder if assume that the
preemption is nearly instantaneous (say 10us), we could short-circuit
the interrupt and poll. Falling back to interrupt if longer, and/or
resetting the GPU to guarantee latencies.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread
* Re: [PATCH v5] drm/i915/icl: Enhanced execution list support
2018-01-22 17:32 ` Chris Wilson
@ 2018-01-22 21:38 ` Daniele Ceraolo Spurio
0 siblings, 0 replies; 118+ messages in thread
From: Daniele Ceraolo Spurio @ 2018-01-22 21:38 UTC (permalink / raw)
To: Chris Wilson, Mika Kuoppala, intel-gfx; +Cc: Thomas Daniel, Rodrigo Vivi
On 22/01/18 09:32, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2018-01-22 16:09:28)
>>
>>
>> On 22/01/18 07:13, Chris Wilson wrote:
>>> Quoting Mika Kuoppala (2018-01-22 15:08:16)
>>>> Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> writes:
>>>>
>>>>> On 19/01/18 05:05, Mika Kuoppala wrote:
>>>>>> Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> writes:
>>>>>>
>>>>>>> From: Thomas Daniel <thomas.daniel@intel.com>
>>>>>>>
>>>>>>> Enhanced Execlists is an upgraded version of execlists which supports
>>>>>>> up to 8 ports. The lrcs to be submitted are written to a submit queue,
>>>>>>> which is then loaded on the HW. When writing to the ELSP register, the
>>>>>>> lrcs are written cyclically in the queue from position 0 to position 7.
>>>>>>> Alternatively, it is possible to write directly in the individual
>>>>>>> positions of the queue using the ELSQ registers. To be able to re-use
>>>>>>> all the existing code we're using the latter method and we're currently
>>>>>>> limiting ourself to only using 2 elements.
>>>>>>>
>>>>>>> The preemption flow is sligthly different with enhanced execlists, so
>>>>>>> this patch turns preemption off temporarily for Gen11+ while we wait for
>>>>>>> the new mechanism to land.
>>>>>>>
>>>>>>> v2: Rebase.
>>>>>>> v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
>>>>>>> v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
>>>>>>> v5: Reword commit, rename regs to be closer to specs, turn off
>>>>>>> preemption (Daniele), reuse engine->execlists.elsp (Chris)
>>>>>>>
>>>>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>>>>> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
>>>>>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>>>>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>>>>>
>>>>>> Was going to adopt this patch from Rodrigo but you were faster.
>>>>>>
>>>>>> I choose to stash the elsq and use it as a gen11 vs rest toggle:
>>>>>>
>>>>>> Relevant bits:
>>>>>>
>>>>>> +static inline void write_port(struct intel_engine_execlists * const execlists,
>>>>>> + unsigned int n,
>>>>>> + u64 desc)
>>>>>> +{
>>>>>> + if (execlists->elsq)
>>>>>> + gen11_elsq_write(desc, n, execlists->elsq);
>>>>>> + else
>>>>>> + gen8_elsp_write(desc, execlists->elsp);
>>>>>> +}
>>>>>> +
>>>>>> +static inline void submit_ports(struct intel_engine_execlists * const execlists)
>>>>>> +{
>>>>>> + /* for gen11+ we need to manually load the submit queue */
>>>>>> + if (execlists->elsq) {
>>>>>> + struct intel_engine_cs *engine =
>>>>>> + container_of(execlists,
>>>>>> + struct intel_engine_cs,
>>>>>> + execlists);
>>>>>> + struct drm_i915_private *dev_priv = engine->i915;
>>>>>> +
>>>>>> + I915_WRITE_FW(RING_ELCR(engine), ELCR_LOAD);
>>>>>> + }
>>>>>> +}
>>>>>> +
>>>>>>
>>>>>
>>>>> I was undecided about hiding the code in sub-functions because of the
>>>>> pre-emption path. There is no need in gen11 to inject a context to
>>>>> preempt to idle,
>>>
>>> Really? The preempt-to-idle is so that we can sync the bookkeeping with
>>> the pending CS interrupts. The HW doesn't require it currently either,
>>> it's the SW that does. If you have a way to avoid that, that should be
>>> applicable to the current code as well?
>>> -Chris
>>>
>>
>> We can't avoid preempt-to-idle, we can do it in a simpler way. There is
>> a bit in RING_EXECLIST_CONTROL that triggers a preemp-to-idle, without
>> the need to do a ctx injection. We'll need to move preemption completion
>> detection to the CSB value instead of the HWSP write, not sure about the
>> impact of that on our bookkeeping.
>
> Ah, shucks :( I was hoping there's a simple way to avoid idling.
>
> I have to ask, is it worth it? If we still have to do a CS interrupt
> round trip, what's the difference? Hmm. I wonder if assume that the
> preemption is nearly instantaneous (say 10us), we could short-circuit
> the interrupt and poll. Falling back to interrupt if longer, and/or
> resetting the GPU to guarantee latencies.
> -Chris
>
The SW cost shouldn't be too bad, so having 1 less ctx switch should
give us some some benefits. Trying a poll with a fall-back sounds nice,
but we'll probably have to wait until we have timing info to see if the
polling makes sense at all.
Daniele
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 118+ messages in thread