* [PATCH v2 1/4] drm/sched: Fix entities with 0 rqs.
@ 2019-01-30 1:53 Bas Nieuwenhuizen
[not found] ` <20190130015322.105870-1-bas-dldO88ZXqoXqqjsSq9zF6IRWq/SkRNHw@public.gmane.org>
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Bas Nieuwenhuizen @ 2019-01-30 1:53 UTC (permalink / raw)
To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Cc: Bas Nieuwenhuizen
Some blocks in amdgpu can have 0 rqs.
Job creation already fails with -ENOENT when entity->rq is NULL,
so jobs cannot be pushed. Without a rq there is no scheduler to
pop jobs, and rq selection already does the right thing with a
list of length 0.
So the operations we need to fix are:
- Creation, do not set rq to rq_list[0] if the list can have length 0.
- Do not flush any jobs when there is no rq.
- On entity destruction handle the rq = NULL case.
- on set_priority, do not try to change the rq if it is NULL.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
---
drivers/gpu/drm/scheduler/sched_entity.c | 39 ++++++++++++++++--------
1 file changed, 26 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 4463d3826ecb..8e31b6628d09 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -52,12 +52,12 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
{
int i;
- if (!(entity && rq_list && num_rq_list > 0 && rq_list[0]))
+ if (!(entity && rq_list && (num_rq_list == 0 || rq_list[0])))
return -EINVAL;
memset(entity, 0, sizeof(struct drm_sched_entity));
INIT_LIST_HEAD(&entity->list);
- entity->rq = rq_list[0];
+ entity->rq = NULL;
entity->guilty = guilty;
entity->num_rq_list = num_rq_list;
entity->rq_list = kcalloc(num_rq_list, sizeof(struct drm_sched_rq *),
@@ -67,6 +67,10 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
for (i = 0; i < num_rq_list; ++i)
entity->rq_list[i] = rq_list[i];
+
+ if (num_rq_list)
+ entity->rq = rq_list[0];
+
entity->last_scheduled = NULL;
spin_lock_init(&entity->rq_lock);
@@ -165,6 +169,9 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
struct task_struct *last_user;
long ret = timeout;
+ if (!entity->rq)
+ return 0;
+
sched = entity->rq->sched;
/**
* The client will not queue more IBs during this fini, consume existing
@@ -264,20 +271,24 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
*/
void drm_sched_entity_fini(struct drm_sched_entity *entity)
{
- struct drm_gpu_scheduler *sched;
+ struct drm_gpu_scheduler *sched = NULL;
- sched = entity->rq->sched;
- drm_sched_rq_remove_entity(entity->rq, entity);
+ if (entity->rq) {
+ sched = entity->rq->sched;
+ drm_sched_rq_remove_entity(entity->rq, entity);
+ }
/* Consumption of existing IBs wasn't completed. Forcefully
* remove them here.
*/
if (spsc_queue_peek(&entity->job_queue)) {
- /* Park the kernel for a moment to make sure it isn't processing
- * our enity.
- */
- kthread_park(sched->thread);
- kthread_unpark(sched->thread);
+ if (sched) {
+ /* Park the kernel for a moment to make sure it isn't processing
+ * our enity.
+ */
+ kthread_park(sched->thread);
+ kthread_unpark(sched->thread);
+ }
if (entity->dependency) {
dma_fence_remove_callback(entity->dependency,
&entity->cb);
@@ -362,9 +373,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
for (i = 0; i < entity->num_rq_list; ++i)
drm_sched_entity_set_rq_priority(&entity->rq_list[i], priority);
- drm_sched_rq_remove_entity(entity->rq, entity);
- drm_sched_entity_set_rq_priority(&entity->rq, priority);
- drm_sched_rq_add_entity(entity->rq, entity);
+ if (entity->rq) {
+ drm_sched_rq_remove_entity(entity->rq, entity);
+ drm_sched_entity_set_rq_priority(&entity->rq, priority);
+ drm_sched_rq_add_entity(entity->rq, entity);
+ }
spin_unlock(&entity->rq_lock);
}
--
2.20.1
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/4] drm/amdgpu: Only add rqs for initialized rings.
[not found] ` <20190130015322.105870-1-bas-dldO88ZXqoXqqjsSq9zF6IRWq/SkRNHw@public.gmane.org>
@ 2019-01-30 1:53 ` Bas Nieuwenhuizen
[not found] ` <20190130015322.105870-2-bas-dldO88ZXqoXqqjsSq9zF6IRWq/SkRNHw@public.gmane.org>
0 siblings, 1 reply; 8+ messages in thread
From: Bas Nieuwenhuizen @ 2019-01-30 1:53 UTC (permalink / raw)
To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Cc: Bas Nieuwenhuizen
I don't see another way to figure out if a ring is initialized if
the hardware block might not be initialized.
Entities have been fixed up to handle num_rqs = 0.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index d85184b5b35c..30407e55593b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -124,6 +124,7 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev,
struct amdgpu_ring *rings[AMDGPU_MAX_RINGS];
struct drm_sched_rq *rqs[AMDGPU_MAX_RINGS];
unsigned num_rings;
+ unsigned num_rqs = 0;
switch (i) {
case AMDGPU_HW_IP_GFX:
@@ -166,12 +167,16 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev,
break;
}
- for (j = 0; j < num_rings; ++j)
- rqs[j] = &rings[j]->sched.sched_rq[priority];
+ for (j = 0; j < num_rings; ++j) {
+ if (rings[j]->adev) {
+ rqs[num_rqs++] =
+ &rings[j]->sched.sched_rq[priority];
+ }
+ }
for (j = 0; j < amdgpu_ctx_num_entities[i]; ++j)
r = drm_sched_entity_init(&ctx->entities[i][j].entity,
- rqs, num_rings, &ctx->guilty);
+ rqs, num_rqs, &ctx->guilty);
if (r)
goto error_cleanup_entities;
}
--
2.20.1
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 3/4] drm/amdgpu: Check if fd really is an amdgpu fd.
2019-01-30 1:53 [PATCH v2 1/4] drm/sched: Fix entities with 0 rqs Bas Nieuwenhuizen
[not found] ` <20190130015322.105870-1-bas-dldO88ZXqoXqqjsSq9zF6IRWq/SkRNHw@public.gmane.org>
@ 2019-01-30 1:53 ` Bas Nieuwenhuizen
2019-01-30 1:53 ` [PATCH v2 4/4] drm/amdgpu: Add command to override the context priority Bas Nieuwenhuizen
2019-01-30 10:43 ` [PATCH v2 1/4] drm/sched: Fix entities with 0 rqs Christian König
3 siblings, 0 replies; 8+ messages in thread
From: Bas Nieuwenhuizen @ 2019-01-30 1:53 UTC (permalink / raw)
To: amd-gfx, dri-devel
Otherwise we interpret the file private data as drm & amdgpu data
while it might not be, possibly allowing one to get memory corruption.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 16 ++++++++++++++++
drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c | 10 +++++++---
3 files changed, 25 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index d67f8b1dfe80..17290cdb8ed8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -411,6 +411,8 @@ struct amdgpu_fpriv {
struct amdgpu_ctx_mgr ctx_mgr;
};
+int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv);
+
int amdgpu_ib_get(struct amdgpu_device *adev, struct amdgpu_vm *vm,
unsigned size, struct amdgpu_ib *ib);
void amdgpu_ib_free(struct amdgpu_device *adev, struct amdgpu_ib *ib,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index c806f984bcc5..90a520034c89 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1176,6 +1176,22 @@ static const struct file_operations amdgpu_driver_kms_fops = {
#endif
};
+int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
+{
+ struct drm_file *file;
+
+ if (!filp)
+ return -EINVAL;
+
+ if (filp->f_op != &amdgpu_driver_kms_fops) {
+ return -EINVAL;
+ }
+
+ file = filp->private_data;
+ *fpriv = file->driver_priv;
+ return 0;
+}
+
static bool
amdgpu_get_crtc_scanout_position(struct drm_device *dev, unsigned int pipe,
bool in_vblank_irq, int *vpos, int *hpos,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
index 1cafe8d83a4d..0b70410488b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
@@ -54,16 +54,20 @@ static int amdgpu_sched_process_priority_override(struct amdgpu_device *adev,
enum drm_sched_priority priority)
{
struct file *filp = fget(fd);
- struct drm_file *file;
struct amdgpu_fpriv *fpriv;
struct amdgpu_ctx *ctx;
uint32_t id;
+ int r;
if (!filp)
return -EINVAL;
- file = filp->private_data;
- fpriv = file->driver_priv;
+ r = amdgpu_file_to_fpriv(filp, &fpriv);
+ if (r) {
+ fput(filp);
+ return r;
+ }
+
idr_for_each_entry(&fpriv->ctx_mgr.ctx_handles, ctx, id)
amdgpu_ctx_priority_override(ctx, priority);
--
2.20.1
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 4/4] drm/amdgpu: Add command to override the context priority.
2019-01-30 1:53 [PATCH v2 1/4] drm/sched: Fix entities with 0 rqs Bas Nieuwenhuizen
[not found] ` <20190130015322.105870-1-bas-dldO88ZXqoXqqjsSq9zF6IRWq/SkRNHw@public.gmane.org>
2019-01-30 1:53 ` [PATCH v2 3/4] drm/amdgpu: Check if fd really is an amdgpu fd Bas Nieuwenhuizen
@ 2019-01-30 1:53 ` Bas Nieuwenhuizen
2019-01-30 10:43 ` [PATCH v2 1/4] drm/sched: Fix entities with 0 rqs Christian König
3 siblings, 0 replies; 8+ messages in thread
From: Bas Nieuwenhuizen @ 2019-01-30 1:53 UTC (permalink / raw)
To: amd-gfx, dri-devel
Given a master fd we can then override the priority of the context
in another fd.
Using these overrides was recommended by Christian instead of trying
to submit from a master fd, and I am adding a way to override a
single context instead of the entire process so we can only upgrade
a single Vulkan queue and not effectively the entire process.
Reused the flags field as it was checked to be 0 anyways, so nothing
used it. This is source-incompatible (due to the name change), but
ABI compatible.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c | 41 ++++++++++++++++++++++-
include/uapi/drm/amdgpu_drm.h | 3 +-
2 files changed, 42 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
index 0b70410488b6..0767a93e4d91 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sched.c
@@ -76,6 +76,39 @@ static int amdgpu_sched_process_priority_override(struct amdgpu_device *adev,
return 0;
}
+static int amdgpu_sched_context_priority_override(struct amdgpu_device *adev,
+ int fd,
+ unsigned ctx_id,
+ enum drm_sched_priority priority)
+{
+ struct file *filp = fget(fd);
+ struct amdgpu_fpriv *fpriv;
+ struct amdgpu_ctx *ctx;
+ int r;
+
+ if (!filp)
+ return -EINVAL;
+
+ r = amdgpu_file_to_fpriv(filp, &fpriv);
+ if (r) {
+ fput(filp);
+ return r;
+ }
+
+ ctx = amdgpu_ctx_get(fpriv, ctx_id);
+
+ if (!ctx) {
+ fput(filp);
+ return -EINVAL;
+ }
+
+ amdgpu_ctx_priority_override(ctx, priority);
+ amdgpu_ctx_put(ctx);
+ fput(filp);
+
+ return 0;
+}
+
int amdgpu_sched_ioctl(struct drm_device *dev, void *data,
struct drm_file *filp)
{
@@ -85,7 +118,7 @@ int amdgpu_sched_ioctl(struct drm_device *dev, void *data,
int r;
priority = amdgpu_to_sched_priority(args->in.priority);
- if (args->in.flags || priority == DRM_SCHED_PRIORITY_INVALID)
+ if (priority == DRM_SCHED_PRIORITY_INVALID)
return -EINVAL;
switch (args->in.op) {
@@ -94,6 +127,12 @@ int amdgpu_sched_ioctl(struct drm_device *dev, void *data,
args->in.fd,
priority);
break;
+ case AMDGPU_SCHED_OP_CONTEXT_PRIORITY_OVERRIDE:
+ r = amdgpu_sched_context_priority_override(adev,
+ args->in.fd,
+ args->in.ctx_id,
+ priority);
+ break;
default:
DRM_ERROR("Invalid sched op specified: %d\n", args->in.op);
r = -EINVAL;
diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
index faaad04814e4..30fa340790b2 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -275,13 +275,14 @@ union drm_amdgpu_vm {
/* sched ioctl */
#define AMDGPU_SCHED_OP_PROCESS_PRIORITY_OVERRIDE 1
+#define AMDGPU_SCHED_OP_CONTEXT_PRIORITY_OVERRIDE 2
struct drm_amdgpu_sched_in {
/* AMDGPU_SCHED_OP_* */
__u32 op;
__u32 fd;
__s32 priority;
- __u32 flags;
+ __u32 ctx_id;
};
union drm_amdgpu_sched {
--
2.20.1
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/4] drm/amdgpu: Only add rqs for initialized rings.
[not found] ` <20190130015322.105870-2-bas-dldO88ZXqoXqqjsSq9zF6IRWq/SkRNHw@public.gmane.org>
@ 2019-01-30 10:42 ` Christian König
0 siblings, 0 replies; 8+ messages in thread
From: Christian König @ 2019-01-30 10:42 UTC (permalink / raw)
To: Bas Nieuwenhuizen, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Am 30.01.19 um 02:53 schrieb Bas Nieuwenhuizen:
> I don't see another way to figure out if a ring is initialized if
> the hardware block might not be initialized.
>
> Entities have been fixed up to handle num_rqs = 0.
>
> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index d85184b5b35c..30407e55593b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -124,6 +124,7 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev,
> struct amdgpu_ring *rings[AMDGPU_MAX_RINGS];
> struct drm_sched_rq *rqs[AMDGPU_MAX_RINGS];
> unsigned num_rings;
> + unsigned num_rqs = 0;
>
> switch (i) {
> case AMDGPU_HW_IP_GFX:
> @@ -166,12 +167,16 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev,
> break;
> }
>
> - for (j = 0; j < num_rings; ++j)
> - rqs[j] = &rings[j]->sched.sched_rq[priority];
> + for (j = 0; j < num_rings; ++j) {
> + if (rings[j]->adev) {
Better do "if (!ring[j]->adev) continue;".
With that done the patch is Reviewed-by: Christian König
<christian.koenig@amd.com>.
Regards,
Christian.
> + rqs[num_rqs++] =
> + &rings[j]->sched.sched_rq[priority];
> + }
> + }
>
> for (j = 0; j < amdgpu_ctx_num_entities[i]; ++j)
> r = drm_sched_entity_init(&ctx->entities[i][j].entity,
> - rqs, num_rings, &ctx->guilty);
> + rqs, num_rqs, &ctx->guilty);
> if (r)
> goto error_cleanup_entities;
> }
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/4] drm/sched: Fix entities with 0 rqs.
2019-01-30 1:53 [PATCH v2 1/4] drm/sched: Fix entities with 0 rqs Bas Nieuwenhuizen
` (2 preceding siblings ...)
2019-01-30 1:53 ` [PATCH v2 4/4] drm/amdgpu: Add command to override the context priority Bas Nieuwenhuizen
@ 2019-01-30 10:43 ` Christian König
[not found] ` <23f6617a-2049-11b3-b1df-07eb029db714-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
3 siblings, 1 reply; 8+ messages in thread
From: Christian König @ 2019-01-30 10:43 UTC (permalink / raw)
To: Bas Nieuwenhuizen, amd-gfx, dri-devel
Am 30.01.19 um 02:53 schrieb Bas Nieuwenhuizen:
> Some blocks in amdgpu can have 0 rqs.
>
> Job creation already fails with -ENOENT when entity->rq is NULL,
> so jobs cannot be pushed. Without a rq there is no scheduler to
> pop jobs, and rq selection already does the right thing with a
> list of length 0.
>
> So the operations we need to fix are:
> - Creation, do not set rq to rq_list[0] if the list can have length 0.
> - Do not flush any jobs when there is no rq.
> - On entity destruction handle the rq = NULL case.
> - on set_priority, do not try to change the rq if it is NULL.
>
> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
One minor comment on patch #2, apart from that the series is
Reviewed-by: Christian König <christian.koenig@amd.com>.
I'm going to make the change on #2 and pick them up for inclusion in
amd-staging-drm-next.
Thanks for the help,
Christian.
> ---
> drivers/gpu/drm/scheduler/sched_entity.c | 39 ++++++++++++++++--------
> 1 file changed, 26 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 4463d3826ecb..8e31b6628d09 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -52,12 +52,12 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
> {
> int i;
>
> - if (!(entity && rq_list && num_rq_list > 0 && rq_list[0]))
> + if (!(entity && rq_list && (num_rq_list == 0 || rq_list[0])))
> return -EINVAL;
>
> memset(entity, 0, sizeof(struct drm_sched_entity));
> INIT_LIST_HEAD(&entity->list);
> - entity->rq = rq_list[0];
> + entity->rq = NULL;
> entity->guilty = guilty;
> entity->num_rq_list = num_rq_list;
> entity->rq_list = kcalloc(num_rq_list, sizeof(struct drm_sched_rq *),
> @@ -67,6 +67,10 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>
> for (i = 0; i < num_rq_list; ++i)
> entity->rq_list[i] = rq_list[i];
> +
> + if (num_rq_list)
> + entity->rq = rq_list[0];
> +
> entity->last_scheduled = NULL;
>
> spin_lock_init(&entity->rq_lock);
> @@ -165,6 +169,9 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
> struct task_struct *last_user;
> long ret = timeout;
>
> + if (!entity->rq)
> + return 0;
> +
> sched = entity->rq->sched;
> /**
> * The client will not queue more IBs during this fini, consume existing
> @@ -264,20 +271,24 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
> */
> void drm_sched_entity_fini(struct drm_sched_entity *entity)
> {
> - struct drm_gpu_scheduler *sched;
> + struct drm_gpu_scheduler *sched = NULL;
>
> - sched = entity->rq->sched;
> - drm_sched_rq_remove_entity(entity->rq, entity);
> + if (entity->rq) {
> + sched = entity->rq->sched;
> + drm_sched_rq_remove_entity(entity->rq, entity);
> + }
>
> /* Consumption of existing IBs wasn't completed. Forcefully
> * remove them here.
> */
> if (spsc_queue_peek(&entity->job_queue)) {
> - /* Park the kernel for a moment to make sure it isn't processing
> - * our enity.
> - */
> - kthread_park(sched->thread);
> - kthread_unpark(sched->thread);
> + if (sched) {
> + /* Park the kernel for a moment to make sure it isn't processing
> + * our enity.
> + */
> + kthread_park(sched->thread);
> + kthread_unpark(sched->thread);
> + }
> if (entity->dependency) {
> dma_fence_remove_callback(entity->dependency,
> &entity->cb);
> @@ -362,9 +373,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> for (i = 0; i < entity->num_rq_list; ++i)
> drm_sched_entity_set_rq_priority(&entity->rq_list[i], priority);
>
> - drm_sched_rq_remove_entity(entity->rq, entity);
> - drm_sched_entity_set_rq_priority(&entity->rq, priority);
> - drm_sched_rq_add_entity(entity->rq, entity);
> + if (entity->rq) {
> + drm_sched_rq_remove_entity(entity->rq, entity);
> + drm_sched_entity_set_rq_priority(&entity->rq, priority);
> + drm_sched_rq_add_entity(entity->rq, entity);
> + }
>
> spin_unlock(&entity->rq_lock);
> }
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/4] drm/sched: Fix entities with 0 rqs.
[not found] ` <23f6617a-2049-11b3-b1df-07eb029db714-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2019-02-13 21:03 ` Alex Deucher via amd-gfx
[not found] ` <CADnq5_PcSTHw48oFu1oKVKh6sQt2=6FGtz3OJAX5xPLR8D9GUA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 8+ messages in thread
From: Alex Deucher via amd-gfx @ 2019-02-13 21:03 UTC (permalink / raw)
To: Christian Koenig
Cc: Maling list - DRI developers, amd-gfx list, Bas Nieuwenhuizen
On Wed, Jan 30, 2019 at 5:43 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 30.01.19 um 02:53 schrieb Bas Nieuwenhuizen:
> > Some blocks in amdgpu can have 0 rqs.
> >
> > Job creation already fails with -ENOENT when entity->rq is NULL,
> > so jobs cannot be pushed. Without a rq there is no scheduler to
> > pop jobs, and rq selection already does the right thing with a
> > list of length 0.
> >
> > So the operations we need to fix are:
> > - Creation, do not set rq to rq_list[0] if the list can have length 0.
> > - Do not flush any jobs when there is no rq.
> > - On entity destruction handle the rq = NULL case.
> > - on set_priority, do not try to change the rq if it is NULL.
> >
> > Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
>
> One minor comment on patch #2, apart from that the series is
> Reviewed-by: Christian König <christian.koenig@amd.com>.
>
> I'm going to make the change on #2 and pick them up for inclusion in
> amd-staging-drm-next.
Hi Christian,
I haven't seen these land yet. Just want to make sure they don't fall
through the cracks.
Alex
>
> Thanks for the help,
> Christian.
>
> > ---
> > drivers/gpu/drm/scheduler/sched_entity.c | 39 ++++++++++++++++--------
> > 1 file changed, 26 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > index 4463d3826ecb..8e31b6628d09 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -52,12 +52,12 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
> > {
> > int i;
> >
> > - if (!(entity && rq_list && num_rq_list > 0 && rq_list[0]))
> > + if (!(entity && rq_list && (num_rq_list == 0 || rq_list[0])))
> > return -EINVAL;
> >
> > memset(entity, 0, sizeof(struct drm_sched_entity));
> > INIT_LIST_HEAD(&entity->list);
> > - entity->rq = rq_list[0];
> > + entity->rq = NULL;
> > entity->guilty = guilty;
> > entity->num_rq_list = num_rq_list;
> > entity->rq_list = kcalloc(num_rq_list, sizeof(struct drm_sched_rq *),
> > @@ -67,6 +67,10 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
> >
> > for (i = 0; i < num_rq_list; ++i)
> > entity->rq_list[i] = rq_list[i];
> > +
> > + if (num_rq_list)
> > + entity->rq = rq_list[0];
> > +
> > entity->last_scheduled = NULL;
> >
> > spin_lock_init(&entity->rq_lock);
> > @@ -165,6 +169,9 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
> > struct task_struct *last_user;
> > long ret = timeout;
> >
> > + if (!entity->rq)
> > + return 0;
> > +
> > sched = entity->rq->sched;
> > /**
> > * The client will not queue more IBs during this fini, consume existing
> > @@ -264,20 +271,24 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
> > */
> > void drm_sched_entity_fini(struct drm_sched_entity *entity)
> > {
> > - struct drm_gpu_scheduler *sched;
> > + struct drm_gpu_scheduler *sched = NULL;
> >
> > - sched = entity->rq->sched;
> > - drm_sched_rq_remove_entity(entity->rq, entity);
> > + if (entity->rq) {
> > + sched = entity->rq->sched;
> > + drm_sched_rq_remove_entity(entity->rq, entity);
> > + }
> >
> > /* Consumption of existing IBs wasn't completed. Forcefully
> > * remove them here.
> > */
> > if (spsc_queue_peek(&entity->job_queue)) {
> > - /* Park the kernel for a moment to make sure it isn't processing
> > - * our enity.
> > - */
> > - kthread_park(sched->thread);
> > - kthread_unpark(sched->thread);
> > + if (sched) {
> > + /* Park the kernel for a moment to make sure it isn't processing
> > + * our enity.
> > + */
> > + kthread_park(sched->thread);
> > + kthread_unpark(sched->thread);
> > + }
> > if (entity->dependency) {
> > dma_fence_remove_callback(entity->dependency,
> > &entity->cb);
> > @@ -362,9 +373,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> > for (i = 0; i < entity->num_rq_list; ++i)
> > drm_sched_entity_set_rq_priority(&entity->rq_list[i], priority);
> >
> > - drm_sched_rq_remove_entity(entity->rq, entity);
> > - drm_sched_entity_set_rq_priority(&entity->rq, priority);
> > - drm_sched_rq_add_entity(entity->rq, entity);
> > + if (entity->rq) {
> > + drm_sched_rq_remove_entity(entity->rq, entity);
> > + drm_sched_entity_set_rq_priority(&entity->rq, priority);
> > + drm_sched_rq_add_entity(entity->rq, entity);
> > + }
> >
> > spin_unlock(&entity->rq_lock);
> > }
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/4] drm/sched: Fix entities with 0 rqs.
[not found] ` <CADnq5_PcSTHw48oFu1oKVKh6sQt2=6FGtz3OJAX5xPLR8D9GUA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-02-14 9:08 ` Christian König via amd-gfx
0 siblings, 0 replies; 8+ messages in thread
From: Christian König via amd-gfx @ 2019-02-14 9:08 UTC (permalink / raw)
To: Alex Deucher, Christian Koenig
Cc: Christian König, amd-gfx list, Maling list - DRI developers,
Bas Nieuwenhuizen
Am 13.02.19 um 22:03 schrieb Alex Deucher via amd-gfx:
> On Wed, Jan 30, 2019 at 5:43 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Am 30.01.19 um 02:53 schrieb Bas Nieuwenhuizen:
>>> Some blocks in amdgpu can have 0 rqs.
>>>
>>> Job creation already fails with -ENOENT when entity->rq is NULL,
>>> so jobs cannot be pushed. Without a rq there is no scheduler to
>>> pop jobs, and rq selection already does the right thing with a
>>> list of length 0.
>>>
>>> So the operations we need to fix are:
>>> - Creation, do not set rq to rq_list[0] if the list can have length 0.
>>> - Do not flush any jobs when there is no rq.
>>> - On entity destruction handle the rq = NULL case.
>>> - on set_priority, do not try to change the rq if it is NULL.
>>>
>>> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
>> One minor comment on patch #2, apart from that the series is
>> Reviewed-by: Christian König <christian.koenig@amd.com>.
>>
>> I'm going to make the change on #2 and pick them up for inclusion in
>> amd-staging-drm-next.
> Hi Christian,
>
> I haven't seen these land yet. Just want to make sure they don't fall
> through the cracks.
Thanks for the reminder, I'm really having trouble catching up on
applying patches lately.
Christian.
>
> Alex
>
>> Thanks for the help,
>> Christian.
>>
>>> ---
>>> drivers/gpu/drm/scheduler/sched_entity.c | 39 ++++++++++++++++--------
>>> 1 file changed, 26 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>>> index 4463d3826ecb..8e31b6628d09 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>> @@ -52,12 +52,12 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>>> {
>>> int i;
>>>
>>> - if (!(entity && rq_list && num_rq_list > 0 && rq_list[0]))
>>> + if (!(entity && rq_list && (num_rq_list == 0 || rq_list[0])))
>>> return -EINVAL;
>>>
>>> memset(entity, 0, sizeof(struct drm_sched_entity));
>>> INIT_LIST_HEAD(&entity->list);
>>> - entity->rq = rq_list[0];
>>> + entity->rq = NULL;
>>> entity->guilty = guilty;
>>> entity->num_rq_list = num_rq_list;
>>> entity->rq_list = kcalloc(num_rq_list, sizeof(struct drm_sched_rq *),
>>> @@ -67,6 +67,10 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>>>
>>> for (i = 0; i < num_rq_list; ++i)
>>> entity->rq_list[i] = rq_list[i];
>>> +
>>> + if (num_rq_list)
>>> + entity->rq = rq_list[0];
>>> +
>>> entity->last_scheduled = NULL;
>>>
>>> spin_lock_init(&entity->rq_lock);
>>> @@ -165,6 +169,9 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
>>> struct task_struct *last_user;
>>> long ret = timeout;
>>>
>>> + if (!entity->rq)
>>> + return 0;
>>> +
>>> sched = entity->rq->sched;
>>> /**
>>> * The client will not queue more IBs during this fini, consume existing
>>> @@ -264,20 +271,24 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
>>> */
>>> void drm_sched_entity_fini(struct drm_sched_entity *entity)
>>> {
>>> - struct drm_gpu_scheduler *sched;
>>> + struct drm_gpu_scheduler *sched = NULL;
>>>
>>> - sched = entity->rq->sched;
>>> - drm_sched_rq_remove_entity(entity->rq, entity);
>>> + if (entity->rq) {
>>> + sched = entity->rq->sched;
>>> + drm_sched_rq_remove_entity(entity->rq, entity);
>>> + }
>>>
>>> /* Consumption of existing IBs wasn't completed. Forcefully
>>> * remove them here.
>>> */
>>> if (spsc_queue_peek(&entity->job_queue)) {
>>> - /* Park the kernel for a moment to make sure it isn't processing
>>> - * our enity.
>>> - */
>>> - kthread_park(sched->thread);
>>> - kthread_unpark(sched->thread);
>>> + if (sched) {
>>> + /* Park the kernel for a moment to make sure it isn't processing
>>> + * our enity.
>>> + */
>>> + kthread_park(sched->thread);
>>> + kthread_unpark(sched->thread);
>>> + }
>>> if (entity->dependency) {
>>> dma_fence_remove_callback(entity->dependency,
>>> &entity->cb);
>>> @@ -362,9 +373,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>>> for (i = 0; i < entity->num_rq_list; ++i)
>>> drm_sched_entity_set_rq_priority(&entity->rq_list[i], priority);
>>>
>>> - drm_sched_rq_remove_entity(entity->rq, entity);
>>> - drm_sched_entity_set_rq_priority(&entity->rq, priority);
>>> - drm_sched_rq_add_entity(entity->rq, entity);
>>> + if (entity->rq) {
>>> + drm_sched_rq_remove_entity(entity->rq, entity);
>>> + drm_sched_entity_set_rq_priority(&entity->rq, priority);
>>> + drm_sched_rq_add_entity(entity->rq, entity);
>>> + }
>>>
>>> spin_unlock(&entity->rq_lock);
>>> }
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-02-14 9:08 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-30 1:53 [PATCH v2 1/4] drm/sched: Fix entities with 0 rqs Bas Nieuwenhuizen
[not found] ` <20190130015322.105870-1-bas-dldO88ZXqoXqqjsSq9zF6IRWq/SkRNHw@public.gmane.org>
2019-01-30 1:53 ` [PATCH v2 2/4] drm/amdgpu: Only add rqs for initialized rings Bas Nieuwenhuizen
[not found] ` <20190130015322.105870-2-bas-dldO88ZXqoXqqjsSq9zF6IRWq/SkRNHw@public.gmane.org>
2019-01-30 10:42 ` Christian König
2019-01-30 1:53 ` [PATCH v2 3/4] drm/amdgpu: Check if fd really is an amdgpu fd Bas Nieuwenhuizen
2019-01-30 1:53 ` [PATCH v2 4/4] drm/amdgpu: Add command to override the context priority Bas Nieuwenhuizen
2019-01-30 10:43 ` [PATCH v2 1/4] drm/sched: Fix entities with 0 rqs Christian König
[not found] ` <23f6617a-2049-11b3-b1df-07eb029db714-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-02-13 21:03 ` Alex Deucher via amd-gfx
[not found] ` <CADnq5_PcSTHw48oFu1oKVKh6sQt2=6FGtz3OJAX5xPLR8D9GUA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-02-14 9:08 ` Christian König via amd-gfx
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.