[PATCH 1/3] drm/scheduler: fix a corner case in dependency optimization

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/3] drm/scheduler: fix a corner case in dependency optimization
@ 2018-05-25  4:45 Nayan Deshmukh
  2018-05-25  4:45 ` [PATCH 2/3] drm/scheduler: add documentation Nayan Deshmukh
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Nayan Deshmukh @ 2018-05-25  4:45 UTC (permalink / raw)
  To: dri-devel; +Cc: Nayan Deshmukh, christian.koenig

When checking for a dependency fence for belonging to the same entity
compare it with scheduled as well finished fence. Earlier we were only
comparing it with the scheduled fence.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
---
 drivers/gpu/drm/scheduler/gpu_scheduler.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
index df1578d6f42e..44d480768dfe 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
@@ -349,8 +349,13 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
 	struct dma_fence * fence = entity->dependency;
 	struct drm_sched_fence *s_fence;
 
-	if (fence->context == entity->fence_context) {
-		/* We can ignore fences from ourself */
+	if (fence->context == entity->fence_context ||
+            fence->context == entity->fence_context + 1) {
+                /*
+                 * Fence is a scheduled/finished fence from a job
+                 * which belongs to the same entity, we can ignore
+                 * fences from ourself
+                 */
 		dma_fence_put(entity->dependency);
 		return false;
 	}
-- 
2.14.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] drm/scheduler: add documentation
  2018-05-25  4:45 [PATCH 1/3] drm/scheduler: fix a corner case in dependency optimization Nayan Deshmukh
@ 2018-05-25  4:45 ` Nayan Deshmukh
  2018-05-25 12:06   ` Christian König
                     ` (2 more replies)
  2018-05-25  4:45 ` [PATCH 3/3] drm/doc: add a chapter for gpu scheduler Nayan Deshmukh
  2018-05-25 12:01 ` [PATCH 1/3] drm/scheduler: fix a corner case in dependency optimization Christian König
  2 siblings, 3 replies; 13+ messages in thread
From: Nayan Deshmukh @ 2018-05-25  4:45 UTC (permalink / raw)
  To: dri-devel; +Cc: Nayan Deshmukh, Alex Deucher, christian.koenig

convert existing raw comments into kernel-doc format as well
as add new documentation

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
---
 drivers/gpu/drm/scheduler/gpu_scheduler.c | 214 ++++++++++++++++++++++++------
 include/drm/gpu_scheduler.h               | 153 +++++++++++++++++----
 2 files changed, 296 insertions(+), 71 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
index 44d480768dfe..c70c983e3e74 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
@@ -21,6 +21,29 @@
  *
  */
 
+/**
+ * DOC: Overview
+ *
+ * The GPU scheduler provides entities which allow userspace to push jobs
+ * into software queues which are then scheduled on a hardware run queue.
+ * The software queues have a priority among them. The scheduler selects the entities
+ * from the run queue using FIFO. The scheduler provides dependency handling
+ * features among jobs. The driver is supposed to provide functions for backend
+ * operations to the scheduler like submitting a job to hardware run queue,
+ * returning the dependency of a job etc.
+ *
+ * The organisation of the scheduler is the following:-
+ *
+ * 1. Each ring buffer has one scheduler
+ * 2. Each scheduler has multiple run queues with different priorities
+ *    (i.e. HIGH_HW,HIGH_SW, KERNEL, NORMAL)
+ * 3. Each run queue has a queue of entities to schedule
+ * 4. Entities themselves maintain a queue of jobs that will be scheduled on
+ *    the hardware.
+ *
+ * The jobs in a entity are always scheduled in the order that they were pushed.
+ */
+
 #include <linux/kthread.h>
 #include <linux/wait.h>
 #include <linux/sched.h>
@@ -39,7 +62,13 @@ static bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
 static void drm_sched_wakeup(struct drm_gpu_scheduler *sched);
 static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb);
 
-/* Initialize a given run queue struct */
+/**
+ * drm_sched_rq_init - initialize a given run queue struct
+ *
+ * @rq: scheduler run queue
+ *
+ * Initializes a scheduler runqueue.
+ */
 static void drm_sched_rq_init(struct drm_sched_rq *rq)
 {
 	spin_lock_init(&rq->lock);
@@ -47,6 +76,14 @@ static void drm_sched_rq_init(struct drm_sched_rq *rq)
 	rq->current_entity = NULL;
 }
 
+/**
+ * drm_sched_rq_add_entity - add an entity
+ *
+ * @rq: scheduler run queue
+ * @entity: scheduler entity
+ *
+ * Adds a scheduler entity to the run queue.
+ */
 static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
 				    struct drm_sched_entity *entity)
 {
@@ -57,6 +94,14 @@ static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
 	spin_unlock(&rq->lock);
 }
 
+/**
+ * drm_sched_rq_remove_entity - remove an entity
+ *
+ * @rq: scheduler run queue
+ * @entity: scheduler entity
+ *
+ * Removes a scheduler entity from the run queue.
+ */
 static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
 				       struct drm_sched_entity *entity)
 {
@@ -70,9 +115,9 @@ static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
 }
 
 /**
- * Select an entity which could provide a job to run
+ * drm_sched_rq_select_entity - Select an entity which could provide a job to run
  *
- * @rq		The run queue to check.
+ * @rq: scheduler run queue to check.
  *
  * Try to find a ready entity, returns NULL if none found.
  */
@@ -112,15 +157,16 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
 }
 
 /**
- * Init a context entity used by scheduler when submit to HW ring.
+ * drm_sched_entity_init - Init a context entity used by scheduler when
+ * submit to HW ring.
  *
- * @sched	The pointer to the scheduler
- * @entity	The pointer to a valid drm_sched_entity
- * @rq		The run queue this entity belongs
- * @guilty      atomic_t set to 1 when a job on this queue
- *              is found to be guilty causing a timeout
+ * @sched: scheduler instance
+ * @entity: scheduler entity to init
+ * @rq: the run queue this entity belongs
+ * @guilty: atomic_t set to 1 when a job on this queue
+ *          is found to be guilty causing a timeout
  *
- * return 0 if succeed. negative error code on failure
+ * Returns 0 on success or a negative error code on failure.
 */
 int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
 			  struct drm_sched_entity *entity,
@@ -149,10 +195,10 @@ int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
 EXPORT_SYMBOL(drm_sched_entity_init);
 
 /**
- * Query if entity is initialized
+ * drm_sched_entity_is_initialized - Query if entity is initialized
  *
- * @sched       Pointer to scheduler instance
- * @entity	The pointer to a valid scheduler entity
+ * @sched: Pointer to scheduler instance
+ * @entity: The pointer to a valid scheduler entity
  *
  * return true if entity is initialized, false otherwise
 */
@@ -164,11 +210,11 @@ static bool drm_sched_entity_is_initialized(struct drm_gpu_scheduler *sched,
 }
 
 /**
- * Check if entity is idle
+ * drm_sched_entity_is_idle - Check if entity is idle
  *
- * @entity	The pointer to a valid scheduler entity
+ * @entity: scheduler entity
  *
- * Return true if entity don't has any unscheduled jobs.
+ * Returns true if the entity does not have any unscheduled jobs.
  */
 static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
 {
@@ -180,9 +226,9 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
 }
 
 /**
- * Check if entity is ready
+ * drm_sched_entity_is_ready - Check if entity is ready
  *
- * @entity	The pointer to a valid scheduler entity
+ * @entity: scheduler entity
  *
  * Return true if entity could provide a job.
  */
@@ -210,12 +256,12 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
 
 
 /**
- * Destroy a context entity
+ * drm_sched_entity_do_release - Destroy a context entity
  *
- * @sched       Pointer to scheduler instance
- * @entity	The pointer to a valid scheduler entity
+ * @sched: scheduler instance
+ * @entity: scheduler entity
  *
- * Splitting drm_sched_entity_fini() into two functions, The first one is does the waiting,
+ * Splitting drm_sched_entity_fini() into two functions, The first one does the waiting,
  * removes the entity from the runqueue and returns an error when the process was killed.
  */
 void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
@@ -237,12 +283,13 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
 EXPORT_SYMBOL(drm_sched_entity_do_release);
 
 /**
- * Destroy a context entity
+ * drm_sched_entity_cleanup - Destroy a context entity
  *
- * @sched       Pointer to scheduler instance
- * @entity	The pointer to a valid scheduler entity
+ * @sched: scheduler instance
+ * @entity: scheduler entity
  *
- * The second one then goes over the entity and signals all jobs with an error code.
+ * This should be called after @drm_sched_entity_do_release. It goes over the
+ * entity and signals all jobs with an error code if the process was killed.
  */
 void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
 			   struct drm_sched_entity *entity)
@@ -281,6 +328,14 @@ void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
 }
 EXPORT_SYMBOL(drm_sched_entity_cleanup);
 
+/**
+ * drm_sched_entity_fini - Destroy a context entity
+ *
+ * @sched: scheduler instance
+ * @entity: scheduler entity
+ *
+ * Calls drm_sched_entity_do_release() and drm_sched_entity_cleanup()
+ */
 void drm_sched_entity_fini(struct drm_gpu_scheduler *sched,
 				struct drm_sched_entity *entity)
 {
@@ -306,6 +361,15 @@ static void drm_sched_entity_clear_dep(struct dma_fence *f, struct dma_fence_cb
 	dma_fence_put(f);
 }
 
+/**
+ * drm_sched_entity_set_rq - Sets the run queue for an entity
+ *
+ * @entity: scheduler entity
+ * @rq: scheduler run queue
+ *
+ * Sets the run queue for an entity and removes the entity from the previous
+ * run queue in which was present.
+ */
 void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
 			     struct drm_sched_rq *rq)
 {
@@ -325,6 +389,14 @@ void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
 }
 EXPORT_SYMBOL(drm_sched_entity_set_rq);
 
+/**
+ * drm_sched_dependency_optimized
+ *
+ * @fence: the dependency fence
+ * @entity: the entity which depends on the above fence
+ *
+ * Returns true if the dependency can be optimized and false otherwise
+ */
 bool drm_sched_dependency_optimized(struct dma_fence* fence,
 				    struct drm_sched_entity *entity)
 {
@@ -413,9 +485,10 @@ drm_sched_entity_pop_job(struct drm_sched_entity *entity)
 }
 
 /**
- * Submit a job to the job queue
+ * drm_sched_entity_push_job - Submit a job to the entity's job queue
  *
- * @sched_job		The pointer to job required to submit
+ * @sched_job: job to submit
+ * @entity: scheduler entity
  *
  * Note: To guarantee that the order of insertion to queue matches
  * the job's fence sequence number this function should be
@@ -506,6 +579,13 @@ static void drm_sched_job_timedout(struct work_struct *work)
 	job->sched->ops->timedout_job(job);
 }
 
+/**
+ * drm_sched_hw_job_reset - stop the scheduler if it contains the bad job
+ *
+ * @sched: scheduler instance
+ * @bad: bad scheduler job
+ *
+ */
 void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 {
 	struct drm_sched_job *s_job;
@@ -550,6 +630,12 @@ void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_jo
 }
 EXPORT_SYMBOL(drm_sched_hw_job_reset);
 
+/**
+ * drm_sched_job_recovery - recover jobs after a reset
+ *
+ * @sched: scheduler instance
+ *
+ */
 void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
 {
 	struct drm_sched_job *s_job, *tmp;
@@ -599,10 +685,17 @@ void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
 EXPORT_SYMBOL(drm_sched_job_recovery);
 
 /**
- * Init a sched_job with basic field
+ * drm_sched_job_init - init a scheduler job
  *
- * Note: Refer to drm_sched_entity_push_job documentation
+ * @job: scheduler job to init
+ * @sched: scheduler instance
+ * @entity: scheduler entity to use
+ * @owner: job owner for debugging
+ *
+ * Refer to drm_sched_entity_push_job() documentation
  * for locking considerations.
+ *
+ * Returns 0 for success, negative error code otherwise.
  */
 int drm_sched_job_init(struct drm_sched_job *job,
 		       struct drm_gpu_scheduler *sched,
@@ -626,7 +719,11 @@ int drm_sched_job_init(struct drm_sched_job *job,
 EXPORT_SYMBOL(drm_sched_job_init);
 
 /**
- * Return ture if we can push more jobs to the hw.
+ * drm_sched_ready - is the scheduler ready
+ *
+ * @sched: scheduler instance
+ *
+ * Return true if we can push more jobs to the hw, otherwise false.
  */
 static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
 {
@@ -635,7 +732,10 @@ static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
 }
 
 /**
- * Wake up the scheduler when it is ready
+ * drm_sched_wakeup - Wake up the scheduler when it is ready
+ *
+ * @sched: scheduler instance
+ *
  */
 static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
 {
@@ -644,8 +744,12 @@ static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
 }
 
 /**
- * Select next entity to process
-*/
+ * drm_sched_select_entity - Select next entity to process
+ *
+ * @sched: scheduler instance
+ *
+ * Returns the entity to process or NULL if none are found.
+ */
 static struct drm_sched_entity *
 drm_sched_select_entity(struct drm_gpu_scheduler *sched)
 {
@@ -665,6 +769,14 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
 	return entity;
 }
 
+/**
+ * drm_sched_process_job - process a job
+ *
+ * @f: fence
+ * @cb: fence callbacks
+ *
+ * Called after job has finished execution.
+ */
 static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
 {
 	struct drm_sched_fence *s_fence =
@@ -680,6 +792,13 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
 	wake_up_interruptible(&sched->wake_up_worker);
 }
 
+/**
+ * drm_sched_blocked - check if the scheduler is blocked
+ *
+ * @sched: scheduler instance
+ *
+ * Returns true if blocked, otherwise false.
+ */
 static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
 {
 	if (kthread_should_park()) {
@@ -690,6 +809,13 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
 	return false;
 }
 
+/**
+ * drm_sched_main - main scheduler thread
+ *
+ * @param: scheduler instance
+ *
+ * Returns 0.
+ */
 static int drm_sched_main(void *param)
 {
 	struct sched_param sparam = {.sched_priority = 1};
@@ -744,15 +870,17 @@ static int drm_sched_main(void *param)
 }
 
 /**
- * Init a gpu scheduler instance
+ * drm_sched_init - Init a gpu scheduler instance
  *
- * @sched		The pointer to the scheduler
- * @ops			The backend operations for this scheduler.
- * @hw_submissions	Number of hw submissions to do.
- * @name		Name used for debugging
+ * @sched: scheduler instance
+ * @ops: backend operations for this scheduler
+ * @hw_submission: number of hw submissions that can be in flight
+ * @hang_limit: number of times to allow a job to hang before dropping it
+ * @timeout: timeout value in jiffies for the scheduler
+ * @name: name used for debugging
  *
  * Return 0 on success, otherwise error code.
-*/
+ */
 int drm_sched_init(struct drm_gpu_scheduler *sched,
 		   const struct drm_sched_backend_ops *ops,
 		   unsigned hw_submission,
@@ -788,9 +916,11 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 EXPORT_SYMBOL(drm_sched_init);
 
 /**
- * Destroy a gpu scheduler
+ * drm_sched_fini - Destroy a gpu scheduler
+ *
+ * @sched: scheduler instance
  *
- * @sched	The pointer to the scheduler
+ * Tears down and cleans up the scheduler.
  */
 void drm_sched_fini(struct drm_gpu_scheduler *sched)
 {
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index dec655894d08..496442f12bff 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -43,13 +43,33 @@ enum drm_sched_priority {
 };
 
 /**
- * drm_sched_entity - A wrapper around a job queue (typically attached
- * to the DRM file_priv).
+ * struct drm_sched_entity - A wrapper around a job queue (typically
+ * attached to the DRM file_priv).
+ *
+ * @list: used to append this struct to the list of entities in the
+ *        runqueue.
+ * @rq: runqueue to which this entity belongs.
+ * @rq_lock: lock to modify the runqueue to which this entity belongs.
+ * @sched: the scheduler instance to which this entity is enqueued.
+ * @job_queue: the list of jobs of this entity.
+ * @fence_seq: a linearly increasing seqno incremented with each
+ *             new &drm_sched_fence which is part of the entity.
+ * @fence_context: a unique context for all the fences which belong
+ *                 to this entity.
+ *                 The &drm_sched_fence.scheduled uses the
+ *                 fence_context but &drm_sched_fence.finished uses
+ *                 fence_context + 1.
+ * @dependency: the dependency fence of the job which is on the top
+ *              of the job queue.
+ * @cb: callback for the dependency fence above.
+ * @guilty: points to ctx's guilty.
+ * @fini_status: contains the exit status in case the process was signalled.
+ * @last_scheduled: points to the finished fence of the last scheduled job.
  *
  * Entities will emit jobs in order to their corresponding hardware
  * ring, and the scheduler will alternate between entities based on
  * scheduling policy.
-*/
+ */
 struct drm_sched_entity {
 	struct list_head		list;
 	struct drm_sched_rq		*rq;
@@ -63,47 +83,96 @@ struct drm_sched_entity {
 
 	struct dma_fence		*dependency;
 	struct dma_fence_cb		cb;
-	atomic_t			*guilty; /* points to ctx's guilty */
-	int            fini_status;
-	struct dma_fence    *last_scheduled;
+	atomic_t			*guilty;
+	int                             fini_status;
+	struct dma_fence                *last_scheduled;
 };
 
 /**
+ * struct drm_sched_rq - queue of entities to be scheduled.
+ *
+ * @lock: to modify the entities list.
+ * @entities: list of the entities to be scheduled.
+ * @current_entity: the entity which is to be scheduled.
+ *
  * Run queue is a set of entities scheduling command submissions for
  * one specific ring. It implements the scheduling policy that selects
  * the next entity to emit commands from.
-*/
+ */
 struct drm_sched_rq {
 	spinlock_t			lock;
 	struct list_head		entities;
 	struct drm_sched_entity		*current_entity;
 };
 
+/**
+ * struct drm_sched_fence - fences corresponding to the scheduling of a job.
+ */
 struct drm_sched_fence {
+        /**
+         * @scheduled: this fence is what will be signaled by the scheduler
+         * when the job is scheduled.
+         */
 	struct dma_fence		scheduled;
 
-	/* This fence is what will be signaled by the scheduler when
-	 * the job is completed.
-	 *
-	 * When setting up an out fence for the job, you should use
-	 * this, since it's available immediately upon
-	 * drm_sched_job_init(), and the fence returned by the driver
-	 * from run_job() won't be created until the dependencies have
-	 * resolved.
-	 */
+        /**
+         * @finished: this fence is what will be signaled by the scheduler
+         * when the job is completed.
+         *
+         * When setting up an out fence for the job, you should use
+         * this, since it's available immediately upon
+         * drm_sched_job_init(), and the fence returned by the driver
+         * from run_job() won't be created until the dependencies have
+         * resolved.
+         */
 	struct dma_fence		finished;
 
+        /**
+         * @cb: the callback for the parent fence below.
+         */
 	struct dma_fence_cb		cb;
+        /**
+         * @parent: the fence returned by &drm_sched_backend_ops.run_job
+         * when scheduling the job on hardware. We signal the
+         * &drm_sched_fence.finished fence once parent is signalled.
+         */
 	struct dma_fence		*parent;
+        /**
+         * @sched: the scheduler instance to which the job having this struct
+         * belongs to.
+         */
 	struct drm_gpu_scheduler	*sched;
+        /**
+         * @lock: the lock used by the scheduled and the finished fences.
+         */
 	spinlock_t			lock;
+        /**
+         * @owner: job owner for debugging
+         */
 	void				*owner;
 };
 
 struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
 
 /**
- * drm_sched_job - A job to be run by an entity.
+ * struct drm_sched_job - A job to be run by an entity.
+ *
+ * @queue_node: used to append this struct to the queue of jobs in an entity.
+ * @sched: the scheduler instance on which this job is scheduled.
+ * @s_fence: contains the fences for the scheduling of job.
+ * @finish_cb: the callback for the finished fence.
+ * @finish_work: schedules the function @drm_sched_job_finish once the job has
+ *               finished to remove the job from the
+ *               @drm_gpu_scheduler.ring_mirror_list.
+ * @node: used to append this struct to the @drm_gpu_scheduler.ring_mirror_list.
+ * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the timeout
+ *            interval is over.
+ * @id: a unique id assigned to each job scheduled on the scheduler.
+ * @karma: increment on every hang caused by this job. If this exceeds the hang
+ *         limit of the scheduler then the job is marked guilty and will not
+ *         be scheduled further.
+ * @s_priority: the priority of the job.
+ * @entity: the entity to which this job belongs.
  *
  * A job is created by the driver using drm_sched_job_init(), and
  * should call drm_sched_entity_push_job() once it wants the scheduler
@@ -130,38 +199,64 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
 }
 
 /**
+ * struct drm_sched_backend_ops
+ *
  * Define the backend operations called by the scheduler,
- * these functions should be implemented in driver side
-*/
+ * these functions should be implemented in driver side.
+ */
 struct drm_sched_backend_ops {
-	/* Called when the scheduler is considering scheduling this
-	 * job next, to get another struct dma_fence for this job to
+	/**
+         * @dependency: Called when the scheduler is considering scheduling
+         * this job next, to get another struct dma_fence for this job to
 	 * block on.  Once it returns NULL, run_job() may be called.
 	 */
 	struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
 					struct drm_sched_entity *s_entity);
 
-	/* Called to execute the job once all of the dependencies have
-	 * been resolved.  This may be called multiple times, if
+	/**
+         * @run_job: Called to execute the job once all of the dependencies
+         * have been resolved.  This may be called multiple times, if
 	 * timedout_job() has happened and drm_sched_job_recovery()
 	 * decides to try it again.
 	 */
 	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
 
-	/* Called when a job has taken too long to execute, to trigger
-	 * GPU recovery.
+	/**
+         * @timedout_job: Called when a job has taken too long to execute,
+         * to trigger GPU recovery.
 	 */
 	void (*timedout_job)(struct drm_sched_job *sched_job);
 
-	/* Called once the job's finished fence has been signaled and
-	 * it's time to clean it up.
+	/**
+         * @free_job: Called once the job's finished fence has been signaled
+         * and it's time to clean it up.
 	 */
 	void (*free_job)(struct drm_sched_job *sched_job);
 };
 
 /**
- * One scheduler is implemented for each hardware ring
-*/
+ * struct drm_gpu_scheduler
+ *
+ * @ops: backend operations provided by the driver.
+ * @hw_submission_limit: the max size of the hardware queue.
+ * @timeout: the time after which a job is removed from the scheduler.
+ * @name: name of the ring for which this scheduler is being used.
+ * @sched_rq: priority wise array of run queues.
+ * @wake_up_worker: the wait queue on which the scheduler sleeps until a job
+ *                  is ready to be scheduled.
+ * @job_scheduled: once @drm_sched_entity_do_release is called the scheduler
+ *                 waits on this wait queue until all the scheduled jobs are
+ *                 finished.
+ * @hw_rq_count: the number of jobs currently in the hardware queue.
+ * @job_id_count: used to assign unique id to the each job.
+ * @thread: the kthread on which the scheduler which run.
+ * @ring_mirror_list: the list of jobs which are currently in the job queue.
+ * @job_list_lock: lock to protect the ring_mirror_list.
+ * @hang_limit: once the hangs by a job crosses this limit then it is marked
+ *              guilty and it will be considered for scheduling further.
+ *
+ * One scheduler is implemented for each hardware ring.
+ */
 struct drm_gpu_scheduler {
 	const struct drm_sched_backend_ops	*ops;
 	uint32_t			hw_submission_limit;
-- 
2.14.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] drm/doc: add a chapter for gpu scheduler
  2018-05-25  4:45 [PATCH 1/3] drm/scheduler: fix a corner case in dependency optimization Nayan Deshmukh
  2018-05-25  4:45 ` [PATCH 2/3] drm/scheduler: add documentation Nayan Deshmukh
@ 2018-05-25  4:45 ` Nayan Deshmukh
  2018-05-25 14:54   ` Alex Deucher
  2018-05-25 12:01 ` [PATCH 1/3] drm/scheduler: fix a corner case in dependency optimization Christian König
  2 siblings, 1 reply; 13+ messages in thread
From: Nayan Deshmukh @ 2018-05-25  4:45 UTC (permalink / raw)
  To: dri-devel; +Cc: Nayan Deshmukh, christian.koenig

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
---
 Documentation/gpu/drm-mm.rst | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index b08e9dcd9177..96ebcc2a7b41 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -496,3 +496,21 @@ DRM Sync Objects
 
 .. kernel-doc:: drivers/gpu/drm/drm_syncobj.c
    :export:
+
+GPU Scheduler
+=============
+
+Overview
+--------
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/gpu_scheduler.c
+   :doc: Overview
+
+Scheduler Function References
+-----------------------------
+
+.. kernel-doc:: include/drm/gpu_scheduler.h
+   :internal:
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/gpu_scheduler.c
+   :export:
-- 
2.14.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] drm/scheduler: fix a corner case in dependency optimization
  2018-05-25  4:45 [PATCH 1/3] drm/scheduler: fix a corner case in dependency optimization Nayan Deshmukh
  2018-05-25  4:45 ` [PATCH 2/3] drm/scheduler: add documentation Nayan Deshmukh
  2018-05-25  4:45 ` [PATCH 3/3] drm/doc: add a chapter for gpu scheduler Nayan Deshmukh
@ 2018-05-25 12:01 ` Christian König
  2 siblings, 0 replies; 13+ messages in thread
From: Christian König @ 2018-05-25 12:01 UTC (permalink / raw)
  To: Nayan Deshmukh, dri-devel

Am 25.05.2018 um 06:45 schrieb Nayan Deshmukh:
> When checking for a dependency fence for belonging to the same entity
> compare it with scheduled as well finished fence. Earlier we were only
> comparing it with the scheduled fence.
>
> Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>

Reviewed and pushed this patch into our internal repository.

Now going to take a look at the rest,
Christian.

> ---
>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> index df1578d6f42e..44d480768dfe 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> @@ -349,8 +349,13 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
>   	struct dma_fence * fence = entity->dependency;
>   	struct drm_sched_fence *s_fence;
>   
> -	if (fence->context == entity->fence_context) {
> -		/* We can ignore fences from ourself */
> +	if (fence->context == entity->fence_context ||
> +            fence->context == entity->fence_context + 1) {
> +                /*
> +                 * Fence is a scheduled/finished fence from a job
> +                 * which belongs to the same entity, we can ignore
> +                 * fences from ourself
> +                 */
>   		dma_fence_put(entity->dependency);
>   		return false;
>   	}

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: add documentation
  2018-05-25  4:45 ` [PATCH 2/3] drm/scheduler: add documentation Nayan Deshmukh
@ 2018-05-25 12:06   ` Christian König
  2018-05-25 14:54   ` Alex Deucher
  2018-05-28  8:09   ` Daniel Vetter
  2 siblings, 0 replies; 13+ messages in thread
From: Christian König @ 2018-05-25 12:06 UTC (permalink / raw)
  To: Nayan Deshmukh, dri-devel; +Cc: Alex Deucher

Am 25.05.2018 um 06:45 schrieb Nayan Deshmukh:
> convert existing raw comments into kernel-doc format as well
> as add new documentation
>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
> ---
>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 214 ++++++++++++++++++++++++------
>   include/drm/gpu_scheduler.h               | 153 +++++++++++++++++----
>   2 files changed, 296 insertions(+), 71 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> index 44d480768dfe..c70c983e3e74 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> @@ -21,6 +21,29 @@
>    *
>    */
>   
> +/**
> + * DOC: Overview
> + *
> + * The GPU scheduler provides entities which allow userspace to push jobs
> + * into software queues which are then scheduled on a hardware run queue.
> + * The software queues have a priority among them. The scheduler selects the entities
> + * from the run queue using FIFO. The scheduler provides dependency handling
> + * features among jobs. The driver is supposed to provide functions for backend
> + * operations to the scheduler like submitting a job to hardware run queue,
> + * returning the dependency of a job etc.
s/dependency/dependencies/

Alex, Michel or others: The descriptions seems right, but it sounds like 
we could still improve the wording a bit.

Since I'm not a native speaker of English either and honestly not so 
good at it, do you guys have any suggestions how to better write that?

Apart from that looks really good to me,
Christian.

> + *
> + * The organisation of the scheduler is the following:-
> + *
> + * 1. Each ring buffer has one scheduler
> + * 2. Each scheduler has multiple run queues with different priorities
> + *    (i.e. HIGH_HW,HIGH_SW, KERNEL, NORMAL)
> + * 3. Each run queue has a queue of entities to schedule
> + * 4. Entities themselves maintain a queue of jobs that will be scheduled on
> + *    the hardware.
> + *
> + * The jobs in a entity are always scheduled in the order that they were pushed.
> + */
> +
>   #include <linux/kthread.h>
>   #include <linux/wait.h>
>   #include <linux/sched.h>
> @@ -39,7 +62,13 @@ static bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>   static void drm_sched_wakeup(struct drm_gpu_scheduler *sched);
>   static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb);
>   
> -/* Initialize a given run queue struct */
> +/**
> + * drm_sched_rq_init - initialize a given run queue struct
> + *
> + * @rq: scheduler run queue
> + *
> + * Initializes a scheduler runqueue.
> + */
>   static void drm_sched_rq_init(struct drm_sched_rq *rq)
>   {
>   	spin_lock_init(&rq->lock);
> @@ -47,6 +76,14 @@ static void drm_sched_rq_init(struct drm_sched_rq *rq)
>   	rq->current_entity = NULL;
>   }
>   
> +/**
> + * drm_sched_rq_add_entity - add an entity
> + *
> + * @rq: scheduler run queue
> + * @entity: scheduler entity
> + *
> + * Adds a scheduler entity to the run queue.
> + */
>   static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>   				    struct drm_sched_entity *entity)
>   {
> @@ -57,6 +94,14 @@ static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>   	spin_unlock(&rq->lock);
>   }
>   
> +/**
> + * drm_sched_rq_remove_entity - remove an entity
> + *
> + * @rq: scheduler run queue
> + * @entity: scheduler entity
> + *
> + * Removes a scheduler entity from the run queue.
> + */
>   static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>   				       struct drm_sched_entity *entity)
>   {
> @@ -70,9 +115,9 @@ static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>   }
>   
>   /**
> - * Select an entity which could provide a job to run
> + * drm_sched_rq_select_entity - Select an entity which could provide a job to run
>    *
> - * @rq		The run queue to check.
> + * @rq: scheduler run queue to check.
>    *
>    * Try to find a ready entity, returns NULL if none found.
>    */
> @@ -112,15 +157,16 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>   }
>   
>   /**
> - * Init a context entity used by scheduler when submit to HW ring.
> + * drm_sched_entity_init - Init a context entity used by scheduler when
> + * submit to HW ring.
>    *
> - * @sched	The pointer to the scheduler
> - * @entity	The pointer to a valid drm_sched_entity
> - * @rq		The run queue this entity belongs
> - * @guilty      atomic_t set to 1 when a job on this queue
> - *              is found to be guilty causing a timeout
> + * @sched: scheduler instance
> + * @entity: scheduler entity to init
> + * @rq: the run queue this entity belongs
> + * @guilty: atomic_t set to 1 when a job on this queue
> + *          is found to be guilty causing a timeout
>    *
> - * return 0 if succeed. negative error code on failure
> + * Returns 0 on success or a negative error code on failure.
>   */
>   int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
>   			  struct drm_sched_entity *entity,
> @@ -149,10 +195,10 @@ int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
>   EXPORT_SYMBOL(drm_sched_entity_init);
>   
>   /**
> - * Query if entity is initialized
> + * drm_sched_entity_is_initialized - Query if entity is initialized
>    *
> - * @sched       Pointer to scheduler instance
> - * @entity	The pointer to a valid scheduler entity
> + * @sched: Pointer to scheduler instance
> + * @entity: The pointer to a valid scheduler entity
>    *
>    * return true if entity is initialized, false otherwise
>   */
> @@ -164,11 +210,11 @@ static bool drm_sched_entity_is_initialized(struct drm_gpu_scheduler *sched,
>   }
>   
>   /**
> - * Check if entity is idle
> + * drm_sched_entity_is_idle - Check if entity is idle
>    *
> - * @entity	The pointer to a valid scheduler entity
> + * @entity: scheduler entity
>    *
> - * Return true if entity don't has any unscheduled jobs.
> + * Returns true if the entity does not have any unscheduled jobs.
>    */
>   static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
>   {
> @@ -180,9 +226,9 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
>   }
>   
>   /**
> - * Check if entity is ready
> + * drm_sched_entity_is_ready - Check if entity is ready
>    *
> - * @entity	The pointer to a valid scheduler entity
> + * @entity: scheduler entity
>    *
>    * Return true if entity could provide a job.
>    */
> @@ -210,12 +256,12 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>   
>   
>   /**
> - * Destroy a context entity
> + * drm_sched_entity_do_release - Destroy a context entity
>    *
> - * @sched       Pointer to scheduler instance
> - * @entity	The pointer to a valid scheduler entity
> + * @sched: scheduler instance
> + * @entity: scheduler entity
>    *
> - * Splitting drm_sched_entity_fini() into two functions, The first one is does the waiting,
> + * Splitting drm_sched_entity_fini() into two functions, The first one does the waiting,
>    * removes the entity from the runqueue and returns an error when the process was killed.
>    */
>   void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
> @@ -237,12 +283,13 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>   EXPORT_SYMBOL(drm_sched_entity_do_release);
>   
>   /**
> - * Destroy a context entity
> + * drm_sched_entity_cleanup - Destroy a context entity
>    *
> - * @sched       Pointer to scheduler instance
> - * @entity	The pointer to a valid scheduler entity
> + * @sched: scheduler instance
> + * @entity: scheduler entity
>    *
> - * The second one then goes over the entity and signals all jobs with an error code.
> + * This should be called after @drm_sched_entity_do_release. It goes over the
> + * entity and signals all jobs with an error code if the process was killed.
>    */
>   void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
>   			   struct drm_sched_entity *entity)
> @@ -281,6 +328,14 @@ void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
>   }
>   EXPORT_SYMBOL(drm_sched_entity_cleanup);
>   
> +/**
> + * drm_sched_entity_fini - Destroy a context entity
> + *
> + * @sched: scheduler instance
> + * @entity: scheduler entity
> + *
> + * Calls drm_sched_entity_do_release() and drm_sched_entity_cleanup()
> + */
>   void drm_sched_entity_fini(struct drm_gpu_scheduler *sched,
>   				struct drm_sched_entity *entity)
>   {
> @@ -306,6 +361,15 @@ static void drm_sched_entity_clear_dep(struct dma_fence *f, struct dma_fence_cb
>   	dma_fence_put(f);
>   }
>   
> +/**
> + * drm_sched_entity_set_rq - Sets the run queue for an entity
> + *
> + * @entity: scheduler entity
> + * @rq: scheduler run queue
> + *
> + * Sets the run queue for an entity and removes the entity from the previous
> + * run queue in which was present.
> + */
>   void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
>   			     struct drm_sched_rq *rq)
>   {
> @@ -325,6 +389,14 @@ void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
>   }
>   EXPORT_SYMBOL(drm_sched_entity_set_rq);
>   
> +/**
> + * drm_sched_dependency_optimized
> + *
> + * @fence: the dependency fence
> + * @entity: the entity which depends on the above fence
> + *
> + * Returns true if the dependency can be optimized and false otherwise
> + */
>   bool drm_sched_dependency_optimized(struct dma_fence* fence,
>   				    struct drm_sched_entity *entity)
>   {
> @@ -413,9 +485,10 @@ drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>   }
>   
>   /**
> - * Submit a job to the job queue
> + * drm_sched_entity_push_job - Submit a job to the entity's job queue
>    *
> - * @sched_job		The pointer to job required to submit
> + * @sched_job: job to submit
> + * @entity: scheduler entity
>    *
>    * Note: To guarantee that the order of insertion to queue matches
>    * the job's fence sequence number this function should be
> @@ -506,6 +579,13 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   	job->sched->ops->timedout_job(job);
>   }
>   
> +/**
> + * drm_sched_hw_job_reset - stop the scheduler if it contains the bad job
> + *
> + * @sched: scheduler instance
> + * @bad: bad scheduler job
> + *
> + */
>   void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>   {
>   	struct drm_sched_job *s_job;
> @@ -550,6 +630,12 @@ void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_jo
>   }
>   EXPORT_SYMBOL(drm_sched_hw_job_reset);
>   
> +/**
> + * drm_sched_job_recovery - recover jobs after a reset
> + *
> + * @sched: scheduler instance
> + *
> + */
>   void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
>   {
>   	struct drm_sched_job *s_job, *tmp;
> @@ -599,10 +685,17 @@ void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
>   EXPORT_SYMBOL(drm_sched_job_recovery);
>   
>   /**
> - * Init a sched_job with basic field
> + * drm_sched_job_init - init a scheduler job
>    *
> - * Note: Refer to drm_sched_entity_push_job documentation
> + * @job: scheduler job to init
> + * @sched: scheduler instance
> + * @entity: scheduler entity to use
> + * @owner: job owner for debugging
> + *
> + * Refer to drm_sched_entity_push_job() documentation
>    * for locking considerations.
> + *
> + * Returns 0 for success, negative error code otherwise.
>    */
>   int drm_sched_job_init(struct drm_sched_job *job,
>   		       struct drm_gpu_scheduler *sched,
> @@ -626,7 +719,11 @@ int drm_sched_job_init(struct drm_sched_job *job,
>   EXPORT_SYMBOL(drm_sched_job_init);
>   
>   /**
> - * Return ture if we can push more jobs to the hw.
> + * drm_sched_ready - is the scheduler ready
> + *
> + * @sched: scheduler instance
> + *
> + * Return true if we can push more jobs to the hw, otherwise false.
>    */
>   static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
>   {
> @@ -635,7 +732,10 @@ static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
>   }
>   
>   /**
> - * Wake up the scheduler when it is ready
> + * drm_sched_wakeup - Wake up the scheduler when it is ready
> + *
> + * @sched: scheduler instance
> + *
>    */
>   static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
>   {
> @@ -644,8 +744,12 @@ static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
>   }
>   
>   /**
> - * Select next entity to process
> -*/
> + * drm_sched_select_entity - Select next entity to process
> + *
> + * @sched: scheduler instance
> + *
> + * Returns the entity to process or NULL if none are found.
> + */
>   static struct drm_sched_entity *
>   drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>   {
> @@ -665,6 +769,14 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>   	return entity;
>   }
>   
> +/**
> + * drm_sched_process_job - process a job
> + *
> + * @f: fence
> + * @cb: fence callbacks
> + *
> + * Called after job has finished execution.
> + */
>   static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>   {
>   	struct drm_sched_fence *s_fence =
> @@ -680,6 +792,13 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>   	wake_up_interruptible(&sched->wake_up_worker);
>   }
>   
> +/**
> + * drm_sched_blocked - check if the scheduler is blocked
> + *
> + * @sched: scheduler instance
> + *
> + * Returns true if blocked, otherwise false.
> + */
>   static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>   {
>   	if (kthread_should_park()) {
> @@ -690,6 +809,13 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>   	return false;
>   }
>   
> +/**
> + * drm_sched_main - main scheduler thread
> + *
> + * @param: scheduler instance
> + *
> + * Returns 0.
> + */
>   static int drm_sched_main(void *param)
>   {
>   	struct sched_param sparam = {.sched_priority = 1};
> @@ -744,15 +870,17 @@ static int drm_sched_main(void *param)
>   }
>   
>   /**
> - * Init a gpu scheduler instance
> + * drm_sched_init - Init a gpu scheduler instance
>    *
> - * @sched		The pointer to the scheduler
> - * @ops			The backend operations for this scheduler.
> - * @hw_submissions	Number of hw submissions to do.
> - * @name		Name used for debugging
> + * @sched: scheduler instance
> + * @ops: backend operations for this scheduler
> + * @hw_submission: number of hw submissions that can be in flight
> + * @hang_limit: number of times to allow a job to hang before dropping it
> + * @timeout: timeout value in jiffies for the scheduler
> + * @name: name used for debugging
>    *
>    * Return 0 on success, otherwise error code.
> -*/
> + */
>   int drm_sched_init(struct drm_gpu_scheduler *sched,
>   		   const struct drm_sched_backend_ops *ops,
>   		   unsigned hw_submission,
> @@ -788,9 +916,11 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   EXPORT_SYMBOL(drm_sched_init);
>   
>   /**
> - * Destroy a gpu scheduler
> + * drm_sched_fini - Destroy a gpu scheduler
> + *
> + * @sched: scheduler instance
>    *
> - * @sched	The pointer to the scheduler
> + * Tears down and cleans up the scheduler.
>    */
>   void drm_sched_fini(struct drm_gpu_scheduler *sched)
>   {
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index dec655894d08..496442f12bff 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -43,13 +43,33 @@ enum drm_sched_priority {
>   };
>   
>   /**
> - * drm_sched_entity - A wrapper around a job queue (typically attached
> - * to the DRM file_priv).
> + * struct drm_sched_entity - A wrapper around a job queue (typically
> + * attached to the DRM file_priv).
> + *
> + * @list: used to append this struct to the list of entities in the
> + *        runqueue.
> + * @rq: runqueue to which this entity belongs.
> + * @rq_lock: lock to modify the runqueue to which this entity belongs.
> + * @sched: the scheduler instance to which this entity is enqueued.
> + * @job_queue: the list of jobs of this entity.
> + * @fence_seq: a linearly increasing seqno incremented with each
> + *             new &drm_sched_fence which is part of the entity.
> + * @fence_context: a unique context for all the fences which belong
> + *                 to this entity.
> + *                 The &drm_sched_fence.scheduled uses the
> + *                 fence_context but &drm_sched_fence.finished uses
> + *                 fence_context + 1.
> + * @dependency: the dependency fence of the job which is on the top
> + *              of the job queue.
> + * @cb: callback for the dependency fence above.
> + * @guilty: points to ctx's guilty.
> + * @fini_status: contains the exit status in case the process was signalled.
> + * @last_scheduled: points to the finished fence of the last scheduled job.
>    *
>    * Entities will emit jobs in order to their corresponding hardware
>    * ring, and the scheduler will alternate between entities based on
>    * scheduling policy.
> -*/
> + */
>   struct drm_sched_entity {
>   	struct list_head		list;
>   	struct drm_sched_rq		*rq;
> @@ -63,47 +83,96 @@ struct drm_sched_entity {
>   
>   	struct dma_fence		*dependency;
>   	struct dma_fence_cb		cb;
> -	atomic_t			*guilty; /* points to ctx's guilty */
> -	int            fini_status;
> -	struct dma_fence    *last_scheduled;
> +	atomic_t			*guilty;
> +	int                             fini_status;
> +	struct dma_fence                *last_scheduled;
>   };
>   
>   /**
> + * struct drm_sched_rq - queue of entities to be scheduled.
> + *
> + * @lock: to modify the entities list.
> + * @entities: list of the entities to be scheduled.
> + * @current_entity: the entity which is to be scheduled.
> + *
>    * Run queue is a set of entities scheduling command submissions for
>    * one specific ring. It implements the scheduling policy that selects
>    * the next entity to emit commands from.
> -*/
> + */
>   struct drm_sched_rq {
>   	spinlock_t			lock;
>   	struct list_head		entities;
>   	struct drm_sched_entity		*current_entity;
>   };
>   
> +/**
> + * struct drm_sched_fence - fences corresponding to the scheduling of a job.
> + */
>   struct drm_sched_fence {
> +        /**
> +         * @scheduled: this fence is what will be signaled by the scheduler
> +         * when the job is scheduled.
> +         */
>   	struct dma_fence		scheduled;
>   
> -	/* This fence is what will be signaled by the scheduler when
> -	 * the job is completed.
> -	 *
> -	 * When setting up an out fence for the job, you should use
> -	 * this, since it's available immediately upon
> -	 * drm_sched_job_init(), and the fence returned by the driver
> -	 * from run_job() won't be created until the dependencies have
> -	 * resolved.
> -	 */
> +        /**
> +         * @finished: this fence is what will be signaled by the scheduler
> +         * when the job is completed.
> +         *
> +         * When setting up an out fence for the job, you should use
> +         * this, since it's available immediately upon
> +         * drm_sched_job_init(), and the fence returned by the driver
> +         * from run_job() won't be created until the dependencies have
> +         * resolved.
> +         */
>   	struct dma_fence		finished;
>   
> +        /**
> +         * @cb: the callback for the parent fence below.
> +         */
>   	struct dma_fence_cb		cb;
> +        /**
> +         * @parent: the fence returned by &drm_sched_backend_ops.run_job
> +         * when scheduling the job on hardware. We signal the
> +         * &drm_sched_fence.finished fence once parent is signalled.
> +         */
>   	struct dma_fence		*parent;
> +        /**
> +         * @sched: the scheduler instance to which the job having this struct
> +         * belongs to.
> +         */
>   	struct drm_gpu_scheduler	*sched;
> +        /**
> +         * @lock: the lock used by the scheduled and the finished fences.
> +         */
>   	spinlock_t			lock;
> +        /**
> +         * @owner: job owner for debugging
> +         */
>   	void				*owner;
>   };
>   
>   struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>   
>   /**
> - * drm_sched_job - A job to be run by an entity.
> + * struct drm_sched_job - A job to be run by an entity.
> + *
> + * @queue_node: used to append this struct to the queue of jobs in an entity.
> + * @sched: the scheduler instance on which this job is scheduled.
> + * @s_fence: contains the fences for the scheduling of job.
> + * @finish_cb: the callback for the finished fence.
> + * @finish_work: schedules the function @drm_sched_job_finish once the job has
> + *               finished to remove the job from the
> + *               @drm_gpu_scheduler.ring_mirror_list.
> + * @node: used to append this struct to the @drm_gpu_scheduler.ring_mirror_list.
> + * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the timeout
> + *            interval is over.
> + * @id: a unique id assigned to each job scheduled on the scheduler.
> + * @karma: increment on every hang caused by this job. If this exceeds the hang
> + *         limit of the scheduler then the job is marked guilty and will not
> + *         be scheduled further.
> + * @s_priority: the priority of the job.
> + * @entity: the entity to which this job belongs.
>    *
>    * A job is created by the driver using drm_sched_job_init(), and
>    * should call drm_sched_entity_push_job() once it wants the scheduler
> @@ -130,38 +199,64 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>   }
>   
>   /**
> + * struct drm_sched_backend_ops
> + *
>    * Define the backend operations called by the scheduler,
> - * these functions should be implemented in driver side
> -*/
> + * these functions should be implemented in driver side.
> + */
>   struct drm_sched_backend_ops {
> -	/* Called when the scheduler is considering scheduling this
> -	 * job next, to get another struct dma_fence for this job to
> +	/**
> +         * @dependency: Called when the scheduler is considering scheduling
> +         * this job next, to get another struct dma_fence for this job to
>   	 * block on.  Once it returns NULL, run_job() may be called.
>   	 */
>   	struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
>   					struct drm_sched_entity *s_entity);
>   
> -	/* Called to execute the job once all of the dependencies have
> -	 * been resolved.  This may be called multiple times, if
> +	/**
> +         * @run_job: Called to execute the job once all of the dependencies
> +         * have been resolved.  This may be called multiple times, if
>   	 * timedout_job() has happened and drm_sched_job_recovery()
>   	 * decides to try it again.
>   	 */
>   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>   
> -	/* Called when a job has taken too long to execute, to trigger
> -	 * GPU recovery.
> +	/**
> +         * @timedout_job: Called when a job has taken too long to execute,
> +         * to trigger GPU recovery.
>   	 */
>   	void (*timedout_job)(struct drm_sched_job *sched_job);
>   
> -	/* Called once the job's finished fence has been signaled and
> -	 * it's time to clean it up.
> +	/**
> +         * @free_job: Called once the job's finished fence has been signaled
> +         * and it's time to clean it up.
>   	 */
>   	void (*free_job)(struct drm_sched_job *sched_job);
>   };
>   
>   /**
> - * One scheduler is implemented for each hardware ring
> -*/
> + * struct drm_gpu_scheduler
> + *
> + * @ops: backend operations provided by the driver.
> + * @hw_submission_limit: the max size of the hardware queue.
> + * @timeout: the time after which a job is removed from the scheduler.
> + * @name: name of the ring for which this scheduler is being used.
> + * @sched_rq: priority wise array of run queues.
> + * @wake_up_worker: the wait queue on which the scheduler sleeps until a job
> + *                  is ready to be scheduled.
> + * @job_scheduled: once @drm_sched_entity_do_release is called the scheduler
> + *                 waits on this wait queue until all the scheduled jobs are
> + *                 finished.
> + * @hw_rq_count: the number of jobs currently in the hardware queue.
> + * @job_id_count: used to assign unique id to the each job.
> + * @thread: the kthread on which the scheduler which run.
> + * @ring_mirror_list: the list of jobs which are currently in the job queue.
> + * @job_list_lock: lock to protect the ring_mirror_list.
> + * @hang_limit: once the hangs by a job crosses this limit then it is marked
> + *              guilty and it will be considered for scheduling further.
> + *
> + * One scheduler is implemented for each hardware ring.
> + */
>   struct drm_gpu_scheduler {
>   	const struct drm_sched_backend_ops	*ops;
>   	uint32_t			hw_submission_limit;

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: add documentation
  2018-05-25  4:45 ` [PATCH 2/3] drm/scheduler: add documentation Nayan Deshmukh
  2018-05-25 12:06   ` Christian König
@ 2018-05-25 14:54   ` Alex Deucher
  2018-05-28  7:09     ` Nayan Deshmukh
  2018-05-28  8:09   ` Daniel Vetter
  2 siblings, 1 reply; 13+ messages in thread
From: Alex Deucher @ 2018-05-25 14:54 UTC (permalink / raw)
  To: Nayan Deshmukh
  Cc: Alex Deucher, Christian Koenig, Maling list - DRI developers

On Fri, May 25, 2018 at 12:45 AM, Nayan Deshmukh
<nayan26deshmukh@gmail.com> wrote:
> convert existing raw comments into kernel-doc format as well
> as add new documentation
>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>

Looks good.  just a couple comments below to clarify the language.
With those fixed:
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/scheduler/gpu_scheduler.c | 214 ++++++++++++++++++++++++------
>  include/drm/gpu_scheduler.h               | 153 +++++++++++++++++----
>  2 files changed, 296 insertions(+), 71 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> index 44d480768dfe..c70c983e3e74 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> @@ -21,6 +21,29 @@
>   *
>   */
>
> +/**
> + * DOC: Overview
> + *
> + * The GPU scheduler provides entities which allow userspace to push jobs
> + * into software queues which are then scheduled on a hardware run queue.
> + * The software queues have a priority among them. The scheduler selects the entities
> + * from the run queue using FIFO. The scheduler provides dependency handling

a FIFO

> + * features among jobs. The driver is supposed to provide functions for backend

The driver provides callback functions for backend

> + * operations to the scheduler like submitting a job to hardware run queue,
> + * returning the dependency of a job etc.
> + *
> + * The organisation of the scheduler is the following:-

drop the -

> + *
> + * 1. Each ring buffer has one scheduler

Each hw run queue

> + * 2. Each scheduler has multiple run queues with different priorities
> + *    (i.e. HIGH_HW,HIGH_SW, KERNEL, NORMAL)

s/i.e./e.g.,/

> + * 3. Each run queue has a queue of entities to schedule

Each scheduler run queue has a queue of entities to schedule

> + * 4. Entities themselves maintain a queue of jobs that will be scheduled on
> + *    the hardware.
> + *
> + * The jobs in a entity are always scheduled in the order that they were pushed.
> + */
> +
>  #include <linux/kthread.h>
>  #include <linux/wait.h>
>  #include <linux/sched.h>
> @@ -39,7 +62,13 @@ static bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>  static void drm_sched_wakeup(struct drm_gpu_scheduler *sched);
>  static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb);
>
> -/* Initialize a given run queue struct */
> +/**
> + * drm_sched_rq_init - initialize a given run queue struct
> + *
> + * @rq: scheduler run queue
> + *
> + * Initializes a scheduler runqueue.
> + */
>  static void drm_sched_rq_init(struct drm_sched_rq *rq)
>  {
>         spin_lock_init(&rq->lock);
> @@ -47,6 +76,14 @@ static void drm_sched_rq_init(struct drm_sched_rq *rq)
>         rq->current_entity = NULL;
>  }
>
> +/**
> + * drm_sched_rq_add_entity - add an entity
> + *
> + * @rq: scheduler run queue
> + * @entity: scheduler entity
> + *
> + * Adds a scheduler entity to the run queue.
> + */
>  static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>                                     struct drm_sched_entity *entity)
>  {
> @@ -57,6 +94,14 @@ static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>         spin_unlock(&rq->lock);
>  }
>
> +/**
> + * drm_sched_rq_remove_entity - remove an entity
> + *
> + * @rq: scheduler run queue
> + * @entity: scheduler entity
> + *
> + * Removes a scheduler entity from the run queue.
> + */
>  static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>                                        struct drm_sched_entity *entity)
>  {
> @@ -70,9 +115,9 @@ static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>  }
>
>  /**
> - * Select an entity which could provide a job to run
> + * drm_sched_rq_select_entity - Select an entity which could provide a job to run
>   *
> - * @rq         The run queue to check.
> + * @rq: scheduler run queue to check.
>   *
>   * Try to find a ready entity, returns NULL if none found.
>   */
> @@ -112,15 +157,16 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>  }
>
>  /**
> - * Init a context entity used by scheduler when submit to HW ring.
> + * drm_sched_entity_init - Init a context entity used by scheduler when
> + * submit to HW ring.
>   *
> - * @sched      The pointer to the scheduler
> - * @entity     The pointer to a valid drm_sched_entity
> - * @rq         The run queue this entity belongs
> - * @guilty      atomic_t set to 1 when a job on this queue
> - *              is found to be guilty causing a timeout
> + * @sched: scheduler instance
> + * @entity: scheduler entity to init
> + * @rq: the run queue this entity belongs
> + * @guilty: atomic_t set to 1 when a job on this queue
> + *          is found to be guilty causing a timeout
>   *
> - * return 0 if succeed. negative error code on failure
> + * Returns 0 on success or a negative error code on failure.
>  */
>  int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
>                           struct drm_sched_entity *entity,
> @@ -149,10 +195,10 @@ int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
>  EXPORT_SYMBOL(drm_sched_entity_init);
>
>  /**
> - * Query if entity is initialized
> + * drm_sched_entity_is_initialized - Query if entity is initialized
>   *
> - * @sched       Pointer to scheduler instance
> - * @entity     The pointer to a valid scheduler entity
> + * @sched: Pointer to scheduler instance
> + * @entity: The pointer to a valid scheduler entity
>   *
>   * return true if entity is initialized, false otherwise
>  */
> @@ -164,11 +210,11 @@ static bool drm_sched_entity_is_initialized(struct drm_gpu_scheduler *sched,
>  }
>
>  /**
> - * Check if entity is idle
> + * drm_sched_entity_is_idle - Check if entity is idle
>   *
> - * @entity     The pointer to a valid scheduler entity
> + * @entity: scheduler entity
>   *
> - * Return true if entity don't has any unscheduled jobs.
> + * Returns true if the entity does not have any unscheduled jobs.
>   */
>  static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
>  {
> @@ -180,9 +226,9 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
>  }
>
>  /**
> - * Check if entity is ready
> + * drm_sched_entity_is_ready - Check if entity is ready
>   *
> - * @entity     The pointer to a valid scheduler entity
> + * @entity: scheduler entity
>   *
>   * Return true if entity could provide a job.
>   */
> @@ -210,12 +256,12 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>
>
>  /**
> - * Destroy a context entity
> + * drm_sched_entity_do_release - Destroy a context entity
>   *
> - * @sched       Pointer to scheduler instance
> - * @entity     The pointer to a valid scheduler entity
> + * @sched: scheduler instance
> + * @entity: scheduler entity
>   *
> - * Splitting drm_sched_entity_fini() into two functions, The first one is does the waiting,
> + * Splitting drm_sched_entity_fini() into two functions, The first one does the waiting,
>   * removes the entity from the runqueue and returns an error when the process was killed.
>   */
>  void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
> @@ -237,12 +283,13 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>  EXPORT_SYMBOL(drm_sched_entity_do_release);
>
>  /**
> - * Destroy a context entity
> + * drm_sched_entity_cleanup - Destroy a context entity
>   *
> - * @sched       Pointer to scheduler instance
> - * @entity     The pointer to a valid scheduler entity
> + * @sched: scheduler instance
> + * @entity: scheduler entity
>   *
> - * The second one then goes over the entity and signals all jobs with an error code.
> + * This should be called after @drm_sched_entity_do_release. It goes over the
> + * entity and signals all jobs with an error code if the process was killed.
>   */
>  void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
>                            struct drm_sched_entity *entity)
> @@ -281,6 +328,14 @@ void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
>  }
>  EXPORT_SYMBOL(drm_sched_entity_cleanup);
>
> +/**
> + * drm_sched_entity_fini - Destroy a context entity
> + *
> + * @sched: scheduler instance
> + * @entity: scheduler entity
> + *
> + * Calls drm_sched_entity_do_release() and drm_sched_entity_cleanup()
> + */
>  void drm_sched_entity_fini(struct drm_gpu_scheduler *sched,
>                                 struct drm_sched_entity *entity)
>  {
> @@ -306,6 +361,15 @@ static void drm_sched_entity_clear_dep(struct dma_fence *f, struct dma_fence_cb
>         dma_fence_put(f);
>  }
>
> +/**
> + * drm_sched_entity_set_rq - Sets the run queue for an entity
> + *
> + * @entity: scheduler entity
> + * @rq: scheduler run queue
> + *
> + * Sets the run queue for an entity and removes the entity from the previous
> + * run queue in which was present.
> + */
>  void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
>                              struct drm_sched_rq *rq)
>  {
> @@ -325,6 +389,14 @@ void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
>  }
>  EXPORT_SYMBOL(drm_sched_entity_set_rq);
>
> +/**
> + * drm_sched_dependency_optimized
> + *
> + * @fence: the dependency fence
> + * @entity: the entity which depends on the above fence
> + *
> + * Returns true if the dependency can be optimized and false otherwise
> + */
>  bool drm_sched_dependency_optimized(struct dma_fence* fence,
>                                     struct drm_sched_entity *entity)
>  {
> @@ -413,9 +485,10 @@ drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>  }
>
>  /**
> - * Submit a job to the job queue
> + * drm_sched_entity_push_job - Submit a job to the entity's job queue
>   *
> - * @sched_job          The pointer to job required to submit
> + * @sched_job: job to submit
> + * @entity: scheduler entity
>   *
>   * Note: To guarantee that the order of insertion to queue matches
>   * the job's fence sequence number this function should be
> @@ -506,6 +579,13 @@ static void drm_sched_job_timedout(struct work_struct *work)
>         job->sched->ops->timedout_job(job);
>  }
>
> +/**
> + * drm_sched_hw_job_reset - stop the scheduler if it contains the bad job
> + *
> + * @sched: scheduler instance
> + * @bad: bad scheduler job
> + *
> + */
>  void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>  {
>         struct drm_sched_job *s_job;
> @@ -550,6 +630,12 @@ void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_jo
>  }
>  EXPORT_SYMBOL(drm_sched_hw_job_reset);
>
> +/**
> + * drm_sched_job_recovery - recover jobs after a reset
> + *
> + * @sched: scheduler instance
> + *
> + */
>  void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
>  {
>         struct drm_sched_job *s_job, *tmp;
> @@ -599,10 +685,17 @@ void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
>  EXPORT_SYMBOL(drm_sched_job_recovery);
>
>  /**
> - * Init a sched_job with basic field
> + * drm_sched_job_init - init a scheduler job
>   *
> - * Note: Refer to drm_sched_entity_push_job documentation
> + * @job: scheduler job to init
> + * @sched: scheduler instance
> + * @entity: scheduler entity to use
> + * @owner: job owner for debugging
> + *
> + * Refer to drm_sched_entity_push_job() documentation
>   * for locking considerations.
> + *
> + * Returns 0 for success, negative error code otherwise.
>   */
>  int drm_sched_job_init(struct drm_sched_job *job,
>                        struct drm_gpu_scheduler *sched,
> @@ -626,7 +719,11 @@ int drm_sched_job_init(struct drm_sched_job *job,
>  EXPORT_SYMBOL(drm_sched_job_init);
>
>  /**
> - * Return ture if we can push more jobs to the hw.
> + * drm_sched_ready - is the scheduler ready
> + *
> + * @sched: scheduler instance
> + *
> + * Return true if we can push more jobs to the hw, otherwise false.
>   */
>  static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
>  {
> @@ -635,7 +732,10 @@ static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
>  }
>
>  /**
> - * Wake up the scheduler when it is ready
> + * drm_sched_wakeup - Wake up the scheduler when it is ready
> + *
> + * @sched: scheduler instance
> + *
>   */
>  static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
>  {
> @@ -644,8 +744,12 @@ static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
>  }
>
>  /**
> - * Select next entity to process
> -*/
> + * drm_sched_select_entity - Select next entity to process
> + *
> + * @sched: scheduler instance
> + *
> + * Returns the entity to process or NULL if none are found.
> + */
>  static struct drm_sched_entity *
>  drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>  {
> @@ -665,6 +769,14 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>         return entity;
>  }
>
> +/**
> + * drm_sched_process_job - process a job
> + *
> + * @f: fence
> + * @cb: fence callbacks
> + *
> + * Called after job has finished execution.
> + */
>  static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>  {
>         struct drm_sched_fence *s_fence =
> @@ -680,6 +792,13 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>         wake_up_interruptible(&sched->wake_up_worker);
>  }
>
> +/**
> + * drm_sched_blocked - check if the scheduler is blocked
> + *
> + * @sched: scheduler instance
> + *
> + * Returns true if blocked, otherwise false.
> + */
>  static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>  {
>         if (kthread_should_park()) {
> @@ -690,6 +809,13 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>         return false;
>  }
>
> +/**
> + * drm_sched_main - main scheduler thread
> + *
> + * @param: scheduler instance
> + *
> + * Returns 0.
> + */
>  static int drm_sched_main(void *param)
>  {
>         struct sched_param sparam = {.sched_priority = 1};
> @@ -744,15 +870,17 @@ static int drm_sched_main(void *param)
>  }
>
>  /**
> - * Init a gpu scheduler instance
> + * drm_sched_init - Init a gpu scheduler instance
>   *
> - * @sched              The pointer to the scheduler
> - * @ops                        The backend operations for this scheduler.
> - * @hw_submissions     Number of hw submissions to do.
> - * @name               Name used for debugging
> + * @sched: scheduler instance
> + * @ops: backend operations for this scheduler
> + * @hw_submission: number of hw submissions that can be in flight
> + * @hang_limit: number of times to allow a job to hang before dropping it
> + * @timeout: timeout value in jiffies for the scheduler
> + * @name: name used for debugging
>   *
>   * Return 0 on success, otherwise error code.
> -*/
> + */
>  int drm_sched_init(struct drm_gpu_scheduler *sched,
>                    const struct drm_sched_backend_ops *ops,
>                    unsigned hw_submission,
> @@ -788,9 +916,11 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>  EXPORT_SYMBOL(drm_sched_init);
>
>  /**
> - * Destroy a gpu scheduler
> + * drm_sched_fini - Destroy a gpu scheduler
> + *
> + * @sched: scheduler instance
>   *
> - * @sched      The pointer to the scheduler
> + * Tears down and cleans up the scheduler.
>   */
>  void drm_sched_fini(struct drm_gpu_scheduler *sched)
>  {
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index dec655894d08..496442f12bff 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -43,13 +43,33 @@ enum drm_sched_priority {
>  };
>
>  /**
> - * drm_sched_entity - A wrapper around a job queue (typically attached
> - * to the DRM file_priv).
> + * struct drm_sched_entity - A wrapper around a job queue (typically
> + * attached to the DRM file_priv).
> + *
> + * @list: used to append this struct to the list of entities in the
> + *        runqueue.
> + * @rq: runqueue to which this entity belongs.
> + * @rq_lock: lock to modify the runqueue to which this entity belongs.
> + * @sched: the scheduler instance to which this entity is enqueued.
> + * @job_queue: the list of jobs of this entity.
> + * @fence_seq: a linearly increasing seqno incremented with each
> + *             new &drm_sched_fence which is part of the entity.
> + * @fence_context: a unique context for all the fences which belong
> + *                 to this entity.
> + *                 The &drm_sched_fence.scheduled uses the
> + *                 fence_context but &drm_sched_fence.finished uses
> + *                 fence_context + 1.
> + * @dependency: the dependency fence of the job which is on the top
> + *              of the job queue.
> + * @cb: callback for the dependency fence above.
> + * @guilty: points to ctx's guilty.
> + * @fini_status: contains the exit status in case the process was signalled.
> + * @last_scheduled: points to the finished fence of the last scheduled job.
>   *
>   * Entities will emit jobs in order to their corresponding hardware
>   * ring, and the scheduler will alternate between entities based on
>   * scheduling policy.
> -*/
> + */
>  struct drm_sched_entity {
>         struct list_head                list;
>         struct drm_sched_rq             *rq;
> @@ -63,47 +83,96 @@ struct drm_sched_entity {
>
>         struct dma_fence                *dependency;
>         struct dma_fence_cb             cb;
> -       atomic_t                        *guilty; /* points to ctx's guilty */
> -       int            fini_status;
> -       struct dma_fence    *last_scheduled;
> +       atomic_t                        *guilty;
> +       int                             fini_status;
> +       struct dma_fence                *last_scheduled;
>  };
>
>  /**
> + * struct drm_sched_rq - queue of entities to be scheduled.
> + *
> + * @lock: to modify the entities list.
> + * @entities: list of the entities to be scheduled.
> + * @current_entity: the entity which is to be scheduled.
> + *
>   * Run queue is a set of entities scheduling command submissions for
>   * one specific ring. It implements the scheduling policy that selects
>   * the next entity to emit commands from.
> -*/
> + */
>  struct drm_sched_rq {
>         spinlock_t                      lock;
>         struct list_head                entities;
>         struct drm_sched_entity         *current_entity;
>  };
>
> +/**
> + * struct drm_sched_fence - fences corresponding to the scheduling of a job.
> + */
>  struct drm_sched_fence {
> +        /**
> +         * @scheduled: this fence is what will be signaled by the scheduler
> +         * when the job is scheduled.
> +         */
>         struct dma_fence                scheduled;
>
> -       /* This fence is what will be signaled by the scheduler when
> -        * the job is completed.
> -        *
> -        * When setting up an out fence for the job, you should use
> -        * this, since it's available immediately upon
> -        * drm_sched_job_init(), and the fence returned by the driver
> -        * from run_job() won't be created until the dependencies have
> -        * resolved.
> -        */
> +        /**
> +         * @finished: this fence is what will be signaled by the scheduler
> +         * when the job is completed.
> +         *
> +         * When setting up an out fence for the job, you should use
> +         * this, since it's available immediately upon
> +         * drm_sched_job_init(), and the fence returned by the driver
> +         * from run_job() won't be created until the dependencies have
> +         * resolved.
> +         */
>         struct dma_fence                finished;
>
> +        /**
> +         * @cb: the callback for the parent fence below.
> +         */
>         struct dma_fence_cb             cb;
> +        /**
> +         * @parent: the fence returned by &drm_sched_backend_ops.run_job
> +         * when scheduling the job on hardware. We signal the
> +         * &drm_sched_fence.finished fence once parent is signalled.
> +         */
>         struct dma_fence                *parent;
> +        /**
> +         * @sched: the scheduler instance to which the job having this struct
> +         * belongs to.
> +         */
>         struct drm_gpu_scheduler        *sched;
> +        /**
> +         * @lock: the lock used by the scheduled and the finished fences.
> +         */
>         spinlock_t                      lock;
> +        /**
> +         * @owner: job owner for debugging
> +         */
>         void                            *owner;
>  };
>
>  struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>
>  /**
> - * drm_sched_job - A job to be run by an entity.
> + * struct drm_sched_job - A job to be run by an entity.
> + *
> + * @queue_node: used to append this struct to the queue of jobs in an entity.
> + * @sched: the scheduler instance on which this job is scheduled.
> + * @s_fence: contains the fences for the scheduling of job.
> + * @finish_cb: the callback for the finished fence.
> + * @finish_work: schedules the function @drm_sched_job_finish once the job has
> + *               finished to remove the job from the
> + *               @drm_gpu_scheduler.ring_mirror_list.
> + * @node: used to append this struct to the @drm_gpu_scheduler.ring_mirror_list.
> + * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the timeout
> + *            interval is over.
> + * @id: a unique id assigned to each job scheduled on the scheduler.
> + * @karma: increment on every hang caused by this job. If this exceeds the hang
> + *         limit of the scheduler then the job is marked guilty and will not
> + *         be scheduled further.
> + * @s_priority: the priority of the job.
> + * @entity: the entity to which this job belongs.
>   *
>   * A job is created by the driver using drm_sched_job_init(), and
>   * should call drm_sched_entity_push_job() once it wants the scheduler
> @@ -130,38 +199,64 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>  }
>
>  /**
> + * struct drm_sched_backend_ops
> + *
>   * Define the backend operations called by the scheduler,
> - * these functions should be implemented in driver side
> -*/
> + * these functions should be implemented in driver side.
> + */
>  struct drm_sched_backend_ops {
> -       /* Called when the scheduler is considering scheduling this
> -        * job next, to get another struct dma_fence for this job to
> +       /**
> +         * @dependency: Called when the scheduler is considering scheduling
> +         * this job next, to get another struct dma_fence for this job to
>          * block on.  Once it returns NULL, run_job() may be called.
>          */
>         struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
>                                         struct drm_sched_entity *s_entity);
>
> -       /* Called to execute the job once all of the dependencies have
> -        * been resolved.  This may be called multiple times, if
> +       /**
> +         * @run_job: Called to execute the job once all of the dependencies
> +         * have been resolved.  This may be called multiple times, if
>          * timedout_job() has happened and drm_sched_job_recovery()
>          * decides to try it again.
>          */
>         struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>
> -       /* Called when a job has taken too long to execute, to trigger
> -        * GPU recovery.
> +       /**
> +         * @timedout_job: Called when a job has taken too long to execute,
> +         * to trigger GPU recovery.
>          */
>         void (*timedout_job)(struct drm_sched_job *sched_job);
>
> -       /* Called once the job's finished fence has been signaled and
> -        * it's time to clean it up.
> +       /**
> +         * @free_job: Called once the job's finished fence has been signaled
> +         * and it's time to clean it up.
>          */
>         void (*free_job)(struct drm_sched_job *sched_job);
>  };
>
>  /**
> - * One scheduler is implemented for each hardware ring
> -*/
> + * struct drm_gpu_scheduler
> + *
> + * @ops: backend operations provided by the driver.
> + * @hw_submission_limit: the max size of the hardware queue.
> + * @timeout: the time after which a job is removed from the scheduler.
> + * @name: name of the ring for which this scheduler is being used.
> + * @sched_rq: priority wise array of run queues.
> + * @wake_up_worker: the wait queue on which the scheduler sleeps until a job
> + *                  is ready to be scheduled.
> + * @job_scheduled: once @drm_sched_entity_do_release is called the scheduler
> + *                 waits on this wait queue until all the scheduled jobs are
> + *                 finished.
> + * @hw_rq_count: the number of jobs currently in the hardware queue.
> + * @job_id_count: used to assign unique id to the each job.
> + * @thread: the kthread on which the scheduler which run.
> + * @ring_mirror_list: the list of jobs which are currently in the job queue.
> + * @job_list_lock: lock to protect the ring_mirror_list.
> + * @hang_limit: once the hangs by a job crosses this limit then it is marked
> + *              guilty and it will be considered for scheduling further.
> + *
> + * One scheduler is implemented for each hardware ring.
> + */
>  struct drm_gpu_scheduler {
>         const struct drm_sched_backend_ops      *ops;
>         uint32_t                        hw_submission_limit;
> --
> 2.14.3
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] drm/doc: add a chapter for gpu scheduler
  2018-05-25  4:45 ` [PATCH 3/3] drm/doc: add a chapter for gpu scheduler Nayan Deshmukh
@ 2018-05-25 14:54   ` Alex Deucher
  0 siblings, 0 replies; 13+ messages in thread
From: Alex Deucher @ 2018-05-25 14:54 UTC (permalink / raw)
  To: Nayan Deshmukh; +Cc: Christian Koenig, Maling list - DRI developers

On Fri, May 25, 2018 at 12:45 AM, Nayan Deshmukh
<nayan26deshmukh@gmail.com> wrote:
> Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  Documentation/gpu/drm-mm.rst | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
>
> diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
> index b08e9dcd9177..96ebcc2a7b41 100644
> --- a/Documentation/gpu/drm-mm.rst
> +++ b/Documentation/gpu/drm-mm.rst
> @@ -496,3 +496,21 @@ DRM Sync Objects
>
>  .. kernel-doc:: drivers/gpu/drm/drm_syncobj.c
>     :export:
> +
> +GPU Scheduler
> +=============
> +
> +Overview
> +--------
> +
> +.. kernel-doc:: drivers/gpu/drm/scheduler/gpu_scheduler.c
> +   :doc: Overview
> +
> +Scheduler Function References
> +-----------------------------
> +
> +.. kernel-doc:: include/drm/gpu_scheduler.h
> +   :internal:
> +
> +.. kernel-doc:: drivers/gpu/drm/scheduler/gpu_scheduler.c
> +   :export:
> --
> 2.14.3
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: add documentation
  2018-05-25 14:54   ` Alex Deucher
@ 2018-05-28  7:09     ` Nayan Deshmukh
  0 siblings, 0 replies; 13+ messages in thread
From: Nayan Deshmukh @ 2018-05-28  7:09 UTC (permalink / raw)
  To: Alex Deucher; +Cc: Alex Deucher, Christian Koenig, Maling list - DRI developers

If there are no more objections/suggestions I will send the patch with
the changes suggested by Alex and Christian later today.

Nayan

On Fri, May 25, 2018 at 8:24 PM, Alex Deucher <alexdeucher@gmail.com> wrote:
> On Fri, May 25, 2018 at 12:45 AM, Nayan Deshmukh
> <nayan26deshmukh@gmail.com> wrote:
>> convert existing raw comments into kernel-doc format as well
>> as add new documentation
>>
>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>> Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
>
> Looks good.  just a couple comments below to clarify the language.
> With those fixed:
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>
>> ---
>>  drivers/gpu/drm/scheduler/gpu_scheduler.c | 214 ++++++++++++++++++++++++------
>>  include/drm/gpu_scheduler.h               | 153 +++++++++++++++++----
>>  2 files changed, 296 insertions(+), 71 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> index 44d480768dfe..c70c983e3e74 100644
>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> @@ -21,6 +21,29 @@
>>   *
>>   */
>>
>> +/**
>> + * DOC: Overview
>> + *
>> + * The GPU scheduler provides entities which allow userspace to push jobs
>> + * into software queues which are then scheduled on a hardware run queue.
>> + * The software queues have a priority among them. The scheduler selects the entities
>> + * from the run queue using FIFO. The scheduler provides dependency handling
>
> a FIFO
>
>> + * features among jobs. The driver is supposed to provide functions for backend
>
> The driver provides callback functions for backend
>
>> + * operations to the scheduler like submitting a job to hardware run queue,
>> + * returning the dependency of a job etc.
>> + *
>> + * The organisation of the scheduler is the following:-
>
> drop the -
>
>> + *
>> + * 1. Each ring buffer has one scheduler
>
> Each hw run queue
>
>> + * 2. Each scheduler has multiple run queues with different priorities
>> + *    (i.e. HIGH_HW,HIGH_SW, KERNEL, NORMAL)
>
> s/i.e./e.g.,/
>
>> + * 3. Each run queue has a queue of entities to schedule
>
> Each scheduler run queue has a queue of entities to schedule
>
>> + * 4. Entities themselves maintain a queue of jobs that will be scheduled on
>> + *    the hardware.
>> + *
>> + * The jobs in a entity are always scheduled in the order that they were pushed.
>> + */
>> +
>>  #include <linux/kthread.h>
>>  #include <linux/wait.h>
>>  #include <linux/sched.h>
>> @@ -39,7 +62,13 @@ static bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>>  static void drm_sched_wakeup(struct drm_gpu_scheduler *sched);
>>  static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb);
>>
>> -/* Initialize a given run queue struct */
>> +/**
>> + * drm_sched_rq_init - initialize a given run queue struct
>> + *
>> + * @rq: scheduler run queue
>> + *
>> + * Initializes a scheduler runqueue.
>> + */
>>  static void drm_sched_rq_init(struct drm_sched_rq *rq)
>>  {
>>         spin_lock_init(&rq->lock);
>> @@ -47,6 +76,14 @@ static void drm_sched_rq_init(struct drm_sched_rq *rq)
>>         rq->current_entity = NULL;
>>  }
>>
>> +/**
>> + * drm_sched_rq_add_entity - add an entity
>> + *
>> + * @rq: scheduler run queue
>> + * @entity: scheduler entity
>> + *
>> + * Adds a scheduler entity to the run queue.
>> + */
>>  static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>>                                     struct drm_sched_entity *entity)
>>  {
>> @@ -57,6 +94,14 @@ static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>>         spin_unlock(&rq->lock);
>>  }
>>
>> +/**
>> + * drm_sched_rq_remove_entity - remove an entity
>> + *
>> + * @rq: scheduler run queue
>> + * @entity: scheduler entity
>> + *
>> + * Removes a scheduler entity from the run queue.
>> + */
>>  static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>                                        struct drm_sched_entity *entity)
>>  {
>> @@ -70,9 +115,9 @@ static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>  }
>>
>>  /**
>> - * Select an entity which could provide a job to run
>> + * drm_sched_rq_select_entity - Select an entity which could provide a job to run
>>   *
>> - * @rq         The run queue to check.
>> + * @rq: scheduler run queue to check.
>>   *
>>   * Try to find a ready entity, returns NULL if none found.
>>   */
>> @@ -112,15 +157,16 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>>  }
>>
>>  /**
>> - * Init a context entity used by scheduler when submit to HW ring.
>> + * drm_sched_entity_init - Init a context entity used by scheduler when
>> + * submit to HW ring.
>>   *
>> - * @sched      The pointer to the scheduler
>> - * @entity     The pointer to a valid drm_sched_entity
>> - * @rq         The run queue this entity belongs
>> - * @guilty      atomic_t set to 1 when a job on this queue
>> - *              is found to be guilty causing a timeout
>> + * @sched: scheduler instance
>> + * @entity: scheduler entity to init
>> + * @rq: the run queue this entity belongs
>> + * @guilty: atomic_t set to 1 when a job on this queue
>> + *          is found to be guilty causing a timeout
>>   *
>> - * return 0 if succeed. negative error code on failure
>> + * Returns 0 on success or a negative error code on failure.
>>  */
>>  int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
>>                           struct drm_sched_entity *entity,
>> @@ -149,10 +195,10 @@ int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
>>  EXPORT_SYMBOL(drm_sched_entity_init);
>>
>>  /**
>> - * Query if entity is initialized
>> + * drm_sched_entity_is_initialized - Query if entity is initialized
>>   *
>> - * @sched       Pointer to scheduler instance
>> - * @entity     The pointer to a valid scheduler entity
>> + * @sched: Pointer to scheduler instance
>> + * @entity: The pointer to a valid scheduler entity
>>   *
>>   * return true if entity is initialized, false otherwise
>>  */
>> @@ -164,11 +210,11 @@ static bool drm_sched_entity_is_initialized(struct drm_gpu_scheduler *sched,
>>  }
>>
>>  /**
>> - * Check if entity is idle
>> + * drm_sched_entity_is_idle - Check if entity is idle
>>   *
>> - * @entity     The pointer to a valid scheduler entity
>> + * @entity: scheduler entity
>>   *
>> - * Return true if entity don't has any unscheduled jobs.
>> + * Returns true if the entity does not have any unscheduled jobs.
>>   */
>>  static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
>>  {
>> @@ -180,9 +226,9 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
>>  }
>>
>>  /**
>> - * Check if entity is ready
>> + * drm_sched_entity_is_ready - Check if entity is ready
>>   *
>> - * @entity     The pointer to a valid scheduler entity
>> + * @entity: scheduler entity
>>   *
>>   * Return true if entity could provide a job.
>>   */
>> @@ -210,12 +256,12 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>>
>>
>>  /**
>> - * Destroy a context entity
>> + * drm_sched_entity_do_release - Destroy a context entity
>>   *
>> - * @sched       Pointer to scheduler instance
>> - * @entity     The pointer to a valid scheduler entity
>> + * @sched: scheduler instance
>> + * @entity: scheduler entity
>>   *
>> - * Splitting drm_sched_entity_fini() into two functions, The first one is does the waiting,
>> + * Splitting drm_sched_entity_fini() into two functions, The first one does the waiting,
>>   * removes the entity from the runqueue and returns an error when the process was killed.
>>   */
>>  void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>> @@ -237,12 +283,13 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>  EXPORT_SYMBOL(drm_sched_entity_do_release);
>>
>>  /**
>> - * Destroy a context entity
>> + * drm_sched_entity_cleanup - Destroy a context entity
>>   *
>> - * @sched       Pointer to scheduler instance
>> - * @entity     The pointer to a valid scheduler entity
>> + * @sched: scheduler instance
>> + * @entity: scheduler entity
>>   *
>> - * The second one then goes over the entity and signals all jobs with an error code.
>> + * This should be called after @drm_sched_entity_do_release. It goes over the
>> + * entity and signals all jobs with an error code if the process was killed.
>>   */
>>  void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
>>                            struct drm_sched_entity *entity)
>> @@ -281,6 +328,14 @@ void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
>>  }
>>  EXPORT_SYMBOL(drm_sched_entity_cleanup);
>>
>> +/**
>> + * drm_sched_entity_fini - Destroy a context entity
>> + *
>> + * @sched: scheduler instance
>> + * @entity: scheduler entity
>> + *
>> + * Calls drm_sched_entity_do_release() and drm_sched_entity_cleanup()
>> + */
>>  void drm_sched_entity_fini(struct drm_gpu_scheduler *sched,
>>                                 struct drm_sched_entity *entity)
>>  {
>> @@ -306,6 +361,15 @@ static void drm_sched_entity_clear_dep(struct dma_fence *f, struct dma_fence_cb
>>         dma_fence_put(f);
>>  }
>>
>> +/**
>> + * drm_sched_entity_set_rq - Sets the run queue for an entity
>> + *
>> + * @entity: scheduler entity
>> + * @rq: scheduler run queue
>> + *
>> + * Sets the run queue for an entity and removes the entity from the previous
>> + * run queue in which was present.
>> + */
>>  void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
>>                              struct drm_sched_rq *rq)
>>  {
>> @@ -325,6 +389,14 @@ void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
>>  }
>>  EXPORT_SYMBOL(drm_sched_entity_set_rq);
>>
>> +/**
>> + * drm_sched_dependency_optimized
>> + *
>> + * @fence: the dependency fence
>> + * @entity: the entity which depends on the above fence
>> + *
>> + * Returns true if the dependency can be optimized and false otherwise
>> + */
>>  bool drm_sched_dependency_optimized(struct dma_fence* fence,
>>                                     struct drm_sched_entity *entity)
>>  {
>> @@ -413,9 +485,10 @@ drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>>  }
>>
>>  /**
>> - * Submit a job to the job queue
>> + * drm_sched_entity_push_job - Submit a job to the entity's job queue
>>   *
>> - * @sched_job          The pointer to job required to submit
>> + * @sched_job: job to submit
>> + * @entity: scheduler entity
>>   *
>>   * Note: To guarantee that the order of insertion to queue matches
>>   * the job's fence sequence number this function should be
>> @@ -506,6 +579,13 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>         job->sched->ops->timedout_job(job);
>>  }
>>
>> +/**
>> + * drm_sched_hw_job_reset - stop the scheduler if it contains the bad job
>> + *
>> + * @sched: scheduler instance
>> + * @bad: bad scheduler job
>> + *
>> + */
>>  void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>>  {
>>         struct drm_sched_job *s_job;
>> @@ -550,6 +630,12 @@ void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_jo
>>  }
>>  EXPORT_SYMBOL(drm_sched_hw_job_reset);
>>
>> +/**
>> + * drm_sched_job_recovery - recover jobs after a reset
>> + *
>> + * @sched: scheduler instance
>> + *
>> + */
>>  void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
>>  {
>>         struct drm_sched_job *s_job, *tmp;
>> @@ -599,10 +685,17 @@ void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
>>  EXPORT_SYMBOL(drm_sched_job_recovery);
>>
>>  /**
>> - * Init a sched_job with basic field
>> + * drm_sched_job_init - init a scheduler job
>>   *
>> - * Note: Refer to drm_sched_entity_push_job documentation
>> + * @job: scheduler job to init
>> + * @sched: scheduler instance
>> + * @entity: scheduler entity to use
>> + * @owner: job owner for debugging
>> + *
>> + * Refer to drm_sched_entity_push_job() documentation
>>   * for locking considerations.
>> + *
>> + * Returns 0 for success, negative error code otherwise.
>>   */
>>  int drm_sched_job_init(struct drm_sched_job *job,
>>                        struct drm_gpu_scheduler *sched,
>> @@ -626,7 +719,11 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>  EXPORT_SYMBOL(drm_sched_job_init);
>>
>>  /**
>> - * Return ture if we can push more jobs to the hw.
>> + * drm_sched_ready - is the scheduler ready
>> + *
>> + * @sched: scheduler instance
>> + *
>> + * Return true if we can push more jobs to the hw, otherwise false.
>>   */
>>  static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
>>  {
>> @@ -635,7 +732,10 @@ static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
>>  }
>>
>>  /**
>> - * Wake up the scheduler when it is ready
>> + * drm_sched_wakeup - Wake up the scheduler when it is ready
>> + *
>> + * @sched: scheduler instance
>> + *
>>   */
>>  static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
>>  {
>> @@ -644,8 +744,12 @@ static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
>>  }
>>
>>  /**
>> - * Select next entity to process
>> -*/
>> + * drm_sched_select_entity - Select next entity to process
>> + *
>> + * @sched: scheduler instance
>> + *
>> + * Returns the entity to process or NULL if none are found.
>> + */
>>  static struct drm_sched_entity *
>>  drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>>  {
>> @@ -665,6 +769,14 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>>         return entity;
>>  }
>>
>> +/**
>> + * drm_sched_process_job - process a job
>> + *
>> + * @f: fence
>> + * @cb: fence callbacks
>> + *
>> + * Called after job has finished execution.
>> + */
>>  static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>>  {
>>         struct drm_sched_fence *s_fence =
>> @@ -680,6 +792,13 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>>         wake_up_interruptible(&sched->wake_up_worker);
>>  }
>>
>> +/**
>> + * drm_sched_blocked - check if the scheduler is blocked
>> + *
>> + * @sched: scheduler instance
>> + *
>> + * Returns true if blocked, otherwise false.
>> + */
>>  static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>>  {
>>         if (kthread_should_park()) {
>> @@ -690,6 +809,13 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>>         return false;
>>  }
>>
>> +/**
>> + * drm_sched_main - main scheduler thread
>> + *
>> + * @param: scheduler instance
>> + *
>> + * Returns 0.
>> + */
>>  static int drm_sched_main(void *param)
>>  {
>>         struct sched_param sparam = {.sched_priority = 1};
>> @@ -744,15 +870,17 @@ static int drm_sched_main(void *param)
>>  }
>>
>>  /**
>> - * Init a gpu scheduler instance
>> + * drm_sched_init - Init a gpu scheduler instance
>>   *
>> - * @sched              The pointer to the scheduler
>> - * @ops                        The backend operations for this scheduler.
>> - * @hw_submissions     Number of hw submissions to do.
>> - * @name               Name used for debugging
>> + * @sched: scheduler instance
>> + * @ops: backend operations for this scheduler
>> + * @hw_submission: number of hw submissions that can be in flight
>> + * @hang_limit: number of times to allow a job to hang before dropping it
>> + * @timeout: timeout value in jiffies for the scheduler
>> + * @name: name used for debugging
>>   *
>>   * Return 0 on success, otherwise error code.
>> -*/
>> + */
>>  int drm_sched_init(struct drm_gpu_scheduler *sched,
>>                    const struct drm_sched_backend_ops *ops,
>>                    unsigned hw_submission,
>> @@ -788,9 +916,11 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>  EXPORT_SYMBOL(drm_sched_init);
>>
>>  /**
>> - * Destroy a gpu scheduler
>> + * drm_sched_fini - Destroy a gpu scheduler
>> + *
>> + * @sched: scheduler instance
>>   *
>> - * @sched      The pointer to the scheduler
>> + * Tears down and cleans up the scheduler.
>>   */
>>  void drm_sched_fini(struct drm_gpu_scheduler *sched)
>>  {
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index dec655894d08..496442f12bff 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -43,13 +43,33 @@ enum drm_sched_priority {
>>  };
>>
>>  /**
>> - * drm_sched_entity - A wrapper around a job queue (typically attached
>> - * to the DRM file_priv).
>> + * struct drm_sched_entity - A wrapper around a job queue (typically
>> + * attached to the DRM file_priv).
>> + *
>> + * @list: used to append this struct to the list of entities in the
>> + *        runqueue.
>> + * @rq: runqueue to which this entity belongs.
>> + * @rq_lock: lock to modify the runqueue to which this entity belongs.
>> + * @sched: the scheduler instance to which this entity is enqueued.
>> + * @job_queue: the list of jobs of this entity.
>> + * @fence_seq: a linearly increasing seqno incremented with each
>> + *             new &drm_sched_fence which is part of the entity.
>> + * @fence_context: a unique context for all the fences which belong
>> + *                 to this entity.
>> + *                 The &drm_sched_fence.scheduled uses the
>> + *                 fence_context but &drm_sched_fence.finished uses
>> + *                 fence_context + 1.
>> + * @dependency: the dependency fence of the job which is on the top
>> + *              of the job queue.
>> + * @cb: callback for the dependency fence above.
>> + * @guilty: points to ctx's guilty.
>> + * @fini_status: contains the exit status in case the process was signalled.
>> + * @last_scheduled: points to the finished fence of the last scheduled job.
>>   *
>>   * Entities will emit jobs in order to their corresponding hardware
>>   * ring, and the scheduler will alternate between entities based on
>>   * scheduling policy.
>> -*/
>> + */
>>  struct drm_sched_entity {
>>         struct list_head                list;
>>         struct drm_sched_rq             *rq;
>> @@ -63,47 +83,96 @@ struct drm_sched_entity {
>>
>>         struct dma_fence                *dependency;
>>         struct dma_fence_cb             cb;
>> -       atomic_t                        *guilty; /* points to ctx's guilty */
>> -       int            fini_status;
>> -       struct dma_fence    *last_scheduled;
>> +       atomic_t                        *guilty;
>> +       int                             fini_status;
>> +       struct dma_fence                *last_scheduled;
>>  };
>>
>>  /**
>> + * struct drm_sched_rq - queue of entities to be scheduled.
>> + *
>> + * @lock: to modify the entities list.
>> + * @entities: list of the entities to be scheduled.
>> + * @current_entity: the entity which is to be scheduled.
>> + *
>>   * Run queue is a set of entities scheduling command submissions for
>>   * one specific ring. It implements the scheduling policy that selects
>>   * the next entity to emit commands from.
>> -*/
>> + */
>>  struct drm_sched_rq {
>>         spinlock_t                      lock;
>>         struct list_head                entities;
>>         struct drm_sched_entity         *current_entity;
>>  };
>>
>> +/**
>> + * struct drm_sched_fence - fences corresponding to the scheduling of a job.
>> + */
>>  struct drm_sched_fence {
>> +        /**
>> +         * @scheduled: this fence is what will be signaled by the scheduler
>> +         * when the job is scheduled.
>> +         */
>>         struct dma_fence                scheduled;
>>
>> -       /* This fence is what will be signaled by the scheduler when
>> -        * the job is completed.
>> -        *
>> -        * When setting up an out fence for the job, you should use
>> -        * this, since it's available immediately upon
>> -        * drm_sched_job_init(), and the fence returned by the driver
>> -        * from run_job() won't be created until the dependencies have
>> -        * resolved.
>> -        */
>> +        /**
>> +         * @finished: this fence is what will be signaled by the scheduler
>> +         * when the job is completed.
>> +         *
>> +         * When setting up an out fence for the job, you should use
>> +         * this, since it's available immediately upon
>> +         * drm_sched_job_init(), and the fence returned by the driver
>> +         * from run_job() won't be created until the dependencies have
>> +         * resolved.
>> +         */
>>         struct dma_fence                finished;
>>
>> +        /**
>> +         * @cb: the callback for the parent fence below.
>> +         */
>>         struct dma_fence_cb             cb;
>> +        /**
>> +         * @parent: the fence returned by &drm_sched_backend_ops.run_job
>> +         * when scheduling the job on hardware. We signal the
>> +         * &drm_sched_fence.finished fence once parent is signalled.
>> +         */
>>         struct dma_fence                *parent;
>> +        /**
>> +         * @sched: the scheduler instance to which the job having this struct
>> +         * belongs to.
>> +         */
>>         struct drm_gpu_scheduler        *sched;
>> +        /**
>> +         * @lock: the lock used by the scheduled and the finished fences.
>> +         */
>>         spinlock_t                      lock;
>> +        /**
>> +         * @owner: job owner for debugging
>> +         */
>>         void                            *owner;
>>  };
>>
>>  struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>>
>>  /**
>> - * drm_sched_job - A job to be run by an entity.
>> + * struct drm_sched_job - A job to be run by an entity.
>> + *
>> + * @queue_node: used to append this struct to the queue of jobs in an entity.
>> + * @sched: the scheduler instance on which this job is scheduled.
>> + * @s_fence: contains the fences for the scheduling of job.
>> + * @finish_cb: the callback for the finished fence.
>> + * @finish_work: schedules the function @drm_sched_job_finish once the job has
>> + *               finished to remove the job from the
>> + *               @drm_gpu_scheduler.ring_mirror_list.
>> + * @node: used to append this struct to the @drm_gpu_scheduler.ring_mirror_list.
>> + * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the timeout
>> + *            interval is over.
>> + * @id: a unique id assigned to each job scheduled on the scheduler.
>> + * @karma: increment on every hang caused by this job. If this exceeds the hang
>> + *         limit of the scheduler then the job is marked guilty and will not
>> + *         be scheduled further.
>> + * @s_priority: the priority of the job.
>> + * @entity: the entity to which this job belongs.
>>   *
>>   * A job is created by the driver using drm_sched_job_init(), and
>>   * should call drm_sched_entity_push_job() once it wants the scheduler
>> @@ -130,38 +199,64 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>>  }
>>
>>  /**
>> + * struct drm_sched_backend_ops
>> + *
>>   * Define the backend operations called by the scheduler,
>> - * these functions should be implemented in driver side
>> -*/
>> + * these functions should be implemented in driver side.
>> + */
>>  struct drm_sched_backend_ops {
>> -       /* Called when the scheduler is considering scheduling this
>> -        * job next, to get another struct dma_fence for this job to
>> +       /**
>> +         * @dependency: Called when the scheduler is considering scheduling
>> +         * this job next, to get another struct dma_fence for this job to
>>          * block on.  Once it returns NULL, run_job() may be called.
>>          */
>>         struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
>>                                         struct drm_sched_entity *s_entity);
>>
>> -       /* Called to execute the job once all of the dependencies have
>> -        * been resolved.  This may be called multiple times, if
>> +       /**
>> +         * @run_job: Called to execute the job once all of the dependencies
>> +         * have been resolved.  This may be called multiple times, if
>>          * timedout_job() has happened and drm_sched_job_recovery()
>>          * decides to try it again.
>>          */
>>         struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>>
>> -       /* Called when a job has taken too long to execute, to trigger
>> -        * GPU recovery.
>> +       /**
>> +         * @timedout_job: Called when a job has taken too long to execute,
>> +         * to trigger GPU recovery.
>>          */
>>         void (*timedout_job)(struct drm_sched_job *sched_job);
>>
>> -       /* Called once the job's finished fence has been signaled and
>> -        * it's time to clean it up.
>> +       /**
>> +         * @free_job: Called once the job's finished fence has been signaled
>> +         * and it's time to clean it up.
>>          */
>>         void (*free_job)(struct drm_sched_job *sched_job);
>>  };
>>
>>  /**
>> - * One scheduler is implemented for each hardware ring
>> -*/
>> + * struct drm_gpu_scheduler
>> + *
>> + * @ops: backend operations provided by the driver.
>> + * @hw_submission_limit: the max size of the hardware queue.
>> + * @timeout: the time after which a job is removed from the scheduler.
>> + * @name: name of the ring for which this scheduler is being used.
>> + * @sched_rq: priority wise array of run queues.
>> + * @wake_up_worker: the wait queue on which the scheduler sleeps until a job
>> + *                  is ready to be scheduled.
>> + * @job_scheduled: once @drm_sched_entity_do_release is called the scheduler
>> + *                 waits on this wait queue until all the scheduled jobs are
>> + *                 finished.
>> + * @hw_rq_count: the number of jobs currently in the hardware queue.
>> + * @job_id_count: used to assign unique id to the each job.
>> + * @thread: the kthread on which the scheduler which run.
>> + * @ring_mirror_list: the list of jobs which are currently in the job queue.
>> + * @job_list_lock: lock to protect the ring_mirror_list.
>> + * @hang_limit: once the hangs by a job crosses this limit then it is marked
>> + *              guilty and it will be considered for scheduling further.
>> + *
>> + * One scheduler is implemented for each hardware ring.
>> + */
>>  struct drm_gpu_scheduler {
>>         const struct drm_sched_backend_ops      *ops;
>>         uint32_t                        hw_submission_limit;
>> --
>> 2.14.3
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: add documentation
  2018-05-25  4:45 ` [PATCH 2/3] drm/scheduler: add documentation Nayan Deshmukh
  2018-05-25 12:06   ` Christian König
  2018-05-25 14:54   ` Alex Deucher
@ 2018-05-28  8:09   ` Daniel Vetter
  2018-05-28  8:31     ` Nayan Deshmukh
  2 siblings, 1 reply; 13+ messages in thread
From: Daniel Vetter @ 2018-05-28  8:09 UTC (permalink / raw)
  To: Nayan Deshmukh; +Cc: Alex Deucher, Christian König, dri-devel

On Fri, May 25, 2018 at 6:45 AM, Nayan Deshmukh
<nayan26deshmukh@gmail.com> wrote:
> convert existing raw comments into kernel-doc format as well
> as add new documentation
>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
> ---
>  drivers/gpu/drm/scheduler/gpu_scheduler.c | 214 ++++++++++++++++++++++++------
>  include/drm/gpu_scheduler.h               | 153 +++++++++++++++++----
>  2 files changed, 296 insertions(+), 71 deletions(-)

Please also include all the new scheduler docs into
Documentation/gpu/drm-mm.rst (I think that's the most suitable place)
and make sure the resulting docs look good and hyperlinks all work
correctly using

$ make htmldocs

Thanks, Daniel

>
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> index 44d480768dfe..c70c983e3e74 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> @@ -21,6 +21,29 @@
>   *
>   */
>
> +/**
> + * DOC: Overview
> + *
> + * The GPU scheduler provides entities which allow userspace to push jobs
> + * into software queues which are then scheduled on a hardware run queue.
> + * The software queues have a priority among them. The scheduler selects the entities
> + * from the run queue using FIFO. The scheduler provides dependency handling
> + * features among jobs. The driver is supposed to provide functions for backend
> + * operations to the scheduler like submitting a job to hardware run queue,
> + * returning the dependency of a job etc.
> + *
> + * The organisation of the scheduler is the following:-
> + *
> + * 1. Each ring buffer has one scheduler
> + * 2. Each scheduler has multiple run queues with different priorities
> + *    (i.e. HIGH_HW,HIGH_SW, KERNEL, NORMAL)
> + * 3. Each run queue has a queue of entities to schedule
> + * 4. Entities themselves maintain a queue of jobs that will be scheduled on
> + *    the hardware.
> + *
> + * The jobs in a entity are always scheduled in the order that they were pushed.
> + */
> +
>  #include <linux/kthread.h>
>  #include <linux/wait.h>
>  #include <linux/sched.h>
> @@ -39,7 +62,13 @@ static bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>  static void drm_sched_wakeup(struct drm_gpu_scheduler *sched);
>  static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb);
>
> -/* Initialize a given run queue struct */
> +/**
> + * drm_sched_rq_init - initialize a given run queue struct
> + *
> + * @rq: scheduler run queue
> + *
> + * Initializes a scheduler runqueue.
> + */
>  static void drm_sched_rq_init(struct drm_sched_rq *rq)
>  {
>         spin_lock_init(&rq->lock);
> @@ -47,6 +76,14 @@ static void drm_sched_rq_init(struct drm_sched_rq *rq)
>         rq->current_entity = NULL;
>  }
>
> +/**
> + * drm_sched_rq_add_entity - add an entity
> + *
> + * @rq: scheduler run queue
> + * @entity: scheduler entity
> + *
> + * Adds a scheduler entity to the run queue.
> + */
>  static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>                                     struct drm_sched_entity *entity)
>  {
> @@ -57,6 +94,14 @@ static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>         spin_unlock(&rq->lock);
>  }
>
> +/**
> + * drm_sched_rq_remove_entity - remove an entity
> + *
> + * @rq: scheduler run queue
> + * @entity: scheduler entity
> + *
> + * Removes a scheduler entity from the run queue.
> + */
>  static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>                                        struct drm_sched_entity *entity)
>  {
> @@ -70,9 +115,9 @@ static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>  }
>
>  /**
> - * Select an entity which could provide a job to run
> + * drm_sched_rq_select_entity - Select an entity which could provide a job to run
>   *
> - * @rq         The run queue to check.
> + * @rq: scheduler run queue to check.
>   *
>   * Try to find a ready entity, returns NULL if none found.
>   */
> @@ -112,15 +157,16 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>  }
>
>  /**
> - * Init a context entity used by scheduler when submit to HW ring.
> + * drm_sched_entity_init - Init a context entity used by scheduler when
> + * submit to HW ring.
>   *
> - * @sched      The pointer to the scheduler
> - * @entity     The pointer to a valid drm_sched_entity
> - * @rq         The run queue this entity belongs
> - * @guilty      atomic_t set to 1 when a job on this queue
> - *              is found to be guilty causing a timeout
> + * @sched: scheduler instance
> + * @entity: scheduler entity to init
> + * @rq: the run queue this entity belongs
> + * @guilty: atomic_t set to 1 when a job on this queue
> + *          is found to be guilty causing a timeout
>   *
> - * return 0 if succeed. negative error code on failure
> + * Returns 0 on success or a negative error code on failure.
>  */
>  int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
>                           struct drm_sched_entity *entity,
> @@ -149,10 +195,10 @@ int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
>  EXPORT_SYMBOL(drm_sched_entity_init);
>
>  /**
> - * Query if entity is initialized
> + * drm_sched_entity_is_initialized - Query if entity is initialized
>   *
> - * @sched       Pointer to scheduler instance
> - * @entity     The pointer to a valid scheduler entity
> + * @sched: Pointer to scheduler instance
> + * @entity: The pointer to a valid scheduler entity
>   *
>   * return true if entity is initialized, false otherwise
>  */
> @@ -164,11 +210,11 @@ static bool drm_sched_entity_is_initialized(struct drm_gpu_scheduler *sched,
>  }
>
>  /**
> - * Check if entity is idle
> + * drm_sched_entity_is_idle - Check if entity is idle
>   *
> - * @entity     The pointer to a valid scheduler entity
> + * @entity: scheduler entity
>   *
> - * Return true if entity don't has any unscheduled jobs.
> + * Returns true if the entity does not have any unscheduled jobs.
>   */
>  static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
>  {
> @@ -180,9 +226,9 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
>  }
>
>  /**
> - * Check if entity is ready
> + * drm_sched_entity_is_ready - Check if entity is ready
>   *
> - * @entity     The pointer to a valid scheduler entity
> + * @entity: scheduler entity
>   *
>   * Return true if entity could provide a job.
>   */
> @@ -210,12 +256,12 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>
>
>  /**
> - * Destroy a context entity
> + * drm_sched_entity_do_release - Destroy a context entity
>   *
> - * @sched       Pointer to scheduler instance
> - * @entity     The pointer to a valid scheduler entity
> + * @sched: scheduler instance
> + * @entity: scheduler entity
>   *
> - * Splitting drm_sched_entity_fini() into two functions, The first one is does the waiting,
> + * Splitting drm_sched_entity_fini() into two functions, The first one does the waiting,
>   * removes the entity from the runqueue and returns an error when the process was killed.
>   */
>  void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
> @@ -237,12 +283,13 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>  EXPORT_SYMBOL(drm_sched_entity_do_release);
>
>  /**
> - * Destroy a context entity
> + * drm_sched_entity_cleanup - Destroy a context entity
>   *
> - * @sched       Pointer to scheduler instance
> - * @entity     The pointer to a valid scheduler entity
> + * @sched: scheduler instance
> + * @entity: scheduler entity
>   *
> - * The second one then goes over the entity and signals all jobs with an error code.
> + * This should be called after @drm_sched_entity_do_release. It goes over the
> + * entity and signals all jobs with an error code if the process was killed.
>   */
>  void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
>                            struct drm_sched_entity *entity)
> @@ -281,6 +328,14 @@ void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
>  }
>  EXPORT_SYMBOL(drm_sched_entity_cleanup);
>
> +/**
> + * drm_sched_entity_fini - Destroy a context entity
> + *
> + * @sched: scheduler instance
> + * @entity: scheduler entity
> + *
> + * Calls drm_sched_entity_do_release() and drm_sched_entity_cleanup()
> + */
>  void drm_sched_entity_fini(struct drm_gpu_scheduler *sched,
>                                 struct drm_sched_entity *entity)
>  {
> @@ -306,6 +361,15 @@ static void drm_sched_entity_clear_dep(struct dma_fence *f, struct dma_fence_cb
>         dma_fence_put(f);
>  }
>
> +/**
> + * drm_sched_entity_set_rq - Sets the run queue for an entity
> + *
> + * @entity: scheduler entity
> + * @rq: scheduler run queue
> + *
> + * Sets the run queue for an entity and removes the entity from the previous
> + * run queue in which was present.
> + */
>  void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
>                              struct drm_sched_rq *rq)
>  {
> @@ -325,6 +389,14 @@ void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
>  }
>  EXPORT_SYMBOL(drm_sched_entity_set_rq);
>
> +/**
> + * drm_sched_dependency_optimized
> + *
> + * @fence: the dependency fence
> + * @entity: the entity which depends on the above fence
> + *
> + * Returns true if the dependency can be optimized and false otherwise
> + */
>  bool drm_sched_dependency_optimized(struct dma_fence* fence,
>                                     struct drm_sched_entity *entity)
>  {
> @@ -413,9 +485,10 @@ drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>  }
>
>  /**
> - * Submit a job to the job queue
> + * drm_sched_entity_push_job - Submit a job to the entity's job queue
>   *
> - * @sched_job          The pointer to job required to submit
> + * @sched_job: job to submit
> + * @entity: scheduler entity
>   *
>   * Note: To guarantee that the order of insertion to queue matches
>   * the job's fence sequence number this function should be
> @@ -506,6 +579,13 @@ static void drm_sched_job_timedout(struct work_struct *work)
>         job->sched->ops->timedout_job(job);
>  }
>
> +/**
> + * drm_sched_hw_job_reset - stop the scheduler if it contains the bad job
> + *
> + * @sched: scheduler instance
> + * @bad: bad scheduler job
> + *
> + */
>  void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>  {
>         struct drm_sched_job *s_job;
> @@ -550,6 +630,12 @@ void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_jo
>  }
>  EXPORT_SYMBOL(drm_sched_hw_job_reset);
>
> +/**
> + * drm_sched_job_recovery - recover jobs after a reset
> + *
> + * @sched: scheduler instance
> + *
> + */
>  void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
>  {
>         struct drm_sched_job *s_job, *tmp;
> @@ -599,10 +685,17 @@ void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
>  EXPORT_SYMBOL(drm_sched_job_recovery);
>
>  /**
> - * Init a sched_job with basic field
> + * drm_sched_job_init - init a scheduler job
>   *
> - * Note: Refer to drm_sched_entity_push_job documentation
> + * @job: scheduler job to init
> + * @sched: scheduler instance
> + * @entity: scheduler entity to use
> + * @owner: job owner for debugging
> + *
> + * Refer to drm_sched_entity_push_job() documentation
>   * for locking considerations.
> + *
> + * Returns 0 for success, negative error code otherwise.
>   */
>  int drm_sched_job_init(struct drm_sched_job *job,
>                        struct drm_gpu_scheduler *sched,
> @@ -626,7 +719,11 @@ int drm_sched_job_init(struct drm_sched_job *job,
>  EXPORT_SYMBOL(drm_sched_job_init);
>
>  /**
> - * Return ture if we can push more jobs to the hw.
> + * drm_sched_ready - is the scheduler ready
> + *
> + * @sched: scheduler instance
> + *
> + * Return true if we can push more jobs to the hw, otherwise false.
>   */
>  static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
>  {
> @@ -635,7 +732,10 @@ static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
>  }
>
>  /**
> - * Wake up the scheduler when it is ready
> + * drm_sched_wakeup - Wake up the scheduler when it is ready
> + *
> + * @sched: scheduler instance
> + *
>   */
>  static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
>  {
> @@ -644,8 +744,12 @@ static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
>  }
>
>  /**
> - * Select next entity to process
> -*/
> + * drm_sched_select_entity - Select next entity to process
> + *
> + * @sched: scheduler instance
> + *
> + * Returns the entity to process or NULL if none are found.
> + */
>  static struct drm_sched_entity *
>  drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>  {
> @@ -665,6 +769,14 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>         return entity;
>  }
>
> +/**
> + * drm_sched_process_job - process a job
> + *
> + * @f: fence
> + * @cb: fence callbacks
> + *
> + * Called after job has finished execution.
> + */
>  static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>  {
>         struct drm_sched_fence *s_fence =
> @@ -680,6 +792,13 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>         wake_up_interruptible(&sched->wake_up_worker);
>  }
>
> +/**
> + * drm_sched_blocked - check if the scheduler is blocked
> + *
> + * @sched: scheduler instance
> + *
> + * Returns true if blocked, otherwise false.
> + */
>  static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>  {
>         if (kthread_should_park()) {
> @@ -690,6 +809,13 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>         return false;
>  }
>
> +/**
> + * drm_sched_main - main scheduler thread
> + *
> + * @param: scheduler instance
> + *
> + * Returns 0.
> + */
>  static int drm_sched_main(void *param)
>  {
>         struct sched_param sparam = {.sched_priority = 1};
> @@ -744,15 +870,17 @@ static int drm_sched_main(void *param)
>  }
>
>  /**
> - * Init a gpu scheduler instance
> + * drm_sched_init - Init a gpu scheduler instance
>   *
> - * @sched              The pointer to the scheduler
> - * @ops                        The backend operations for this scheduler.
> - * @hw_submissions     Number of hw submissions to do.
> - * @name               Name used for debugging
> + * @sched: scheduler instance
> + * @ops: backend operations for this scheduler
> + * @hw_submission: number of hw submissions that can be in flight
> + * @hang_limit: number of times to allow a job to hang before dropping it
> + * @timeout: timeout value in jiffies for the scheduler
> + * @name: name used for debugging
>   *
>   * Return 0 on success, otherwise error code.
> -*/
> + */
>  int drm_sched_init(struct drm_gpu_scheduler *sched,
>                    const struct drm_sched_backend_ops *ops,
>                    unsigned hw_submission,
> @@ -788,9 +916,11 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>  EXPORT_SYMBOL(drm_sched_init);
>
>  /**
> - * Destroy a gpu scheduler
> + * drm_sched_fini - Destroy a gpu scheduler
> + *
> + * @sched: scheduler instance
>   *
> - * @sched      The pointer to the scheduler
> + * Tears down and cleans up the scheduler.
>   */
>  void drm_sched_fini(struct drm_gpu_scheduler *sched)
>  {
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index dec655894d08..496442f12bff 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -43,13 +43,33 @@ enum drm_sched_priority {
>  };
>
>  /**
> - * drm_sched_entity - A wrapper around a job queue (typically attached
> - * to the DRM file_priv).
> + * struct drm_sched_entity - A wrapper around a job queue (typically
> + * attached to the DRM file_priv).
> + *
> + * @list: used to append this struct to the list of entities in the
> + *        runqueue.
> + * @rq: runqueue to which this entity belongs.
> + * @rq_lock: lock to modify the runqueue to which this entity belongs.
> + * @sched: the scheduler instance to which this entity is enqueued.
> + * @job_queue: the list of jobs of this entity.
> + * @fence_seq: a linearly increasing seqno incremented with each
> + *             new &drm_sched_fence which is part of the entity.
> + * @fence_context: a unique context for all the fences which belong
> + *                 to this entity.
> + *                 The &drm_sched_fence.scheduled uses the
> + *                 fence_context but &drm_sched_fence.finished uses
> + *                 fence_context + 1.
> + * @dependency: the dependency fence of the job which is on the top
> + *              of the job queue.
> + * @cb: callback for the dependency fence above.
> + * @guilty: points to ctx's guilty.
> + * @fini_status: contains the exit status in case the process was signalled.
> + * @last_scheduled: points to the finished fence of the last scheduled job.
>   *
>   * Entities will emit jobs in order to their corresponding hardware
>   * ring, and the scheduler will alternate between entities based on
>   * scheduling policy.
> -*/
> + */
>  struct drm_sched_entity {
>         struct list_head                list;
>         struct drm_sched_rq             *rq;
> @@ -63,47 +83,96 @@ struct drm_sched_entity {
>
>         struct dma_fence                *dependency;
>         struct dma_fence_cb             cb;
> -       atomic_t                        *guilty; /* points to ctx's guilty */
> -       int            fini_status;
> -       struct dma_fence    *last_scheduled;
> +       atomic_t                        *guilty;
> +       int                             fini_status;
> +       struct dma_fence                *last_scheduled;
>  };
>
>  /**
> + * struct drm_sched_rq - queue of entities to be scheduled.
> + *
> + * @lock: to modify the entities list.
> + * @entities: list of the entities to be scheduled.
> + * @current_entity: the entity which is to be scheduled.
> + *
>   * Run queue is a set of entities scheduling command submissions for
>   * one specific ring. It implements the scheduling policy that selects
>   * the next entity to emit commands from.
> -*/
> + */
>  struct drm_sched_rq {
>         spinlock_t                      lock;
>         struct list_head                entities;
>         struct drm_sched_entity         *current_entity;
>  };
>
> +/**
> + * struct drm_sched_fence - fences corresponding to the scheduling of a job.
> + */
>  struct drm_sched_fence {
> +        /**
> +         * @scheduled: this fence is what will be signaled by the scheduler
> +         * when the job is scheduled.
> +         */
>         struct dma_fence                scheduled;
>
> -       /* This fence is what will be signaled by the scheduler when
> -        * the job is completed.
> -        *
> -        * When setting up an out fence for the job, you should use
> -        * this, since it's available immediately upon
> -        * drm_sched_job_init(), and the fence returned by the driver
> -        * from run_job() won't be created until the dependencies have
> -        * resolved.
> -        */
> +        /**
> +         * @finished: this fence is what will be signaled by the scheduler
> +         * when the job is completed.
> +         *
> +         * When setting up an out fence for the job, you should use
> +         * this, since it's available immediately upon
> +         * drm_sched_job_init(), and the fence returned by the driver
> +         * from run_job() won't be created until the dependencies have
> +         * resolved.
> +         */
>         struct dma_fence                finished;
>
> +        /**
> +         * @cb: the callback for the parent fence below.
> +         */
>         struct dma_fence_cb             cb;
> +        /**
> +         * @parent: the fence returned by &drm_sched_backend_ops.run_job
> +         * when scheduling the job on hardware. We signal the
> +         * &drm_sched_fence.finished fence once parent is signalled.
> +         */
>         struct dma_fence                *parent;
> +        /**
> +         * @sched: the scheduler instance to which the job having this struct
> +         * belongs to.
> +         */
>         struct drm_gpu_scheduler        *sched;
> +        /**
> +         * @lock: the lock used by the scheduled and the finished fences.
> +         */
>         spinlock_t                      lock;
> +        /**
> +         * @owner: job owner for debugging
> +         */
>         void                            *owner;
>  };
>
>  struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>
>  /**
> - * drm_sched_job - A job to be run by an entity.
> + * struct drm_sched_job - A job to be run by an entity.
> + *
> + * @queue_node: used to append this struct to the queue of jobs in an entity.
> + * @sched: the scheduler instance on which this job is scheduled.
> + * @s_fence: contains the fences for the scheduling of job.
> + * @finish_cb: the callback for the finished fence.
> + * @finish_work: schedules the function @drm_sched_job_finish once the job has
> + *               finished to remove the job from the
> + *               @drm_gpu_scheduler.ring_mirror_list.
> + * @node: used to append this struct to the @drm_gpu_scheduler.ring_mirror_list.
> + * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the timeout
> + *            interval is over.
> + * @id: a unique id assigned to each job scheduled on the scheduler.
> + * @karma: increment on every hang caused by this job. If this exceeds the hang
> + *         limit of the scheduler then the job is marked guilty and will not
> + *         be scheduled further.
> + * @s_priority: the priority of the job.
> + * @entity: the entity to which this job belongs.
>   *
>   * A job is created by the driver using drm_sched_job_init(), and
>   * should call drm_sched_entity_push_job() once it wants the scheduler
> @@ -130,38 +199,64 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>  }
>
>  /**
> + * struct drm_sched_backend_ops
> + *
>   * Define the backend operations called by the scheduler,
> - * these functions should be implemented in driver side
> -*/
> + * these functions should be implemented in driver side.
> + */
>  struct drm_sched_backend_ops {
> -       /* Called when the scheduler is considering scheduling this
> -        * job next, to get another struct dma_fence for this job to
> +       /**
> +         * @dependency: Called when the scheduler is considering scheduling
> +         * this job next, to get another struct dma_fence for this job to
>          * block on.  Once it returns NULL, run_job() may be called.
>          */
>         struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
>                                         struct drm_sched_entity *s_entity);
>
> -       /* Called to execute the job once all of the dependencies have
> -        * been resolved.  This may be called multiple times, if
> +       /**
> +         * @run_job: Called to execute the job once all of the dependencies
> +         * have been resolved.  This may be called multiple times, if
>          * timedout_job() has happened and drm_sched_job_recovery()
>          * decides to try it again.
>          */
>         struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>
> -       /* Called when a job has taken too long to execute, to trigger
> -        * GPU recovery.
> +       /**
> +         * @timedout_job: Called when a job has taken too long to execute,
> +         * to trigger GPU recovery.
>          */
>         void (*timedout_job)(struct drm_sched_job *sched_job);
>
> -       /* Called once the job's finished fence has been signaled and
> -        * it's time to clean it up.
> +       /**
> +         * @free_job: Called once the job's finished fence has been signaled
> +         * and it's time to clean it up.
>          */
>         void (*free_job)(struct drm_sched_job *sched_job);
>  };
>
>  /**
> - * One scheduler is implemented for each hardware ring
> -*/
> + * struct drm_gpu_scheduler
> + *
> + * @ops: backend operations provided by the driver.
> + * @hw_submission_limit: the max size of the hardware queue.
> + * @timeout: the time after which a job is removed from the scheduler.
> + * @name: name of the ring for which this scheduler is being used.
> + * @sched_rq: priority wise array of run queues.
> + * @wake_up_worker: the wait queue on which the scheduler sleeps until a job
> + *                  is ready to be scheduled.
> + * @job_scheduled: once @drm_sched_entity_do_release is called the scheduler
> + *                 waits on this wait queue until all the scheduled jobs are
> + *                 finished.
> + * @hw_rq_count: the number of jobs currently in the hardware queue.
> + * @job_id_count: used to assign unique id to the each job.
> + * @thread: the kthread on which the scheduler which run.
> + * @ring_mirror_list: the list of jobs which are currently in the job queue.
> + * @job_list_lock: lock to protect the ring_mirror_list.
> + * @hang_limit: once the hangs by a job crosses this limit then it is marked
> + *              guilty and it will be considered for scheduling further.
> + *
> + * One scheduler is implemented for each hardware ring.
> + */
>  struct drm_gpu_scheduler {
>         const struct drm_sched_backend_ops      *ops;
>         uint32_t                        hw_submission_limit;
> --
> 2.14.3
>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: add documentation
  2018-05-28  8:09   ` Daniel Vetter
@ 2018-05-28  8:31     ` Nayan Deshmukh
  2018-05-29  5:53       ` [PATCH v2] " Nayan Deshmukh
  2018-05-29  8:05       ` [PATCH 2/3] " Daniel Vetter
  0 siblings, 2 replies; 13+ messages in thread
From: Nayan Deshmukh @ 2018-05-28  8:31 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Alex Deucher, Christian König, dri-devel

I have done that already, sent a patch with this one. The last patch
of this series. I have tried to take care of all the hyperlinks.

On Mon, May 28, 2018 at 1:39 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Fri, May 25, 2018 at 6:45 AM, Nayan Deshmukh
> <nayan26deshmukh@gmail.com> wrote:
>> convert existing raw comments into kernel-doc format as well
>> as add new documentation
>>
>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>> Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
>> ---
>>  drivers/gpu/drm/scheduler/gpu_scheduler.c | 214 ++++++++++++++++++++++++------
>>  include/drm/gpu_scheduler.h               | 153 +++++++++++++++++----
>>  2 files changed, 296 insertions(+), 71 deletions(-)
>
> Please also include all the new scheduler docs into
> Documentation/gpu/drm-mm.rst (I think that's the most suitable place)
> and make sure the resulting docs look good and hyperlinks all work
> correctly using
>
> $ make htmldocs
>
> Thanks, Daniel
>
>>
>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> index 44d480768dfe..c70c983e3e74 100644
>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>> @@ -21,6 +21,29 @@
>>   *
>>   */
>>
>> +/**
>> + * DOC: Overview
>> + *
>> + * The GPU scheduler provides entities which allow userspace to push jobs
>> + * into software queues which are then scheduled on a hardware run queue.
>> + * The software queues have a priority among them. The scheduler selects the entities
>> + * from the run queue using FIFO. The scheduler provides dependency handling
>> + * features among jobs. The driver is supposed to provide functions for backend
>> + * operations to the scheduler like submitting a job to hardware run queue,
>> + * returning the dependency of a job etc.
>> + *
>> + * The organisation of the scheduler is the following:-
>> + *
>> + * 1. Each ring buffer has one scheduler
>> + * 2. Each scheduler has multiple run queues with different priorities
>> + *    (i.e. HIGH_HW,HIGH_SW, KERNEL, NORMAL)
>> + * 3. Each run queue has a queue of entities to schedule
>> + * 4. Entities themselves maintain a queue of jobs that will be scheduled on
>> + *    the hardware.
>> + *
>> + * The jobs in a entity are always scheduled in the order that they were pushed.
>> + */
>> +
>>  #include <linux/kthread.h>
>>  #include <linux/wait.h>
>>  #include <linux/sched.h>
>> @@ -39,7 +62,13 @@ static bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>>  static void drm_sched_wakeup(struct drm_gpu_scheduler *sched);
>>  static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb);
>>
>> -/* Initialize a given run queue struct */
>> +/**
>> + * drm_sched_rq_init - initialize a given run queue struct
>> + *
>> + * @rq: scheduler run queue
>> + *
>> + * Initializes a scheduler runqueue.
>> + */
>>  static void drm_sched_rq_init(struct drm_sched_rq *rq)
>>  {
>>         spin_lock_init(&rq->lock);
>> @@ -47,6 +76,14 @@ static void drm_sched_rq_init(struct drm_sched_rq *rq)
>>         rq->current_entity = NULL;
>>  }
>>
>> +/**
>> + * drm_sched_rq_add_entity - add an entity
>> + *
>> + * @rq: scheduler run queue
>> + * @entity: scheduler entity
>> + *
>> + * Adds a scheduler entity to the run queue.
>> + */
>>  static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>>                                     struct drm_sched_entity *entity)
>>  {
>> @@ -57,6 +94,14 @@ static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>>         spin_unlock(&rq->lock);
>>  }
>>
>> +/**
>> + * drm_sched_rq_remove_entity - remove an entity
>> + *
>> + * @rq: scheduler run queue
>> + * @entity: scheduler entity
>> + *
>> + * Removes a scheduler entity from the run queue.
>> + */
>>  static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>                                        struct drm_sched_entity *entity)
>>  {
>> @@ -70,9 +115,9 @@ static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>>  }
>>
>>  /**
>> - * Select an entity which could provide a job to run
>> + * drm_sched_rq_select_entity - Select an entity which could provide a job to run
>>   *
>> - * @rq         The run queue to check.
>> + * @rq: scheduler run queue to check.
>>   *
>>   * Try to find a ready entity, returns NULL if none found.
>>   */
>> @@ -112,15 +157,16 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>>  }
>>
>>  /**
>> - * Init a context entity used by scheduler when submit to HW ring.
>> + * drm_sched_entity_init - Init a context entity used by scheduler when
>> + * submit to HW ring.
>>   *
>> - * @sched      The pointer to the scheduler
>> - * @entity     The pointer to a valid drm_sched_entity
>> - * @rq         The run queue this entity belongs
>> - * @guilty      atomic_t set to 1 when a job on this queue
>> - *              is found to be guilty causing a timeout
>> + * @sched: scheduler instance
>> + * @entity: scheduler entity to init
>> + * @rq: the run queue this entity belongs
>> + * @guilty: atomic_t set to 1 when a job on this queue
>> + *          is found to be guilty causing a timeout
>>   *
>> - * return 0 if succeed. negative error code on failure
>> + * Returns 0 on success or a negative error code on failure.
>>  */
>>  int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
>>                           struct drm_sched_entity *entity,
>> @@ -149,10 +195,10 @@ int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
>>  EXPORT_SYMBOL(drm_sched_entity_init);
>>
>>  /**
>> - * Query if entity is initialized
>> + * drm_sched_entity_is_initialized - Query if entity is initialized
>>   *
>> - * @sched       Pointer to scheduler instance
>> - * @entity     The pointer to a valid scheduler entity
>> + * @sched: Pointer to scheduler instance
>> + * @entity: The pointer to a valid scheduler entity
>>   *
>>   * return true if entity is initialized, false otherwise
>>  */
>> @@ -164,11 +210,11 @@ static bool drm_sched_entity_is_initialized(struct drm_gpu_scheduler *sched,
>>  }
>>
>>  /**
>> - * Check if entity is idle
>> + * drm_sched_entity_is_idle - Check if entity is idle
>>   *
>> - * @entity     The pointer to a valid scheduler entity
>> + * @entity: scheduler entity
>>   *
>> - * Return true if entity don't has any unscheduled jobs.
>> + * Returns true if the entity does not have any unscheduled jobs.
>>   */
>>  static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
>>  {
>> @@ -180,9 +226,9 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
>>  }
>>
>>  /**
>> - * Check if entity is ready
>> + * drm_sched_entity_is_ready - Check if entity is ready
>>   *
>> - * @entity     The pointer to a valid scheduler entity
>> + * @entity: scheduler entity
>>   *
>>   * Return true if entity could provide a job.
>>   */
>> @@ -210,12 +256,12 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>>
>>
>>  /**
>> - * Destroy a context entity
>> + * drm_sched_entity_do_release - Destroy a context entity
>>   *
>> - * @sched       Pointer to scheduler instance
>> - * @entity     The pointer to a valid scheduler entity
>> + * @sched: scheduler instance
>> + * @entity: scheduler entity
>>   *
>> - * Splitting drm_sched_entity_fini() into two functions, The first one is does the waiting,
>> + * Splitting drm_sched_entity_fini() into two functions, The first one does the waiting,
>>   * removes the entity from the runqueue and returns an error when the process was killed.
>>   */
>>  void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>> @@ -237,12 +283,13 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>  EXPORT_SYMBOL(drm_sched_entity_do_release);
>>
>>  /**
>> - * Destroy a context entity
>> + * drm_sched_entity_cleanup - Destroy a context entity
>>   *
>> - * @sched       Pointer to scheduler instance
>> - * @entity     The pointer to a valid scheduler entity
>> + * @sched: scheduler instance
>> + * @entity: scheduler entity
>>   *
>> - * The second one then goes over the entity and signals all jobs with an error code.
>> + * This should be called after @drm_sched_entity_do_release. It goes over the
>> + * entity and signals all jobs with an error code if the process was killed.
>>   */
>>  void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
>>                            struct drm_sched_entity *entity)
>> @@ -281,6 +328,14 @@ void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
>>  }
>>  EXPORT_SYMBOL(drm_sched_entity_cleanup);
>>
>> +/**
>> + * drm_sched_entity_fini - Destroy a context entity
>> + *
>> + * @sched: scheduler instance
>> + * @entity: scheduler entity
>> + *
>> + * Calls drm_sched_entity_do_release() and drm_sched_entity_cleanup()
>> + */
>>  void drm_sched_entity_fini(struct drm_gpu_scheduler *sched,
>>                                 struct drm_sched_entity *entity)
>>  {
>> @@ -306,6 +361,15 @@ static void drm_sched_entity_clear_dep(struct dma_fence *f, struct dma_fence_cb
>>         dma_fence_put(f);
>>  }
>>
>> +/**
>> + * drm_sched_entity_set_rq - Sets the run queue for an entity
>> + *
>> + * @entity: scheduler entity
>> + * @rq: scheduler run queue
>> + *
>> + * Sets the run queue for an entity and removes the entity from the previous
>> + * run queue in which was present.
>> + */
>>  void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
>>                              struct drm_sched_rq *rq)
>>  {
>> @@ -325,6 +389,14 @@ void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
>>  }
>>  EXPORT_SYMBOL(drm_sched_entity_set_rq);
>>
>> +/**
>> + * drm_sched_dependency_optimized
>> + *
>> + * @fence: the dependency fence
>> + * @entity: the entity which depends on the above fence
>> + *
>> + * Returns true if the dependency can be optimized and false otherwise
>> + */
>>  bool drm_sched_dependency_optimized(struct dma_fence* fence,
>>                                     struct drm_sched_entity *entity)
>>  {
>> @@ -413,9 +485,10 @@ drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>>  }
>>
>>  /**
>> - * Submit a job to the job queue
>> + * drm_sched_entity_push_job - Submit a job to the entity's job queue
>>   *
>> - * @sched_job          The pointer to job required to submit
>> + * @sched_job: job to submit
>> + * @entity: scheduler entity
>>   *
>>   * Note: To guarantee that the order of insertion to queue matches
>>   * the job's fence sequence number this function should be
>> @@ -506,6 +579,13 @@ static void drm_sched_job_timedout(struct work_struct *work)
>>         job->sched->ops->timedout_job(job);
>>  }
>>
>> +/**
>> + * drm_sched_hw_job_reset - stop the scheduler if it contains the bad job
>> + *
>> + * @sched: scheduler instance
>> + * @bad: bad scheduler job
>> + *
>> + */
>>  void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>>  {
>>         struct drm_sched_job *s_job;
>> @@ -550,6 +630,12 @@ void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_jo
>>  }
>>  EXPORT_SYMBOL(drm_sched_hw_job_reset);
>>
>> +/**
>> + * drm_sched_job_recovery - recover jobs after a reset
>> + *
>> + * @sched: scheduler instance
>> + *
>> + */
>>  void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
>>  {
>>         struct drm_sched_job *s_job, *tmp;
>> @@ -599,10 +685,17 @@ void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
>>  EXPORT_SYMBOL(drm_sched_job_recovery);
>>
>>  /**
>> - * Init a sched_job with basic field
>> + * drm_sched_job_init - init a scheduler job
>>   *
>> - * Note: Refer to drm_sched_entity_push_job documentation
>> + * @job: scheduler job to init
>> + * @sched: scheduler instance
>> + * @entity: scheduler entity to use
>> + * @owner: job owner for debugging
>> + *
>> + * Refer to drm_sched_entity_push_job() documentation
>>   * for locking considerations.
>> + *
>> + * Returns 0 for success, negative error code otherwise.
>>   */
>>  int drm_sched_job_init(struct drm_sched_job *job,
>>                        struct drm_gpu_scheduler *sched,
>> @@ -626,7 +719,11 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>  EXPORT_SYMBOL(drm_sched_job_init);
>>
>>  /**
>> - * Return ture if we can push more jobs to the hw.
>> + * drm_sched_ready - is the scheduler ready
>> + *
>> + * @sched: scheduler instance
>> + *
>> + * Return true if we can push more jobs to the hw, otherwise false.
>>   */
>>  static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
>>  {
>> @@ -635,7 +732,10 @@ static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
>>  }
>>
>>  /**
>> - * Wake up the scheduler when it is ready
>> + * drm_sched_wakeup - Wake up the scheduler when it is ready
>> + *
>> + * @sched: scheduler instance
>> + *
>>   */
>>  static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
>>  {
>> @@ -644,8 +744,12 @@ static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
>>  }
>>
>>  /**
>> - * Select next entity to process
>> -*/
>> + * drm_sched_select_entity - Select next entity to process
>> + *
>> + * @sched: scheduler instance
>> + *
>> + * Returns the entity to process or NULL if none are found.
>> + */
>>  static struct drm_sched_entity *
>>  drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>>  {
>> @@ -665,6 +769,14 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>>         return entity;
>>  }
>>
>> +/**
>> + * drm_sched_process_job - process a job
>> + *
>> + * @f: fence
>> + * @cb: fence callbacks
>> + *
>> + * Called after job has finished execution.
>> + */
>>  static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>>  {
>>         struct drm_sched_fence *s_fence =
>> @@ -680,6 +792,13 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>>         wake_up_interruptible(&sched->wake_up_worker);
>>  }
>>
>> +/**
>> + * drm_sched_blocked - check if the scheduler is blocked
>> + *
>> + * @sched: scheduler instance
>> + *
>> + * Returns true if blocked, otherwise false.
>> + */
>>  static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>>  {
>>         if (kthread_should_park()) {
>> @@ -690,6 +809,13 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>>         return false;
>>  }
>>
>> +/**
>> + * drm_sched_main - main scheduler thread
>> + *
>> + * @param: scheduler instance
>> + *
>> + * Returns 0.
>> + */
>>  static int drm_sched_main(void *param)
>>  {
>>         struct sched_param sparam = {.sched_priority = 1};
>> @@ -744,15 +870,17 @@ static int drm_sched_main(void *param)
>>  }
>>
>>  /**
>> - * Init a gpu scheduler instance
>> + * drm_sched_init - Init a gpu scheduler instance
>>   *
>> - * @sched              The pointer to the scheduler
>> - * @ops                        The backend operations for this scheduler.
>> - * @hw_submissions     Number of hw submissions to do.
>> - * @name               Name used for debugging
>> + * @sched: scheduler instance
>> + * @ops: backend operations for this scheduler
>> + * @hw_submission: number of hw submissions that can be in flight
>> + * @hang_limit: number of times to allow a job to hang before dropping it
>> + * @timeout: timeout value in jiffies for the scheduler
>> + * @name: name used for debugging
>>   *
>>   * Return 0 on success, otherwise error code.
>> -*/
>> + */
>>  int drm_sched_init(struct drm_gpu_scheduler *sched,
>>                    const struct drm_sched_backend_ops *ops,
>>                    unsigned hw_submission,
>> @@ -788,9 +916,11 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>>  EXPORT_SYMBOL(drm_sched_init);
>>
>>  /**
>> - * Destroy a gpu scheduler
>> + * drm_sched_fini - Destroy a gpu scheduler
>> + *
>> + * @sched: scheduler instance
>>   *
>> - * @sched      The pointer to the scheduler
>> + * Tears down and cleans up the scheduler.
>>   */
>>  void drm_sched_fini(struct drm_gpu_scheduler *sched)
>>  {
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index dec655894d08..496442f12bff 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -43,13 +43,33 @@ enum drm_sched_priority {
>>  };
>>
>>  /**
>> - * drm_sched_entity - A wrapper around a job queue (typically attached
>> - * to the DRM file_priv).
>> + * struct drm_sched_entity - A wrapper around a job queue (typically
>> + * attached to the DRM file_priv).
>> + *
>> + * @list: used to append this struct to the list of entities in the
>> + *        runqueue.
>> + * @rq: runqueue to which this entity belongs.
>> + * @rq_lock: lock to modify the runqueue to which this entity belongs.
>> + * @sched: the scheduler instance to which this entity is enqueued.
>> + * @job_queue: the list of jobs of this entity.
>> + * @fence_seq: a linearly increasing seqno incremented with each
>> + *             new &drm_sched_fence which is part of the entity.
>> + * @fence_context: a unique context for all the fences which belong
>> + *                 to this entity.
>> + *                 The &drm_sched_fence.scheduled uses the
>> + *                 fence_context but &drm_sched_fence.finished uses
>> + *                 fence_context + 1.
>> + * @dependency: the dependency fence of the job which is on the top
>> + *              of the job queue.
>> + * @cb: callback for the dependency fence above.
>> + * @guilty: points to ctx's guilty.
>> + * @fini_status: contains the exit status in case the process was signalled.
>> + * @last_scheduled: points to the finished fence of the last scheduled job.
>>   *
>>   * Entities will emit jobs in order to their corresponding hardware
>>   * ring, and the scheduler will alternate between entities based on
>>   * scheduling policy.
>> -*/
>> + */
>>  struct drm_sched_entity {
>>         struct list_head                list;
>>         struct drm_sched_rq             *rq;
>> @@ -63,47 +83,96 @@ struct drm_sched_entity {
>>
>>         struct dma_fence                *dependency;
>>         struct dma_fence_cb             cb;
>> -       atomic_t                        *guilty; /* points to ctx's guilty */
>> -       int            fini_status;
>> -       struct dma_fence    *last_scheduled;
>> +       atomic_t                        *guilty;
>> +       int                             fini_status;
>> +       struct dma_fence                *last_scheduled;
>>  };
>>
>>  /**
>> + * struct drm_sched_rq - queue of entities to be scheduled.
>> + *
>> + * @lock: to modify the entities list.
>> + * @entities: list of the entities to be scheduled.
>> + * @current_entity: the entity which is to be scheduled.
>> + *
>>   * Run queue is a set of entities scheduling command submissions for
>>   * one specific ring. It implements the scheduling policy that selects
>>   * the next entity to emit commands from.
>> -*/
>> + */
>>  struct drm_sched_rq {
>>         spinlock_t                      lock;
>>         struct list_head                entities;
>>         struct drm_sched_entity         *current_entity;
>>  };
>>
>> +/**
>> + * struct drm_sched_fence - fences corresponding to the scheduling of a job.
>> + */
>>  struct drm_sched_fence {
>> +        /**
>> +         * @scheduled: this fence is what will be signaled by the scheduler
>> +         * when the job is scheduled.
>> +         */
>>         struct dma_fence                scheduled;
>>
>> -       /* This fence is what will be signaled by the scheduler when
>> -        * the job is completed.
>> -        *
>> -        * When setting up an out fence for the job, you should use
>> -        * this, since it's available immediately upon
>> -        * drm_sched_job_init(), and the fence returned by the driver
>> -        * from run_job() won't be created until the dependencies have
>> -        * resolved.
>> -        */
>> +        /**
>> +         * @finished: this fence is what will be signaled by the scheduler
>> +         * when the job is completed.
>> +         *
>> +         * When setting up an out fence for the job, you should use
>> +         * this, since it's available immediately upon
>> +         * drm_sched_job_init(), and the fence returned by the driver
>> +         * from run_job() won't be created until the dependencies have
>> +         * resolved.
>> +         */
>>         struct dma_fence                finished;
>>
>> +        /**
>> +         * @cb: the callback for the parent fence below.
>> +         */
>>         struct dma_fence_cb             cb;
>> +        /**
>> +         * @parent: the fence returned by &drm_sched_backend_ops.run_job
>> +         * when scheduling the job on hardware. We signal the
>> +         * &drm_sched_fence.finished fence once parent is signalled.
>> +         */
>>         struct dma_fence                *parent;
>> +        /**
>> +         * @sched: the scheduler instance to which the job having this struct
>> +         * belongs to.
>> +         */
>>         struct drm_gpu_scheduler        *sched;
>> +        /**
>> +         * @lock: the lock used by the scheduled and the finished fences.
>> +         */
>>         spinlock_t                      lock;
>> +        /**
>> +         * @owner: job owner for debugging
>> +         */
>>         void                            *owner;
>>  };
>>
>>  struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>>
>>  /**
>> - * drm_sched_job - A job to be run by an entity.
>> + * struct drm_sched_job - A job to be run by an entity.
>> + *
>> + * @queue_node: used to append this struct to the queue of jobs in an entity.
>> + * @sched: the scheduler instance on which this job is scheduled.
>> + * @s_fence: contains the fences for the scheduling of job.
>> + * @finish_cb: the callback for the finished fence.
>> + * @finish_work: schedules the function @drm_sched_job_finish once the job has
>> + *               finished to remove the job from the
>> + *               @drm_gpu_scheduler.ring_mirror_list.
>> + * @node: used to append this struct to the @drm_gpu_scheduler.ring_mirror_list.
>> + * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the timeout
>> + *            interval is over.
>> + * @id: a unique id assigned to each job scheduled on the scheduler.
>> + * @karma: increment on every hang caused by this job. If this exceeds the hang
>> + *         limit of the scheduler then the job is marked guilty and will not
>> + *         be scheduled further.
>> + * @s_priority: the priority of the job.
>> + * @entity: the entity to which this job belongs.
>>   *
>>   * A job is created by the driver using drm_sched_job_init(), and
>>   * should call drm_sched_entity_push_job() once it wants the scheduler
>> @@ -130,38 +199,64 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>>  }
>>
>>  /**
>> + * struct drm_sched_backend_ops
>> + *
>>   * Define the backend operations called by the scheduler,
>> - * these functions should be implemented in driver side
>> -*/
>> + * these functions should be implemented in driver side.
>> + */
>>  struct drm_sched_backend_ops {
>> -       /* Called when the scheduler is considering scheduling this
>> -        * job next, to get another struct dma_fence for this job to
>> +       /**
>> +         * @dependency: Called when the scheduler is considering scheduling
>> +         * this job next, to get another struct dma_fence for this job to
>>          * block on.  Once it returns NULL, run_job() may be called.
>>          */
>>         struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
>>                                         struct drm_sched_entity *s_entity);
>>
>> -       /* Called to execute the job once all of the dependencies have
>> -        * been resolved.  This may be called multiple times, if
>> +       /**
>> +         * @run_job: Called to execute the job once all of the dependencies
>> +         * have been resolved.  This may be called multiple times, if
>>          * timedout_job() has happened and drm_sched_job_recovery()
>>          * decides to try it again.
>>          */
>>         struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>>
>> -       /* Called when a job has taken too long to execute, to trigger
>> -        * GPU recovery.
>> +       /**
>> +         * @timedout_job: Called when a job has taken too long to execute,
>> +         * to trigger GPU recovery.
>>          */
>>         void (*timedout_job)(struct drm_sched_job *sched_job);
>>
>> -       /* Called once the job's finished fence has been signaled and
>> -        * it's time to clean it up.
>> +       /**
>> +         * @free_job: Called once the job's finished fence has been signaled
>> +         * and it's time to clean it up.
>>          */
>>         void (*free_job)(struct drm_sched_job *sched_job);
>>  };
>>
>>  /**
>> - * One scheduler is implemented for each hardware ring
>> -*/
>> + * struct drm_gpu_scheduler
>> + *
>> + * @ops: backend operations provided by the driver.
>> + * @hw_submission_limit: the max size of the hardware queue.
>> + * @timeout: the time after which a job is removed from the scheduler.
>> + * @name: name of the ring for which this scheduler is being used.
>> + * @sched_rq: priority wise array of run queues.
>> + * @wake_up_worker: the wait queue on which the scheduler sleeps until a job
>> + *                  is ready to be scheduled.
>> + * @job_scheduled: once @drm_sched_entity_do_release is called the scheduler
>> + *                 waits on this wait queue until all the scheduled jobs are
>> + *                 finished.
>> + * @hw_rq_count: the number of jobs currently in the hardware queue.
>> + * @job_id_count: used to assign unique id to the each job.
>> + * @thread: the kthread on which the scheduler which run.
>> + * @ring_mirror_list: the list of jobs which are currently in the job queue.
>> + * @job_list_lock: lock to protect the ring_mirror_list.
>> + * @hang_limit: once the hangs by a job crosses this limit then it is marked
>> + *              guilty and it will be considered for scheduling further.
>> + *
>> + * One scheduler is implemented for each hardware ring.
>> + */
>>  struct drm_gpu_scheduler {
>>         const struct drm_sched_backend_ops      *ops;
>>         uint32_t                        hw_submission_limit;
>> --
>> 2.14.3
>>
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] drm/scheduler: add documentation
  2018-05-28  8:31     ` Nayan Deshmukh
@ 2018-05-29  5:53       ` Nayan Deshmukh
  2018-05-29  6:38         ` Christian König
  2018-05-29  8:05       ` [PATCH 2/3] " Daniel Vetter
  1 sibling, 1 reply; 13+ messages in thread
From: Nayan Deshmukh @ 2018-05-29  5:53 UTC (permalink / raw)
  To: dri-devel; +Cc: Nayan Deshmukh, Alex Deucher, christian.koenig

convert existing raw comments into kernel-doc format as well
as add new documentation

v2: reword the overview

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/scheduler/gpu_scheduler.c | 214 ++++++++++++++++++++++++------
 include/drm/gpu_scheduler.h               | 153 +++++++++++++++++----
 2 files changed, 296 insertions(+), 71 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
index 44d480768dfe..8c1e80c9b674 100644
--- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
+++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
@@ -21,6 +21,29 @@
  *
  */
 
+/**
+ * DOC: Overview
+ *
+ * The GPU scheduler provides entities which allow userspace to push jobs
+ * into software queues which are then scheduled on a hardware run queue.
+ * The software queues have a priority among them. The scheduler selects the entities
+ * from the run queue using a FIFO. The scheduler provides dependency handling
+ * features among jobs. The driver is supposed to provide callback functions for
+ * backend operations to the scheduler like submitting a job to hardware run queue,
+ * returning the dependencies of a job etc.
+ *
+ * The organisation of the scheduler is the following:
+ *
+ * 1. Each hw run queue has one scheduler
+ * 2. Each scheduler has multiple run queues with different priorities
+ *    (e.g., HIGH_HW,HIGH_SW, KERNEL, NORMAL)
+ * 3. Each scheduler run queue has a queue of entities to schedule
+ * 4. Entities themselves maintain a queue of jobs that will be scheduled on
+ *    the hardware.
+ *
+ * The jobs in a entity are always scheduled in the order that they were pushed.
+ */
+
 #include <linux/kthread.h>
 #include <linux/wait.h>
 #include <linux/sched.h>
@@ -39,7 +62,13 @@ static bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
 static void drm_sched_wakeup(struct drm_gpu_scheduler *sched);
 static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb);
 
-/* Initialize a given run queue struct */
+/**
+ * drm_sched_rq_init - initialize a given run queue struct
+ *
+ * @rq: scheduler run queue
+ *
+ * Initializes a scheduler runqueue.
+ */
 static void drm_sched_rq_init(struct drm_sched_rq *rq)
 {
 	spin_lock_init(&rq->lock);
@@ -47,6 +76,14 @@ static void drm_sched_rq_init(struct drm_sched_rq *rq)
 	rq->current_entity = NULL;
 }
 
+/**
+ * drm_sched_rq_add_entity - add an entity
+ *
+ * @rq: scheduler run queue
+ * @entity: scheduler entity
+ *
+ * Adds a scheduler entity to the run queue.
+ */
 static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
 				    struct drm_sched_entity *entity)
 {
@@ -57,6 +94,14 @@ static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
 	spin_unlock(&rq->lock);
 }
 
+/**
+ * drm_sched_rq_remove_entity - remove an entity
+ *
+ * @rq: scheduler run queue
+ * @entity: scheduler entity
+ *
+ * Removes a scheduler entity from the run queue.
+ */
 static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
 				       struct drm_sched_entity *entity)
 {
@@ -70,9 +115,9 @@ static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
 }
 
 /**
- * Select an entity which could provide a job to run
+ * drm_sched_rq_select_entity - Select an entity which could provide a job to run
  *
- * @rq		The run queue to check.
+ * @rq: scheduler run queue to check.
  *
  * Try to find a ready entity, returns NULL if none found.
  */
@@ -112,15 +157,16 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
 }
 
 /**
- * Init a context entity used by scheduler when submit to HW ring.
+ * drm_sched_entity_init - Init a context entity used by scheduler when
+ * submit to HW ring.
  *
- * @sched	The pointer to the scheduler
- * @entity	The pointer to a valid drm_sched_entity
- * @rq		The run queue this entity belongs
- * @guilty      atomic_t set to 1 when a job on this queue
- *              is found to be guilty causing a timeout
+ * @sched: scheduler instance
+ * @entity: scheduler entity to init
+ * @rq: the run queue this entity belongs
+ * @guilty: atomic_t set to 1 when a job on this queue
+ *          is found to be guilty causing a timeout
  *
- * return 0 if succeed. negative error code on failure
+ * Returns 0 on success or a negative error code on failure.
 */
 int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
 			  struct drm_sched_entity *entity,
@@ -149,10 +195,10 @@ int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
 EXPORT_SYMBOL(drm_sched_entity_init);
 
 /**
- * Query if entity is initialized
+ * drm_sched_entity_is_initialized - Query if entity is initialized
  *
- * @sched       Pointer to scheduler instance
- * @entity	The pointer to a valid scheduler entity
+ * @sched: Pointer to scheduler instance
+ * @entity: The pointer to a valid scheduler entity
  *
  * return true if entity is initialized, false otherwise
 */
@@ -164,11 +210,11 @@ static bool drm_sched_entity_is_initialized(struct drm_gpu_scheduler *sched,
 }
 
 /**
- * Check if entity is idle
+ * drm_sched_entity_is_idle - Check if entity is idle
  *
- * @entity	The pointer to a valid scheduler entity
+ * @entity: scheduler entity
  *
- * Return true if entity don't has any unscheduled jobs.
+ * Returns true if the entity does not have any unscheduled jobs.
  */
 static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
 {
@@ -180,9 +226,9 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
 }
 
 /**
- * Check if entity is ready
+ * drm_sched_entity_is_ready - Check if entity is ready
  *
- * @entity	The pointer to a valid scheduler entity
+ * @entity: scheduler entity
  *
  * Return true if entity could provide a job.
  */
@@ -210,12 +256,12 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
 
 
 /**
- * Destroy a context entity
+ * drm_sched_entity_do_release - Destroy a context entity
  *
- * @sched       Pointer to scheduler instance
- * @entity	The pointer to a valid scheduler entity
+ * @sched: scheduler instance
+ * @entity: scheduler entity
  *
- * Splitting drm_sched_entity_fini() into two functions, The first one is does the waiting,
+ * Splitting drm_sched_entity_fini() into two functions, The first one does the waiting,
  * removes the entity from the runqueue and returns an error when the process was killed.
  */
 void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
@@ -237,12 +283,13 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
 EXPORT_SYMBOL(drm_sched_entity_do_release);
 
 /**
- * Destroy a context entity
+ * drm_sched_entity_cleanup - Destroy a context entity
  *
- * @sched       Pointer to scheduler instance
- * @entity	The pointer to a valid scheduler entity
+ * @sched: scheduler instance
+ * @entity: scheduler entity
  *
- * The second one then goes over the entity and signals all jobs with an error code.
+ * This should be called after @drm_sched_entity_do_release. It goes over the
+ * entity and signals all jobs with an error code if the process was killed.
  */
 void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
 			   struct drm_sched_entity *entity)
@@ -281,6 +328,14 @@ void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
 }
 EXPORT_SYMBOL(drm_sched_entity_cleanup);
 
+/**
+ * drm_sched_entity_fini - Destroy a context entity
+ *
+ * @sched: scheduler instance
+ * @entity: scheduler entity
+ *
+ * Calls drm_sched_entity_do_release() and drm_sched_entity_cleanup()
+ */
 void drm_sched_entity_fini(struct drm_gpu_scheduler *sched,
 				struct drm_sched_entity *entity)
 {
@@ -306,6 +361,15 @@ static void drm_sched_entity_clear_dep(struct dma_fence *f, struct dma_fence_cb
 	dma_fence_put(f);
 }
 
+/**
+ * drm_sched_entity_set_rq - Sets the run queue for an entity
+ *
+ * @entity: scheduler entity
+ * @rq: scheduler run queue
+ *
+ * Sets the run queue for an entity and removes the entity from the previous
+ * run queue in which was present.
+ */
 void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
 			     struct drm_sched_rq *rq)
 {
@@ -325,6 +389,14 @@ void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
 }
 EXPORT_SYMBOL(drm_sched_entity_set_rq);
 
+/**
+ * drm_sched_dependency_optimized
+ *
+ * @fence: the dependency fence
+ * @entity: the entity which depends on the above fence
+ *
+ * Returns true if the dependency can be optimized and false otherwise
+ */
 bool drm_sched_dependency_optimized(struct dma_fence* fence,
 				    struct drm_sched_entity *entity)
 {
@@ -413,9 +485,10 @@ drm_sched_entity_pop_job(struct drm_sched_entity *entity)
 }
 
 /**
- * Submit a job to the job queue
+ * drm_sched_entity_push_job - Submit a job to the entity's job queue
  *
- * @sched_job		The pointer to job required to submit
+ * @sched_job: job to submit
+ * @entity: scheduler entity
  *
  * Note: To guarantee that the order of insertion to queue matches
  * the job's fence sequence number this function should be
@@ -506,6 +579,13 @@ static void drm_sched_job_timedout(struct work_struct *work)
 	job->sched->ops->timedout_job(job);
 }
 
+/**
+ * drm_sched_hw_job_reset - stop the scheduler if it contains the bad job
+ *
+ * @sched: scheduler instance
+ * @bad: bad scheduler job
+ *
+ */
 void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
 {
 	struct drm_sched_job *s_job;
@@ -550,6 +630,12 @@ void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_jo
 }
 EXPORT_SYMBOL(drm_sched_hw_job_reset);
 
+/**
+ * drm_sched_job_recovery - recover jobs after a reset
+ *
+ * @sched: scheduler instance
+ *
+ */
 void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
 {
 	struct drm_sched_job *s_job, *tmp;
@@ -599,10 +685,17 @@ void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
 EXPORT_SYMBOL(drm_sched_job_recovery);
 
 /**
- * Init a sched_job with basic field
+ * drm_sched_job_init - init a scheduler job
  *
- * Note: Refer to drm_sched_entity_push_job documentation
+ * @job: scheduler job to init
+ * @sched: scheduler instance
+ * @entity: scheduler entity to use
+ * @owner: job owner for debugging
+ *
+ * Refer to drm_sched_entity_push_job() documentation
  * for locking considerations.
+ *
+ * Returns 0 for success, negative error code otherwise.
  */
 int drm_sched_job_init(struct drm_sched_job *job,
 		       struct drm_gpu_scheduler *sched,
@@ -626,7 +719,11 @@ int drm_sched_job_init(struct drm_sched_job *job,
 EXPORT_SYMBOL(drm_sched_job_init);
 
 /**
- * Return ture if we can push more jobs to the hw.
+ * drm_sched_ready - is the scheduler ready
+ *
+ * @sched: scheduler instance
+ *
+ * Return true if we can push more jobs to the hw, otherwise false.
  */
 static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
 {
@@ -635,7 +732,10 @@ static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
 }
 
 /**
- * Wake up the scheduler when it is ready
+ * drm_sched_wakeup - Wake up the scheduler when it is ready
+ *
+ * @sched: scheduler instance
+ *
  */
 static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
 {
@@ -644,8 +744,12 @@ static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
 }
 
 /**
- * Select next entity to process
-*/
+ * drm_sched_select_entity - Select next entity to process
+ *
+ * @sched: scheduler instance
+ *
+ * Returns the entity to process or NULL if none are found.
+ */
 static struct drm_sched_entity *
 drm_sched_select_entity(struct drm_gpu_scheduler *sched)
 {
@@ -665,6 +769,14 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
 	return entity;
 }
 
+/**
+ * drm_sched_process_job - process a job
+ *
+ * @f: fence
+ * @cb: fence callbacks
+ *
+ * Called after job has finished execution.
+ */
 static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
 {
 	struct drm_sched_fence *s_fence =
@@ -680,6 +792,13 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
 	wake_up_interruptible(&sched->wake_up_worker);
 }
 
+/**
+ * drm_sched_blocked - check if the scheduler is blocked
+ *
+ * @sched: scheduler instance
+ *
+ * Returns true if blocked, otherwise false.
+ */
 static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
 {
 	if (kthread_should_park()) {
@@ -690,6 +809,13 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
 	return false;
 }
 
+/**
+ * drm_sched_main - main scheduler thread
+ *
+ * @param: scheduler instance
+ *
+ * Returns 0.
+ */
 static int drm_sched_main(void *param)
 {
 	struct sched_param sparam = {.sched_priority = 1};
@@ -744,15 +870,17 @@ static int drm_sched_main(void *param)
 }
 
 /**
- * Init a gpu scheduler instance
+ * drm_sched_init - Init a gpu scheduler instance
  *
- * @sched		The pointer to the scheduler
- * @ops			The backend operations for this scheduler.
- * @hw_submissions	Number of hw submissions to do.
- * @name		Name used for debugging
+ * @sched: scheduler instance
+ * @ops: backend operations for this scheduler
+ * @hw_submission: number of hw submissions that can be in flight
+ * @hang_limit: number of times to allow a job to hang before dropping it
+ * @timeout: timeout value in jiffies for the scheduler
+ * @name: name used for debugging
  *
  * Return 0 on success, otherwise error code.
-*/
+ */
 int drm_sched_init(struct drm_gpu_scheduler *sched,
 		   const struct drm_sched_backend_ops *ops,
 		   unsigned hw_submission,
@@ -788,9 +916,11 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
 EXPORT_SYMBOL(drm_sched_init);
 
 /**
- * Destroy a gpu scheduler
+ * drm_sched_fini - Destroy a gpu scheduler
+ *
+ * @sched: scheduler instance
  *
- * @sched	The pointer to the scheduler
+ * Tears down and cleans up the scheduler.
  */
 void drm_sched_fini(struct drm_gpu_scheduler *sched)
 {
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index dec655894d08..496442f12bff 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -43,13 +43,33 @@ enum drm_sched_priority {
 };
 
 /**
- * drm_sched_entity - A wrapper around a job queue (typically attached
- * to the DRM file_priv).
+ * struct drm_sched_entity - A wrapper around a job queue (typically
+ * attached to the DRM file_priv).
+ *
+ * @list: used to append this struct to the list of entities in the
+ *        runqueue.
+ * @rq: runqueue to which this entity belongs.
+ * @rq_lock: lock to modify the runqueue to which this entity belongs.
+ * @sched: the scheduler instance to which this entity is enqueued.
+ * @job_queue: the list of jobs of this entity.
+ * @fence_seq: a linearly increasing seqno incremented with each
+ *             new &drm_sched_fence which is part of the entity.
+ * @fence_context: a unique context for all the fences which belong
+ *                 to this entity.
+ *                 The &drm_sched_fence.scheduled uses the
+ *                 fence_context but &drm_sched_fence.finished uses
+ *                 fence_context + 1.
+ * @dependency: the dependency fence of the job which is on the top
+ *              of the job queue.
+ * @cb: callback for the dependency fence above.
+ * @guilty: points to ctx's guilty.
+ * @fini_status: contains the exit status in case the process was signalled.
+ * @last_scheduled: points to the finished fence of the last scheduled job.
  *
  * Entities will emit jobs in order to their corresponding hardware
  * ring, and the scheduler will alternate between entities based on
  * scheduling policy.
-*/
+ */
 struct drm_sched_entity {
 	struct list_head		list;
 	struct drm_sched_rq		*rq;
@@ -63,47 +83,96 @@ struct drm_sched_entity {
 
 	struct dma_fence		*dependency;
 	struct dma_fence_cb		cb;
-	atomic_t			*guilty; /* points to ctx's guilty */
-	int            fini_status;
-	struct dma_fence    *last_scheduled;
+	atomic_t			*guilty;
+	int                             fini_status;
+	struct dma_fence                *last_scheduled;
 };
 
 /**
+ * struct drm_sched_rq - queue of entities to be scheduled.
+ *
+ * @lock: to modify the entities list.
+ * @entities: list of the entities to be scheduled.
+ * @current_entity: the entity which is to be scheduled.
+ *
  * Run queue is a set of entities scheduling command submissions for
  * one specific ring. It implements the scheduling policy that selects
  * the next entity to emit commands from.
-*/
+ */
 struct drm_sched_rq {
 	spinlock_t			lock;
 	struct list_head		entities;
 	struct drm_sched_entity		*current_entity;
 };
 
+/**
+ * struct drm_sched_fence - fences corresponding to the scheduling of a job.
+ */
 struct drm_sched_fence {
+        /**
+         * @scheduled: this fence is what will be signaled by the scheduler
+         * when the job is scheduled.
+         */
 	struct dma_fence		scheduled;
 
-	/* This fence is what will be signaled by the scheduler when
-	 * the job is completed.
-	 *
-	 * When setting up an out fence for the job, you should use
-	 * this, since it's available immediately upon
-	 * drm_sched_job_init(), and the fence returned by the driver
-	 * from run_job() won't be created until the dependencies have
-	 * resolved.
-	 */
+        /**
+         * @finished: this fence is what will be signaled by the scheduler
+         * when the job is completed.
+         *
+         * When setting up an out fence for the job, you should use
+         * this, since it's available immediately upon
+         * drm_sched_job_init(), and the fence returned by the driver
+         * from run_job() won't be created until the dependencies have
+         * resolved.
+         */
 	struct dma_fence		finished;
 
+        /**
+         * @cb: the callback for the parent fence below.
+         */
 	struct dma_fence_cb		cb;
+        /**
+         * @parent: the fence returned by &drm_sched_backend_ops.run_job
+         * when scheduling the job on hardware. We signal the
+         * &drm_sched_fence.finished fence once parent is signalled.
+         */
 	struct dma_fence		*parent;
+        /**
+         * @sched: the scheduler instance to which the job having this struct
+         * belongs to.
+         */
 	struct drm_gpu_scheduler	*sched;
+        /**
+         * @lock: the lock used by the scheduled and the finished fences.
+         */
 	spinlock_t			lock;
+        /**
+         * @owner: job owner for debugging
+         */
 	void				*owner;
 };
 
 struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
 
 /**
- * drm_sched_job - A job to be run by an entity.
+ * struct drm_sched_job - A job to be run by an entity.
+ *
+ * @queue_node: used to append this struct to the queue of jobs in an entity.
+ * @sched: the scheduler instance on which this job is scheduled.
+ * @s_fence: contains the fences for the scheduling of job.
+ * @finish_cb: the callback for the finished fence.
+ * @finish_work: schedules the function @drm_sched_job_finish once the job has
+ *               finished to remove the job from the
+ *               @drm_gpu_scheduler.ring_mirror_list.
+ * @node: used to append this struct to the @drm_gpu_scheduler.ring_mirror_list.
+ * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the timeout
+ *            interval is over.
+ * @id: a unique id assigned to each job scheduled on the scheduler.
+ * @karma: increment on every hang caused by this job. If this exceeds the hang
+ *         limit of the scheduler then the job is marked guilty and will not
+ *         be scheduled further.
+ * @s_priority: the priority of the job.
+ * @entity: the entity to which this job belongs.
  *
  * A job is created by the driver using drm_sched_job_init(), and
  * should call drm_sched_entity_push_job() once it wants the scheduler
@@ -130,38 +199,64 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
 }
 
 /**
+ * struct drm_sched_backend_ops
+ *
  * Define the backend operations called by the scheduler,
- * these functions should be implemented in driver side
-*/
+ * these functions should be implemented in driver side.
+ */
 struct drm_sched_backend_ops {
-	/* Called when the scheduler is considering scheduling this
-	 * job next, to get another struct dma_fence for this job to
+	/**
+         * @dependency: Called when the scheduler is considering scheduling
+         * this job next, to get another struct dma_fence for this job to
 	 * block on.  Once it returns NULL, run_job() may be called.
 	 */
 	struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
 					struct drm_sched_entity *s_entity);
 
-	/* Called to execute the job once all of the dependencies have
-	 * been resolved.  This may be called multiple times, if
+	/**
+         * @run_job: Called to execute the job once all of the dependencies
+         * have been resolved.  This may be called multiple times, if
 	 * timedout_job() has happened and drm_sched_job_recovery()
 	 * decides to try it again.
 	 */
 	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
 
-	/* Called when a job has taken too long to execute, to trigger
-	 * GPU recovery.
+	/**
+         * @timedout_job: Called when a job has taken too long to execute,
+         * to trigger GPU recovery.
 	 */
 	void (*timedout_job)(struct drm_sched_job *sched_job);
 
-	/* Called once the job's finished fence has been signaled and
-	 * it's time to clean it up.
+	/**
+         * @free_job: Called once the job's finished fence has been signaled
+         * and it's time to clean it up.
 	 */
 	void (*free_job)(struct drm_sched_job *sched_job);
 };
 
 /**
- * One scheduler is implemented for each hardware ring
-*/
+ * struct drm_gpu_scheduler
+ *
+ * @ops: backend operations provided by the driver.
+ * @hw_submission_limit: the max size of the hardware queue.
+ * @timeout: the time after which a job is removed from the scheduler.
+ * @name: name of the ring for which this scheduler is being used.
+ * @sched_rq: priority wise array of run queues.
+ * @wake_up_worker: the wait queue on which the scheduler sleeps until a job
+ *                  is ready to be scheduled.
+ * @job_scheduled: once @drm_sched_entity_do_release is called the scheduler
+ *                 waits on this wait queue until all the scheduled jobs are
+ *                 finished.
+ * @hw_rq_count: the number of jobs currently in the hardware queue.
+ * @job_id_count: used to assign unique id to the each job.
+ * @thread: the kthread on which the scheduler which run.
+ * @ring_mirror_list: the list of jobs which are currently in the job queue.
+ * @job_list_lock: lock to protect the ring_mirror_list.
+ * @hang_limit: once the hangs by a job crosses this limit then it is marked
+ *              guilty and it will be considered for scheduling further.
+ *
+ * One scheduler is implemented for each hardware ring.
+ */
 struct drm_gpu_scheduler {
 	const struct drm_sched_backend_ops	*ops;
 	uint32_t			hw_submission_limit;
-- 
2.14.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] drm/scheduler: add documentation
  2018-05-29  5:53       ` [PATCH v2] " Nayan Deshmukh
@ 2018-05-29  6:38         ` Christian König
  0 siblings, 0 replies; 13+ messages in thread
From: Christian König @ 2018-05-29  6:38 UTC (permalink / raw)
  To: Nayan Deshmukh, dri-devel; +Cc: Alex Deucher

Am 29.05.2018 um 07:53 schrieb Nayan Deshmukh:
> convert existing raw comments into kernel-doc format as well
> as add new documentation
>
> v2: reword the overview
>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

Going to push that into our branches later today,
Christian.

> ---
>   drivers/gpu/drm/scheduler/gpu_scheduler.c | 214 ++++++++++++++++++++++++------
>   include/drm/gpu_scheduler.h               | 153 +++++++++++++++++----
>   2 files changed, 296 insertions(+), 71 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> index 44d480768dfe..8c1e80c9b674 100644
> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> @@ -21,6 +21,29 @@
>    *
>    */
>   
> +/**
> + * DOC: Overview
> + *
> + * The GPU scheduler provides entities which allow userspace to push jobs
> + * into software queues which are then scheduled on a hardware run queue.
> + * The software queues have a priority among them. The scheduler selects the entities
> + * from the run queue using a FIFO. The scheduler provides dependency handling
> + * features among jobs. The driver is supposed to provide callback functions for
> + * backend operations to the scheduler like submitting a job to hardware run queue,
> + * returning the dependencies of a job etc.
> + *
> + * The organisation of the scheduler is the following:
> + *
> + * 1. Each hw run queue has one scheduler
> + * 2. Each scheduler has multiple run queues with different priorities
> + *    (e.g., HIGH_HW,HIGH_SW, KERNEL, NORMAL)
> + * 3. Each scheduler run queue has a queue of entities to schedule
> + * 4. Entities themselves maintain a queue of jobs that will be scheduled on
> + *    the hardware.
> + *
> + * The jobs in a entity are always scheduled in the order that they were pushed.
> + */
> +
>   #include <linux/kthread.h>
>   #include <linux/wait.h>
>   #include <linux/sched.h>
> @@ -39,7 +62,13 @@ static bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>   static void drm_sched_wakeup(struct drm_gpu_scheduler *sched);
>   static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb);
>   
> -/* Initialize a given run queue struct */
> +/**
> + * drm_sched_rq_init - initialize a given run queue struct
> + *
> + * @rq: scheduler run queue
> + *
> + * Initializes a scheduler runqueue.
> + */
>   static void drm_sched_rq_init(struct drm_sched_rq *rq)
>   {
>   	spin_lock_init(&rq->lock);
> @@ -47,6 +76,14 @@ static void drm_sched_rq_init(struct drm_sched_rq *rq)
>   	rq->current_entity = NULL;
>   }
>   
> +/**
> + * drm_sched_rq_add_entity - add an entity
> + *
> + * @rq: scheduler run queue
> + * @entity: scheduler entity
> + *
> + * Adds a scheduler entity to the run queue.
> + */
>   static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>   				    struct drm_sched_entity *entity)
>   {
> @@ -57,6 +94,14 @@ static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
>   	spin_unlock(&rq->lock);
>   }
>   
> +/**
> + * drm_sched_rq_remove_entity - remove an entity
> + *
> + * @rq: scheduler run queue
> + * @entity: scheduler entity
> + *
> + * Removes a scheduler entity from the run queue.
> + */
>   static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>   				       struct drm_sched_entity *entity)
>   {
> @@ -70,9 +115,9 @@ static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>   }
>   
>   /**
> - * Select an entity which could provide a job to run
> + * drm_sched_rq_select_entity - Select an entity which could provide a job to run
>    *
> - * @rq		The run queue to check.
> + * @rq: scheduler run queue to check.
>    *
>    * Try to find a ready entity, returns NULL if none found.
>    */
> @@ -112,15 +157,16 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
>   }
>   
>   /**
> - * Init a context entity used by scheduler when submit to HW ring.
> + * drm_sched_entity_init - Init a context entity used by scheduler when
> + * submit to HW ring.
>    *
> - * @sched	The pointer to the scheduler
> - * @entity	The pointer to a valid drm_sched_entity
> - * @rq		The run queue this entity belongs
> - * @guilty      atomic_t set to 1 when a job on this queue
> - *              is found to be guilty causing a timeout
> + * @sched: scheduler instance
> + * @entity: scheduler entity to init
> + * @rq: the run queue this entity belongs
> + * @guilty: atomic_t set to 1 when a job on this queue
> + *          is found to be guilty causing a timeout
>    *
> - * return 0 if succeed. negative error code on failure
> + * Returns 0 on success or a negative error code on failure.
>   */
>   int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
>   			  struct drm_sched_entity *entity,
> @@ -149,10 +195,10 @@ int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
>   EXPORT_SYMBOL(drm_sched_entity_init);
>   
>   /**
> - * Query if entity is initialized
> + * drm_sched_entity_is_initialized - Query if entity is initialized
>    *
> - * @sched       Pointer to scheduler instance
> - * @entity	The pointer to a valid scheduler entity
> + * @sched: Pointer to scheduler instance
> + * @entity: The pointer to a valid scheduler entity
>    *
>    * return true if entity is initialized, false otherwise
>   */
> @@ -164,11 +210,11 @@ static bool drm_sched_entity_is_initialized(struct drm_gpu_scheduler *sched,
>   }
>   
>   /**
> - * Check if entity is idle
> + * drm_sched_entity_is_idle - Check if entity is idle
>    *
> - * @entity	The pointer to a valid scheduler entity
> + * @entity: scheduler entity
>    *
> - * Return true if entity don't has any unscheduled jobs.
> + * Returns true if the entity does not have any unscheduled jobs.
>    */
>   static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
>   {
> @@ -180,9 +226,9 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
>   }
>   
>   /**
> - * Check if entity is ready
> + * drm_sched_entity_is_ready - Check if entity is ready
>    *
> - * @entity	The pointer to a valid scheduler entity
> + * @entity: scheduler entity
>    *
>    * Return true if entity could provide a job.
>    */
> @@ -210,12 +256,12 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>   
>   
>   /**
> - * Destroy a context entity
> + * drm_sched_entity_do_release - Destroy a context entity
>    *
> - * @sched       Pointer to scheduler instance
> - * @entity	The pointer to a valid scheduler entity
> + * @sched: scheduler instance
> + * @entity: scheduler entity
>    *
> - * Splitting drm_sched_entity_fini() into two functions, The first one is does the waiting,
> + * Splitting drm_sched_entity_fini() into two functions, The first one does the waiting,
>    * removes the entity from the runqueue and returns an error when the process was killed.
>    */
>   void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
> @@ -237,12 +283,13 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>   EXPORT_SYMBOL(drm_sched_entity_do_release);
>   
>   /**
> - * Destroy a context entity
> + * drm_sched_entity_cleanup - Destroy a context entity
>    *
> - * @sched       Pointer to scheduler instance
> - * @entity	The pointer to a valid scheduler entity
> + * @sched: scheduler instance
> + * @entity: scheduler entity
>    *
> - * The second one then goes over the entity and signals all jobs with an error code.
> + * This should be called after @drm_sched_entity_do_release. It goes over the
> + * entity and signals all jobs with an error code if the process was killed.
>    */
>   void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
>   			   struct drm_sched_entity *entity)
> @@ -281,6 +328,14 @@ void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
>   }
>   EXPORT_SYMBOL(drm_sched_entity_cleanup);
>   
> +/**
> + * drm_sched_entity_fini - Destroy a context entity
> + *
> + * @sched: scheduler instance
> + * @entity: scheduler entity
> + *
> + * Calls drm_sched_entity_do_release() and drm_sched_entity_cleanup()
> + */
>   void drm_sched_entity_fini(struct drm_gpu_scheduler *sched,
>   				struct drm_sched_entity *entity)
>   {
> @@ -306,6 +361,15 @@ static void drm_sched_entity_clear_dep(struct dma_fence *f, struct dma_fence_cb
>   	dma_fence_put(f);
>   }
>   
> +/**
> + * drm_sched_entity_set_rq - Sets the run queue for an entity
> + *
> + * @entity: scheduler entity
> + * @rq: scheduler run queue
> + *
> + * Sets the run queue for an entity and removes the entity from the previous
> + * run queue in which was present.
> + */
>   void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
>   			     struct drm_sched_rq *rq)
>   {
> @@ -325,6 +389,14 @@ void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
>   }
>   EXPORT_SYMBOL(drm_sched_entity_set_rq);
>   
> +/**
> + * drm_sched_dependency_optimized
> + *
> + * @fence: the dependency fence
> + * @entity: the entity which depends on the above fence
> + *
> + * Returns true if the dependency can be optimized and false otherwise
> + */
>   bool drm_sched_dependency_optimized(struct dma_fence* fence,
>   				    struct drm_sched_entity *entity)
>   {
> @@ -413,9 +485,10 @@ drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>   }
>   
>   /**
> - * Submit a job to the job queue
> + * drm_sched_entity_push_job - Submit a job to the entity's job queue
>    *
> - * @sched_job		The pointer to job required to submit
> + * @sched_job: job to submit
> + * @entity: scheduler entity
>    *
>    * Note: To guarantee that the order of insertion to queue matches
>    * the job's fence sequence number this function should be
> @@ -506,6 +579,13 @@ static void drm_sched_job_timedout(struct work_struct *work)
>   	job->sched->ops->timedout_job(job);
>   }
>   
> +/**
> + * drm_sched_hw_job_reset - stop the scheduler if it contains the bad job
> + *
> + * @sched: scheduler instance
> + * @bad: bad scheduler job
> + *
> + */
>   void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
>   {
>   	struct drm_sched_job *s_job;
> @@ -550,6 +630,12 @@ void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_jo
>   }
>   EXPORT_SYMBOL(drm_sched_hw_job_reset);
>   
> +/**
> + * drm_sched_job_recovery - recover jobs after a reset
> + *
> + * @sched: scheduler instance
> + *
> + */
>   void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
>   {
>   	struct drm_sched_job *s_job, *tmp;
> @@ -599,10 +685,17 @@ void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
>   EXPORT_SYMBOL(drm_sched_job_recovery);
>   
>   /**
> - * Init a sched_job with basic field
> + * drm_sched_job_init - init a scheduler job
>    *
> - * Note: Refer to drm_sched_entity_push_job documentation
> + * @job: scheduler job to init
> + * @sched: scheduler instance
> + * @entity: scheduler entity to use
> + * @owner: job owner for debugging
> + *
> + * Refer to drm_sched_entity_push_job() documentation
>    * for locking considerations.
> + *
> + * Returns 0 for success, negative error code otherwise.
>    */
>   int drm_sched_job_init(struct drm_sched_job *job,
>   		       struct drm_gpu_scheduler *sched,
> @@ -626,7 +719,11 @@ int drm_sched_job_init(struct drm_sched_job *job,
>   EXPORT_SYMBOL(drm_sched_job_init);
>   
>   /**
> - * Return ture if we can push more jobs to the hw.
> + * drm_sched_ready - is the scheduler ready
> + *
> + * @sched: scheduler instance
> + *
> + * Return true if we can push more jobs to the hw, otherwise false.
>    */
>   static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
>   {
> @@ -635,7 +732,10 @@ static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
>   }
>   
>   /**
> - * Wake up the scheduler when it is ready
> + * drm_sched_wakeup - Wake up the scheduler when it is ready
> + *
> + * @sched: scheduler instance
> + *
>    */
>   static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
>   {
> @@ -644,8 +744,12 @@ static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
>   }
>   
>   /**
> - * Select next entity to process
> -*/
> + * drm_sched_select_entity - Select next entity to process
> + *
> + * @sched: scheduler instance
> + *
> + * Returns the entity to process or NULL if none are found.
> + */
>   static struct drm_sched_entity *
>   drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>   {
> @@ -665,6 +769,14 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
>   	return entity;
>   }
>   
> +/**
> + * drm_sched_process_job - process a job
> + *
> + * @f: fence
> + * @cb: fence callbacks
> + *
> + * Called after job has finished execution.
> + */
>   static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>   {
>   	struct drm_sched_fence *s_fence =
> @@ -680,6 +792,13 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
>   	wake_up_interruptible(&sched->wake_up_worker);
>   }
>   
> +/**
> + * drm_sched_blocked - check if the scheduler is blocked
> + *
> + * @sched: scheduler instance
> + *
> + * Returns true if blocked, otherwise false.
> + */
>   static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>   {
>   	if (kthread_should_park()) {
> @@ -690,6 +809,13 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
>   	return false;
>   }
>   
> +/**
> + * drm_sched_main - main scheduler thread
> + *
> + * @param: scheduler instance
> + *
> + * Returns 0.
> + */
>   static int drm_sched_main(void *param)
>   {
>   	struct sched_param sparam = {.sched_priority = 1};
> @@ -744,15 +870,17 @@ static int drm_sched_main(void *param)
>   }
>   
>   /**
> - * Init a gpu scheduler instance
> + * drm_sched_init - Init a gpu scheduler instance
>    *
> - * @sched		The pointer to the scheduler
> - * @ops			The backend operations for this scheduler.
> - * @hw_submissions	Number of hw submissions to do.
> - * @name		Name used for debugging
> + * @sched: scheduler instance
> + * @ops: backend operations for this scheduler
> + * @hw_submission: number of hw submissions that can be in flight
> + * @hang_limit: number of times to allow a job to hang before dropping it
> + * @timeout: timeout value in jiffies for the scheduler
> + * @name: name used for debugging
>    *
>    * Return 0 on success, otherwise error code.
> -*/
> + */
>   int drm_sched_init(struct drm_gpu_scheduler *sched,
>   		   const struct drm_sched_backend_ops *ops,
>   		   unsigned hw_submission,
> @@ -788,9 +916,11 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
>   EXPORT_SYMBOL(drm_sched_init);
>   
>   /**
> - * Destroy a gpu scheduler
> + * drm_sched_fini - Destroy a gpu scheduler
> + *
> + * @sched: scheduler instance
>    *
> - * @sched	The pointer to the scheduler
> + * Tears down and cleans up the scheduler.
>    */
>   void drm_sched_fini(struct drm_gpu_scheduler *sched)
>   {
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index dec655894d08..496442f12bff 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -43,13 +43,33 @@ enum drm_sched_priority {
>   };
>   
>   /**
> - * drm_sched_entity - A wrapper around a job queue (typically attached
> - * to the DRM file_priv).
> + * struct drm_sched_entity - A wrapper around a job queue (typically
> + * attached to the DRM file_priv).
> + *
> + * @list: used to append this struct to the list of entities in the
> + *        runqueue.
> + * @rq: runqueue to which this entity belongs.
> + * @rq_lock: lock to modify the runqueue to which this entity belongs.
> + * @sched: the scheduler instance to which this entity is enqueued.
> + * @job_queue: the list of jobs of this entity.
> + * @fence_seq: a linearly increasing seqno incremented with each
> + *             new &drm_sched_fence which is part of the entity.
> + * @fence_context: a unique context for all the fences which belong
> + *                 to this entity.
> + *                 The &drm_sched_fence.scheduled uses the
> + *                 fence_context but &drm_sched_fence.finished uses
> + *                 fence_context + 1.
> + * @dependency: the dependency fence of the job which is on the top
> + *              of the job queue.
> + * @cb: callback for the dependency fence above.
> + * @guilty: points to ctx's guilty.
> + * @fini_status: contains the exit status in case the process was signalled.
> + * @last_scheduled: points to the finished fence of the last scheduled job.
>    *
>    * Entities will emit jobs in order to their corresponding hardware
>    * ring, and the scheduler will alternate between entities based on
>    * scheduling policy.
> -*/
> + */
>   struct drm_sched_entity {
>   	struct list_head		list;
>   	struct drm_sched_rq		*rq;
> @@ -63,47 +83,96 @@ struct drm_sched_entity {
>   
>   	struct dma_fence		*dependency;
>   	struct dma_fence_cb		cb;
> -	atomic_t			*guilty; /* points to ctx's guilty */
> -	int            fini_status;
> -	struct dma_fence    *last_scheduled;
> +	atomic_t			*guilty;
> +	int                             fini_status;
> +	struct dma_fence                *last_scheduled;
>   };
>   
>   /**
> + * struct drm_sched_rq - queue of entities to be scheduled.
> + *
> + * @lock: to modify the entities list.
> + * @entities: list of the entities to be scheduled.
> + * @current_entity: the entity which is to be scheduled.
> + *
>    * Run queue is a set of entities scheduling command submissions for
>    * one specific ring. It implements the scheduling policy that selects
>    * the next entity to emit commands from.
> -*/
> + */
>   struct drm_sched_rq {
>   	spinlock_t			lock;
>   	struct list_head		entities;
>   	struct drm_sched_entity		*current_entity;
>   };
>   
> +/**
> + * struct drm_sched_fence - fences corresponding to the scheduling of a job.
> + */
>   struct drm_sched_fence {
> +        /**
> +         * @scheduled: this fence is what will be signaled by the scheduler
> +         * when the job is scheduled.
> +         */
>   	struct dma_fence		scheduled;
>   
> -	/* This fence is what will be signaled by the scheduler when
> -	 * the job is completed.
> -	 *
> -	 * When setting up an out fence for the job, you should use
> -	 * this, since it's available immediately upon
> -	 * drm_sched_job_init(), and the fence returned by the driver
> -	 * from run_job() won't be created until the dependencies have
> -	 * resolved.
> -	 */
> +        /**
> +         * @finished: this fence is what will be signaled by the scheduler
> +         * when the job is completed.
> +         *
> +         * When setting up an out fence for the job, you should use
> +         * this, since it's available immediately upon
> +         * drm_sched_job_init(), and the fence returned by the driver
> +         * from run_job() won't be created until the dependencies have
> +         * resolved.
> +         */
>   	struct dma_fence		finished;
>   
> +        /**
> +         * @cb: the callback for the parent fence below.
> +         */
>   	struct dma_fence_cb		cb;
> +        /**
> +         * @parent: the fence returned by &drm_sched_backend_ops.run_job
> +         * when scheduling the job on hardware. We signal the
> +         * &drm_sched_fence.finished fence once parent is signalled.
> +         */
>   	struct dma_fence		*parent;
> +        /**
> +         * @sched: the scheduler instance to which the job having this struct
> +         * belongs to.
> +         */
>   	struct drm_gpu_scheduler	*sched;
> +        /**
> +         * @lock: the lock used by the scheduled and the finished fences.
> +         */
>   	spinlock_t			lock;
> +        /**
> +         * @owner: job owner for debugging
> +         */
>   	void				*owner;
>   };
>   
>   struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
>   
>   /**
> - * drm_sched_job - A job to be run by an entity.
> + * struct drm_sched_job - A job to be run by an entity.
> + *
> + * @queue_node: used to append this struct to the queue of jobs in an entity.
> + * @sched: the scheduler instance on which this job is scheduled.
> + * @s_fence: contains the fences for the scheduling of job.
> + * @finish_cb: the callback for the finished fence.
> + * @finish_work: schedules the function @drm_sched_job_finish once the job has
> + *               finished to remove the job from the
> + *               @drm_gpu_scheduler.ring_mirror_list.
> + * @node: used to append this struct to the @drm_gpu_scheduler.ring_mirror_list.
> + * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the timeout
> + *            interval is over.
> + * @id: a unique id assigned to each job scheduled on the scheduler.
> + * @karma: increment on every hang caused by this job. If this exceeds the hang
> + *         limit of the scheduler then the job is marked guilty and will not
> + *         be scheduled further.
> + * @s_priority: the priority of the job.
> + * @entity: the entity to which this job belongs.
>    *
>    * A job is created by the driver using drm_sched_job_init(), and
>    * should call drm_sched_entity_push_job() once it wants the scheduler
> @@ -130,38 +199,64 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>   }
>   
>   /**
> + * struct drm_sched_backend_ops
> + *
>    * Define the backend operations called by the scheduler,
> - * these functions should be implemented in driver side
> -*/
> + * these functions should be implemented in driver side.
> + */
>   struct drm_sched_backend_ops {
> -	/* Called when the scheduler is considering scheduling this
> -	 * job next, to get another struct dma_fence for this job to
> +	/**
> +         * @dependency: Called when the scheduler is considering scheduling
> +         * this job next, to get another struct dma_fence for this job to
>   	 * block on.  Once it returns NULL, run_job() may be called.
>   	 */
>   	struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
>   					struct drm_sched_entity *s_entity);
>   
> -	/* Called to execute the job once all of the dependencies have
> -	 * been resolved.  This may be called multiple times, if
> +	/**
> +         * @run_job: Called to execute the job once all of the dependencies
> +         * have been resolved.  This may be called multiple times, if
>   	 * timedout_job() has happened and drm_sched_job_recovery()
>   	 * decides to try it again.
>   	 */
>   	struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
>   
> -	/* Called when a job has taken too long to execute, to trigger
> -	 * GPU recovery.
> +	/**
> +         * @timedout_job: Called when a job has taken too long to execute,
> +         * to trigger GPU recovery.
>   	 */
>   	void (*timedout_job)(struct drm_sched_job *sched_job);
>   
> -	/* Called once the job's finished fence has been signaled and
> -	 * it's time to clean it up.
> +	/**
> +         * @free_job: Called once the job's finished fence has been signaled
> +         * and it's time to clean it up.
>   	 */
>   	void (*free_job)(struct drm_sched_job *sched_job);
>   };
>   
>   /**
> - * One scheduler is implemented for each hardware ring
> -*/
> + * struct drm_gpu_scheduler
> + *
> + * @ops: backend operations provided by the driver.
> + * @hw_submission_limit: the max size of the hardware queue.
> + * @timeout: the time after which a job is removed from the scheduler.
> + * @name: name of the ring for which this scheduler is being used.
> + * @sched_rq: priority wise array of run queues.
> + * @wake_up_worker: the wait queue on which the scheduler sleeps until a job
> + *                  is ready to be scheduled.
> + * @job_scheduled: once @drm_sched_entity_do_release is called the scheduler
> + *                 waits on this wait queue until all the scheduled jobs are
> + *                 finished.
> + * @hw_rq_count: the number of jobs currently in the hardware queue.
> + * @job_id_count: used to assign unique id to the each job.
> + * @thread: the kthread on which the scheduler which run.
> + * @ring_mirror_list: the list of jobs which are currently in the job queue.
> + * @job_list_lock: lock to protect the ring_mirror_list.
> + * @hang_limit: once the hangs by a job crosses this limit then it is marked
> + *              guilty and it will be considered for scheduling further.
> + *
> + * One scheduler is implemented for each hardware ring.
> + */
>   struct drm_gpu_scheduler {
>   	const struct drm_sched_backend_ops	*ops;
>   	uint32_t			hw_submission_limit;

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] drm/scheduler: add documentation
  2018-05-28  8:31     ` Nayan Deshmukh
  2018-05-29  5:53       ` [PATCH v2] " Nayan Deshmukh
@ 2018-05-29  8:05       ` Daniel Vetter
  1 sibling, 0 replies; 13+ messages in thread
From: Daniel Vetter @ 2018-05-29  8:05 UTC (permalink / raw)
  To: Nayan Deshmukh; +Cc: Alex Deucher, dri-devel, Christian König

On Mon, May 28, 2018 at 02:01:49PM +0530, Nayan Deshmukh wrote:
> I have done that already, sent a patch with this one. The last patch
> of this series. I have tried to take care of all the hyperlinks.

Oh great, was jumping to conclusions. Usually I do that first, so that I
can see mistakes while I write out the docs. But doing it last also works
...

Ack on the series fwiw.
-Daniel

> 
> On Mon, May 28, 2018 at 1:39 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Fri, May 25, 2018 at 6:45 AM, Nayan Deshmukh
> > <nayan26deshmukh@gmail.com> wrote:
> >> convert existing raw comments into kernel-doc format as well
> >> as add new documentation
> >>
> >> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> >> Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
> >> ---
> >>  drivers/gpu/drm/scheduler/gpu_scheduler.c | 214 ++++++++++++++++++++++++------
> >>  include/drm/gpu_scheduler.h               | 153 +++++++++++++++++----
> >>  2 files changed, 296 insertions(+), 71 deletions(-)
> >
> > Please also include all the new scheduler docs into
> > Documentation/gpu/drm-mm.rst (I think that's the most suitable place)
> > and make sure the resulting docs look good and hyperlinks all work
> > correctly using
> >
> > $ make htmldocs
> >
> > Thanks, Daniel
> >
> >>
> >> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> >> index 44d480768dfe..c70c983e3e74 100644
> >> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> >> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> >> @@ -21,6 +21,29 @@
> >>   *
> >>   */
> >>
> >> +/**
> >> + * DOC: Overview
> >> + *
> >> + * The GPU scheduler provides entities which allow userspace to push jobs
> >> + * into software queues which are then scheduled on a hardware run queue.
> >> + * The software queues have a priority among them. The scheduler selects the entities
> >> + * from the run queue using FIFO. The scheduler provides dependency handling
> >> + * features among jobs. The driver is supposed to provide functions for backend
> >> + * operations to the scheduler like submitting a job to hardware run queue,
> >> + * returning the dependency of a job etc.
> >> + *
> >> + * The organisation of the scheduler is the following:-
> >> + *
> >> + * 1. Each ring buffer has one scheduler
> >> + * 2. Each scheduler has multiple run queues with different priorities
> >> + *    (i.e. HIGH_HW,HIGH_SW, KERNEL, NORMAL)
> >> + * 3. Each run queue has a queue of entities to schedule
> >> + * 4. Entities themselves maintain a queue of jobs that will be scheduled on
> >> + *    the hardware.
> >> + *
> >> + * The jobs in a entity are always scheduled in the order that they were pushed.
> >> + */
> >> +
> >>  #include <linux/kthread.h>
> >>  #include <linux/wait.h>
> >>  #include <linux/sched.h>
> >> @@ -39,7 +62,13 @@ static bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >>  static void drm_sched_wakeup(struct drm_gpu_scheduler *sched);
> >>  static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb);
> >>
> >> -/* Initialize a given run queue struct */
> >> +/**
> >> + * drm_sched_rq_init - initialize a given run queue struct
> >> + *
> >> + * @rq: scheduler run queue
> >> + *
> >> + * Initializes a scheduler runqueue.
> >> + */
> >>  static void drm_sched_rq_init(struct drm_sched_rq *rq)
> >>  {
> >>         spin_lock_init(&rq->lock);
> >> @@ -47,6 +76,14 @@ static void drm_sched_rq_init(struct drm_sched_rq *rq)
> >>         rq->current_entity = NULL;
> >>  }
> >>
> >> +/**
> >> + * drm_sched_rq_add_entity - add an entity
> >> + *
> >> + * @rq: scheduler run queue
> >> + * @entity: scheduler entity
> >> + *
> >> + * Adds a scheduler entity to the run queue.
> >> + */
> >>  static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
> >>                                     struct drm_sched_entity *entity)
> >>  {
> >> @@ -57,6 +94,14 @@ static void drm_sched_rq_add_entity(struct drm_sched_rq *rq,
> >>         spin_unlock(&rq->lock);
> >>  }
> >>
> >> +/**
> >> + * drm_sched_rq_remove_entity - remove an entity
> >> + *
> >> + * @rq: scheduler run queue
> >> + * @entity: scheduler entity
> >> + *
> >> + * Removes a scheduler entity from the run queue.
> >> + */
> >>  static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
> >>                                        struct drm_sched_entity *entity)
> >>  {
> >> @@ -70,9 +115,9 @@ static void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
> >>  }
> >>
> >>  /**
> >> - * Select an entity which could provide a job to run
> >> + * drm_sched_rq_select_entity - Select an entity which could provide a job to run
> >>   *
> >> - * @rq         The run queue to check.
> >> + * @rq: scheduler run queue to check.
> >>   *
> >>   * Try to find a ready entity, returns NULL if none found.
> >>   */
> >> @@ -112,15 +157,16 @@ drm_sched_rq_select_entity(struct drm_sched_rq *rq)
> >>  }
> >>
> >>  /**
> >> - * Init a context entity used by scheduler when submit to HW ring.
> >> + * drm_sched_entity_init - Init a context entity used by scheduler when
> >> + * submit to HW ring.
> >>   *
> >> - * @sched      The pointer to the scheduler
> >> - * @entity     The pointer to a valid drm_sched_entity
> >> - * @rq         The run queue this entity belongs
> >> - * @guilty      atomic_t set to 1 when a job on this queue
> >> - *              is found to be guilty causing a timeout
> >> + * @sched: scheduler instance
> >> + * @entity: scheduler entity to init
> >> + * @rq: the run queue this entity belongs
> >> + * @guilty: atomic_t set to 1 when a job on this queue
> >> + *          is found to be guilty causing a timeout
> >>   *
> >> - * return 0 if succeed. negative error code on failure
> >> + * Returns 0 on success or a negative error code on failure.
> >>  */
> >>  int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
> >>                           struct drm_sched_entity *entity,
> >> @@ -149,10 +195,10 @@ int drm_sched_entity_init(struct drm_gpu_scheduler *sched,
> >>  EXPORT_SYMBOL(drm_sched_entity_init);
> >>
> >>  /**
> >> - * Query if entity is initialized
> >> + * drm_sched_entity_is_initialized - Query if entity is initialized
> >>   *
> >> - * @sched       Pointer to scheduler instance
> >> - * @entity     The pointer to a valid scheduler entity
> >> + * @sched: Pointer to scheduler instance
> >> + * @entity: The pointer to a valid scheduler entity
> >>   *
> >>   * return true if entity is initialized, false otherwise
> >>  */
> >> @@ -164,11 +210,11 @@ static bool drm_sched_entity_is_initialized(struct drm_gpu_scheduler *sched,
> >>  }
> >>
> >>  /**
> >> - * Check if entity is idle
> >> + * drm_sched_entity_is_idle - Check if entity is idle
> >>   *
> >> - * @entity     The pointer to a valid scheduler entity
> >> + * @entity: scheduler entity
> >>   *
> >> - * Return true if entity don't has any unscheduled jobs.
> >> + * Returns true if the entity does not have any unscheduled jobs.
> >>   */
> >>  static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
> >>  {
> >> @@ -180,9 +226,9 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
> >>  }
> >>
> >>  /**
> >> - * Check if entity is ready
> >> + * drm_sched_entity_is_ready - Check if entity is ready
> >>   *
> >> - * @entity     The pointer to a valid scheduler entity
> >> + * @entity: scheduler entity
> >>   *
> >>   * Return true if entity could provide a job.
> >>   */
> >> @@ -210,12 +256,12 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
> >>
> >>
> >>  /**
> >> - * Destroy a context entity
> >> + * drm_sched_entity_do_release - Destroy a context entity
> >>   *
> >> - * @sched       Pointer to scheduler instance
> >> - * @entity     The pointer to a valid scheduler entity
> >> + * @sched: scheduler instance
> >> + * @entity: scheduler entity
> >>   *
> >> - * Splitting drm_sched_entity_fini() into two functions, The first one is does the waiting,
> >> + * Splitting drm_sched_entity_fini() into two functions, The first one does the waiting,
> >>   * removes the entity from the runqueue and returns an error when the process was killed.
> >>   */
> >>  void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
> >> @@ -237,12 +283,13 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
> >>  EXPORT_SYMBOL(drm_sched_entity_do_release);
> >>
> >>  /**
> >> - * Destroy a context entity
> >> + * drm_sched_entity_cleanup - Destroy a context entity
> >>   *
> >> - * @sched       Pointer to scheduler instance
> >> - * @entity     The pointer to a valid scheduler entity
> >> + * @sched: scheduler instance
> >> + * @entity: scheduler entity
> >>   *
> >> - * The second one then goes over the entity and signals all jobs with an error code.
> >> + * This should be called after @drm_sched_entity_do_release. It goes over the
> >> + * entity and signals all jobs with an error code if the process was killed.
> >>   */
> >>  void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
> >>                            struct drm_sched_entity *entity)
> >> @@ -281,6 +328,14 @@ void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched,
> >>  }
> >>  EXPORT_SYMBOL(drm_sched_entity_cleanup);
> >>
> >> +/**
> >> + * drm_sched_entity_fini - Destroy a context entity
> >> + *
> >> + * @sched: scheduler instance
> >> + * @entity: scheduler entity
> >> + *
> >> + * Calls drm_sched_entity_do_release() and drm_sched_entity_cleanup()
> >> + */
> >>  void drm_sched_entity_fini(struct drm_gpu_scheduler *sched,
> >>                                 struct drm_sched_entity *entity)
> >>  {
> >> @@ -306,6 +361,15 @@ static void drm_sched_entity_clear_dep(struct dma_fence *f, struct dma_fence_cb
> >>         dma_fence_put(f);
> >>  }
> >>
> >> +/**
> >> + * drm_sched_entity_set_rq - Sets the run queue for an entity
> >> + *
> >> + * @entity: scheduler entity
> >> + * @rq: scheduler run queue
> >> + *
> >> + * Sets the run queue for an entity and removes the entity from the previous
> >> + * run queue in which was present.
> >> + */
> >>  void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
> >>                              struct drm_sched_rq *rq)
> >>  {
> >> @@ -325,6 +389,14 @@ void drm_sched_entity_set_rq(struct drm_sched_entity *entity,
> >>  }
> >>  EXPORT_SYMBOL(drm_sched_entity_set_rq);
> >>
> >> +/**
> >> + * drm_sched_dependency_optimized
> >> + *
> >> + * @fence: the dependency fence
> >> + * @entity: the entity which depends on the above fence
> >> + *
> >> + * Returns true if the dependency can be optimized and false otherwise
> >> + */
> >>  bool drm_sched_dependency_optimized(struct dma_fence* fence,
> >>                                     struct drm_sched_entity *entity)
> >>  {
> >> @@ -413,9 +485,10 @@ drm_sched_entity_pop_job(struct drm_sched_entity *entity)
> >>  }
> >>
> >>  /**
> >> - * Submit a job to the job queue
> >> + * drm_sched_entity_push_job - Submit a job to the entity's job queue
> >>   *
> >> - * @sched_job          The pointer to job required to submit
> >> + * @sched_job: job to submit
> >> + * @entity: scheduler entity
> >>   *
> >>   * Note: To guarantee that the order of insertion to queue matches
> >>   * the job's fence sequence number this function should be
> >> @@ -506,6 +579,13 @@ static void drm_sched_job_timedout(struct work_struct *work)
> >>         job->sched->ops->timedout_job(job);
> >>  }
> >>
> >> +/**
> >> + * drm_sched_hw_job_reset - stop the scheduler if it contains the bad job
> >> + *
> >> + * @sched: scheduler instance
> >> + * @bad: bad scheduler job
> >> + *
> >> + */
> >>  void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad)
> >>  {
> >>         struct drm_sched_job *s_job;
> >> @@ -550,6 +630,12 @@ void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, struct drm_sched_jo
> >>  }
> >>  EXPORT_SYMBOL(drm_sched_hw_job_reset);
> >>
> >> +/**
> >> + * drm_sched_job_recovery - recover jobs after a reset
> >> + *
> >> + * @sched: scheduler instance
> >> + *
> >> + */
> >>  void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
> >>  {
> >>         struct drm_sched_job *s_job, *tmp;
> >> @@ -599,10 +685,17 @@ void drm_sched_job_recovery(struct drm_gpu_scheduler *sched)
> >>  EXPORT_SYMBOL(drm_sched_job_recovery);
> >>
> >>  /**
> >> - * Init a sched_job with basic field
> >> + * drm_sched_job_init - init a scheduler job
> >>   *
> >> - * Note: Refer to drm_sched_entity_push_job documentation
> >> + * @job: scheduler job to init
> >> + * @sched: scheduler instance
> >> + * @entity: scheduler entity to use
> >> + * @owner: job owner for debugging
> >> + *
> >> + * Refer to drm_sched_entity_push_job() documentation
> >>   * for locking considerations.
> >> + *
> >> + * Returns 0 for success, negative error code otherwise.
> >>   */
> >>  int drm_sched_job_init(struct drm_sched_job *job,
> >>                        struct drm_gpu_scheduler *sched,
> >> @@ -626,7 +719,11 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >>  EXPORT_SYMBOL(drm_sched_job_init);
> >>
> >>  /**
> >> - * Return ture if we can push more jobs to the hw.
> >> + * drm_sched_ready - is the scheduler ready
> >> + *
> >> + * @sched: scheduler instance
> >> + *
> >> + * Return true if we can push more jobs to the hw, otherwise false.
> >>   */
> >>  static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
> >>  {
> >> @@ -635,7 +732,10 @@ static bool drm_sched_ready(struct drm_gpu_scheduler *sched)
> >>  }
> >>
> >>  /**
> >> - * Wake up the scheduler when it is ready
> >> + * drm_sched_wakeup - Wake up the scheduler when it is ready
> >> + *
> >> + * @sched: scheduler instance
> >> + *
> >>   */
> >>  static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
> >>  {
> >> @@ -644,8 +744,12 @@ static void drm_sched_wakeup(struct drm_gpu_scheduler *sched)
> >>  }
> >>
> >>  /**
> >> - * Select next entity to process
> >> -*/
> >> + * drm_sched_select_entity - Select next entity to process
> >> + *
> >> + * @sched: scheduler instance
> >> + *
> >> + * Returns the entity to process or NULL if none are found.
> >> + */
> >>  static struct drm_sched_entity *
> >>  drm_sched_select_entity(struct drm_gpu_scheduler *sched)
> >>  {
> >> @@ -665,6 +769,14 @@ drm_sched_select_entity(struct drm_gpu_scheduler *sched)
> >>         return entity;
> >>  }
> >>
> >> +/**
> >> + * drm_sched_process_job - process a job
> >> + *
> >> + * @f: fence
> >> + * @cb: fence callbacks
> >> + *
> >> + * Called after job has finished execution.
> >> + */
> >>  static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
> >>  {
> >>         struct drm_sched_fence *s_fence =
> >> @@ -680,6 +792,13 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
> >>         wake_up_interruptible(&sched->wake_up_worker);
> >>  }
> >>
> >> +/**
> >> + * drm_sched_blocked - check if the scheduler is blocked
> >> + *
> >> + * @sched: scheduler instance
> >> + *
> >> + * Returns true if blocked, otherwise false.
> >> + */
> >>  static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
> >>  {
> >>         if (kthread_should_park()) {
> >> @@ -690,6 +809,13 @@ static bool drm_sched_blocked(struct drm_gpu_scheduler *sched)
> >>         return false;
> >>  }
> >>
> >> +/**
> >> + * drm_sched_main - main scheduler thread
> >> + *
> >> + * @param: scheduler instance
> >> + *
> >> + * Returns 0.
> >> + */
> >>  static int drm_sched_main(void *param)
> >>  {
> >>         struct sched_param sparam = {.sched_priority = 1};
> >> @@ -744,15 +870,17 @@ static int drm_sched_main(void *param)
> >>  }
> >>
> >>  /**
> >> - * Init a gpu scheduler instance
> >> + * drm_sched_init - Init a gpu scheduler instance
> >>   *
> >> - * @sched              The pointer to the scheduler
> >> - * @ops                        The backend operations for this scheduler.
> >> - * @hw_submissions     Number of hw submissions to do.
> >> - * @name               Name used for debugging
> >> + * @sched: scheduler instance
> >> + * @ops: backend operations for this scheduler
> >> + * @hw_submission: number of hw submissions that can be in flight
> >> + * @hang_limit: number of times to allow a job to hang before dropping it
> >> + * @timeout: timeout value in jiffies for the scheduler
> >> + * @name: name used for debugging
> >>   *
> >>   * Return 0 on success, otherwise error code.
> >> -*/
> >> + */
> >>  int drm_sched_init(struct drm_gpu_scheduler *sched,
> >>                    const struct drm_sched_backend_ops *ops,
> >>                    unsigned hw_submission,
> >> @@ -788,9 +916,11 @@ int drm_sched_init(struct drm_gpu_scheduler *sched,
> >>  EXPORT_SYMBOL(drm_sched_init);
> >>
> >>  /**
> >> - * Destroy a gpu scheduler
> >> + * drm_sched_fini - Destroy a gpu scheduler
> >> + *
> >> + * @sched: scheduler instance
> >>   *
> >> - * @sched      The pointer to the scheduler
> >> + * Tears down and cleans up the scheduler.
> >>   */
> >>  void drm_sched_fini(struct drm_gpu_scheduler *sched)
> >>  {
> >> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> >> index dec655894d08..496442f12bff 100644
> >> --- a/include/drm/gpu_scheduler.h
> >> +++ b/include/drm/gpu_scheduler.h
> >> @@ -43,13 +43,33 @@ enum drm_sched_priority {
> >>  };
> >>
> >>  /**
> >> - * drm_sched_entity - A wrapper around a job queue (typically attached
> >> - * to the DRM file_priv).
> >> + * struct drm_sched_entity - A wrapper around a job queue (typically
> >> + * attached to the DRM file_priv).
> >> + *
> >> + * @list: used to append this struct to the list of entities in the
> >> + *        runqueue.
> >> + * @rq: runqueue to which this entity belongs.
> >> + * @rq_lock: lock to modify the runqueue to which this entity belongs.
> >> + * @sched: the scheduler instance to which this entity is enqueued.
> >> + * @job_queue: the list of jobs of this entity.
> >> + * @fence_seq: a linearly increasing seqno incremented with each
> >> + *             new &drm_sched_fence which is part of the entity.
> >> + * @fence_context: a unique context for all the fences which belong
> >> + *                 to this entity.
> >> + *                 The &drm_sched_fence.scheduled uses the
> >> + *                 fence_context but &drm_sched_fence.finished uses
> >> + *                 fence_context + 1.
> >> + * @dependency: the dependency fence of the job which is on the top
> >> + *              of the job queue.
> >> + * @cb: callback for the dependency fence above.
> >> + * @guilty: points to ctx's guilty.
> >> + * @fini_status: contains the exit status in case the process was signalled.
> >> + * @last_scheduled: points to the finished fence of the last scheduled job.
> >>   *
> >>   * Entities will emit jobs in order to their corresponding hardware
> >>   * ring, and the scheduler will alternate between entities based on
> >>   * scheduling policy.
> >> -*/
> >> + */
> >>  struct drm_sched_entity {
> >>         struct list_head                list;
> >>         struct drm_sched_rq             *rq;
> >> @@ -63,47 +83,96 @@ struct drm_sched_entity {
> >>
> >>         struct dma_fence                *dependency;
> >>         struct dma_fence_cb             cb;
> >> -       atomic_t                        *guilty; /* points to ctx's guilty */
> >> -       int            fini_status;
> >> -       struct dma_fence    *last_scheduled;
> >> +       atomic_t                        *guilty;
> >> +       int                             fini_status;
> >> +       struct dma_fence                *last_scheduled;
> >>  };
> >>
> >>  /**
> >> + * struct drm_sched_rq - queue of entities to be scheduled.
> >> + *
> >> + * @lock: to modify the entities list.
> >> + * @entities: list of the entities to be scheduled.
> >> + * @current_entity: the entity which is to be scheduled.
> >> + *
> >>   * Run queue is a set of entities scheduling command submissions for
> >>   * one specific ring. It implements the scheduling policy that selects
> >>   * the next entity to emit commands from.
> >> -*/
> >> + */
> >>  struct drm_sched_rq {
> >>         spinlock_t                      lock;
> >>         struct list_head                entities;
> >>         struct drm_sched_entity         *current_entity;
> >>  };
> >>
> >> +/**
> >> + * struct drm_sched_fence - fences corresponding to the scheduling of a job.
> >> + */
> >>  struct drm_sched_fence {
> >> +        /**
> >> +         * @scheduled: this fence is what will be signaled by the scheduler
> >> +         * when the job is scheduled.
> >> +         */
> >>         struct dma_fence                scheduled;
> >>
> >> -       /* This fence is what will be signaled by the scheduler when
> >> -        * the job is completed.
> >> -        *
> >> -        * When setting up an out fence for the job, you should use
> >> -        * this, since it's available immediately upon
> >> -        * drm_sched_job_init(), and the fence returned by the driver
> >> -        * from run_job() won't be created until the dependencies have
> >> -        * resolved.
> >> -        */
> >> +        /**
> >> +         * @finished: this fence is what will be signaled by the scheduler
> >> +         * when the job is completed.
> >> +         *
> >> +         * When setting up an out fence for the job, you should use
> >> +         * this, since it's available immediately upon
> >> +         * drm_sched_job_init(), and the fence returned by the driver
> >> +         * from run_job() won't be created until the dependencies have
> >> +         * resolved.
> >> +         */
> >>         struct dma_fence                finished;
> >>
> >> +        /**
> >> +         * @cb: the callback for the parent fence below.
> >> +         */
> >>         struct dma_fence_cb             cb;
> >> +        /**
> >> +         * @parent: the fence returned by &drm_sched_backend_ops.run_job
> >> +         * when scheduling the job on hardware. We signal the
> >> +         * &drm_sched_fence.finished fence once parent is signalled.
> >> +         */
> >>         struct dma_fence                *parent;
> >> +        /**
> >> +         * @sched: the scheduler instance to which the job having this struct
> >> +         * belongs to.
> >> +         */
> >>         struct drm_gpu_scheduler        *sched;
> >> +        /**
> >> +         * @lock: the lock used by the scheduled and the finished fences.
> >> +         */
> >>         spinlock_t                      lock;
> >> +        /**
> >> +         * @owner: job owner for debugging
> >> +         */
> >>         void                            *owner;
> >>  };
> >>
> >>  struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f);
> >>
> >>  /**
> >> - * drm_sched_job - A job to be run by an entity.
> >> + * struct drm_sched_job - A job to be run by an entity.
> >> + *
> >> + * @queue_node: used to append this struct to the queue of jobs in an entity.
> >> + * @sched: the scheduler instance on which this job is scheduled.
> >> + * @s_fence: contains the fences for the scheduling of job.
> >> + * @finish_cb: the callback for the finished fence.
> >> + * @finish_work: schedules the function @drm_sched_job_finish once the job has
> >> + *               finished to remove the job from the
> >> + *               @drm_gpu_scheduler.ring_mirror_list.
> >> + * @node: used to append this struct to the @drm_gpu_scheduler.ring_mirror_list.
> >> + * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the timeout
> >> + *            interval is over.
> >> + * @id: a unique id assigned to each job scheduled on the scheduler.
> >> + * @karma: increment on every hang caused by this job. If this exceeds the hang
> >> + *         limit of the scheduler then the job is marked guilty and will not
> >> + *         be scheduled further.
> >> + * @s_priority: the priority of the job.
> >> + * @entity: the entity to which this job belongs.
> >>   *
> >>   * A job is created by the driver using drm_sched_job_init(), and
> >>   * should call drm_sched_entity_push_job() once it wants the scheduler
> >> @@ -130,38 +199,64 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
> >>  }
> >>
> >>  /**
> >> + * struct drm_sched_backend_ops
> >> + *
> >>   * Define the backend operations called by the scheduler,
> >> - * these functions should be implemented in driver side
> >> -*/
> >> + * these functions should be implemented in driver side.
> >> + */
> >>  struct drm_sched_backend_ops {
> >> -       /* Called when the scheduler is considering scheduling this
> >> -        * job next, to get another struct dma_fence for this job to
> >> +       /**
> >> +         * @dependency: Called when the scheduler is considering scheduling
> >> +         * this job next, to get another struct dma_fence for this job to
> >>          * block on.  Once it returns NULL, run_job() may be called.
> >>          */
> >>         struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
> >>                                         struct drm_sched_entity *s_entity);
> >>
> >> -       /* Called to execute the job once all of the dependencies have
> >> -        * been resolved.  This may be called multiple times, if
> >> +       /**
> >> +         * @run_job: Called to execute the job once all of the dependencies
> >> +         * have been resolved.  This may be called multiple times, if
> >>          * timedout_job() has happened and drm_sched_job_recovery()
> >>          * decides to try it again.
> >>          */
> >>         struct dma_fence *(*run_job)(struct drm_sched_job *sched_job);
> >>
> >> -       /* Called when a job has taken too long to execute, to trigger
> >> -        * GPU recovery.
> >> +       /**
> >> +         * @timedout_job: Called when a job has taken too long to execute,
> >> +         * to trigger GPU recovery.
> >>          */
> >>         void (*timedout_job)(struct drm_sched_job *sched_job);
> >>
> >> -       /* Called once the job's finished fence has been signaled and
> >> -        * it's time to clean it up.
> >> +       /**
> >> +         * @free_job: Called once the job's finished fence has been signaled
> >> +         * and it's time to clean it up.
> >>          */
> >>         void (*free_job)(struct drm_sched_job *sched_job);
> >>  };
> >>
> >>  /**
> >> - * One scheduler is implemented for each hardware ring
> >> -*/
> >> + * struct drm_gpu_scheduler
> >> + *
> >> + * @ops: backend operations provided by the driver.
> >> + * @hw_submission_limit: the max size of the hardware queue.
> >> + * @timeout: the time after which a job is removed from the scheduler.
> >> + * @name: name of the ring for which this scheduler is being used.
> >> + * @sched_rq: priority wise array of run queues.
> >> + * @wake_up_worker: the wait queue on which the scheduler sleeps until a job
> >> + *                  is ready to be scheduled.
> >> + * @job_scheduled: once @drm_sched_entity_do_release is called the scheduler
> >> + *                 waits on this wait queue until all the scheduled jobs are
> >> + *                 finished.
> >> + * @hw_rq_count: the number of jobs currently in the hardware queue.
> >> + * @job_id_count: used to assign unique id to the each job.
> >> + * @thread: the kthread on which the scheduler which run.
> >> + * @ring_mirror_list: the list of jobs which are currently in the job queue.
> >> + * @job_list_lock: lock to protect the ring_mirror_list.
> >> + * @hang_limit: once the hangs by a job crosses this limit then it is marked
> >> + *              guilty and it will be considered for scheduling further.
> >> + *
> >> + * One scheduler is implemented for each hardware ring.
> >> + */
> >>  struct drm_gpu_scheduler {
> >>         const struct drm_sched_backend_ops      *ops;
> >>         uint32_t                        hw_submission_limit;
> >> --
> >> 2.14.3
> >>
> >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-05-29  8:06 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-25  4:45 [PATCH 1/3] drm/scheduler: fix a corner case in dependency optimization Nayan Deshmukh
2018-05-25  4:45 ` [PATCH 2/3] drm/scheduler: add documentation Nayan Deshmukh
2018-05-25 12:06   ` Christian König
2018-05-25 14:54   ` Alex Deucher
2018-05-28  7:09     ` Nayan Deshmukh
2018-05-28  8:09   ` Daniel Vetter
2018-05-28  8:31     ` Nayan Deshmukh
2018-05-29  5:53       ` [PATCH v2] " Nayan Deshmukh
2018-05-29  6:38         ` Christian König
2018-05-29  8:05       ` [PATCH 2/3] " Daniel Vetter
2018-05-25  4:45 ` [PATCH 3/3] drm/doc: add a chapter for gpu scheduler Nayan Deshmukh
2018-05-25 14:54   ` Alex Deucher
2018-05-25 12:01 ` [PATCH 1/3] drm/scheduler: fix a corner case in dependency optimization Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.