All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/3] drm: Add comm/cmdline fdinfo fields
@ 2023-04-17 20:12 ` Rob Clark
  0 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-17 20:12 UTC (permalink / raw)
  To: dri-devel
  Cc: Tvrtko Ursulin, Rob Clark, Akhil P Oommen, Chia-I Wu,
	Dmitry Baryshkov, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Konrad Dybcio, open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DOCUMENTATION, open list, Sean Paul

From: Rob Clark <robdclark@chromium.org>

When many of the things using the GPU are processes in a VM guest, the
actual client process is just a proxy.  The msm driver has a way to let
the proxy tell the kernel the actual VM client process's executable name
and command-line, which has until now been used simply for GPU crash
devcore dumps.  Lets also expose this via fdinfo so that tools can
expose who the actual user of the GPU is.

Rob Clark (3):
  drm/doc: Relax fdinfo string constraints
  drm/msm: Rework get_comm_cmdline() helper
  drm/msm: Add comm/cmdline fields

 Documentation/gpu/drm-usage-stats.rst   | 37 +++++++++++++++----------
 drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 +--
 drivers/gpu/drm/msm/msm_drv.c           |  2 ++
 drivers/gpu/drm/msm/msm_gpu.c           | 27 +++++++++++++-----
 drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++--
 drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
 6 files changed, 58 insertions(+), 25 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFC 0/3] drm: Add comm/cmdline fdinfo fields
@ 2023-04-17 20:12 ` Rob Clark
  0 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-17 20:12 UTC (permalink / raw)
  To: dri-devel
  Cc: Rob Clark, Tvrtko Ursulin, open list:DOCUMENTATION,
	Akhil P Oommen, open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list, Konrad Dybcio, Sean Paul, Dmitry Baryshkov,
	open list:DRM DRIVER FOR MSM ADRENO GPU

From: Rob Clark <robdclark@chromium.org>

When many of the things using the GPU are processes in a VM guest, the
actual client process is just a proxy.  The msm driver has a way to let
the proxy tell the kernel the actual VM client process's executable name
and command-line, which has until now been used simply for GPU crash
devcore dumps.  Lets also expose this via fdinfo so that tools can
expose who the actual user of the GPU is.

Rob Clark (3):
  drm/doc: Relax fdinfo string constraints
  drm/msm: Rework get_comm_cmdline() helper
  drm/msm: Add comm/cmdline fields

 Documentation/gpu/drm-usage-stats.rst   | 37 +++++++++++++++----------
 drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 +--
 drivers/gpu/drm/msm/msm_drv.c           |  2 ++
 drivers/gpu/drm/msm/msm_gpu.c           | 27 +++++++++++++-----
 drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++--
 drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
 6 files changed, 58 insertions(+), 25 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFC 1/3] drm/doc: Relax fdinfo string constraints
  2023-04-17 20:12 ` Rob Clark
@ 2023-04-17 20:12   ` Rob Clark
  -1 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-17 20:12 UTC (permalink / raw)
  To: dri-devel
  Cc: Tvrtko Ursulin, Rob Clark, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Daniel Vetter, Jonathan Corbet,
	open list:DOCUMENTATION, open list

From: Rob Clark <robdclark@chromium.org>

The restriction about no whitespace, etc, really only applies to the
usage of strings in keys.  Values can contain anything (other than
newline).

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 Documentation/gpu/drm-usage-stats.rst | 29 ++++++++++++++-------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
index 258bdcc8fb86..8e00d53231e0 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -24,7 +24,7 @@ File format specification
 - All keys shall be prefixed with `drm-`.
 - Whitespace between the delimiter and first non-whitespace character shall be
   ignored when parsing.
-- Neither keys or values are allowed to contain whitespace characters.
+- Keys are not allowed to contain whitespace characters.
 - Numerical key value pairs can end with optional unit string.
 - Data type of the value is fixed as defined in the specification.
 
@@ -39,12 +39,13 @@ Data types
 ----------
 
 - <uint> - Unsigned integer without defining the maximum value.
-- <str> - String excluding any above defined reserved characters or whitespace.
+- <keystr> - String excluding any above defined reserved characters or whitespace.
+- <valstr> - String.
 
 Mandatory fully standardised keys
 ---------------------------------
 
-- drm-driver: <str>
+- drm-driver: <valstr>
 
 String shall contain the name this driver registered as via the respective
 `struct drm_driver` data structure.
@@ -69,10 +70,10 @@ scope of each device, in which case `drm-pdev` shall be present as well.
 Userspace should make sure to not double account any usage statistics by using
 the above described criteria in order to associate data to individual clients.
 
-- drm-engine-<str>: <uint> ns
+- drm-engine-<keystr>: <uint> ns
 
 GPUs usually contain multiple execution engines. Each shall be given a stable
-and unique name (str), with possible values documented in the driver specific
+and unique name (keystr), with possible values documented in the driver specific
 documentation.
 
 Value shall be in specified time units which the respective GPU engine spent
@@ -84,16 +85,16 @@ larger value within a reasonable period. Upon observing a value lower than what
 was previously read, userspace is expected to stay with that larger previous
 value until a monotonic update is seen.
 
-- drm-engine-capacity-<str>: <uint>
+- drm-engine-capacity-<keystr>: <uint>
 
 Engine identifier string must be the same as the one specified in the
-drm-engine-<str> tag and shall contain a greater than zero number in case the
+drm-engine-<keystr> tag and shall contain a greater than zero number in case the
 exported engine corresponds to a group of identical hardware engines.
 
 In the absence of this tag parser shall assume capacity of one. Zero capacity
 is not allowed.
 
-- drm-memory-<str>: <uint> [KiB|MiB]
+- drm-memory-<keystr>: <uint> [KiB|MiB]
 
 Each possible memory type which can be used to store buffer objects by the
 GPU in question shall be given a stable and unique name to be returned as the
@@ -126,10 +127,10 @@ The total size of buffers that are purgeable.
 
 The total size of buffers that are active on one or more rings.
 
-- drm-cycles-<str>: <uint>
+- drm-cycles-<keystr>: <uint>
 
 Engine identifier string must be the same as the one specified in the
-drm-engine-<str> tag and shall contain the number of busy cycles for the given
+drm-engine-<keystr> tag and shall contain the number of busy cycles for the given
 engine.
 
 Values are not required to be constantly monotonic if it makes the driver
@@ -138,12 +139,12 @@ larger value within a reasonable period. Upon observing a value lower than what
 was previously read, userspace is expected to stay with that larger previous
 value until a monotonic update is seen.
 
-- drm-maxfreq-<str>: <uint> [Hz|MHz|KHz]
+- drm-maxfreq-<keystr>: <uint> [Hz|MHz|KHz]
 
 Engine identifier string must be the same as the one specified in the
-drm-engine-<str> tag and shall contain the maximum frequency for the given
-engine.  Taken together with drm-cycles-<str>, this can be used to calculate
-percentage utilization of the engine, whereas drm-engine-<str> only reflects
+drm-engine-<keystr> tag and shall contain the maximum frequency for the given
+engine.  Taken together with drm-cycles-<keystr>, this can be used to calculate
+percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
 time active without considering what frequency the engine is operating as a
 percentage of it's maximum frequency.
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 1/3] drm/doc: Relax fdinfo string constraints
@ 2023-04-17 20:12   ` Rob Clark
  0 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-17 20:12 UTC (permalink / raw)
  To: dri-devel
  Cc: Rob Clark, Tvrtko Ursulin, Jonathan Corbet,
	open list:DOCUMENTATION, open list, Thomas Zimmermann

From: Rob Clark <robdclark@chromium.org>

The restriction about no whitespace, etc, really only applies to the
usage of strings in keys.  Values can contain anything (other than
newline).

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 Documentation/gpu/drm-usage-stats.rst | 29 ++++++++++++++-------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
index 258bdcc8fb86..8e00d53231e0 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -24,7 +24,7 @@ File format specification
 - All keys shall be prefixed with `drm-`.
 - Whitespace between the delimiter and first non-whitespace character shall be
   ignored when parsing.
-- Neither keys or values are allowed to contain whitespace characters.
+- Keys are not allowed to contain whitespace characters.
 - Numerical key value pairs can end with optional unit string.
 - Data type of the value is fixed as defined in the specification.
 
@@ -39,12 +39,13 @@ Data types
 ----------
 
 - <uint> - Unsigned integer without defining the maximum value.
-- <str> - String excluding any above defined reserved characters or whitespace.
+- <keystr> - String excluding any above defined reserved characters or whitespace.
+- <valstr> - String.
 
 Mandatory fully standardised keys
 ---------------------------------
 
-- drm-driver: <str>
+- drm-driver: <valstr>
 
 String shall contain the name this driver registered as via the respective
 `struct drm_driver` data structure.
@@ -69,10 +70,10 @@ scope of each device, in which case `drm-pdev` shall be present as well.
 Userspace should make sure to not double account any usage statistics by using
 the above described criteria in order to associate data to individual clients.
 
-- drm-engine-<str>: <uint> ns
+- drm-engine-<keystr>: <uint> ns
 
 GPUs usually contain multiple execution engines. Each shall be given a stable
-and unique name (str), with possible values documented in the driver specific
+and unique name (keystr), with possible values documented in the driver specific
 documentation.
 
 Value shall be in specified time units which the respective GPU engine spent
@@ -84,16 +85,16 @@ larger value within a reasonable period. Upon observing a value lower than what
 was previously read, userspace is expected to stay with that larger previous
 value until a monotonic update is seen.
 
-- drm-engine-capacity-<str>: <uint>
+- drm-engine-capacity-<keystr>: <uint>
 
 Engine identifier string must be the same as the one specified in the
-drm-engine-<str> tag and shall contain a greater than zero number in case the
+drm-engine-<keystr> tag and shall contain a greater than zero number in case the
 exported engine corresponds to a group of identical hardware engines.
 
 In the absence of this tag parser shall assume capacity of one. Zero capacity
 is not allowed.
 
-- drm-memory-<str>: <uint> [KiB|MiB]
+- drm-memory-<keystr>: <uint> [KiB|MiB]
 
 Each possible memory type which can be used to store buffer objects by the
 GPU in question shall be given a stable and unique name to be returned as the
@@ -126,10 +127,10 @@ The total size of buffers that are purgeable.
 
 The total size of buffers that are active on one or more rings.
 
-- drm-cycles-<str>: <uint>
+- drm-cycles-<keystr>: <uint>
 
 Engine identifier string must be the same as the one specified in the
-drm-engine-<str> tag and shall contain the number of busy cycles for the given
+drm-engine-<keystr> tag and shall contain the number of busy cycles for the given
 engine.
 
 Values are not required to be constantly monotonic if it makes the driver
@@ -138,12 +139,12 @@ larger value within a reasonable period. Upon observing a value lower than what
 was previously read, userspace is expected to stay with that larger previous
 value until a monotonic update is seen.
 
-- drm-maxfreq-<str>: <uint> [Hz|MHz|KHz]
+- drm-maxfreq-<keystr>: <uint> [Hz|MHz|KHz]
 
 Engine identifier string must be the same as the one specified in the
-drm-engine-<str> tag and shall contain the maximum frequency for the given
-engine.  Taken together with drm-cycles-<str>, this can be used to calculate
-percentage utilization of the engine, whereas drm-engine-<str> only reflects
+drm-engine-<keystr> tag and shall contain the maximum frequency for the given
+engine.  Taken together with drm-cycles-<keystr>, this can be used to calculate
+percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
 time active without considering what frequency the engine is operating as a
 percentage of it's maximum frequency.
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper
  2023-04-17 20:12 ` Rob Clark
@ 2023-04-17 20:12   ` Rob Clark
  -1 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-17 20:12 UTC (permalink / raw)
  To: dri-devel
  Cc: Tvrtko Ursulin, Rob Clark, Rob Clark, Abhinav Kumar,
	Dmitry Baryshkov, Sean Paul, David Airlie, Daniel Vetter,
	Akhil P Oommen, Chia-I Wu, Konrad Dybcio,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list

From: Rob Clark <robdclark@chromium.org>

Make it work in terms of ctx so that it can be re-used for fdinfo.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++--
 drivers/gpu/drm/msm/msm_drv.c           |  2 ++
 drivers/gpu/drm/msm/msm_gpu.c           | 13 ++++++-------
 drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++++++--
 drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
 5 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index bb38e728864d..43c4e1fea83f 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -412,7 +412,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
 		/* Ensure string is null terminated: */
 		str[len] = '\0';
 
-		mutex_lock(&gpu->lock);
+		mutex_lock(&ctx->lock);
 
 		if (param == MSM_PARAM_COMM) {
 			paramp = &ctx->comm;
@@ -423,7 +423,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
 		kfree(*paramp);
 		*paramp = str;
 
-		mutex_unlock(&gpu->lock);
+		mutex_unlock(&ctx->lock);
 
 		return 0;
 	}
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 3d73b98d6a9c..ca0e89e46e13 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -581,6 +581,8 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
 	rwlock_init(&ctx->queuelock);
 
 	kref_init(&ctx->ref);
+	ctx->pid = get_pid(task_pid(current));
+	mutex_init(&ctx->lock);
 	msm_submitqueue_init(dev, ctx);
 
 	ctx->aspace = msm_gpu_create_private_address_space(priv->gpu, current);
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index c403912d13ab..f0f4f845c32d 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -327,18 +327,17 @@ find_submit(struct msm_ringbuffer *ring, uint32_t fence)
 
 static void retire_submits(struct msm_gpu *gpu);
 
-static void get_comm_cmdline(struct msm_gem_submit *submit, char **comm, char **cmd)
+static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd)
 {
-	struct msm_file_private *ctx = submit->queue->ctx;
 	struct task_struct *task;
 
-	WARN_ON(!mutex_is_locked(&submit->gpu->lock));
-
 	/* Note that kstrdup will return NULL if argument is NULL: */
+	mutex_lock(&ctx->lock);
 	*comm = kstrdup(ctx->comm, GFP_KERNEL);
 	*cmd  = kstrdup(ctx->cmdline, GFP_KERNEL);
+	mutex_unlock(&ctx->lock);
 
-	task = get_pid_task(submit->pid, PIDTYPE_PID);
+	task = get_pid_task(ctx->pid, PIDTYPE_PID);
 	if (!task)
 		return;
 
@@ -372,7 +371,7 @@ static void recover_worker(struct kthread_work *work)
 		if (submit->aspace)
 			submit->aspace->faults++;
 
-		get_comm_cmdline(submit, &comm, &cmd);
+		get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
 
 		if (comm && cmd) {
 			DRM_DEV_ERROR(dev->dev, "%s: offending task: %s (%s)\n",
@@ -460,7 +459,7 @@ static void fault_worker(struct kthread_work *work)
 		goto resume_smmu;
 
 	if (submit) {
-		get_comm_cmdline(submit, &comm, &cmd);
+		get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
 
 		/*
 		 * When we get GPU iova faults, we can get 1000s of them,
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 7a4fa1b8655b..b2023a42116b 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -377,17 +377,25 @@ struct msm_file_private {
 	 */
 	int sysprof;
 
+	/** @pid: Process that opened this file. */
+	struct pid *pid;
+
+	/**
+	 * lock: Protects comm and cmdline
+	 */
+	struct mutex lock;
+
 	/**
 	 * comm: Overridden task comm, see MSM_PARAM_COMM
 	 *
-	 * Accessed under msm_gpu::lock
+	 * Accessed under msm_file_private::lock
 	 */
 	char *comm;
 
 	/**
 	 * cmdline: Overridden task cmdline, see MSM_PARAM_CMDLINE
 	 *
-	 * Accessed under msm_gpu::lock
+	 * Accessed under msm_file_private::lock
 	 */
 	char *cmdline;
 
diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
index 0e803125a325..0444ba04fa06 100644
--- a/drivers/gpu/drm/msm/msm_submitqueue.c
+++ b/drivers/gpu/drm/msm/msm_submitqueue.c
@@ -61,6 +61,7 @@ void __msm_file_private_destroy(struct kref *kref)
 	}
 
 	msm_gem_address_space_put(ctx->aspace);
+	put_pid(ctx->pid);
 	kfree(ctx->comm);
 	kfree(ctx->cmdline);
 	kfree(ctx);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper
@ 2023-04-17 20:12   ` Rob Clark
  0 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-17 20:12 UTC (permalink / raw)
  To: dri-devel
  Cc: Rob Clark, Tvrtko Ursulin, Akhil P Oommen, Abhinav Kumar,
	open list, Konrad Dybcio,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Dmitry Baryshkov,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Sean Paul

From: Rob Clark <robdclark@chromium.org>

Make it work in terms of ctx so that it can be re-used for fdinfo.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++--
 drivers/gpu/drm/msm/msm_drv.c           |  2 ++
 drivers/gpu/drm/msm/msm_gpu.c           | 13 ++++++-------
 drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++++++--
 drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
 5 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index bb38e728864d..43c4e1fea83f 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -412,7 +412,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
 		/* Ensure string is null terminated: */
 		str[len] = '\0';
 
-		mutex_lock(&gpu->lock);
+		mutex_lock(&ctx->lock);
 
 		if (param == MSM_PARAM_COMM) {
 			paramp = &ctx->comm;
@@ -423,7 +423,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
 		kfree(*paramp);
 		*paramp = str;
 
-		mutex_unlock(&gpu->lock);
+		mutex_unlock(&ctx->lock);
 
 		return 0;
 	}
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 3d73b98d6a9c..ca0e89e46e13 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -581,6 +581,8 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
 	rwlock_init(&ctx->queuelock);
 
 	kref_init(&ctx->ref);
+	ctx->pid = get_pid(task_pid(current));
+	mutex_init(&ctx->lock);
 	msm_submitqueue_init(dev, ctx);
 
 	ctx->aspace = msm_gpu_create_private_address_space(priv->gpu, current);
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index c403912d13ab..f0f4f845c32d 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -327,18 +327,17 @@ find_submit(struct msm_ringbuffer *ring, uint32_t fence)
 
 static void retire_submits(struct msm_gpu *gpu);
 
-static void get_comm_cmdline(struct msm_gem_submit *submit, char **comm, char **cmd)
+static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd)
 {
-	struct msm_file_private *ctx = submit->queue->ctx;
 	struct task_struct *task;
 
-	WARN_ON(!mutex_is_locked(&submit->gpu->lock));
-
 	/* Note that kstrdup will return NULL if argument is NULL: */
+	mutex_lock(&ctx->lock);
 	*comm = kstrdup(ctx->comm, GFP_KERNEL);
 	*cmd  = kstrdup(ctx->cmdline, GFP_KERNEL);
+	mutex_unlock(&ctx->lock);
 
-	task = get_pid_task(submit->pid, PIDTYPE_PID);
+	task = get_pid_task(ctx->pid, PIDTYPE_PID);
 	if (!task)
 		return;
 
@@ -372,7 +371,7 @@ static void recover_worker(struct kthread_work *work)
 		if (submit->aspace)
 			submit->aspace->faults++;
 
-		get_comm_cmdline(submit, &comm, &cmd);
+		get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
 
 		if (comm && cmd) {
 			DRM_DEV_ERROR(dev->dev, "%s: offending task: %s (%s)\n",
@@ -460,7 +459,7 @@ static void fault_worker(struct kthread_work *work)
 		goto resume_smmu;
 
 	if (submit) {
-		get_comm_cmdline(submit, &comm, &cmd);
+		get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
 
 		/*
 		 * When we get GPU iova faults, we can get 1000s of them,
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 7a4fa1b8655b..b2023a42116b 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -377,17 +377,25 @@ struct msm_file_private {
 	 */
 	int sysprof;
 
+	/** @pid: Process that opened this file. */
+	struct pid *pid;
+
+	/**
+	 * lock: Protects comm and cmdline
+	 */
+	struct mutex lock;
+
 	/**
 	 * comm: Overridden task comm, see MSM_PARAM_COMM
 	 *
-	 * Accessed under msm_gpu::lock
+	 * Accessed under msm_file_private::lock
 	 */
 	char *comm;
 
 	/**
 	 * cmdline: Overridden task cmdline, see MSM_PARAM_CMDLINE
 	 *
-	 * Accessed under msm_gpu::lock
+	 * Accessed under msm_file_private::lock
 	 */
 	char *cmdline;
 
diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
index 0e803125a325..0444ba04fa06 100644
--- a/drivers/gpu/drm/msm/msm_submitqueue.c
+++ b/drivers/gpu/drm/msm/msm_submitqueue.c
@@ -61,6 +61,7 @@ void __msm_file_private_destroy(struct kref *kref)
 	}
 
 	msm_gem_address_space_put(ctx->aspace);
+	put_pid(ctx->pid);
 	kfree(ctx->comm);
 	kfree(ctx->cmdline);
 	kfree(ctx);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 3/3] drm/msm: Add comm/cmdline fields
  2023-04-17 20:12 ` Rob Clark
@ 2023-04-17 20:12   ` Rob Clark
  -1 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-17 20:12 UTC (permalink / raw)
  To: dri-devel
  Cc: Tvrtko Ursulin, Rob Clark, David Airlie, Daniel Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet, Rob Clark, Abhinav Kumar, Dmitry Baryshkov,
	Sean Paul, open list:DOCUMENTATION, open list,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU

From: Rob Clark <robdclark@chromium.org>

Normally this would be the same information that can be obtained in
other ways.  But in some cases the process opening the drm fd is merely
a sort of proxy for the actual process using the GPU.  This is the case
for guest VM processes using the GPU via virglrenderer, in which case
the msm native-context renderer in virglrenderer overrides the comm/
cmdline to be the guest process's values.

Exposing this via fdinfo allows tools like gputop to show something more
meaningful than just a bunch of "pcivirtio-gpu" users.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 Documentation/gpu/drm-usage-stats.rst |  8 ++++++++
 drivers/gpu/drm/msm/msm_gpu.c         | 14 ++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
index 8e00d53231e0..bc90bed455e3 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -148,6 +148,14 @@ percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
 time active without considering what frequency the engine is operating as a
 percentage of it's maximum frequency.
 
+- drm-comm: <valstr>
+
+Returns the clients executable path.
+
+- drm-cmdline: <valstr>
+
+Returns the clients cmdline.
+
 Implementation Details
 ======================
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index f0f4f845c32d..1150dcbf28aa 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -148,12 +148,26 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)
 	return 0;
 }
 
+static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd);
+
 void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx,
 			 struct drm_printer *p)
 {
+	char *comm, *cmdline;
+
+	get_comm_cmdline(ctx, &comm, &cmdline);
+
 	drm_printf(p, "drm-engine-gpu:\t%llu ns\n", ctx->elapsed_ns);
 	drm_printf(p, "drm-cycles-gpu:\t%llu\n", ctx->cycles);
 	drm_printf(p, "drm-maxfreq-gpu:\t%u Hz\n", gpu->fast_rate);
+
+	if (comm)
+		drm_printf(p, "drm-comm:\t%s\n", comm);
+	if (cmdline)
+		drm_printf(p, "drm-cmdline:\t%s\n", cmdline);
+
+	kfree(comm);
+	kfree(cmdline);
 }
 
 int msm_gpu_hw_init(struct msm_gpu *gpu)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 3/3] drm/msm: Add comm/cmdline fields
@ 2023-04-17 20:12   ` Rob Clark
  0 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-17 20:12 UTC (permalink / raw)
  To: dri-devel
  Cc: Rob Clark, Tvrtko Ursulin, Thomas Zimmermann, Jonathan Corbet,
	Sean Paul, open list:DOCUMENTATION, Abhinav Kumar, open list,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Dmitry Baryshkov,
	open list:DRM DRIVER FOR MSM ADRENO GPU

From: Rob Clark <robdclark@chromium.org>

Normally this would be the same information that can be obtained in
other ways.  But in some cases the process opening the drm fd is merely
a sort of proxy for the actual process using the GPU.  This is the case
for guest VM processes using the GPU via virglrenderer, in which case
the msm native-context renderer in virglrenderer overrides the comm/
cmdline to be the guest process's values.

Exposing this via fdinfo allows tools like gputop to show something more
meaningful than just a bunch of "pcivirtio-gpu" users.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 Documentation/gpu/drm-usage-stats.rst |  8 ++++++++
 drivers/gpu/drm/msm/msm_gpu.c         | 14 ++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
index 8e00d53231e0..bc90bed455e3 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -148,6 +148,14 @@ percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
 time active without considering what frequency the engine is operating as a
 percentage of it's maximum frequency.
 
+- drm-comm: <valstr>
+
+Returns the clients executable path.
+
+- drm-cmdline: <valstr>
+
+Returns the clients cmdline.
+
 Implementation Details
 ======================
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index f0f4f845c32d..1150dcbf28aa 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -148,12 +148,26 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)
 	return 0;
 }
 
+static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd);
+
 void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx,
 			 struct drm_printer *p)
 {
+	char *comm, *cmdline;
+
+	get_comm_cmdline(ctx, &comm, &cmdline);
+
 	drm_printf(p, "drm-engine-gpu:\t%llu ns\n", ctx->elapsed_ns);
 	drm_printf(p, "drm-cycles-gpu:\t%llu\n", ctx->cycles);
 	drm_printf(p, "drm-maxfreq-gpu:\t%u Hz\n", gpu->fast_rate);
+
+	if (comm)
+		drm_printf(p, "drm-comm:\t%s\n", comm);
+	if (cmdline)
+		drm_printf(p, "drm-cmdline:\t%s\n", cmdline);
+
+	kfree(comm);
+	kfree(cmdline);
 }
 
 int msm_gpu_hw_init(struct msm_gpu *gpu)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] drm: Add comm/cmdline fdinfo fields
  2023-04-17 20:12 ` Rob Clark
@ 2023-04-17 20:45   ` Rob Clark
  -1 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-17 20:45 UTC (permalink / raw)
  To: dri-devel
  Cc: Tvrtko Ursulin, Rob Clark, Akhil P Oommen, Chia-I Wu,
	Dmitry Baryshkov, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Konrad Dybcio, open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DOCUMENTATION, open list, Sean Paul

On Mon, Apr 17, 2023 at 1:12 PM Rob Clark <robdclark@gmail.com> wrote:
>
> From: Rob Clark <robdclark@chromium.org>
>
> When many of the things using the GPU are processes in a VM guest, the
> actual client process is just a proxy.  The msm driver has a way to let
> the proxy tell the kernel the actual VM client process's executable name
> and command-line, which has until now been used simply for GPU crash
> devcore dumps.  Lets also expose this via fdinfo so that tools can
> expose who the actual user of the GPU is.

I should have also mentioned, in the VM/proxy scenario we have a
single process with separate drm_file's for each guest VM process.  So
it isn't an option to just change the proxy process's name to match
the client.

> Rob Clark (3):
>   drm/doc: Relax fdinfo string constraints
>   drm/msm: Rework get_comm_cmdline() helper
>   drm/msm: Add comm/cmdline fields
>
>  Documentation/gpu/drm-usage-stats.rst   | 37 +++++++++++++++----------
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 +--
>  drivers/gpu/drm/msm/msm_drv.c           |  2 ++
>  drivers/gpu/drm/msm/msm_gpu.c           | 27 +++++++++++++-----
>  drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++--
>  drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
>  6 files changed, 58 insertions(+), 25 deletions(-)
>
> --
> 2.39.2
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] drm: Add comm/cmdline fdinfo fields
@ 2023-04-17 20:45   ` Rob Clark
  0 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-17 20:45 UTC (permalink / raw)
  To: dri-devel
  Cc: Rob Clark, Tvrtko Ursulin, open list:DOCUMENTATION,
	Akhil P Oommen, open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list, Konrad Dybcio, Sean Paul, Dmitry Baryshkov,
	open list:DRM DRIVER FOR MSM ADRENO GPU

On Mon, Apr 17, 2023 at 1:12 PM Rob Clark <robdclark@gmail.com> wrote:
>
> From: Rob Clark <robdclark@chromium.org>
>
> When many of the things using the GPU are processes in a VM guest, the
> actual client process is just a proxy.  The msm driver has a way to let
> the proxy tell the kernel the actual VM client process's executable name
> and command-line, which has until now been used simply for GPU crash
> devcore dumps.  Lets also expose this via fdinfo so that tools can
> expose who the actual user of the GPU is.

I should have also mentioned, in the VM/proxy scenario we have a
single process with separate drm_file's for each guest VM process.  So
it isn't an option to just change the proxy process's name to match
the client.

> Rob Clark (3):
>   drm/doc: Relax fdinfo string constraints
>   drm/msm: Rework get_comm_cmdline() helper
>   drm/msm: Add comm/cmdline fields
>
>  Documentation/gpu/drm-usage-stats.rst   | 37 +++++++++++++++----------
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 +--
>  drivers/gpu/drm/msm/msm_drv.c           |  2 ++
>  drivers/gpu/drm/msm/msm_gpu.c           | 27 +++++++++++++-----
>  drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++--
>  drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
>  6 files changed, 58 insertions(+), 25 deletions(-)
>
> --
> 2.39.2
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 1/3] drm/doc: Relax fdinfo string constraints
  2023-04-17 20:12   ` Rob Clark
@ 2023-04-18  8:19     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 35+ messages in thread
From: Tvrtko Ursulin @ 2023-04-18  8:19 UTC (permalink / raw)
  To: Rob Clark, dri-devel
  Cc: Rob Clark, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, Daniel Vetter, Jonathan Corbet,
	open list:DOCUMENTATION, open list


On 17/04/2023 21:12, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> The restriction about no whitespace, etc, really only applies to the
> usage of strings in keys.  Values can contain anything (other than
> newline).
> 
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
>   Documentation/gpu/drm-usage-stats.rst | 29 ++++++++++++++-------------
>   1 file changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> index 258bdcc8fb86..8e00d53231e0 100644
> --- a/Documentation/gpu/drm-usage-stats.rst
> +++ b/Documentation/gpu/drm-usage-stats.rst
> @@ -24,7 +24,7 @@ File format specification
>   - All keys shall be prefixed with `drm-`.
>   - Whitespace between the delimiter and first non-whitespace character shall be
>     ignored when parsing.
> -- Neither keys or values are allowed to contain whitespace characters.
> +- Keys are not allowed to contain whitespace characters.
>   - Numerical key value pairs can end with optional unit string.
>   - Data type of the value is fixed as defined in the specification.
>   
> @@ -39,12 +39,13 @@ Data types
>   ----------
>   
>   - <uint> - Unsigned integer without defining the maximum value.
> -- <str> - String excluding any above defined reserved characters or whitespace.
> +- <keystr> - String excluding any above defined reserved characters or whitespace.
> +- <valstr> - String.

Makes sense I think. At least I can't remember that I had special reason 
to word it as strict as it was. Lets give it some time to marinade so 
for later:

Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

>   
>   Mandatory fully standardised keys
>   ---------------------------------
>   
> -- drm-driver: <str>
> +- drm-driver: <valstr>
>   
>   String shall contain the name this driver registered as via the respective
>   `struct drm_driver` data structure.
> @@ -69,10 +70,10 @@ scope of each device, in which case `drm-pdev` shall be present as well.
>   Userspace should make sure to not double account any usage statistics by using
>   the above described criteria in order to associate data to individual clients.
>   
> -- drm-engine-<str>: <uint> ns
> +- drm-engine-<keystr>: <uint> ns
>   
>   GPUs usually contain multiple execution engines. Each shall be given a stable
> -and unique name (str), with possible values documented in the driver specific
> +and unique name (keystr), with possible values documented in the driver specific
>   documentation.
>   
>   Value shall be in specified time units which the respective GPU engine spent
> @@ -84,16 +85,16 @@ larger value within a reasonable period. Upon observing a value lower than what
>   was previously read, userspace is expected to stay with that larger previous
>   value until a monotonic update is seen.
>   
> -- drm-engine-capacity-<str>: <uint>
> +- drm-engine-capacity-<keystr>: <uint>
>   
>   Engine identifier string must be the same as the one specified in the
> -drm-engine-<str> tag and shall contain a greater than zero number in case the
> +drm-engine-<keystr> tag and shall contain a greater than zero number in case the
>   exported engine corresponds to a group of identical hardware engines.
>   
>   In the absence of this tag parser shall assume capacity of one. Zero capacity
>   is not allowed.
>   
> -- drm-memory-<str>: <uint> [KiB|MiB]
> +- drm-memory-<keystr>: <uint> [KiB|MiB]
>   
>   Each possible memory type which can be used to store buffer objects by the
>   GPU in question shall be given a stable and unique name to be returned as the
> @@ -126,10 +127,10 @@ The total size of buffers that are purgeable.
>   
>   The total size of buffers that are active on one or more rings.
>   
> -- drm-cycles-<str>: <uint>
> +- drm-cycles-<keystr>: <uint>
>   
>   Engine identifier string must be the same as the one specified in the
> -drm-engine-<str> tag and shall contain the number of busy cycles for the given
> +drm-engine-<keystr> tag and shall contain the number of busy cycles for the given
>   engine.
>   
>   Values are not required to be constantly monotonic if it makes the driver
> @@ -138,12 +139,12 @@ larger value within a reasonable period. Upon observing a value lower than what
>   was previously read, userspace is expected to stay with that larger previous
>   value until a monotonic update is seen.
>   
> -- drm-maxfreq-<str>: <uint> [Hz|MHz|KHz]
> +- drm-maxfreq-<keystr>: <uint> [Hz|MHz|KHz]
>   
>   Engine identifier string must be the same as the one specified in the
> -drm-engine-<str> tag and shall contain the maximum frequency for the given
> -engine.  Taken together with drm-cycles-<str>, this can be used to calculate
> -percentage utilization of the engine, whereas drm-engine-<str> only reflects
> +drm-engine-<keystr> tag and shall contain the maximum frequency for the given
> +engine.  Taken together with drm-cycles-<keystr>, this can be used to calculate
> +percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
>   time active without considering what frequency the engine is operating as a
>   percentage of it's maximum frequency.
>   

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 1/3] drm/doc: Relax fdinfo string constraints
@ 2023-04-18  8:19     ` Tvrtko Ursulin
  0 siblings, 0 replies; 35+ messages in thread
From: Tvrtko Ursulin @ 2023-04-18  8:19 UTC (permalink / raw)
  To: Rob Clark, dri-devel
  Cc: Rob Clark, Thomas Zimmermann, Jonathan Corbet,
	open list:DOCUMENTATION, open list


On 17/04/2023 21:12, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> The restriction about no whitespace, etc, really only applies to the
> usage of strings in keys.  Values can contain anything (other than
> newline).
> 
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
>   Documentation/gpu/drm-usage-stats.rst | 29 ++++++++++++++-------------
>   1 file changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> index 258bdcc8fb86..8e00d53231e0 100644
> --- a/Documentation/gpu/drm-usage-stats.rst
> +++ b/Documentation/gpu/drm-usage-stats.rst
> @@ -24,7 +24,7 @@ File format specification
>   - All keys shall be prefixed with `drm-`.
>   - Whitespace between the delimiter and first non-whitespace character shall be
>     ignored when parsing.
> -- Neither keys or values are allowed to contain whitespace characters.
> +- Keys are not allowed to contain whitespace characters.
>   - Numerical key value pairs can end with optional unit string.
>   - Data type of the value is fixed as defined in the specification.
>   
> @@ -39,12 +39,13 @@ Data types
>   ----------
>   
>   - <uint> - Unsigned integer without defining the maximum value.
> -- <str> - String excluding any above defined reserved characters or whitespace.
> +- <keystr> - String excluding any above defined reserved characters or whitespace.
> +- <valstr> - String.

Makes sense I think. At least I can't remember that I had special reason 
to word it as strict as it was. Lets give it some time to marinade so 
for later:

Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

>   
>   Mandatory fully standardised keys
>   ---------------------------------
>   
> -- drm-driver: <str>
> +- drm-driver: <valstr>
>   
>   String shall contain the name this driver registered as via the respective
>   `struct drm_driver` data structure.
> @@ -69,10 +70,10 @@ scope of each device, in which case `drm-pdev` shall be present as well.
>   Userspace should make sure to not double account any usage statistics by using
>   the above described criteria in order to associate data to individual clients.
>   
> -- drm-engine-<str>: <uint> ns
> +- drm-engine-<keystr>: <uint> ns
>   
>   GPUs usually contain multiple execution engines. Each shall be given a stable
> -and unique name (str), with possible values documented in the driver specific
> +and unique name (keystr), with possible values documented in the driver specific
>   documentation.
>   
>   Value shall be in specified time units which the respective GPU engine spent
> @@ -84,16 +85,16 @@ larger value within a reasonable period. Upon observing a value lower than what
>   was previously read, userspace is expected to stay with that larger previous
>   value until a monotonic update is seen.
>   
> -- drm-engine-capacity-<str>: <uint>
> +- drm-engine-capacity-<keystr>: <uint>
>   
>   Engine identifier string must be the same as the one specified in the
> -drm-engine-<str> tag and shall contain a greater than zero number in case the
> +drm-engine-<keystr> tag and shall contain a greater than zero number in case the
>   exported engine corresponds to a group of identical hardware engines.
>   
>   In the absence of this tag parser shall assume capacity of one. Zero capacity
>   is not allowed.
>   
> -- drm-memory-<str>: <uint> [KiB|MiB]
> +- drm-memory-<keystr>: <uint> [KiB|MiB]
>   
>   Each possible memory type which can be used to store buffer objects by the
>   GPU in question shall be given a stable and unique name to be returned as the
> @@ -126,10 +127,10 @@ The total size of buffers that are purgeable.
>   
>   The total size of buffers that are active on one or more rings.
>   
> -- drm-cycles-<str>: <uint>
> +- drm-cycles-<keystr>: <uint>
>   
>   Engine identifier string must be the same as the one specified in the
> -drm-engine-<str> tag and shall contain the number of busy cycles for the given
> +drm-engine-<keystr> tag and shall contain the number of busy cycles for the given
>   engine.
>   
>   Values are not required to be constantly monotonic if it makes the driver
> @@ -138,12 +139,12 @@ larger value within a reasonable period. Upon observing a value lower than what
>   was previously read, userspace is expected to stay with that larger previous
>   value until a monotonic update is seen.
>   
> -- drm-maxfreq-<str>: <uint> [Hz|MHz|KHz]
> +- drm-maxfreq-<keystr>: <uint> [Hz|MHz|KHz]
>   
>   Engine identifier string must be the same as the one specified in the
> -drm-engine-<str> tag and shall contain the maximum frequency for the given
> -engine.  Taken together with drm-cycles-<str>, this can be used to calculate
> -percentage utilization of the engine, whereas drm-engine-<str> only reflects
> +drm-engine-<keystr> tag and shall contain the maximum frequency for the given
> +engine.  Taken together with drm-cycles-<keystr>, this can be used to calculate
> +percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
>   time active without considering what frequency the engine is operating as a
>   percentage of it's maximum frequency.
>   

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper
  2023-04-17 20:12   ` Rob Clark
@ 2023-04-18  8:27     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 35+ messages in thread
From: Tvrtko Ursulin @ 2023-04-18  8:27 UTC (permalink / raw)
  To: Rob Clark, dri-devel
  Cc: Rob Clark, Abhinav Kumar, Dmitry Baryshkov, Sean Paul,
	David Airlie, Daniel Vetter, Akhil P Oommen, Chia-I Wu,
	Konrad Dybcio, open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list


On 17/04/2023 21:12, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> Make it work in terms of ctx so that it can be re-used for fdinfo.
> 
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
>   drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++--
>   drivers/gpu/drm/msm/msm_drv.c           |  2 ++
>   drivers/gpu/drm/msm/msm_gpu.c           | 13 ++++++-------
>   drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++++++--
>   drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
>   5 files changed, 21 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index bb38e728864d..43c4e1fea83f 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -412,7 +412,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
>   		/* Ensure string is null terminated: */
>   		str[len] = '\0';
>   
> -		mutex_lock(&gpu->lock);
> +		mutex_lock(&ctx->lock);
>   
>   		if (param == MSM_PARAM_COMM) {
>   			paramp = &ctx->comm;
> @@ -423,7 +423,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
>   		kfree(*paramp);
>   		*paramp = str;
>   
> -		mutex_unlock(&gpu->lock);
> +		mutex_unlock(&ctx->lock);
>   
>   		return 0;
>   	}
> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> index 3d73b98d6a9c..ca0e89e46e13 100644
> --- a/drivers/gpu/drm/msm/msm_drv.c
> +++ b/drivers/gpu/drm/msm/msm_drv.c
> @@ -581,6 +581,8 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
>   	rwlock_init(&ctx->queuelock);
>   
>   	kref_init(&ctx->ref);
> +	ctx->pid = get_pid(task_pid(current));

Would it simplify things for msm if DRM core had an up to date file->pid 
as proposed in 
https://patchwork.freedesktop.org/patch/526752/?series=109902&rev=4 ? It 
gets updated if ioctl issuer is different than fd opener and this being 
context_init here reminded me of it. Maybe you wouldn't have to track 
the pid in msm?

Regards,

Tvrtko

> +	mutex_init(&ctx->lock);
>   	msm_submitqueue_init(dev, ctx);
>   
>   	ctx->aspace = msm_gpu_create_private_address_space(priv->gpu, current);
> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> index c403912d13ab..f0f4f845c32d 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.c
> +++ b/drivers/gpu/drm/msm/msm_gpu.c
> @@ -327,18 +327,17 @@ find_submit(struct msm_ringbuffer *ring, uint32_t fence)
>   
>   static void retire_submits(struct msm_gpu *gpu);
>   
> -static void get_comm_cmdline(struct msm_gem_submit *submit, char **comm, char **cmd)
> +static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd)
>   {
> -	struct msm_file_private *ctx = submit->queue->ctx;
>   	struct task_struct *task;
>   
> -	WARN_ON(!mutex_is_locked(&submit->gpu->lock));
> -
>   	/* Note that kstrdup will return NULL if argument is NULL: */
> +	mutex_lock(&ctx->lock);
>   	*comm = kstrdup(ctx->comm, GFP_KERNEL);
>   	*cmd  = kstrdup(ctx->cmdline, GFP_KERNEL);
> +	mutex_unlock(&ctx->lock);
>   
> -	task = get_pid_task(submit->pid, PIDTYPE_PID);
> +	task = get_pid_task(ctx->pid, PIDTYPE_PID);
>   	if (!task)
>   		return;
>   
> @@ -372,7 +371,7 @@ static void recover_worker(struct kthread_work *work)
>   		if (submit->aspace)
>   			submit->aspace->faults++;
>   
> -		get_comm_cmdline(submit, &comm, &cmd);
> +		get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
>   
>   		if (comm && cmd) {
>   			DRM_DEV_ERROR(dev->dev, "%s: offending task: %s (%s)\n",
> @@ -460,7 +459,7 @@ static void fault_worker(struct kthread_work *work)
>   		goto resume_smmu;
>   
>   	if (submit) {
> -		get_comm_cmdline(submit, &comm, &cmd);
> +		get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
>   
>   		/*
>   		 * When we get GPU iova faults, we can get 1000s of them,
> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> index 7a4fa1b8655b..b2023a42116b 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.h
> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> @@ -377,17 +377,25 @@ struct msm_file_private {
>   	 */
>   	int sysprof;
>   
> +	/** @pid: Process that opened this file. */
> +	struct pid *pid;
> +
> +	/**
> +	 * lock: Protects comm and cmdline
> +	 */
> +	struct mutex lock;
> +
>   	/**
>   	 * comm: Overridden task comm, see MSM_PARAM_COMM
>   	 *
> -	 * Accessed under msm_gpu::lock
> +	 * Accessed under msm_file_private::lock
>   	 */
>   	char *comm;
>   
>   	/**
>   	 * cmdline: Overridden task cmdline, see MSM_PARAM_CMDLINE
>   	 *
> -	 * Accessed under msm_gpu::lock
> +	 * Accessed under msm_file_private::lock
>   	 */
>   	char *cmdline;
>   
> diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
> index 0e803125a325..0444ba04fa06 100644
> --- a/drivers/gpu/drm/msm/msm_submitqueue.c
> +++ b/drivers/gpu/drm/msm/msm_submitqueue.c
> @@ -61,6 +61,7 @@ void __msm_file_private_destroy(struct kref *kref)
>   	}
>   
>   	msm_gem_address_space_put(ctx->aspace);
> +	put_pid(ctx->pid);
>   	kfree(ctx->comm);
>   	kfree(ctx->cmdline);
>   	kfree(ctx);

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper
@ 2023-04-18  8:27     ` Tvrtko Ursulin
  0 siblings, 0 replies; 35+ messages in thread
From: Tvrtko Ursulin @ 2023-04-18  8:27 UTC (permalink / raw)
  To: Rob Clark, dri-devel
  Cc: Rob Clark, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Akhil P Oommen, Sean Paul, Abhinav Kumar, open list,
	Konrad Dybcio, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Dmitry Baryshkov


On 17/04/2023 21:12, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> Make it work in terms of ctx so that it can be re-used for fdinfo.
> 
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
>   drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++--
>   drivers/gpu/drm/msm/msm_drv.c           |  2 ++
>   drivers/gpu/drm/msm/msm_gpu.c           | 13 ++++++-------
>   drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++++++--
>   drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
>   5 files changed, 21 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index bb38e728864d..43c4e1fea83f 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -412,7 +412,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
>   		/* Ensure string is null terminated: */
>   		str[len] = '\0';
>   
> -		mutex_lock(&gpu->lock);
> +		mutex_lock(&ctx->lock);
>   
>   		if (param == MSM_PARAM_COMM) {
>   			paramp = &ctx->comm;
> @@ -423,7 +423,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
>   		kfree(*paramp);
>   		*paramp = str;
>   
> -		mutex_unlock(&gpu->lock);
> +		mutex_unlock(&ctx->lock);
>   
>   		return 0;
>   	}
> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> index 3d73b98d6a9c..ca0e89e46e13 100644
> --- a/drivers/gpu/drm/msm/msm_drv.c
> +++ b/drivers/gpu/drm/msm/msm_drv.c
> @@ -581,6 +581,8 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
>   	rwlock_init(&ctx->queuelock);
>   
>   	kref_init(&ctx->ref);
> +	ctx->pid = get_pid(task_pid(current));

Would it simplify things for msm if DRM core had an up to date file->pid 
as proposed in 
https://patchwork.freedesktop.org/patch/526752/?series=109902&rev=4 ? It 
gets updated if ioctl issuer is different than fd opener and this being 
context_init here reminded me of it. Maybe you wouldn't have to track 
the pid in msm?

Regards,

Tvrtko

> +	mutex_init(&ctx->lock);
>   	msm_submitqueue_init(dev, ctx);
>   
>   	ctx->aspace = msm_gpu_create_private_address_space(priv->gpu, current);
> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> index c403912d13ab..f0f4f845c32d 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.c
> +++ b/drivers/gpu/drm/msm/msm_gpu.c
> @@ -327,18 +327,17 @@ find_submit(struct msm_ringbuffer *ring, uint32_t fence)
>   
>   static void retire_submits(struct msm_gpu *gpu);
>   
> -static void get_comm_cmdline(struct msm_gem_submit *submit, char **comm, char **cmd)
> +static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd)
>   {
> -	struct msm_file_private *ctx = submit->queue->ctx;
>   	struct task_struct *task;
>   
> -	WARN_ON(!mutex_is_locked(&submit->gpu->lock));
> -
>   	/* Note that kstrdup will return NULL if argument is NULL: */
> +	mutex_lock(&ctx->lock);
>   	*comm = kstrdup(ctx->comm, GFP_KERNEL);
>   	*cmd  = kstrdup(ctx->cmdline, GFP_KERNEL);
> +	mutex_unlock(&ctx->lock);
>   
> -	task = get_pid_task(submit->pid, PIDTYPE_PID);
> +	task = get_pid_task(ctx->pid, PIDTYPE_PID);
>   	if (!task)
>   		return;
>   
> @@ -372,7 +371,7 @@ static void recover_worker(struct kthread_work *work)
>   		if (submit->aspace)
>   			submit->aspace->faults++;
>   
> -		get_comm_cmdline(submit, &comm, &cmd);
> +		get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
>   
>   		if (comm && cmd) {
>   			DRM_DEV_ERROR(dev->dev, "%s: offending task: %s (%s)\n",
> @@ -460,7 +459,7 @@ static void fault_worker(struct kthread_work *work)
>   		goto resume_smmu;
>   
>   	if (submit) {
> -		get_comm_cmdline(submit, &comm, &cmd);
> +		get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
>   
>   		/*
>   		 * When we get GPU iova faults, we can get 1000s of them,
> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> index 7a4fa1b8655b..b2023a42116b 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.h
> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> @@ -377,17 +377,25 @@ struct msm_file_private {
>   	 */
>   	int sysprof;
>   
> +	/** @pid: Process that opened this file. */
> +	struct pid *pid;
> +
> +	/**
> +	 * lock: Protects comm and cmdline
> +	 */
> +	struct mutex lock;
> +
>   	/**
>   	 * comm: Overridden task comm, see MSM_PARAM_COMM
>   	 *
> -	 * Accessed under msm_gpu::lock
> +	 * Accessed under msm_file_private::lock
>   	 */
>   	char *comm;
>   
>   	/**
>   	 * cmdline: Overridden task cmdline, see MSM_PARAM_CMDLINE
>   	 *
> -	 * Accessed under msm_gpu::lock
> +	 * Accessed under msm_file_private::lock
>   	 */
>   	char *cmdline;
>   
> diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
> index 0e803125a325..0444ba04fa06 100644
> --- a/drivers/gpu/drm/msm/msm_submitqueue.c
> +++ b/drivers/gpu/drm/msm/msm_submitqueue.c
> @@ -61,6 +61,7 @@ void __msm_file_private_destroy(struct kref *kref)
>   	}
>   
>   	msm_gem_address_space_put(ctx->aspace);
> +	put_pid(ctx->pid);
>   	kfree(ctx->comm);
>   	kfree(ctx->cmdline);
>   	kfree(ctx);

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper
  2023-04-18  8:27     ` Tvrtko Ursulin
@ 2023-04-18  8:34       ` Daniel Vetter
  -1 siblings, 0 replies; 35+ messages in thread
From: Daniel Vetter @ 2023-04-18  8:34 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Rob Clark, dri-devel, Rob Clark, Abhinav Kumar, Dmitry Baryshkov,
	Sean Paul, David Airlie, Daniel Vetter, Akhil P Oommen,
	Chia-I Wu, Konrad Dybcio,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list

On Tue, Apr 18, 2023 at 09:27:49AM +0100, Tvrtko Ursulin wrote:
> 
> On 17/04/2023 21:12, Rob Clark wrote:
> > From: Rob Clark <robdclark@chromium.org>
> > 
> > Make it work in terms of ctx so that it can be re-used for fdinfo.
> > 
> > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > ---
> >   drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++--
> >   drivers/gpu/drm/msm/msm_drv.c           |  2 ++
> >   drivers/gpu/drm/msm/msm_gpu.c           | 13 ++++++-------
> >   drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++++++--
> >   drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
> >   5 files changed, 21 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > index bb38e728864d..43c4e1fea83f 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > @@ -412,7 +412,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> >   		/* Ensure string is null terminated: */
> >   		str[len] = '\0';
> > -		mutex_lock(&gpu->lock);
> > +		mutex_lock(&ctx->lock);
> >   		if (param == MSM_PARAM_COMM) {
> >   			paramp = &ctx->comm;
> > @@ -423,7 +423,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> >   		kfree(*paramp);
> >   		*paramp = str;
> > -		mutex_unlock(&gpu->lock);
> > +		mutex_unlock(&ctx->lock);
> >   		return 0;
> >   	}
> > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > index 3d73b98d6a9c..ca0e89e46e13 100644
> > --- a/drivers/gpu/drm/msm/msm_drv.c
> > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > @@ -581,6 +581,8 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
> >   	rwlock_init(&ctx->queuelock);
> >   	kref_init(&ctx->ref);
> > +	ctx->pid = get_pid(task_pid(current));
> 
> Would it simplify things for msm if DRM core had an up to date file->pid as
> proposed in
> https://patchwork.freedesktop.org/patch/526752/?series=109902&rev=4 ? It
> gets updated if ioctl issuer is different than fd opener and this being
> context_init here reminded me of it. Maybe you wouldn't have to track the
> pid in msm?

Can we go one step further and let the drm fdinfo stuff print these new
additions? Consistency across drivers and all that.

Also for a generic trigger I think any driver ioctl is good enough (we
only really need to avoid the auth dance when you're not on a render
node).
-Daniel

> 
> Regards,
> 
> Tvrtko
> 
> > +	mutex_init(&ctx->lock);
> >   	msm_submitqueue_init(dev, ctx);
> >   	ctx->aspace = msm_gpu_create_private_address_space(priv->gpu, current);
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> > index c403912d13ab..f0f4f845c32d 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.c
> > +++ b/drivers/gpu/drm/msm/msm_gpu.c
> > @@ -327,18 +327,17 @@ find_submit(struct msm_ringbuffer *ring, uint32_t fence)
> >   static void retire_submits(struct msm_gpu *gpu);
> > -static void get_comm_cmdline(struct msm_gem_submit *submit, char **comm, char **cmd)
> > +static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd)
> >   {
> > -	struct msm_file_private *ctx = submit->queue->ctx;
> >   	struct task_struct *task;
> > -	WARN_ON(!mutex_is_locked(&submit->gpu->lock));
> > -
> >   	/* Note that kstrdup will return NULL if argument is NULL: */
> > +	mutex_lock(&ctx->lock);
> >   	*comm = kstrdup(ctx->comm, GFP_KERNEL);
> >   	*cmd  = kstrdup(ctx->cmdline, GFP_KERNEL);
> > +	mutex_unlock(&ctx->lock);
> > -	task = get_pid_task(submit->pid, PIDTYPE_PID);
> > +	task = get_pid_task(ctx->pid, PIDTYPE_PID);
> >   	if (!task)
> >   		return;
> > @@ -372,7 +371,7 @@ static void recover_worker(struct kthread_work *work)
> >   		if (submit->aspace)
> >   			submit->aspace->faults++;
> > -		get_comm_cmdline(submit, &comm, &cmd);
> > +		get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
> >   		if (comm && cmd) {
> >   			DRM_DEV_ERROR(dev->dev, "%s: offending task: %s (%s)\n",
> > @@ -460,7 +459,7 @@ static void fault_worker(struct kthread_work *work)
> >   		goto resume_smmu;
> >   	if (submit) {
> > -		get_comm_cmdline(submit, &comm, &cmd);
> > +		get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
> >   		/*
> >   		 * When we get GPU iova faults, we can get 1000s of them,
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> > index 7a4fa1b8655b..b2023a42116b 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.h
> > +++ b/drivers/gpu/drm/msm/msm_gpu.h
> > @@ -377,17 +377,25 @@ struct msm_file_private {
> >   	 */
> >   	int sysprof;
> > +	/** @pid: Process that opened this file. */
> > +	struct pid *pid;
> > +
> > +	/**
> > +	 * lock: Protects comm and cmdline
> > +	 */
> > +	struct mutex lock;
> > +
> >   	/**
> >   	 * comm: Overridden task comm, see MSM_PARAM_COMM
> >   	 *
> > -	 * Accessed under msm_gpu::lock
> > +	 * Accessed under msm_file_private::lock
> >   	 */
> >   	char *comm;
> >   	/**
> >   	 * cmdline: Overridden task cmdline, see MSM_PARAM_CMDLINE
> >   	 *
> > -	 * Accessed under msm_gpu::lock
> > +	 * Accessed under msm_file_private::lock
> >   	 */
> >   	char *cmdline;
> > diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
> > index 0e803125a325..0444ba04fa06 100644
> > --- a/drivers/gpu/drm/msm/msm_submitqueue.c
> > +++ b/drivers/gpu/drm/msm/msm_submitqueue.c
> > @@ -61,6 +61,7 @@ void __msm_file_private_destroy(struct kref *kref)
> >   	}
> >   	msm_gem_address_space_put(ctx->aspace);
> > +	put_pid(ctx->pid);
> >   	kfree(ctx->comm);
> >   	kfree(ctx->cmdline);
> >   	kfree(ctx);

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper
@ 2023-04-18  8:34       ` Daniel Vetter
  0 siblings, 0 replies; 35+ messages in thread
From: Daniel Vetter @ 2023-04-18  8:34 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Rob Clark, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Akhil P Oommen, Abhinav Kumar, dri-devel, open list,
	Konrad Dybcio, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Dmitry Baryshkov, Sean Paul

On Tue, Apr 18, 2023 at 09:27:49AM +0100, Tvrtko Ursulin wrote:
> 
> On 17/04/2023 21:12, Rob Clark wrote:
> > From: Rob Clark <robdclark@chromium.org>
> > 
> > Make it work in terms of ctx so that it can be re-used for fdinfo.
> > 
> > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > ---
> >   drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++--
> >   drivers/gpu/drm/msm/msm_drv.c           |  2 ++
> >   drivers/gpu/drm/msm/msm_gpu.c           | 13 ++++++-------
> >   drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++++++--
> >   drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
> >   5 files changed, 21 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > index bb38e728864d..43c4e1fea83f 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > @@ -412,7 +412,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> >   		/* Ensure string is null terminated: */
> >   		str[len] = '\0';
> > -		mutex_lock(&gpu->lock);
> > +		mutex_lock(&ctx->lock);
> >   		if (param == MSM_PARAM_COMM) {
> >   			paramp = &ctx->comm;
> > @@ -423,7 +423,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> >   		kfree(*paramp);
> >   		*paramp = str;
> > -		mutex_unlock(&gpu->lock);
> > +		mutex_unlock(&ctx->lock);
> >   		return 0;
> >   	}
> > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > index 3d73b98d6a9c..ca0e89e46e13 100644
> > --- a/drivers/gpu/drm/msm/msm_drv.c
> > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > @@ -581,6 +581,8 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
> >   	rwlock_init(&ctx->queuelock);
> >   	kref_init(&ctx->ref);
> > +	ctx->pid = get_pid(task_pid(current));
> 
> Would it simplify things for msm if DRM core had an up to date file->pid as
> proposed in
> https://patchwork.freedesktop.org/patch/526752/?series=109902&rev=4 ? It
> gets updated if ioctl issuer is different than fd opener and this being
> context_init here reminded me of it. Maybe you wouldn't have to track the
> pid in msm?

Can we go one step further and let the drm fdinfo stuff print these new
additions? Consistency across drivers and all that.

Also for a generic trigger I think any driver ioctl is good enough (we
only really need to avoid the auth dance when you're not on a render
node).
-Daniel

> 
> Regards,
> 
> Tvrtko
> 
> > +	mutex_init(&ctx->lock);
> >   	msm_submitqueue_init(dev, ctx);
> >   	ctx->aspace = msm_gpu_create_private_address_space(priv->gpu, current);
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> > index c403912d13ab..f0f4f845c32d 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.c
> > +++ b/drivers/gpu/drm/msm/msm_gpu.c
> > @@ -327,18 +327,17 @@ find_submit(struct msm_ringbuffer *ring, uint32_t fence)
> >   static void retire_submits(struct msm_gpu *gpu);
> > -static void get_comm_cmdline(struct msm_gem_submit *submit, char **comm, char **cmd)
> > +static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd)
> >   {
> > -	struct msm_file_private *ctx = submit->queue->ctx;
> >   	struct task_struct *task;
> > -	WARN_ON(!mutex_is_locked(&submit->gpu->lock));
> > -
> >   	/* Note that kstrdup will return NULL if argument is NULL: */
> > +	mutex_lock(&ctx->lock);
> >   	*comm = kstrdup(ctx->comm, GFP_KERNEL);
> >   	*cmd  = kstrdup(ctx->cmdline, GFP_KERNEL);
> > +	mutex_unlock(&ctx->lock);
> > -	task = get_pid_task(submit->pid, PIDTYPE_PID);
> > +	task = get_pid_task(ctx->pid, PIDTYPE_PID);
> >   	if (!task)
> >   		return;
> > @@ -372,7 +371,7 @@ static void recover_worker(struct kthread_work *work)
> >   		if (submit->aspace)
> >   			submit->aspace->faults++;
> > -		get_comm_cmdline(submit, &comm, &cmd);
> > +		get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
> >   		if (comm && cmd) {
> >   			DRM_DEV_ERROR(dev->dev, "%s: offending task: %s (%s)\n",
> > @@ -460,7 +459,7 @@ static void fault_worker(struct kthread_work *work)
> >   		goto resume_smmu;
> >   	if (submit) {
> > -		get_comm_cmdline(submit, &comm, &cmd);
> > +		get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
> >   		/*
> >   		 * When we get GPU iova faults, we can get 1000s of them,
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> > index 7a4fa1b8655b..b2023a42116b 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.h
> > +++ b/drivers/gpu/drm/msm/msm_gpu.h
> > @@ -377,17 +377,25 @@ struct msm_file_private {
> >   	 */
> >   	int sysprof;
> > +	/** @pid: Process that opened this file. */
> > +	struct pid *pid;
> > +
> > +	/**
> > +	 * lock: Protects comm and cmdline
> > +	 */
> > +	struct mutex lock;
> > +
> >   	/**
> >   	 * comm: Overridden task comm, see MSM_PARAM_COMM
> >   	 *
> > -	 * Accessed under msm_gpu::lock
> > +	 * Accessed under msm_file_private::lock
> >   	 */
> >   	char *comm;
> >   	/**
> >   	 * cmdline: Overridden task cmdline, see MSM_PARAM_CMDLINE
> >   	 *
> > -	 * Accessed under msm_gpu::lock
> > +	 * Accessed under msm_file_private::lock
> >   	 */
> >   	char *cmdline;
> > diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
> > index 0e803125a325..0444ba04fa06 100644
> > --- a/drivers/gpu/drm/msm/msm_submitqueue.c
> > +++ b/drivers/gpu/drm/msm/msm_submitqueue.c
> > @@ -61,6 +61,7 @@ void __msm_file_private_destroy(struct kref *kref)
> >   	}
> >   	msm_gem_address_space_put(ctx->aspace);
> > +	put_pid(ctx->pid);
> >   	kfree(ctx->comm);
> >   	kfree(ctx->cmdline);
> >   	kfree(ctx);

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 3/3] drm/msm: Add comm/cmdline fields
  2023-04-17 20:12   ` Rob Clark
@ 2023-04-18  8:53     ` Tvrtko Ursulin
  -1 siblings, 0 replies; 35+ messages in thread
From: Tvrtko Ursulin @ 2023-04-18  8:53 UTC (permalink / raw)
  To: Rob Clark, dri-devel
  Cc: Rob Clark, David Airlie, Daniel Vetter, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, Jonathan Corbet, Abhinav Kumar,
	Dmitry Baryshkov, Sean Paul, open list:DOCUMENTATION, open list,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU


On 17/04/2023 21:12, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> Normally this would be the same information that can be obtained in
> other ways.  But in some cases the process opening the drm fd is merely
> a sort of proxy for the actual process using the GPU.  This is the case
> for guest VM processes using the GPU via virglrenderer, in which case
> the msm native-context renderer in virglrenderer overrides the comm/
> cmdline to be the guest process's values.
> 
> Exposing this via fdinfo allows tools like gputop to show something more
> meaningful than just a bunch of "pcivirtio-gpu" users.

You also later expanded with:

"""
I should have also mentioned, in the VM/proxy scenario we have a
single process with separate drm_file's for each guest VM process.  So
it isn't an option to just change the proxy process's name to match
the client.
"""

So how does that work - this single process temporarily changes it's 
name for each drm fd it opens and creates a context or it is actually in 
the native context protocol?

> 
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
>   Documentation/gpu/drm-usage-stats.rst |  8 ++++++++
>   drivers/gpu/drm/msm/msm_gpu.c         | 14 ++++++++++++++
>   2 files changed, 22 insertions(+)
> 
> diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> index 8e00d53231e0..bc90bed455e3 100644
> --- a/Documentation/gpu/drm-usage-stats.rst
> +++ b/Documentation/gpu/drm-usage-stats.rst
> @@ -148,6 +148,14 @@ percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
>   time active without considering what frequency the engine is operating as a
>   percentage of it's maximum frequency.
>   
> +- drm-comm: <valstr>
> +
> +Returns the clients executable path.

Full path and not just current->comm? In this case probably give it a 
more descriptive name here.

drm-client-executable
drm-client-command-line

So we stay in the drm-client- namespace?

Or if the former is absolute path could one key be enough for both?

drm-client-command-line: /path/to/executable --arguments

> +
> +- drm-cmdline: <valstr>
> +
> +Returns the clients cmdline.

I think drm-usage-stats.rst text should provide some more text with 
these two. To precisely define their content and outline the use case 
under which driver authors may want to add them, and fdinfo consumer 
therefore expect to see them. Just so everything is completely clear and 
people do not start adding them for drivers which do not support native 
context (or like).

But on the overall it sounds reasonable to me - it would be really cool 
to not just see pcivirtio-gpu as you say. Even if the standard virtiogpu 
use case (not native context) could show real users.

Regards,

Tvrtko

> +
>   Implementation Details
>   ======================
>   
> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> index f0f4f845c32d..1150dcbf28aa 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.c
> +++ b/drivers/gpu/drm/msm/msm_gpu.c
> @@ -148,12 +148,26 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)
>   	return 0;
>   }
>   
> +static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd);
> +
>   void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx,
>   			 struct drm_printer *p)
>   {
> +	char *comm, *cmdline;
> +
> +	get_comm_cmdline(ctx, &comm, &cmdline);
> +
>   	drm_printf(p, "drm-engine-gpu:\t%llu ns\n", ctx->elapsed_ns);
>   	drm_printf(p, "drm-cycles-gpu:\t%llu\n", ctx->cycles);
>   	drm_printf(p, "drm-maxfreq-gpu:\t%u Hz\n", gpu->fast_rate);
> +
> +	if (comm)
> +		drm_printf(p, "drm-comm:\t%s\n", comm);
> +	if (cmdline)
> +		drm_printf(p, "drm-cmdline:\t%s\n", cmdline);
> +
> +	kfree(comm);
> +	kfree(cmdline);
>   }
>   
>   int msm_gpu_hw_init(struct msm_gpu *gpu)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 3/3] drm/msm: Add comm/cmdline fields
@ 2023-04-18  8:53     ` Tvrtko Ursulin
  0 siblings, 0 replies; 35+ messages in thread
From: Tvrtko Ursulin @ 2023-04-18  8:53 UTC (permalink / raw)
  To: Rob Clark, dri-devel
  Cc: Rob Clark, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Thomas Zimmermann, Jonathan Corbet, Sean Paul,
	open list:DOCUMENTATION, Abhinav Kumar, open list,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Dmitry Baryshkov


On 17/04/2023 21:12, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> Normally this would be the same information that can be obtained in
> other ways.  But in some cases the process opening the drm fd is merely
> a sort of proxy for the actual process using the GPU.  This is the case
> for guest VM processes using the GPU via virglrenderer, in which case
> the msm native-context renderer in virglrenderer overrides the comm/
> cmdline to be the guest process's values.
> 
> Exposing this via fdinfo allows tools like gputop to show something more
> meaningful than just a bunch of "pcivirtio-gpu" users.

You also later expanded with:

"""
I should have also mentioned, in the VM/proxy scenario we have a
single process with separate drm_file's for each guest VM process.  So
it isn't an option to just change the proxy process's name to match
the client.
"""

So how does that work - this single process temporarily changes it's 
name for each drm fd it opens and creates a context or it is actually in 
the native context protocol?

> 
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
>   Documentation/gpu/drm-usage-stats.rst |  8 ++++++++
>   drivers/gpu/drm/msm/msm_gpu.c         | 14 ++++++++++++++
>   2 files changed, 22 insertions(+)
> 
> diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> index 8e00d53231e0..bc90bed455e3 100644
> --- a/Documentation/gpu/drm-usage-stats.rst
> +++ b/Documentation/gpu/drm-usage-stats.rst
> @@ -148,6 +148,14 @@ percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
>   time active without considering what frequency the engine is operating as a
>   percentage of it's maximum frequency.
>   
> +- drm-comm: <valstr>
> +
> +Returns the clients executable path.

Full path and not just current->comm? In this case probably give it a 
more descriptive name here.

drm-client-executable
drm-client-command-line

So we stay in the drm-client- namespace?

Or if the former is absolute path could one key be enough for both?

drm-client-command-line: /path/to/executable --arguments

> +
> +- drm-cmdline: <valstr>
> +
> +Returns the clients cmdline.

I think drm-usage-stats.rst text should provide some more text with 
these two. To precisely define their content and outline the use case 
under which driver authors may want to add them, and fdinfo consumer 
therefore expect to see them. Just so everything is completely clear and 
people do not start adding them for drivers which do not support native 
context (or like).

But on the overall it sounds reasonable to me - it would be really cool 
to not just see pcivirtio-gpu as you say. Even if the standard virtiogpu 
use case (not native context) could show real users.

Regards,

Tvrtko

> +
>   Implementation Details
>   ======================
>   
> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> index f0f4f845c32d..1150dcbf28aa 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.c
> +++ b/drivers/gpu/drm/msm/msm_gpu.c
> @@ -148,12 +148,26 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)
>   	return 0;
>   }
>   
> +static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd);
> +
>   void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx,
>   			 struct drm_printer *p)
>   {
> +	char *comm, *cmdline;
> +
> +	get_comm_cmdline(ctx, &comm, &cmdline);
> +
>   	drm_printf(p, "drm-engine-gpu:\t%llu ns\n", ctx->elapsed_ns);
>   	drm_printf(p, "drm-cycles-gpu:\t%llu\n", ctx->cycles);
>   	drm_printf(p, "drm-maxfreq-gpu:\t%u Hz\n", gpu->fast_rate);
> +
> +	if (comm)
> +		drm_printf(p, "drm-comm:\t%s\n", comm);
> +	if (cmdline)
> +		drm_printf(p, "drm-cmdline:\t%s\n", cmdline);
> +
> +	kfree(comm);
> +	kfree(cmdline);
>   }
>   
>   int msm_gpu_hw_init(struct msm_gpu *gpu)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] drm: Add comm/cmdline fdinfo fields
  2023-04-17 20:12 ` Rob Clark
@ 2023-04-18  9:33   ` Konrad Dybcio
  -1 siblings, 0 replies; 35+ messages in thread
From: Konrad Dybcio @ 2023-04-18  9:33 UTC (permalink / raw)
  To: Rob Clark, dri-devel
  Cc: Tvrtko Ursulin, Rob Clark, Akhil P Oommen, Chia-I Wu,
	Dmitry Baryshkov, open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list:DOCUMENTATION,
	open list, Sean Paul

Looks like the 'PATCH' part of your subject was cut off!

Konrad

On 17.04.2023 22:12, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> When many of the things using the GPU are processes in a VM guest, the
> actual client process is just a proxy.  The msm driver has a way to let
> the proxy tell the kernel the actual VM client process's executable name
> and command-line, which has until now been used simply for GPU crash
> devcore dumps.  Lets also expose this via fdinfo so that tools can
> expose who the actual user of the GPU is.
> 
> Rob Clark (3):
>   drm/doc: Relax fdinfo string constraints
>   drm/msm: Rework get_comm_cmdline() helper
>   drm/msm: Add comm/cmdline fields
> 
>  Documentation/gpu/drm-usage-stats.rst   | 37 +++++++++++++++----------
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 +--
>  drivers/gpu/drm/msm/msm_drv.c           |  2 ++
>  drivers/gpu/drm/msm/msm_gpu.c           | 27 +++++++++++++-----
>  drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++--
>  drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
>  6 files changed, 58 insertions(+), 25 deletions(-)
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/3] drm: Add comm/cmdline fdinfo fields
@ 2023-04-18  9:33   ` Konrad Dybcio
  0 siblings, 0 replies; 35+ messages in thread
From: Konrad Dybcio @ 2023-04-18  9:33 UTC (permalink / raw)
  To: Rob Clark, dri-devel
  Cc: Rob Clark, Tvrtko Ursulin, open list:DOCUMENTATION,
	Akhil P Oommen, open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list, Sean Paul, Dmitry Baryshkov,
	open list:DRM DRIVER FOR MSM ADRENO GPU

Looks like the 'PATCH' part of your subject was cut off!

Konrad

On 17.04.2023 22:12, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> When many of the things using the GPU are processes in a VM guest, the
> actual client process is just a proxy.  The msm driver has a way to let
> the proxy tell the kernel the actual VM client process's executable name
> and command-line, which has until now been used simply for GPU crash
> devcore dumps.  Lets also expose this via fdinfo so that tools can
> expose who the actual user of the GPU is.
> 
> Rob Clark (3):
>   drm/doc: Relax fdinfo string constraints
>   drm/msm: Rework get_comm_cmdline() helper
>   drm/msm: Add comm/cmdline fields
> 
>  Documentation/gpu/drm-usage-stats.rst   | 37 +++++++++++++++----------
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 +--
>  drivers/gpu/drm/msm/msm_drv.c           |  2 ++
>  drivers/gpu/drm/msm/msm_gpu.c           | 27 +++++++++++++-----
>  drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++--
>  drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
>  6 files changed, 58 insertions(+), 25 deletions(-)
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper
  2023-04-18  8:34       ` Daniel Vetter
@ 2023-04-18 14:31         ` Rob Clark
  -1 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-18 14:31 UTC (permalink / raw)
  To: Tvrtko Ursulin, Rob Clark, dri-devel, Rob Clark, Abhinav Kumar,
	Dmitry Baryshkov, Sean Paul, David Airlie, Akhil P Oommen,
	Chia-I Wu, Konrad Dybcio,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list
  Cc: Daniel Vetter

On Tue, Apr 18, 2023 at 1:34 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Tue, Apr 18, 2023 at 09:27:49AM +0100, Tvrtko Ursulin wrote:
> >
> > On 17/04/2023 21:12, Rob Clark wrote:
> > > From: Rob Clark <robdclark@chromium.org>
> > >
> > > Make it work in terms of ctx so that it can be re-used for fdinfo.
> > >
> > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > ---
> > >   drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++--
> > >   drivers/gpu/drm/msm/msm_drv.c           |  2 ++
> > >   drivers/gpu/drm/msm/msm_gpu.c           | 13 ++++++-------
> > >   drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++++++--
> > >   drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
> > >   5 files changed, 21 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > index bb38e728864d..43c4e1fea83f 100644
> > > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > @@ -412,7 +412,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > >             /* Ensure string is null terminated: */
> > >             str[len] = '\0';
> > > -           mutex_lock(&gpu->lock);
> > > +           mutex_lock(&ctx->lock);
> > >             if (param == MSM_PARAM_COMM) {
> > >                     paramp = &ctx->comm;
> > > @@ -423,7 +423,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > >             kfree(*paramp);
> > >             *paramp = str;
> > > -           mutex_unlock(&gpu->lock);
> > > +           mutex_unlock(&ctx->lock);
> > >             return 0;
> > >     }
> > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > > index 3d73b98d6a9c..ca0e89e46e13 100644
> > > --- a/drivers/gpu/drm/msm/msm_drv.c
> > > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > > @@ -581,6 +581,8 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
> > >     rwlock_init(&ctx->queuelock);
> > >     kref_init(&ctx->ref);
> > > +   ctx->pid = get_pid(task_pid(current));
> >
> > Would it simplify things for msm if DRM core had an up to date file->pid as
> > proposed in
> > https://patchwork.freedesktop.org/patch/526752/?series=109902&rev=4 ? It
> > gets updated if ioctl issuer is different than fd opener and this being
> > context_init here reminded me of it. Maybe you wouldn't have to track the
> > pid in msm?

The problem is that we also need this for gpu devcore dumps, which
could happen after the drm_file is closed.  The ctx can outlive the
file.

But the ctx->pid has the same problem as the existing file->pid when
it comes to Xorg.. hopefully over time that problem just goes away.  I
guess I could do a similar dance to your patch to update the pid
whenever (for ex) a submitqueue is created.

> Can we go one step further and let the drm fdinfo stuff print these new
> additions? Consistency across drivers and all that.

Hmm, I guess I could _also_ store the overridden comm/cmdline in
drm_file.  I still need to track it in ctx (msm_file_private) because
I could need it after the file is closed.

Maybe it could be useful to have a gl extension to let the app set a
name on the context so that this is useful beyond native-ctx (ie.
maybe it would be nice to see that "chrome: lwn.net" is using less gpu
memory than "chrome: phoronix.com", etc)

BR,
-R

> Also for a generic trigger I think any driver ioctl is good enough (we
> only really need to avoid the auth dance when you're not on a render
> node).
> -Daniel
>
> >
> > Regards,
> >
> > Tvrtko
> >
> > > +   mutex_init(&ctx->lock);
> > >     msm_submitqueue_init(dev, ctx);
> > >     ctx->aspace = msm_gpu_create_private_address_space(priv->gpu, current);
> > > diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> > > index c403912d13ab..f0f4f845c32d 100644
> > > --- a/drivers/gpu/drm/msm/msm_gpu.c
> > > +++ b/drivers/gpu/drm/msm/msm_gpu.c
> > > @@ -327,18 +327,17 @@ find_submit(struct msm_ringbuffer *ring, uint32_t fence)
> > >   static void retire_submits(struct msm_gpu *gpu);
> > > -static void get_comm_cmdline(struct msm_gem_submit *submit, char **comm, char **cmd)
> > > +static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd)
> > >   {
> > > -   struct msm_file_private *ctx = submit->queue->ctx;
> > >     struct task_struct *task;
> > > -   WARN_ON(!mutex_is_locked(&submit->gpu->lock));
> > > -
> > >     /* Note that kstrdup will return NULL if argument is NULL: */
> > > +   mutex_lock(&ctx->lock);
> > >     *comm = kstrdup(ctx->comm, GFP_KERNEL);
> > >     *cmd  = kstrdup(ctx->cmdline, GFP_KERNEL);
> > > +   mutex_unlock(&ctx->lock);
> > > -   task = get_pid_task(submit->pid, PIDTYPE_PID);
> > > +   task = get_pid_task(ctx->pid, PIDTYPE_PID);
> > >     if (!task)
> > >             return;
> > > @@ -372,7 +371,7 @@ static void recover_worker(struct kthread_work *work)
> > >             if (submit->aspace)
> > >                     submit->aspace->faults++;
> > > -           get_comm_cmdline(submit, &comm, &cmd);
> > > +           get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
> > >             if (comm && cmd) {
> > >                     DRM_DEV_ERROR(dev->dev, "%s: offending task: %s (%s)\n",
> > > @@ -460,7 +459,7 @@ static void fault_worker(struct kthread_work *work)
> > >             goto resume_smmu;
> > >     if (submit) {
> > > -           get_comm_cmdline(submit, &comm, &cmd);
> > > +           get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
> > >             /*
> > >              * When we get GPU iova faults, we can get 1000s of them,
> > > diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> > > index 7a4fa1b8655b..b2023a42116b 100644
> > > --- a/drivers/gpu/drm/msm/msm_gpu.h
> > > +++ b/drivers/gpu/drm/msm/msm_gpu.h
> > > @@ -377,17 +377,25 @@ struct msm_file_private {
> > >      */
> > >     int sysprof;
> > > +   /** @pid: Process that opened this file. */
> > > +   struct pid *pid;
> > > +
> > > +   /**
> > > +    * lock: Protects comm and cmdline
> > > +    */
> > > +   struct mutex lock;
> > > +
> > >     /**
> > >      * comm: Overridden task comm, see MSM_PARAM_COMM
> > >      *
> > > -    * Accessed under msm_gpu::lock
> > > +    * Accessed under msm_file_private::lock
> > >      */
> > >     char *comm;
> > >     /**
> > >      * cmdline: Overridden task cmdline, see MSM_PARAM_CMDLINE
> > >      *
> > > -    * Accessed under msm_gpu::lock
> > > +    * Accessed under msm_file_private::lock
> > >      */
> > >     char *cmdline;
> > > diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
> > > index 0e803125a325..0444ba04fa06 100644
> > > --- a/drivers/gpu/drm/msm/msm_submitqueue.c
> > > +++ b/drivers/gpu/drm/msm/msm_submitqueue.c
> > > @@ -61,6 +61,7 @@ void __msm_file_private_destroy(struct kref *kref)
> > >     }
> > >     msm_gem_address_space_put(ctx->aspace);
> > > +   put_pid(ctx->pid);
> > >     kfree(ctx->comm);
> > >     kfree(ctx->cmdline);
> > >     kfree(ctx);
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper
@ 2023-04-18 14:31         ` Rob Clark
  0 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-18 14:31 UTC (permalink / raw)
  To: Tvrtko Ursulin, Rob Clark, dri-devel, Rob Clark, Abhinav Kumar,
	Dmitry Baryshkov, Sean Paul, David Airlie, Akhil P Oommen,
	Chia-I Wu, Konrad Dybcio,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list

On Tue, Apr 18, 2023 at 1:34 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Tue, Apr 18, 2023 at 09:27:49AM +0100, Tvrtko Ursulin wrote:
> >
> > On 17/04/2023 21:12, Rob Clark wrote:
> > > From: Rob Clark <robdclark@chromium.org>
> > >
> > > Make it work in terms of ctx so that it can be re-used for fdinfo.
> > >
> > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > ---
> > >   drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++--
> > >   drivers/gpu/drm/msm/msm_drv.c           |  2 ++
> > >   drivers/gpu/drm/msm/msm_gpu.c           | 13 ++++++-------
> > >   drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++++++--
> > >   drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
> > >   5 files changed, 21 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > index bb38e728864d..43c4e1fea83f 100644
> > > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > @@ -412,7 +412,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > >             /* Ensure string is null terminated: */
> > >             str[len] = '\0';
> > > -           mutex_lock(&gpu->lock);
> > > +           mutex_lock(&ctx->lock);
> > >             if (param == MSM_PARAM_COMM) {
> > >                     paramp = &ctx->comm;
> > > @@ -423,7 +423,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > >             kfree(*paramp);
> > >             *paramp = str;
> > > -           mutex_unlock(&gpu->lock);
> > > +           mutex_unlock(&ctx->lock);
> > >             return 0;
> > >     }
> > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > > index 3d73b98d6a9c..ca0e89e46e13 100644
> > > --- a/drivers/gpu/drm/msm/msm_drv.c
> > > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > > @@ -581,6 +581,8 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
> > >     rwlock_init(&ctx->queuelock);
> > >     kref_init(&ctx->ref);
> > > +   ctx->pid = get_pid(task_pid(current));
> >
> > Would it simplify things for msm if DRM core had an up to date file->pid as
> > proposed in
> > https://patchwork.freedesktop.org/patch/526752/?series=109902&rev=4 ? It
> > gets updated if ioctl issuer is different than fd opener and this being
> > context_init here reminded me of it. Maybe you wouldn't have to track the
> > pid in msm?

The problem is that we also need this for gpu devcore dumps, which
could happen after the drm_file is closed.  The ctx can outlive the
file.

But the ctx->pid has the same problem as the existing file->pid when
it comes to Xorg.. hopefully over time that problem just goes away.  I
guess I could do a similar dance to your patch to update the pid
whenever (for ex) a submitqueue is created.

> Can we go one step further and let the drm fdinfo stuff print these new
> additions? Consistency across drivers and all that.

Hmm, I guess I could _also_ store the overridden comm/cmdline in
drm_file.  I still need to track it in ctx (msm_file_private) because
I could need it after the file is closed.

Maybe it could be useful to have a gl extension to let the app set a
name on the context so that this is useful beyond native-ctx (ie.
maybe it would be nice to see that "chrome: lwn.net" is using less gpu
memory than "chrome: phoronix.com", etc)

BR,
-R

> Also for a generic trigger I think any driver ioctl is good enough (we
> only really need to avoid the auth dance when you're not on a render
> node).
> -Daniel
>
> >
> > Regards,
> >
> > Tvrtko
> >
> > > +   mutex_init(&ctx->lock);
> > >     msm_submitqueue_init(dev, ctx);
> > >     ctx->aspace = msm_gpu_create_private_address_space(priv->gpu, current);
> > > diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> > > index c403912d13ab..f0f4f845c32d 100644
> > > --- a/drivers/gpu/drm/msm/msm_gpu.c
> > > +++ b/drivers/gpu/drm/msm/msm_gpu.c
> > > @@ -327,18 +327,17 @@ find_submit(struct msm_ringbuffer *ring, uint32_t fence)
> > >   static void retire_submits(struct msm_gpu *gpu);
> > > -static void get_comm_cmdline(struct msm_gem_submit *submit, char **comm, char **cmd)
> > > +static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd)
> > >   {
> > > -   struct msm_file_private *ctx = submit->queue->ctx;
> > >     struct task_struct *task;
> > > -   WARN_ON(!mutex_is_locked(&submit->gpu->lock));
> > > -
> > >     /* Note that kstrdup will return NULL if argument is NULL: */
> > > +   mutex_lock(&ctx->lock);
> > >     *comm = kstrdup(ctx->comm, GFP_KERNEL);
> > >     *cmd  = kstrdup(ctx->cmdline, GFP_KERNEL);
> > > +   mutex_unlock(&ctx->lock);
> > > -   task = get_pid_task(submit->pid, PIDTYPE_PID);
> > > +   task = get_pid_task(ctx->pid, PIDTYPE_PID);
> > >     if (!task)
> > >             return;
> > > @@ -372,7 +371,7 @@ static void recover_worker(struct kthread_work *work)
> > >             if (submit->aspace)
> > >                     submit->aspace->faults++;
> > > -           get_comm_cmdline(submit, &comm, &cmd);
> > > +           get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
> > >             if (comm && cmd) {
> > >                     DRM_DEV_ERROR(dev->dev, "%s: offending task: %s (%s)\n",
> > > @@ -460,7 +459,7 @@ static void fault_worker(struct kthread_work *work)
> > >             goto resume_smmu;
> > >     if (submit) {
> > > -           get_comm_cmdline(submit, &comm, &cmd);
> > > +           get_comm_cmdline(submit->queue->ctx, &comm, &cmd);
> > >             /*
> > >              * When we get GPU iova faults, we can get 1000s of them,
> > > diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> > > index 7a4fa1b8655b..b2023a42116b 100644
> > > --- a/drivers/gpu/drm/msm/msm_gpu.h
> > > +++ b/drivers/gpu/drm/msm/msm_gpu.h
> > > @@ -377,17 +377,25 @@ struct msm_file_private {
> > >      */
> > >     int sysprof;
> > > +   /** @pid: Process that opened this file. */
> > > +   struct pid *pid;
> > > +
> > > +   /**
> > > +    * lock: Protects comm and cmdline
> > > +    */
> > > +   struct mutex lock;
> > > +
> > >     /**
> > >      * comm: Overridden task comm, see MSM_PARAM_COMM
> > >      *
> > > -    * Accessed under msm_gpu::lock
> > > +    * Accessed under msm_file_private::lock
> > >      */
> > >     char *comm;
> > >     /**
> > >      * cmdline: Overridden task cmdline, see MSM_PARAM_CMDLINE
> > >      *
> > > -    * Accessed under msm_gpu::lock
> > > +    * Accessed under msm_file_private::lock
> > >      */
> > >     char *cmdline;
> > > diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
> > > index 0e803125a325..0444ba04fa06 100644
> > > --- a/drivers/gpu/drm/msm/msm_submitqueue.c
> > > +++ b/drivers/gpu/drm/msm/msm_submitqueue.c
> > > @@ -61,6 +61,7 @@ void __msm_file_private_destroy(struct kref *kref)
> > >     }
> > >     msm_gem_address_space_put(ctx->aspace);
> > > +   put_pid(ctx->pid);
> > >     kfree(ctx->comm);
> > >     kfree(ctx->cmdline);
> > >     kfree(ctx);
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 3/3] drm/msm: Add comm/cmdline fields
  2023-04-18  8:53     ` Tvrtko Ursulin
@ 2023-04-18 14:56       ` Rob Clark
  -1 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-18 14:56 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: dri-devel, Rob Clark, David Airlie, Daniel Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet, Abhinav Kumar, Dmitry Baryshkov, Sean Paul,
	open list:DOCUMENTATION, open list,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU

On Tue, Apr 18, 2023 at 1:53 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 17/04/2023 21:12, Rob Clark wrote:
> > From: Rob Clark <robdclark@chromium.org>
> >
> > Normally this would be the same information that can be obtained in
> > other ways.  But in some cases the process opening the drm fd is merely
> > a sort of proxy for the actual process using the GPU.  This is the case
> > for guest VM processes using the GPU via virglrenderer, in which case
> > the msm native-context renderer in virglrenderer overrides the comm/
> > cmdline to be the guest process's values.
> >
> > Exposing this via fdinfo allows tools like gputop to show something more
> > meaningful than just a bunch of "pcivirtio-gpu" users.
>
> You also later expanded with:
>
> """
> I should have also mentioned, in the VM/proxy scenario we have a
> single process with separate drm_file's for each guest VM process.  So
> it isn't an option to just change the proxy process's name to match
> the client.
> """
>
> So how does that work - this single process temporarily changes it's
> name for each drm fd it opens and creates a context or it is actually in
> the native context protocol?

It is part of the protocol, the mesa driver in the VM sends[1] this
info to the native-context "shim" in host userspace which uses the
SET_PARAM ioctl to pass this to the kernel.  In the host userspace
there is just a single process (you see the host PID below) but it
does a separate open() of the drm dev for each guest process (so that
they each have their own GPU address space for isolation):

DRM minor 128
    PID    MEM ACTIV              NAME                    gpu
    5297  200M   82M com.mojang.minecr |██████████████▏                        |
    1859  199M    0B            chrome |█▉                                     |
    5297   64M    9M    surfaceflinger |                                       |
    5297   12M    0B org.chromium.arc. |                                       |
    5297   12M    0B com.android.syste |                                       |
    5297   12M    0B org.chromium.arc. |                                       |
    5297   26M    0B com.google.androi |                                       |
    5297   65M    0B     system_server |                                       |


[1] https://gitlab.freedesktop.org/virgl/virglrenderer/-/blob/master/src/drm/msm/msm_proto.h#L326
[2] https://gitlab.freedesktop.org/virgl/virglrenderer/-/blob/master/src/drm/msm/msm_renderer.c#L1050

> >
> > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > ---
> >   Documentation/gpu/drm-usage-stats.rst |  8 ++++++++
> >   drivers/gpu/drm/msm/msm_gpu.c         | 14 ++++++++++++++
> >   2 files changed, 22 insertions(+)
> >
> > diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> > index 8e00d53231e0..bc90bed455e3 100644
> > --- a/Documentation/gpu/drm-usage-stats.rst
> > +++ b/Documentation/gpu/drm-usage-stats.rst
> > @@ -148,6 +148,14 @@ percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
> >   time active without considering what frequency the engine is operating as a
> >   percentage of it's maximum frequency.
> >
> > +- drm-comm: <valstr>
> > +
> > +Returns the clients executable path.
>
> Full path and not just current->comm? In this case probably give it a
> more descriptive name here.
>
> drm-client-executable
> drm-client-command-line
>
> So we stay in the drm-client- namespace?
>
> Or if the former is absolute path could one key be enough for both?
>
> drm-client-command-line: /path/to/executable --arguments

comm and cmdline can be different. Android seems to change the comm to
the apk name, for example (and w/ the zygote stuff cmdline isn't
really a thing)

I guess it could be drm-client-comm and drm-client-cmdline?  Although
comm/cmdline aren't the best names, they are just following what the
kernel calls them elsewhere.

> > +
> > +- drm-cmdline: <valstr>
> > +
> > +Returns the clients cmdline.
>
> I think drm-usage-stats.rst text should provide some more text with
> these two. To precisely define their content and outline the use case
> under which driver authors may want to add them, and fdinfo consumer
> therefore expect to see them. Just so everything is completely clear and
> people do not start adding them for drivers which do not support native
> context (or like).

I really was just piggy-backing on existing comm/cmdline.. but I'll
try to write up something better.

I think it maybe should not be limited just to native context.. for
ex. if the browser did somehow manage to create different displays
associated with different drm_file instances (I guess it would have to
use gbm to do this?) it would be nice to see browser tab names.

> But on the overall it sounds reasonable to me - it would be really cool
> to not just see pcivirtio-gpu as you say. Even if the standard virtiogpu
> use case (not native context) could show real users.

For vrend/virgl, we'd first need to solve the issue that there is just
a single drm_file for all guest processes.  But really, just don't use
virgl.  (I mean, like seriously, would you put a gl driver in the
kernel?  Vrend has access to all guest memory, so this is essentially
what you have with virgl.  This is just not a sane thing to do.) The
only "valid" reason for not doing native-context is if you don't have
the src code for your UMD to be able to modify it to talk
native-context to virtgpu in the guest. ;-)

BR,
-R

> Regards,
>
> Tvrtko
>
> > +
> >   Implementation Details
> >   ======================
> >
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> > index f0f4f845c32d..1150dcbf28aa 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.c
> > +++ b/drivers/gpu/drm/msm/msm_gpu.c
> > @@ -148,12 +148,26 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)
> >       return 0;
> >   }
> >
> > +static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd);
> > +
> >   void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx,
> >                        struct drm_printer *p)
> >   {
> > +     char *comm, *cmdline;
> > +
> > +     get_comm_cmdline(ctx, &comm, &cmdline);
> > +
> >       drm_printf(p, "drm-engine-gpu:\t%llu ns\n", ctx->elapsed_ns);
> >       drm_printf(p, "drm-cycles-gpu:\t%llu\n", ctx->cycles);
> >       drm_printf(p, "drm-maxfreq-gpu:\t%u Hz\n", gpu->fast_rate);
> > +
> > +     if (comm)
> > +             drm_printf(p, "drm-comm:\t%s\n", comm);
> > +     if (cmdline)
> > +             drm_printf(p, "drm-cmdline:\t%s\n", cmdline);
> > +
> > +     kfree(comm);
> > +     kfree(cmdline);
> >   }
> >
> >   int msm_gpu_hw_init(struct msm_gpu *gpu)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 3/3] drm/msm: Add comm/cmdline fields
@ 2023-04-18 14:56       ` Rob Clark
  0 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-18 14:56 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Rob Clark, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Thomas Zimmermann, Jonathan Corbet, Sean Paul,
	open list:DOCUMENTATION, Abhinav Kumar, dri-devel, open list,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Dmitry Baryshkov

On Tue, Apr 18, 2023 at 1:53 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 17/04/2023 21:12, Rob Clark wrote:
> > From: Rob Clark <robdclark@chromium.org>
> >
> > Normally this would be the same information that can be obtained in
> > other ways.  But in some cases the process opening the drm fd is merely
> > a sort of proxy for the actual process using the GPU.  This is the case
> > for guest VM processes using the GPU via virglrenderer, in which case
> > the msm native-context renderer in virglrenderer overrides the comm/
> > cmdline to be the guest process's values.
> >
> > Exposing this via fdinfo allows tools like gputop to show something more
> > meaningful than just a bunch of "pcivirtio-gpu" users.
>
> You also later expanded with:
>
> """
> I should have also mentioned, in the VM/proxy scenario we have a
> single process with separate drm_file's for each guest VM process.  So
> it isn't an option to just change the proxy process's name to match
> the client.
> """
>
> So how does that work - this single process temporarily changes it's
> name for each drm fd it opens and creates a context or it is actually in
> the native context protocol?

It is part of the protocol, the mesa driver in the VM sends[1] this
info to the native-context "shim" in host userspace which uses the
SET_PARAM ioctl to pass this to the kernel.  In the host userspace
there is just a single process (you see the host PID below) but it
does a separate open() of the drm dev for each guest process (so that
they each have their own GPU address space for isolation):

DRM minor 128
    PID    MEM ACTIV              NAME                    gpu
    5297  200M   82M com.mojang.minecr |██████████████▏                        |
    1859  199M    0B            chrome |█▉                                     |
    5297   64M    9M    surfaceflinger |                                       |
    5297   12M    0B org.chromium.arc. |                                       |
    5297   12M    0B com.android.syste |                                       |
    5297   12M    0B org.chromium.arc. |                                       |
    5297   26M    0B com.google.androi |                                       |
    5297   65M    0B     system_server |                                       |


[1] https://gitlab.freedesktop.org/virgl/virglrenderer/-/blob/master/src/drm/msm/msm_proto.h#L326
[2] https://gitlab.freedesktop.org/virgl/virglrenderer/-/blob/master/src/drm/msm/msm_renderer.c#L1050

> >
> > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > ---
> >   Documentation/gpu/drm-usage-stats.rst |  8 ++++++++
> >   drivers/gpu/drm/msm/msm_gpu.c         | 14 ++++++++++++++
> >   2 files changed, 22 insertions(+)
> >
> > diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> > index 8e00d53231e0..bc90bed455e3 100644
> > --- a/Documentation/gpu/drm-usage-stats.rst
> > +++ b/Documentation/gpu/drm-usage-stats.rst
> > @@ -148,6 +148,14 @@ percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
> >   time active without considering what frequency the engine is operating as a
> >   percentage of it's maximum frequency.
> >
> > +- drm-comm: <valstr>
> > +
> > +Returns the clients executable path.
>
> Full path and not just current->comm? In this case probably give it a
> more descriptive name here.
>
> drm-client-executable
> drm-client-command-line
>
> So we stay in the drm-client- namespace?
>
> Or if the former is absolute path could one key be enough for both?
>
> drm-client-command-line: /path/to/executable --arguments

comm and cmdline can be different. Android seems to change the comm to
the apk name, for example (and w/ the zygote stuff cmdline isn't
really a thing)

I guess it could be drm-client-comm and drm-client-cmdline?  Although
comm/cmdline aren't the best names, they are just following what the
kernel calls them elsewhere.

> > +
> > +- drm-cmdline: <valstr>
> > +
> > +Returns the clients cmdline.
>
> I think drm-usage-stats.rst text should provide some more text with
> these two. To precisely define their content and outline the use case
> under which driver authors may want to add them, and fdinfo consumer
> therefore expect to see them. Just so everything is completely clear and
> people do not start adding them for drivers which do not support native
> context (or like).

I really was just piggy-backing on existing comm/cmdline.. but I'll
try to write up something better.

I think it maybe should not be limited just to native context.. for
ex. if the browser did somehow manage to create different displays
associated with different drm_file instances (I guess it would have to
use gbm to do this?) it would be nice to see browser tab names.

> But on the overall it sounds reasonable to me - it would be really cool
> to not just see pcivirtio-gpu as you say. Even if the standard virtiogpu
> use case (not native context) could show real users.

For vrend/virgl, we'd first need to solve the issue that there is just
a single drm_file for all guest processes.  But really, just don't use
virgl.  (I mean, like seriously, would you put a gl driver in the
kernel?  Vrend has access to all guest memory, so this is essentially
what you have with virgl.  This is just not a sane thing to do.) The
only "valid" reason for not doing native-context is if you don't have
the src code for your UMD to be able to modify it to talk
native-context to virtgpu in the guest. ;-)

BR,
-R

> Regards,
>
> Tvrtko
>
> > +
> >   Implementation Details
> >   ======================
> >
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> > index f0f4f845c32d..1150dcbf28aa 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.c
> > +++ b/drivers/gpu/drm/msm/msm_gpu.c
> > @@ -148,12 +148,26 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)
> >       return 0;
> >   }
> >
> > +static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd);
> > +
> >   void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx,
> >                        struct drm_printer *p)
> >   {
> > +     char *comm, *cmdline;
> > +
> > +     get_comm_cmdline(ctx, &comm, &cmdline);
> > +
> >       drm_printf(p, "drm-engine-gpu:\t%llu ns\n", ctx->elapsed_ns);
> >       drm_printf(p, "drm-cycles-gpu:\t%llu\n", ctx->cycles);
> >       drm_printf(p, "drm-maxfreq-gpu:\t%u Hz\n", gpu->fast_rate);
> > +
> > +     if (comm)
> > +             drm_printf(p, "drm-comm:\t%s\n", comm);
> > +     if (cmdline)
> > +             drm_printf(p, "drm-cmdline:\t%s\n", cmdline);
> > +
> > +     kfree(comm);
> > +     kfree(cmdline);
> >   }
> >
> >   int msm_gpu_hw_init(struct msm_gpu *gpu)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 3/3] drm/msm: Add comm/cmdline fields
  2023-04-18 14:56       ` Rob Clark
@ 2023-04-19 13:36         ` Tvrtko Ursulin
  -1 siblings, 0 replies; 35+ messages in thread
From: Tvrtko Ursulin @ 2023-04-19 13:36 UTC (permalink / raw)
  To: Rob Clark
  Cc: Rob Clark, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Thomas Zimmermann, Jonathan Corbet, Sean Paul,
	open list:DOCUMENTATION, Abhinav Kumar, dri-devel, open list,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Dmitry Baryshkov


On 18/04/2023 15:56, Rob Clark wrote:
> On Tue, Apr 18, 2023 at 1:53 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 17/04/2023 21:12, Rob Clark wrote:
>>> From: Rob Clark <robdclark@chromium.org>
>>>
>>> Normally this would be the same information that can be obtained in
>>> other ways.  But in some cases the process opening the drm fd is merely
>>> a sort of proxy for the actual process using the GPU.  This is the case
>>> for guest VM processes using the GPU via virglrenderer, in which case
>>> the msm native-context renderer in virglrenderer overrides the comm/
>>> cmdline to be the guest process's values.
>>>
>>> Exposing this via fdinfo allows tools like gputop to show something more
>>> meaningful than just a bunch of "pcivirtio-gpu" users.
>>
>> You also later expanded with:
>>
>> """
>> I should have also mentioned, in the VM/proxy scenario we have a
>> single process with separate drm_file's for each guest VM process.  So
>> it isn't an option to just change the proxy process's name to match
>> the client.
>> """
>>
>> So how does that work - this single process temporarily changes it's
>> name for each drm fd it opens and creates a context or it is actually in
>> the native context protocol?
> 
> It is part of the protocol, the mesa driver in the VM sends[1] this
> info to the native-context "shim" in host userspace which uses the
> SET_PARAM ioctl to pass this to the kernel.  In the host userspace
> there is just a single process (you see the host PID below) but it
> does a separate open() of the drm dev for each guest process (so that
> they each have their own GPU address space for isolation):
> 
> DRM minor 128
>      PID    MEM ACTIV              NAME                    gpu
>      5297  200M   82M com.mojang.minecr |██████████████▏                        |
>      1859  199M    0B            chrome |█▉                                     |
>      5297   64M    9M    surfaceflinger |                                       |
>      5297   12M    0B org.chromium.arc. |                                       |
>      5297   12M    0B com.android.syste |                                       |
>      5297   12M    0B org.chromium.arc. |                                       |
>      5297   26M    0B com.google.androi |                                       |
>      5297   65M    0B     system_server |                                       |
> 
> 
> [1] https://gitlab.freedesktop.org/virgl/virglrenderer/-/blob/master/src/drm/msm/msm_proto.h#L326
> [2] https://gitlab.freedesktop.org/virgl/virglrenderer/-/blob/master/src/drm/msm/msm_renderer.c#L1050
> 
>>>
>>> Signed-off-by: Rob Clark <robdclark@chromium.org>
>>> ---
>>>    Documentation/gpu/drm-usage-stats.rst |  8 ++++++++
>>>    drivers/gpu/drm/msm/msm_gpu.c         | 14 ++++++++++++++
>>>    2 files changed, 22 insertions(+)
>>>
>>> diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
>>> index 8e00d53231e0..bc90bed455e3 100644
>>> --- a/Documentation/gpu/drm-usage-stats.rst
>>> +++ b/Documentation/gpu/drm-usage-stats.rst
>>> @@ -148,6 +148,14 @@ percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
>>>    time active without considering what frequency the engine is operating as a
>>>    percentage of it's maximum frequency.
>>>
>>> +- drm-comm: <valstr>
>>> +
>>> +Returns the clients executable path.
>>
>> Full path and not just current->comm? In this case probably give it a
>> more descriptive name here.
>>
>> drm-client-executable
>> drm-client-command-line
>>
>> So we stay in the drm-client- namespace?
>>
>> Or if the former is absolute path could one key be enough for both?
>>
>> drm-client-command-line: /path/to/executable --arguments
> 
> comm and cmdline can be different. Android seems to change the comm to
> the apk name, for example (and w/ the zygote stuff cmdline isn't
> really a thing)
> 
> I guess it could be drm-client-comm and drm-client-cmdline?  Although
> comm/cmdline aren't the best names, they are just following what the
> kernel calls them elsewhere.

I wasn't sure what do you plan to do given mention of a path under the 
drm-comm description. If it is a path then comm would be misleading, 
since comm as defined in procfs is not a path, I don't think so at 
least. Which is why I was suggesting executable. But if you remove the 
mention of a path from rst and rather refer to processes' comm value I 
think that is then okay.

>>> +
>>> +- drm-cmdline: <valstr>
>>> +
>>> +Returns the clients cmdline.
>>
>> I think drm-usage-stats.rst text should provide some more text with
>> these two. To precisely define their content and outline the use case
>> under which driver authors may want to add them, and fdinfo consumer
>> therefore expect to see them. Just so everything is completely clear and
>> people do not start adding them for drivers which do not support native
>> context (or like).
> 
> I really was just piggy-backing on existing comm/cmdline.. but I'll
> try to write up something better.
> 
> I think it maybe should not be limited just to native context.. for
> ex. if the browser did somehow manage to create different displays
> associated with different drm_file instances (I guess it would have to
> use gbm to do this?) it would be nice to see browser tab names.

Would be cool yes.

My thinking behind why we maybe do not want to blanket add them is 
because for common case is it the same information which can be obtained 
from procfs. Like in igt_drm_clients.c I get the pid and comm from 
/proc/$pid/stat. So I was thinking it is only interesting to add to 
fdinfo for drivers where it could differ by the explicit override like 
you have with native context.

It can be added once there is a GL/whatever extension which would allow 
it? (I am not familiar with how browsers manage rendering contexts so 
maybe I am missing something.)

>> But on the overall it sounds reasonable to me - it would be really cool
>> to not just see pcivirtio-gpu as you say. Even if the standard virtiogpu
>> use case (not native context) could show real users.
> 
> For vrend/virgl, we'd first need to solve the issue that there is just
> a single drm_file for all guest processes.  But really, just don't use
> virgl.  (I mean, like seriously, would you put a gl driver in the
> kernel?  Vrend has access to all guest memory, so this is essentially
> what you have with virgl.  This is just not a sane thing to do.) The
> only "valid" reason for not doing native-context is if you don't have
> the src code for your UMD to be able to modify it to talk
> native-context to virtgpu in the guest. ;-)

I am just observing the current state of things on an Intel based 
Chromebook. :) Presumably the custom name for a context would be 
passable via the virtio-gpu protocol or something?

Regards,

Tvrtko

> 
> BR,
> -R
> 
>> Regards,
>>
>> Tvrtko
>>
>>> +
>>>    Implementation Details
>>>    ======================
>>>
>>> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
>>> index f0f4f845c32d..1150dcbf28aa 100644
>>> --- a/drivers/gpu/drm/msm/msm_gpu.c
>>> +++ b/drivers/gpu/drm/msm/msm_gpu.c
>>> @@ -148,12 +148,26 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)
>>>        return 0;
>>>    }
>>>
>>> +static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd);
>>> +
>>>    void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx,
>>>                         struct drm_printer *p)
>>>    {
>>> +     char *comm, *cmdline;
>>> +
>>> +     get_comm_cmdline(ctx, &comm, &cmdline);
>>> +
>>>        drm_printf(p, "drm-engine-gpu:\t%llu ns\n", ctx->elapsed_ns);
>>>        drm_printf(p, "drm-cycles-gpu:\t%llu\n", ctx->cycles);
>>>        drm_printf(p, "drm-maxfreq-gpu:\t%u Hz\n", gpu->fast_rate);
>>> +
>>> +     if (comm)
>>> +             drm_printf(p, "drm-comm:\t%s\n", comm);
>>> +     if (cmdline)
>>> +             drm_printf(p, "drm-cmdline:\t%s\n", cmdline);
>>> +
>>> +     kfree(comm);
>>> +     kfree(cmdline);
>>>    }
>>>
>>>    int msm_gpu_hw_init(struct msm_gpu *gpu)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 3/3] drm/msm: Add comm/cmdline fields
@ 2023-04-19 13:36         ` Tvrtko Ursulin
  0 siblings, 0 replies; 35+ messages in thread
From: Tvrtko Ursulin @ 2023-04-19 13:36 UTC (permalink / raw)
  To: Rob Clark
  Cc: dri-devel, Rob Clark, David Airlie, Daniel Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet, Abhinav Kumar, Dmitry Baryshkov, Sean Paul,
	open list:DOCUMENTATION, open list,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU


On 18/04/2023 15:56, Rob Clark wrote:
> On Tue, Apr 18, 2023 at 1:53 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 17/04/2023 21:12, Rob Clark wrote:
>>> From: Rob Clark <robdclark@chromium.org>
>>>
>>> Normally this would be the same information that can be obtained in
>>> other ways.  But in some cases the process opening the drm fd is merely
>>> a sort of proxy for the actual process using the GPU.  This is the case
>>> for guest VM processes using the GPU via virglrenderer, in which case
>>> the msm native-context renderer in virglrenderer overrides the comm/
>>> cmdline to be the guest process's values.
>>>
>>> Exposing this via fdinfo allows tools like gputop to show something more
>>> meaningful than just a bunch of "pcivirtio-gpu" users.
>>
>> You also later expanded with:
>>
>> """
>> I should have also mentioned, in the VM/proxy scenario we have a
>> single process with separate drm_file's for each guest VM process.  So
>> it isn't an option to just change the proxy process's name to match
>> the client.
>> """
>>
>> So how does that work - this single process temporarily changes it's
>> name for each drm fd it opens and creates a context or it is actually in
>> the native context protocol?
> 
> It is part of the protocol, the mesa driver in the VM sends[1] this
> info to the native-context "shim" in host userspace which uses the
> SET_PARAM ioctl to pass this to the kernel.  In the host userspace
> there is just a single process (you see the host PID below) but it
> does a separate open() of the drm dev for each guest process (so that
> they each have their own GPU address space for isolation):
> 
> DRM minor 128
>      PID    MEM ACTIV              NAME                    gpu
>      5297  200M   82M com.mojang.minecr |██████████████▏                        |
>      1859  199M    0B            chrome |█▉                                     |
>      5297   64M    9M    surfaceflinger |                                       |
>      5297   12M    0B org.chromium.arc. |                                       |
>      5297   12M    0B com.android.syste |                                       |
>      5297   12M    0B org.chromium.arc. |                                       |
>      5297   26M    0B com.google.androi |                                       |
>      5297   65M    0B     system_server |                                       |
> 
> 
> [1] https://gitlab.freedesktop.org/virgl/virglrenderer/-/blob/master/src/drm/msm/msm_proto.h#L326
> [2] https://gitlab.freedesktop.org/virgl/virglrenderer/-/blob/master/src/drm/msm/msm_renderer.c#L1050
> 
>>>
>>> Signed-off-by: Rob Clark <robdclark@chromium.org>
>>> ---
>>>    Documentation/gpu/drm-usage-stats.rst |  8 ++++++++
>>>    drivers/gpu/drm/msm/msm_gpu.c         | 14 ++++++++++++++
>>>    2 files changed, 22 insertions(+)
>>>
>>> diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
>>> index 8e00d53231e0..bc90bed455e3 100644
>>> --- a/Documentation/gpu/drm-usage-stats.rst
>>> +++ b/Documentation/gpu/drm-usage-stats.rst
>>> @@ -148,6 +148,14 @@ percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
>>>    time active without considering what frequency the engine is operating as a
>>>    percentage of it's maximum frequency.
>>>
>>> +- drm-comm: <valstr>
>>> +
>>> +Returns the clients executable path.
>>
>> Full path and not just current->comm? In this case probably give it a
>> more descriptive name here.
>>
>> drm-client-executable
>> drm-client-command-line
>>
>> So we stay in the drm-client- namespace?
>>
>> Or if the former is absolute path could one key be enough for both?
>>
>> drm-client-command-line: /path/to/executable --arguments
> 
> comm and cmdline can be different. Android seems to change the comm to
> the apk name, for example (and w/ the zygote stuff cmdline isn't
> really a thing)
> 
> I guess it could be drm-client-comm and drm-client-cmdline?  Although
> comm/cmdline aren't the best names, they are just following what the
> kernel calls them elsewhere.

I wasn't sure what do you plan to do given mention of a path under the 
drm-comm description. If it is a path then comm would be misleading, 
since comm as defined in procfs is not a path, I don't think so at 
least. Which is why I was suggesting executable. But if you remove the 
mention of a path from rst and rather refer to processes' comm value I 
think that is then okay.

>>> +
>>> +- drm-cmdline: <valstr>
>>> +
>>> +Returns the clients cmdline.
>>
>> I think drm-usage-stats.rst text should provide some more text with
>> these two. To precisely define their content and outline the use case
>> under which driver authors may want to add them, and fdinfo consumer
>> therefore expect to see them. Just so everything is completely clear and
>> people do not start adding them for drivers which do not support native
>> context (or like).
> 
> I really was just piggy-backing on existing comm/cmdline.. but I'll
> try to write up something better.
> 
> I think it maybe should not be limited just to native context.. for
> ex. if the browser did somehow manage to create different displays
> associated with different drm_file instances (I guess it would have to
> use gbm to do this?) it would be nice to see browser tab names.

Would be cool yes.

My thinking behind why we maybe do not want to blanket add them is 
because for common case is it the same information which can be obtained 
from procfs. Like in igt_drm_clients.c I get the pid and comm from 
/proc/$pid/stat. So I was thinking it is only interesting to add to 
fdinfo for drivers where it could differ by the explicit override like 
you have with native context.

It can be added once there is a GL/whatever extension which would allow 
it? (I am not familiar with how browsers manage rendering contexts so 
maybe I am missing something.)

>> But on the overall it sounds reasonable to me - it would be really cool
>> to not just see pcivirtio-gpu as you say. Even if the standard virtiogpu
>> use case (not native context) could show real users.
> 
> For vrend/virgl, we'd first need to solve the issue that there is just
> a single drm_file for all guest processes.  But really, just don't use
> virgl.  (I mean, like seriously, would you put a gl driver in the
> kernel?  Vrend has access to all guest memory, so this is essentially
> what you have with virgl.  This is just not a sane thing to do.) The
> only "valid" reason for not doing native-context is if you don't have
> the src code for your UMD to be able to modify it to talk
> native-context to virtgpu in the guest. ;-)

I am just observing the current state of things on an Intel based 
Chromebook. :) Presumably the custom name for a context would be 
passable via the virtio-gpu protocol or something?

Regards,

Tvrtko

> 
> BR,
> -R
> 
>> Regards,
>>
>> Tvrtko
>>
>>> +
>>>    Implementation Details
>>>    ======================
>>>
>>> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
>>> index f0f4f845c32d..1150dcbf28aa 100644
>>> --- a/drivers/gpu/drm/msm/msm_gpu.c
>>> +++ b/drivers/gpu/drm/msm/msm_gpu.c
>>> @@ -148,12 +148,26 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)
>>>        return 0;
>>>    }
>>>
>>> +static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd);
>>> +
>>>    void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx,
>>>                         struct drm_printer *p)
>>>    {
>>> +     char *comm, *cmdline;
>>> +
>>> +     get_comm_cmdline(ctx, &comm, &cmdline);
>>> +
>>>        drm_printf(p, "drm-engine-gpu:\t%llu ns\n", ctx->elapsed_ns);
>>>        drm_printf(p, "drm-cycles-gpu:\t%llu\n", ctx->cycles);
>>>        drm_printf(p, "drm-maxfreq-gpu:\t%u Hz\n", gpu->fast_rate);
>>> +
>>> +     if (comm)
>>> +             drm_printf(p, "drm-comm:\t%s\n", comm);
>>> +     if (cmdline)
>>> +             drm_printf(p, "drm-cmdline:\t%s\n", cmdline);
>>> +
>>> +     kfree(comm);
>>> +     kfree(cmdline);
>>>    }
>>>
>>>    int msm_gpu_hw_init(struct msm_gpu *gpu)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 3/3] drm/msm: Add comm/cmdline fields
  2023-04-19 13:36         ` Tvrtko Ursulin
@ 2023-04-19 15:00           ` Rob Clark
  -1 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-19 15:00 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: dri-devel, Rob Clark, David Airlie, Daniel Vetter,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Jonathan Corbet, Abhinav Kumar, Dmitry Baryshkov, Sean Paul,
	open list:DOCUMENTATION, open list,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU

On Wed, Apr 19, 2023 at 6:36 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 18/04/2023 15:56, Rob Clark wrote:
> > On Tue, Apr 18, 2023 at 1:53 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>
> >> On 17/04/2023 21:12, Rob Clark wrote:
> >>> From: Rob Clark <robdclark@chromium.org>
> >>>
> >>> Normally this would be the same information that can be obtained in
> >>> other ways.  But in some cases the process opening the drm fd is merely
> >>> a sort of proxy for the actual process using the GPU.  This is the case
> >>> for guest VM processes using the GPU via virglrenderer, in which case
> >>> the msm native-context renderer in virglrenderer overrides the comm/
> >>> cmdline to be the guest process's values.
> >>>
> >>> Exposing this via fdinfo allows tools like gputop to show something more
> >>> meaningful than just a bunch of "pcivirtio-gpu" users.
> >>
> >> You also later expanded with:
> >>
> >> """
> >> I should have also mentioned, in the VM/proxy scenario we have a
> >> single process with separate drm_file's for each guest VM process.  So
> >> it isn't an option to just change the proxy process's name to match
> >> the client.
> >> """
> >>
> >> So how does that work - this single process temporarily changes it's
> >> name for each drm fd it opens and creates a context or it is actually in
> >> the native context protocol?
> >
> > It is part of the protocol, the mesa driver in the VM sends[1] this
> > info to the native-context "shim" in host userspace which uses the
> > SET_PARAM ioctl to pass this to the kernel.  In the host userspace
> > there is just a single process (you see the host PID below) but it
> > does a separate open() of the drm dev for each guest process (so that
> > they each have their own GPU address space for isolation):
> >
> > DRM minor 128
> >      PID    MEM ACTIV              NAME                    gpu
> >      5297  200M   82M com.mojang.minecr |██████████████▏                        |
> >      1859  199M    0B            chrome |█▉                                     |
> >      5297   64M    9M    surfaceflinger |                                       |
> >      5297   12M    0B org.chromium.arc. |                                       |
> >      5297   12M    0B com.android.syste |                                       |
> >      5297   12M    0B org.chromium.arc. |                                       |
> >      5297   26M    0B com.google.androi |                                       |
> >      5297   65M    0B     system_server |                                       |
> >
> >
> > [1] https://gitlab.freedesktop.org/virgl/virglrenderer/-/blob/master/src/drm/msm/msm_proto.h#L326
> > [2] https://gitlab.freedesktop.org/virgl/virglrenderer/-/blob/master/src/drm/msm/msm_renderer.c#L1050
> >
> >>>
> >>> Signed-off-by: Rob Clark <robdclark@chromium.org>
> >>> ---
> >>>    Documentation/gpu/drm-usage-stats.rst |  8 ++++++++
> >>>    drivers/gpu/drm/msm/msm_gpu.c         | 14 ++++++++++++++
> >>>    2 files changed, 22 insertions(+)
> >>>
> >>> diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> >>> index 8e00d53231e0..bc90bed455e3 100644
> >>> --- a/Documentation/gpu/drm-usage-stats.rst
> >>> +++ b/Documentation/gpu/drm-usage-stats.rst
> >>> @@ -148,6 +148,14 @@ percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
> >>>    time active without considering what frequency the engine is operating as a
> >>>    percentage of it's maximum frequency.
> >>>
> >>> +- drm-comm: <valstr>
> >>> +
> >>> +Returns the clients executable path.
> >>
> >> Full path and not just current->comm? In this case probably give it a
> >> more descriptive name here.
> >>
> >> drm-client-executable
> >> drm-client-command-line
> >>
> >> So we stay in the drm-client- namespace?
> >>
> >> Or if the former is absolute path could one key be enough for both?
> >>
> >> drm-client-command-line: /path/to/executable --arguments
> >
> > comm and cmdline can be different. Android seems to change the comm to
> > the apk name, for example (and w/ the zygote stuff cmdline isn't
> > really a thing)
> >
> > I guess it could be drm-client-comm and drm-client-cmdline?  Although
> > comm/cmdline aren't the best names, they are just following what the
> > kernel calls them elsewhere.
>
> I wasn't sure what do you plan to do given mention of a path under the
> drm-comm description. If it is a path then comm would be misleading,
> since comm as defined in procfs is not a path, I don't think so at
> least. Which is why I was suggesting executable. But if you remove the
> mention of a path from rst and rather refer to processes' comm value I
> think that is then okay.

Oh, whoops the mention of "path" for comm was a mistake.  task->comm
is described as executable name without path, and that is what the
fdinfo field was intending to follow.

> >>> +
> >>> +- drm-cmdline: <valstr>
> >>> +
> >>> +Returns the clients cmdline.
> >>
> >> I think drm-usage-stats.rst text should provide some more text with
> >> these two. To precisely define their content and outline the use case
> >> under which driver authors may want to add them, and fdinfo consumer
> >> therefore expect to see them. Just so everything is completely clear and
> >> people do not start adding them for drivers which do not support native
> >> context (or like).
> >
> > I really was just piggy-backing on existing comm/cmdline.. but I'll
> > try to write up something better.
> >
> > I think it maybe should not be limited just to native context.. for
> > ex. if the browser did somehow manage to create different displays
> > associated with different drm_file instances (I guess it would have to
> > use gbm to do this?) it would be nice to see browser tab names.
>
> Would be cool yes.
>
> My thinking behind why we maybe do not want to blanket add them is
> because for common case is it the same information which can be obtained
> from procfs. Like in igt_drm_clients.c I get the pid and comm from
> /proc/$pid/stat. So I was thinking it is only interesting to add to
> fdinfo for drivers where it could differ by the explicit override like
> you have with native context.

Yeah, I suppose I could define them as drm-client-comm-override and
drm-client-cmdline-override

> It can be added once there is a GL/whatever extension which would allow
> it? (I am not familiar with how browsers manage rendering contexts so
> maybe I am missing something.)
>
> >> But on the overall it sounds reasonable to me - it would be really cool
> >> to not just see pcivirtio-gpu as you say. Even if the standard virtiogpu
> >> use case (not native context) could show real users.
> >
> > For vrend/virgl, we'd first need to solve the issue that there is just
> > a single drm_file for all guest processes.  But really, just don't use
> > virgl.  (I mean, like seriously, would you put a gl driver in the
> > kernel?  Vrend has access to all guest memory, so this is essentially
> > what you have with virgl.  This is just not a sane thing to do.) The
> > only "valid" reason for not doing native-context is if you don't have
> > the src code for your UMD to be able to modify it to talk
> > native-context to virtgpu in the guest. ;-)
>
> I am just observing the current state of things on an Intel based
> Chromebook. :) Presumably the custom name for a context would be
> passable via the virtio-gpu protocol or something?

It is part of the context-type specific protocol.  Ie. some parts of
the protocol are "core" and dealt with in virtgpu guest kernel driver.
But on top of that there are various context-types with their own
protocol (ie. virgl, venus, cross-domain, msm native ctx, and some WIP
native ctx types floating around)

BR,
-R

> Regards,
>
> Tvrtko
>
> >
> > BR,
> > -R
> >
> >> Regards,
> >>
> >> Tvrtko
> >>
> >>> +
> >>>    Implementation Details
> >>>    ======================
> >>>
> >>> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> >>> index f0f4f845c32d..1150dcbf28aa 100644
> >>> --- a/drivers/gpu/drm/msm/msm_gpu.c
> >>> +++ b/drivers/gpu/drm/msm/msm_gpu.c
> >>> @@ -148,12 +148,26 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)
> >>>        return 0;
> >>>    }
> >>>
> >>> +static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd);
> >>> +
> >>>    void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx,
> >>>                         struct drm_printer *p)
> >>>    {
> >>> +     char *comm, *cmdline;
> >>> +
> >>> +     get_comm_cmdline(ctx, &comm, &cmdline);
> >>> +
> >>>        drm_printf(p, "drm-engine-gpu:\t%llu ns\n", ctx->elapsed_ns);
> >>>        drm_printf(p, "drm-cycles-gpu:\t%llu\n", ctx->cycles);
> >>>        drm_printf(p, "drm-maxfreq-gpu:\t%u Hz\n", gpu->fast_rate);
> >>> +
> >>> +     if (comm)
> >>> +             drm_printf(p, "drm-comm:\t%s\n", comm);
> >>> +     if (cmdline)
> >>> +             drm_printf(p, "drm-cmdline:\t%s\n", cmdline);
> >>> +
> >>> +     kfree(comm);
> >>> +     kfree(cmdline);
> >>>    }
> >>>
> >>>    int msm_gpu_hw_init(struct msm_gpu *gpu)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 3/3] drm/msm: Add comm/cmdline fields
@ 2023-04-19 15:00           ` Rob Clark
  0 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-19 15:00 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Rob Clark, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Thomas Zimmermann, Jonathan Corbet, Sean Paul,
	open list:DOCUMENTATION, Abhinav Kumar, dri-devel, open list,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Dmitry Baryshkov

On Wed, Apr 19, 2023 at 6:36 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 18/04/2023 15:56, Rob Clark wrote:
> > On Tue, Apr 18, 2023 at 1:53 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>
> >> On 17/04/2023 21:12, Rob Clark wrote:
> >>> From: Rob Clark <robdclark@chromium.org>
> >>>
> >>> Normally this would be the same information that can be obtained in
> >>> other ways.  But in some cases the process opening the drm fd is merely
> >>> a sort of proxy for the actual process using the GPU.  This is the case
> >>> for guest VM processes using the GPU via virglrenderer, in which case
> >>> the msm native-context renderer in virglrenderer overrides the comm/
> >>> cmdline to be the guest process's values.
> >>>
> >>> Exposing this via fdinfo allows tools like gputop to show something more
> >>> meaningful than just a bunch of "pcivirtio-gpu" users.
> >>
> >> You also later expanded with:
> >>
> >> """
> >> I should have also mentioned, in the VM/proxy scenario we have a
> >> single process with separate drm_file's for each guest VM process.  So
> >> it isn't an option to just change the proxy process's name to match
> >> the client.
> >> """
> >>
> >> So how does that work - this single process temporarily changes it's
> >> name for each drm fd it opens and creates a context or it is actually in
> >> the native context protocol?
> >
> > It is part of the protocol, the mesa driver in the VM sends[1] this
> > info to the native-context "shim" in host userspace which uses the
> > SET_PARAM ioctl to pass this to the kernel.  In the host userspace
> > there is just a single process (you see the host PID below) but it
> > does a separate open() of the drm dev for each guest process (so that
> > they each have their own GPU address space for isolation):
> >
> > DRM minor 128
> >      PID    MEM ACTIV              NAME                    gpu
> >      5297  200M   82M com.mojang.minecr |██████████████▏                        |
> >      1859  199M    0B            chrome |█▉                                     |
> >      5297   64M    9M    surfaceflinger |                                       |
> >      5297   12M    0B org.chromium.arc. |                                       |
> >      5297   12M    0B com.android.syste |                                       |
> >      5297   12M    0B org.chromium.arc. |                                       |
> >      5297   26M    0B com.google.androi |                                       |
> >      5297   65M    0B     system_server |                                       |
> >
> >
> > [1] https://gitlab.freedesktop.org/virgl/virglrenderer/-/blob/master/src/drm/msm/msm_proto.h#L326
> > [2] https://gitlab.freedesktop.org/virgl/virglrenderer/-/blob/master/src/drm/msm/msm_renderer.c#L1050
> >
> >>>
> >>> Signed-off-by: Rob Clark <robdclark@chromium.org>
> >>> ---
> >>>    Documentation/gpu/drm-usage-stats.rst |  8 ++++++++
> >>>    drivers/gpu/drm/msm/msm_gpu.c         | 14 ++++++++++++++
> >>>    2 files changed, 22 insertions(+)
> >>>
> >>> diff --git a/Documentation/gpu/drm-usage-stats.rst b/Documentation/gpu/drm-usage-stats.rst
> >>> index 8e00d53231e0..bc90bed455e3 100644
> >>> --- a/Documentation/gpu/drm-usage-stats.rst
> >>> +++ b/Documentation/gpu/drm-usage-stats.rst
> >>> @@ -148,6 +148,14 @@ percentage utilization of the engine, whereas drm-engine-<keystr> only reflects
> >>>    time active without considering what frequency the engine is operating as a
> >>>    percentage of it's maximum frequency.
> >>>
> >>> +- drm-comm: <valstr>
> >>> +
> >>> +Returns the clients executable path.
> >>
> >> Full path and not just current->comm? In this case probably give it a
> >> more descriptive name here.
> >>
> >> drm-client-executable
> >> drm-client-command-line
> >>
> >> So we stay in the drm-client- namespace?
> >>
> >> Or if the former is absolute path could one key be enough for both?
> >>
> >> drm-client-command-line: /path/to/executable --arguments
> >
> > comm and cmdline can be different. Android seems to change the comm to
> > the apk name, for example (and w/ the zygote stuff cmdline isn't
> > really a thing)
> >
> > I guess it could be drm-client-comm and drm-client-cmdline?  Although
> > comm/cmdline aren't the best names, they are just following what the
> > kernel calls them elsewhere.
>
> I wasn't sure what do you plan to do given mention of a path under the
> drm-comm description. If it is a path then comm would be misleading,
> since comm as defined in procfs is not a path, I don't think so at
> least. Which is why I was suggesting executable. But if you remove the
> mention of a path from rst and rather refer to processes' comm value I
> think that is then okay.

Oh, whoops the mention of "path" for comm was a mistake.  task->comm
is described as executable name without path, and that is what the
fdinfo field was intending to follow.

> >>> +
> >>> +- drm-cmdline: <valstr>
> >>> +
> >>> +Returns the clients cmdline.
> >>
> >> I think drm-usage-stats.rst text should provide some more text with
> >> these two. To precisely define their content and outline the use case
> >> under which driver authors may want to add them, and fdinfo consumer
> >> therefore expect to see them. Just so everything is completely clear and
> >> people do not start adding them for drivers which do not support native
> >> context (or like).
> >
> > I really was just piggy-backing on existing comm/cmdline.. but I'll
> > try to write up something better.
> >
> > I think it maybe should not be limited just to native context.. for
> > ex. if the browser did somehow manage to create different displays
> > associated with different drm_file instances (I guess it would have to
> > use gbm to do this?) it would be nice to see browser tab names.
>
> Would be cool yes.
>
> My thinking behind why we maybe do not want to blanket add them is
> because for common case is it the same information which can be obtained
> from procfs. Like in igt_drm_clients.c I get the pid and comm from
> /proc/$pid/stat. So I was thinking it is only interesting to add to
> fdinfo for drivers where it could differ by the explicit override like
> you have with native context.

Yeah, I suppose I could define them as drm-client-comm-override and
drm-client-cmdline-override

> It can be added once there is a GL/whatever extension which would allow
> it? (I am not familiar with how browsers manage rendering contexts so
> maybe I am missing something.)
>
> >> But on the overall it sounds reasonable to me - it would be really cool
> >> to not just see pcivirtio-gpu as you say. Even if the standard virtiogpu
> >> use case (not native context) could show real users.
> >
> > For vrend/virgl, we'd first need to solve the issue that there is just
> > a single drm_file for all guest processes.  But really, just don't use
> > virgl.  (I mean, like seriously, would you put a gl driver in the
> > kernel?  Vrend has access to all guest memory, so this is essentially
> > what you have with virgl.  This is just not a sane thing to do.) The
> > only "valid" reason for not doing native-context is if you don't have
> > the src code for your UMD to be able to modify it to talk
> > native-context to virtgpu in the guest. ;-)
>
> I am just observing the current state of things on an Intel based
> Chromebook. :) Presumably the custom name for a context would be
> passable via the virtio-gpu protocol or something?

It is part of the context-type specific protocol.  Ie. some parts of
the protocol are "core" and dealt with in virtgpu guest kernel driver.
But on top of that there are various context-types with their own
protocol (ie. virgl, venus, cross-domain, msm native ctx, and some WIP
native ctx types floating around)

BR,
-R

> Regards,
>
> Tvrtko
>
> >
> > BR,
> > -R
> >
> >> Regards,
> >>
> >> Tvrtko
> >>
> >>> +
> >>>    Implementation Details
> >>>    ======================
> >>>
> >>> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> >>> index f0f4f845c32d..1150dcbf28aa 100644
> >>> --- a/drivers/gpu/drm/msm/msm_gpu.c
> >>> +++ b/drivers/gpu/drm/msm/msm_gpu.c
> >>> @@ -148,12 +148,26 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)
> >>>        return 0;
> >>>    }
> >>>
> >>> +static void get_comm_cmdline(struct msm_file_private *ctx, char **comm, char **cmd);
> >>> +
> >>>    void msm_gpu_show_fdinfo(struct msm_gpu *gpu, struct msm_file_private *ctx,
> >>>                         struct drm_printer *p)
> >>>    {
> >>> +     char *comm, *cmdline;
> >>> +
> >>> +     get_comm_cmdline(ctx, &comm, &cmdline);
> >>> +
> >>>        drm_printf(p, "drm-engine-gpu:\t%llu ns\n", ctx->elapsed_ns);
> >>>        drm_printf(p, "drm-cycles-gpu:\t%llu\n", ctx->cycles);
> >>>        drm_printf(p, "drm-maxfreq-gpu:\t%u Hz\n", gpu->fast_rate);
> >>> +
> >>> +     if (comm)
> >>> +             drm_printf(p, "drm-comm:\t%s\n", comm);
> >>> +     if (cmdline)
> >>> +             drm_printf(p, "drm-cmdline:\t%s\n", cmdline);
> >>> +
> >>> +     kfree(comm);
> >>> +     kfree(cmdline);
> >>>    }
> >>>
> >>>    int msm_gpu_hw_init(struct msm_gpu *gpu)

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper
  2023-04-18 14:31         ` Rob Clark
@ 2023-04-21  9:33           ` Emil Velikov
  -1 siblings, 0 replies; 35+ messages in thread
From: Emil Velikov @ 2023-04-21  9:33 UTC (permalink / raw)
  To: Rob Clark
  Cc: Tvrtko Ursulin, dri-devel, Rob Clark, Abhinav Kumar,
	Dmitry Baryshkov, Sean Paul, David Airlie, Akhil P Oommen,
	Chia-I Wu, Konrad Dybcio,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list

Greeting all,

Sorry for the delay - Easter Holidays, food coma and all that :-)

On Tue, 18 Apr 2023 at 15:31, Rob Clark <robdclark@gmail.com> wrote:
>
> On Tue, Apr 18, 2023 at 1:34 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Tue, Apr 18, 2023 at 09:27:49AM +0100, Tvrtko Ursulin wrote:
> > >
> > > On 17/04/2023 21:12, Rob Clark wrote:
> > > > From: Rob Clark <robdclark@chromium.org>
> > > >
> > > > Make it work in terms of ctx so that it can be re-used for fdinfo.
> > > >
> > > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > > ---
> > > >   drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++--
> > > >   drivers/gpu/drm/msm/msm_drv.c           |  2 ++
> > > >   drivers/gpu/drm/msm/msm_gpu.c           | 13 ++++++-------
> > > >   drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++++++--
> > > >   drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
> > > >   5 files changed, 21 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > index bb38e728864d..43c4e1fea83f 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > @@ -412,7 +412,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > > >             /* Ensure string is null terminated: */
> > > >             str[len] = '\0';
> > > > -           mutex_lock(&gpu->lock);
> > > > +           mutex_lock(&ctx->lock);
> > > >             if (param == MSM_PARAM_COMM) {
> > > >                     paramp = &ctx->comm;
> > > > @@ -423,7 +423,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > > >             kfree(*paramp);
> > > >             *paramp = str;
> > > > -           mutex_unlock(&gpu->lock);
> > > > +           mutex_unlock(&ctx->lock);
> > > >             return 0;
> > > >     }
> > > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > > > index 3d73b98d6a9c..ca0e89e46e13 100644
> > > > --- a/drivers/gpu/drm/msm/msm_drv.c
> > > > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > > > @@ -581,6 +581,8 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
> > > >     rwlock_init(&ctx->queuelock);
> > > >     kref_init(&ctx->ref);
> > > > +   ctx->pid = get_pid(task_pid(current));
> > >
> > > Would it simplify things for msm if DRM core had an up to date file->pid as
> > > proposed in
> > > https://patchwork.freedesktop.org/patch/526752/?series=109902&rev=4 ? It
> > > gets updated if ioctl issuer is different than fd opener and this being
> > > context_init here reminded me of it. Maybe you wouldn't have to track the
> > > pid in msm?
>
> The problem is that we also need this for gpu devcore dumps, which
> could happen after the drm_file is closed.  The ctx can outlive the
> file.
>
I think we all kept forgetting about that. MSM had support for ages,
while AMDGPU is the second driver to land support - just a release
ago.

> But the ctx->pid has the same problem as the existing file->pid when
> it comes to Xorg.. hopefully over time that problem just goes away.

Out of curiosity: what do you mean with "when it comes to Xorg" - the
"was_master" handling or something else?

> guess I could do a similar dance to your patch to update the pid
> whenever (for ex) a submitqueue is created.
>
> > Can we go one step further and let the drm fdinfo stuff print these new
> > additions? Consistency across drivers and all that.
>
> Hmm, I guess I could _also_ store the overridden comm/cmdline in
> drm_file.  I still need to track it in ctx (msm_file_private) because
> I could need it after the file is closed.
>
> Maybe it could be useful to have a gl extension to let the app set a
> name on the context so that this is useful beyond native-ctx (ie.
> maybe it would be nice to see that "chrome: lwn.net" is using less gpu
> memory than "chrome: phoronix.com", etc)
>

/me awaits for the series to hit the respective websites ;-)

But seriously - the series from Tvrtko (thanks for the link, will
check in a moment) makes sense. Although given the livespan issue
mentioned above, I don't think it's applicable here.

So if it were me, I would consider the two orthogonal for the
short/mid term. Fwiw this and patch 1/3 are:
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

HTH
-Emil

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper
@ 2023-04-21  9:33           ` Emil Velikov
  0 siblings, 0 replies; 35+ messages in thread
From: Emil Velikov @ 2023-04-21  9:33 UTC (permalink / raw)
  To: Rob Clark
  Cc: Rob Clark, Tvrtko Ursulin, Akhil P Oommen, Abhinav Kumar,
	dri-devel, open list, Konrad Dybcio,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Dmitry Baryshkov,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Sean Paul

Greeting all,

Sorry for the delay - Easter Holidays, food coma and all that :-)

On Tue, 18 Apr 2023 at 15:31, Rob Clark <robdclark@gmail.com> wrote:
>
> On Tue, Apr 18, 2023 at 1:34 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Tue, Apr 18, 2023 at 09:27:49AM +0100, Tvrtko Ursulin wrote:
> > >
> > > On 17/04/2023 21:12, Rob Clark wrote:
> > > > From: Rob Clark <robdclark@chromium.org>
> > > >
> > > > Make it work in terms of ctx so that it can be re-used for fdinfo.
> > > >
> > > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > > ---
> > > >   drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++--
> > > >   drivers/gpu/drm/msm/msm_drv.c           |  2 ++
> > > >   drivers/gpu/drm/msm/msm_gpu.c           | 13 ++++++-------
> > > >   drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++++++--
> > > >   drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
> > > >   5 files changed, 21 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > index bb38e728864d..43c4e1fea83f 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > @@ -412,7 +412,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > > >             /* Ensure string is null terminated: */
> > > >             str[len] = '\0';
> > > > -           mutex_lock(&gpu->lock);
> > > > +           mutex_lock(&ctx->lock);
> > > >             if (param == MSM_PARAM_COMM) {
> > > >                     paramp = &ctx->comm;
> > > > @@ -423,7 +423,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > > >             kfree(*paramp);
> > > >             *paramp = str;
> > > > -           mutex_unlock(&gpu->lock);
> > > > +           mutex_unlock(&ctx->lock);
> > > >             return 0;
> > > >     }
> > > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > > > index 3d73b98d6a9c..ca0e89e46e13 100644
> > > > --- a/drivers/gpu/drm/msm/msm_drv.c
> > > > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > > > @@ -581,6 +581,8 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
> > > >     rwlock_init(&ctx->queuelock);
> > > >     kref_init(&ctx->ref);
> > > > +   ctx->pid = get_pid(task_pid(current));
> > >
> > > Would it simplify things for msm if DRM core had an up to date file->pid as
> > > proposed in
> > > https://patchwork.freedesktop.org/patch/526752/?series=109902&rev=4 ? It
> > > gets updated if ioctl issuer is different than fd opener and this being
> > > context_init here reminded me of it. Maybe you wouldn't have to track the
> > > pid in msm?
>
> The problem is that we also need this for gpu devcore dumps, which
> could happen after the drm_file is closed.  The ctx can outlive the
> file.
>
I think we all kept forgetting about that. MSM had support for ages,
while AMDGPU is the second driver to land support - just a release
ago.

> But the ctx->pid has the same problem as the existing file->pid when
> it comes to Xorg.. hopefully over time that problem just goes away.

Out of curiosity: what do you mean with "when it comes to Xorg" - the
"was_master" handling or something else?

> guess I could do a similar dance to your patch to update the pid
> whenever (for ex) a submitqueue is created.
>
> > Can we go one step further and let the drm fdinfo stuff print these new
> > additions? Consistency across drivers and all that.
>
> Hmm, I guess I could _also_ store the overridden comm/cmdline in
> drm_file.  I still need to track it in ctx (msm_file_private) because
> I could need it after the file is closed.
>
> Maybe it could be useful to have a gl extension to let the app set a
> name on the context so that this is useful beyond native-ctx (ie.
> maybe it would be nice to see that "chrome: lwn.net" is using less gpu
> memory than "chrome: phoronix.com", etc)
>

/me awaits for the series to hit the respective websites ;-)

But seriously - the series from Tvrtko (thanks for the link, will
check in a moment) makes sense. Although given the livespan issue
mentioned above, I don't think it's applicable here.

So if it were me, I would consider the two orthogonal for the
short/mid term. Fwiw this and patch 1/3 are:
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>

HTH
-Emil

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper
  2023-04-21  9:33           ` Emil Velikov
@ 2023-04-21 14:47             ` Rob Clark
  -1 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-21 14:47 UTC (permalink / raw)
  To: Emil Velikov
  Cc: Tvrtko Ursulin, dri-devel, Rob Clark, Abhinav Kumar,
	Dmitry Baryshkov, Sean Paul, David Airlie, Akhil P Oommen,
	Chia-I Wu, Konrad Dybcio,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list

On Fri, Apr 21, 2023 at 2:33 AM Emil Velikov <emil.l.velikov@gmail.com> wrote:
>
> Greeting all,
>
> Sorry for the delay - Easter Holidays, food coma and all that :-)
>
> On Tue, 18 Apr 2023 at 15:31, Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Tue, Apr 18, 2023 at 1:34 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Tue, Apr 18, 2023 at 09:27:49AM +0100, Tvrtko Ursulin wrote:
> > > >
> > > > On 17/04/2023 21:12, Rob Clark wrote:
> > > > > From: Rob Clark <robdclark@chromium.org>
> > > > >
> > > > > Make it work in terms of ctx so that it can be re-used for fdinfo.
> > > > >
> > > > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > > > ---
> > > > >   drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++--
> > > > >   drivers/gpu/drm/msm/msm_drv.c           |  2 ++
> > > > >   drivers/gpu/drm/msm/msm_gpu.c           | 13 ++++++-------
> > > > >   drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++++++--
> > > > >   drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
> > > > >   5 files changed, 21 insertions(+), 11 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > > index bb38e728864d..43c4e1fea83f 100644
> > > > > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > > @@ -412,7 +412,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > > > >             /* Ensure string is null terminated: */
> > > > >             str[len] = '\0';
> > > > > -           mutex_lock(&gpu->lock);
> > > > > +           mutex_lock(&ctx->lock);
> > > > >             if (param == MSM_PARAM_COMM) {
> > > > >                     paramp = &ctx->comm;
> > > > > @@ -423,7 +423,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > > > >             kfree(*paramp);
> > > > >             *paramp = str;
> > > > > -           mutex_unlock(&gpu->lock);
> > > > > +           mutex_unlock(&ctx->lock);
> > > > >             return 0;
> > > > >     }
> > > > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > > > > index 3d73b98d6a9c..ca0e89e46e13 100644
> > > > > --- a/drivers/gpu/drm/msm/msm_drv.c
> > > > > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > > > > @@ -581,6 +581,8 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
> > > > >     rwlock_init(&ctx->queuelock);
> > > > >     kref_init(&ctx->ref);
> > > > > +   ctx->pid = get_pid(task_pid(current));
> > > >
> > > > Would it simplify things for msm if DRM core had an up to date file->pid as
> > > > proposed in
> > > > https://patchwork.freedesktop.org/patch/526752/?series=109902&rev=4 ? It
> > > > gets updated if ioctl issuer is different than fd opener and this being
> > > > context_init here reminded me of it. Maybe you wouldn't have to track the
> > > > pid in msm?
> >
> > The problem is that we also need this for gpu devcore dumps, which
> > could happen after the drm_file is closed.  The ctx can outlive the
> > file.
> >
> I think we all kept forgetting about that. MSM had support for ages,
> while AMDGPU is the second driver to land support - just a release
> ago.
>
> > But the ctx->pid has the same problem as the existing file->pid when
> > it comes to Xorg.. hopefully over time that problem just goes away.
>
> Out of curiosity: what do you mean with "when it comes to Xorg" - the
> "was_master" handling or something else?

The problem is that Xorg is the one to open the drm fd, and then
passes the fd to the client.. so the pid of drm_file is the Xorg pid,
not the client.  Making it not terribly informative.

Tvrtko's patch he linked above would address that for drm_file, but
not for other driver internal usages.  Maybe it could be wired up as a
helper so that drivers don't have to re-invent that dance.  Idk, I
have to think about it.

Btw, with my WIP drm sched fence signalling patch lockdep is unhappy
when gpu devcore dumps are triggered.  I'm still pondering how to
decouple the locking so that anything coming from fs (ie.
show_fdinfo()) is decoupled from anything that happens in the fence
signaling path.  But will repost this series once I get that sorted
out.

BR,
-R

>
> > guess I could do a similar dance to your patch to update the pid
> > whenever (for ex) a submitqueue is created.
> >
> > > Can we go one step further and let the drm fdinfo stuff print these new
> > > additions? Consistency across drivers and all that.
> >
> > Hmm, I guess I could _also_ store the overridden comm/cmdline in
> > drm_file.  I still need to track it in ctx (msm_file_private) because
> > I could need it after the file is closed.
> >
> > Maybe it could be useful to have a gl extension to let the app set a
> > name on the context so that this is useful beyond native-ctx (ie.
> > maybe it would be nice to see that "chrome: lwn.net" is using less gpu
> > memory than "chrome: phoronix.com", etc)
> >
>
> /me awaits for the series to hit the respective websites ;-)
>
> But seriously - the series from Tvrtko (thanks for the link, will
> check in a moment) makes sense. Although given the livespan issue
> mentioned above, I don't think it's applicable here.
>
> So if it were me, I would consider the two orthogonal for the
> short/mid term. Fwiw this and patch 1/3 are:
> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
>
> HTH
> -Emil

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper
@ 2023-04-21 14:47             ` Rob Clark
  0 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-21 14:47 UTC (permalink / raw)
  To: Emil Velikov
  Cc: Rob Clark, Tvrtko Ursulin, Akhil P Oommen, Abhinav Kumar,
	dri-devel, open list, Konrad Dybcio,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Dmitry Baryshkov,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Sean Paul

On Fri, Apr 21, 2023 at 2:33 AM Emil Velikov <emil.l.velikov@gmail.com> wrote:
>
> Greeting all,
>
> Sorry for the delay - Easter Holidays, food coma and all that :-)
>
> On Tue, 18 Apr 2023 at 15:31, Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Tue, Apr 18, 2023 at 1:34 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Tue, Apr 18, 2023 at 09:27:49AM +0100, Tvrtko Ursulin wrote:
> > > >
> > > > On 17/04/2023 21:12, Rob Clark wrote:
> > > > > From: Rob Clark <robdclark@chromium.org>
> > > > >
> > > > > Make it work in terms of ctx so that it can be re-used for fdinfo.
> > > > >
> > > > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > > > ---
> > > > >   drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++--
> > > > >   drivers/gpu/drm/msm/msm_drv.c           |  2 ++
> > > > >   drivers/gpu/drm/msm/msm_gpu.c           | 13 ++++++-------
> > > > >   drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++++++--
> > > > >   drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
> > > > >   5 files changed, 21 insertions(+), 11 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > > index bb38e728864d..43c4e1fea83f 100644
> > > > > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > > @@ -412,7 +412,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > > > >             /* Ensure string is null terminated: */
> > > > >             str[len] = '\0';
> > > > > -           mutex_lock(&gpu->lock);
> > > > > +           mutex_lock(&ctx->lock);
> > > > >             if (param == MSM_PARAM_COMM) {
> > > > >                     paramp = &ctx->comm;
> > > > > @@ -423,7 +423,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > > > >             kfree(*paramp);
> > > > >             *paramp = str;
> > > > > -           mutex_unlock(&gpu->lock);
> > > > > +           mutex_unlock(&ctx->lock);
> > > > >             return 0;
> > > > >     }
> > > > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > > > > index 3d73b98d6a9c..ca0e89e46e13 100644
> > > > > --- a/drivers/gpu/drm/msm/msm_drv.c
> > > > > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > > > > @@ -581,6 +581,8 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
> > > > >     rwlock_init(&ctx->queuelock);
> > > > >     kref_init(&ctx->ref);
> > > > > +   ctx->pid = get_pid(task_pid(current));
> > > >
> > > > Would it simplify things for msm if DRM core had an up to date file->pid as
> > > > proposed in
> > > > https://patchwork.freedesktop.org/patch/526752/?series=109902&rev=4 ? It
> > > > gets updated if ioctl issuer is different than fd opener and this being
> > > > context_init here reminded me of it. Maybe you wouldn't have to track the
> > > > pid in msm?
> >
> > The problem is that we also need this for gpu devcore dumps, which
> > could happen after the drm_file is closed.  The ctx can outlive the
> > file.
> >
> I think we all kept forgetting about that. MSM had support for ages,
> while AMDGPU is the second driver to land support - just a release
> ago.
>
> > But the ctx->pid has the same problem as the existing file->pid when
> > it comes to Xorg.. hopefully over time that problem just goes away.
>
> Out of curiosity: what do you mean with "when it comes to Xorg" - the
> "was_master" handling or something else?

The problem is that Xorg is the one to open the drm fd, and then
passes the fd to the client.. so the pid of drm_file is the Xorg pid,
not the client.  Making it not terribly informative.

Tvrtko's patch he linked above would address that for drm_file, but
not for other driver internal usages.  Maybe it could be wired up as a
helper so that drivers don't have to re-invent that dance.  Idk, I
have to think about it.

Btw, with my WIP drm sched fence signalling patch lockdep is unhappy
when gpu devcore dumps are triggered.  I'm still pondering how to
decouple the locking so that anything coming from fs (ie.
show_fdinfo()) is decoupled from anything that happens in the fence
signaling path.  But will repost this series once I get that sorted
out.

BR,
-R

>
> > guess I could do a similar dance to your patch to update the pid
> > whenever (for ex) a submitqueue is created.
> >
> > > Can we go one step further and let the drm fdinfo stuff print these new
> > > additions? Consistency across drivers and all that.
> >
> > Hmm, I guess I could _also_ store the overridden comm/cmdline in
> > drm_file.  I still need to track it in ctx (msm_file_private) because
> > I could need it after the file is closed.
> >
> > Maybe it could be useful to have a gl extension to let the app set a
> > name on the context so that this is useful beyond native-ctx (ie.
> > maybe it would be nice to see that "chrome: lwn.net" is using less gpu
> > memory than "chrome: phoronix.com", etc)
> >
>
> /me awaits for the series to hit the respective websites ;-)
>
> But seriously - the series from Tvrtko (thanks for the link, will
> check in a moment) makes sense. Although given the livespan issue
> mentioned above, I don't think it's applicable here.
>
> So if it were me, I would consider the two orthogonal for the
> short/mid term. Fwiw this and patch 1/3 are:
> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
>
> HTH
> -Emil

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper
  2023-04-21 14:47             ` Rob Clark
@ 2023-04-27  9:39               ` Daniel Vetter
  -1 siblings, 0 replies; 35+ messages in thread
From: Daniel Vetter @ 2023-04-27  9:39 UTC (permalink / raw)
  To: Rob Clark
  Cc: Emil Velikov, Rob Clark, Tvrtko Ursulin, Akhil P Oommen,
	Abhinav Kumar, dri-devel, open list, Konrad Dybcio,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Dmitry Baryshkov,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Sean Paul

On Fri, Apr 21, 2023 at 07:47:26AM -0700, Rob Clark wrote:
> On Fri, Apr 21, 2023 at 2:33 AM Emil Velikov <emil.l.velikov@gmail.com> wrote:
> >
> > Greeting all,
> >
> > Sorry for the delay - Easter Holidays, food coma and all that :-)
> >
> > On Tue, 18 Apr 2023 at 15:31, Rob Clark <robdclark@gmail.com> wrote:
> > >
> > > On Tue, Apr 18, 2023 at 1:34 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Tue, Apr 18, 2023 at 09:27:49AM +0100, Tvrtko Ursulin wrote:
> > > > >
> > > > > On 17/04/2023 21:12, Rob Clark wrote:
> > > > > > From: Rob Clark <robdclark@chromium.org>
> > > > > >
> > > > > > Make it work in terms of ctx so that it can be re-used for fdinfo.
> > > > > >
> > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > > > > ---
> > > > > >   drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++--
> > > > > >   drivers/gpu/drm/msm/msm_drv.c           |  2 ++
> > > > > >   drivers/gpu/drm/msm/msm_gpu.c           | 13 ++++++-------
> > > > > >   drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++++++--
> > > > > >   drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
> > > > > >   5 files changed, 21 insertions(+), 11 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > > > index bb38e728864d..43c4e1fea83f 100644
> > > > > > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > > > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > > > @@ -412,7 +412,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > > > > >             /* Ensure string is null terminated: */
> > > > > >             str[len] = '\0';
> > > > > > -           mutex_lock(&gpu->lock);
> > > > > > +           mutex_lock(&ctx->lock);
> > > > > >             if (param == MSM_PARAM_COMM) {
> > > > > >                     paramp = &ctx->comm;
> > > > > > @@ -423,7 +423,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > > > > >             kfree(*paramp);
> > > > > >             *paramp = str;
> > > > > > -           mutex_unlock(&gpu->lock);
> > > > > > +           mutex_unlock(&ctx->lock);
> > > > > >             return 0;
> > > > > >     }
> > > > > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > > > > > index 3d73b98d6a9c..ca0e89e46e13 100644
> > > > > > --- a/drivers/gpu/drm/msm/msm_drv.c
> > > > > > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > > > > > @@ -581,6 +581,8 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
> > > > > >     rwlock_init(&ctx->queuelock);
> > > > > >     kref_init(&ctx->ref);
> > > > > > +   ctx->pid = get_pid(task_pid(current));
> > > > >
> > > > > Would it simplify things for msm if DRM core had an up to date file->pid as
> > > > > proposed in
> > > > > https://patchwork.freedesktop.org/patch/526752/?series=109902&rev=4 ? It
> > > > > gets updated if ioctl issuer is different than fd opener and this being
> > > > > context_init here reminded me of it. Maybe you wouldn't have to track the
> > > > > pid in msm?
> > >
> > > The problem is that we also need this for gpu devcore dumps, which
> > > could happen after the drm_file is closed.  The ctx can outlive the
> > > file.
> > >
> > I think we all kept forgetting about that. MSM had support for ages,
> > while AMDGPU is the second driver to land support - just a release
> > ago.
> >
> > > But the ctx->pid has the same problem as the existing file->pid when
> > > it comes to Xorg.. hopefully over time that problem just goes away.
> >
> > Out of curiosity: what do you mean with "when it comes to Xorg" - the
> > "was_master" handling or something else?
> 
> The problem is that Xorg is the one to open the drm fd, and then
> passes the fd to the client.. so the pid of drm_file is the Xorg pid,
> not the client.  Making it not terribly informative.
> 
> Tvrtko's patch he linked above would address that for drm_file, but
> not for other driver internal usages.  Maybe it could be wired up as a
> helper so that drivers don't have to re-invent that dance.  Idk, I
> have to think about it.
> 
> Btw, with my WIP drm sched fence signalling patch lockdep is unhappy
> when gpu devcore dumps are triggered.  I'm still pondering how to
> decouple the locking so that anything coming from fs (ie.
> show_fdinfo()) is decoupled from anything that happens in the fence
> signaling path.  But will repost this series once I get that sorted
> out.

So the cleanest imo is that you push most of the capturing into a worker
that's entirely decoupled. If you have terminal context (i.e. on first
hang they stop all further cmd submission, which is anyway what
vk/arb_robustness want), then you don't have to capture at tdr time,
because there's no subsequent batch that will wreck the state.

But it only works if your gpu ctx don't have recoverable semantics.

If you can't do that it's a _lot_ of GFP_ATOMIC and trylock and bailing
out if any fails :-/
-Daniel

> 
> BR,
> -R
> 
> >
> > > guess I could do a similar dance to your patch to update the pid
> > > whenever (for ex) a submitqueue is created.
> > >
> > > > Can we go one step further and let the drm fdinfo stuff print these new
> > > > additions? Consistency across drivers and all that.
> > >
> > > Hmm, I guess I could _also_ store the overridden comm/cmdline in
> > > drm_file.  I still need to track it in ctx (msm_file_private) because
> > > I could need it after the file is closed.
> > >
> > > Maybe it could be useful to have a gl extension to let the app set a
> > > name on the context so that this is useful beyond native-ctx (ie.
> > > maybe it would be nice to see that "chrome: lwn.net" is using less gpu
> > > memory than "chrome: phoronix.com", etc)
> > >
> >
> > /me awaits for the series to hit the respective websites ;-)
> >
> > But seriously - the series from Tvrtko (thanks for the link, will
> > check in a moment) makes sense. Although given the livespan issue
> > mentioned above, I don't think it's applicable here.
> >
> > So if it were me, I would consider the two orthogonal for the
> > short/mid term. Fwiw this and patch 1/3 are:
> > Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
> >
> > HTH
> > -Emil

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper
@ 2023-04-27  9:39               ` Daniel Vetter
  0 siblings, 0 replies; 35+ messages in thread
From: Daniel Vetter @ 2023-04-27  9:39 UTC (permalink / raw)
  To: Rob Clark
  Cc: Rob Clark, Tvrtko Ursulin, Akhil P Oommen,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Emil Velikov,
	Abhinav Kumar, dri-devel, open list, Konrad Dybcio, Sean Paul,
	Dmitry Baryshkov, open list:DRM DRIVER FOR MSM ADRENO GPU

On Fri, Apr 21, 2023 at 07:47:26AM -0700, Rob Clark wrote:
> On Fri, Apr 21, 2023 at 2:33 AM Emil Velikov <emil.l.velikov@gmail.com> wrote:
> >
> > Greeting all,
> >
> > Sorry for the delay - Easter Holidays, food coma and all that :-)
> >
> > On Tue, 18 Apr 2023 at 15:31, Rob Clark <robdclark@gmail.com> wrote:
> > >
> > > On Tue, Apr 18, 2023 at 1:34 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Tue, Apr 18, 2023 at 09:27:49AM +0100, Tvrtko Ursulin wrote:
> > > > >
> > > > > On 17/04/2023 21:12, Rob Clark wrote:
> > > > > > From: Rob Clark <robdclark@chromium.org>
> > > > > >
> > > > > > Make it work in terms of ctx so that it can be re-used for fdinfo.
> > > > > >
> > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > > > > ---
> > > > > >   drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++--
> > > > > >   drivers/gpu/drm/msm/msm_drv.c           |  2 ++
> > > > > >   drivers/gpu/drm/msm/msm_gpu.c           | 13 ++++++-------
> > > > > >   drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++++++--
> > > > > >   drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
> > > > > >   5 files changed, 21 insertions(+), 11 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > > > index bb38e728864d..43c4e1fea83f 100644
> > > > > > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > > > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > > > @@ -412,7 +412,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > > > > >             /* Ensure string is null terminated: */
> > > > > >             str[len] = '\0';
> > > > > > -           mutex_lock(&gpu->lock);
> > > > > > +           mutex_lock(&ctx->lock);
> > > > > >             if (param == MSM_PARAM_COMM) {
> > > > > >                     paramp = &ctx->comm;
> > > > > > @@ -423,7 +423,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > > > > >             kfree(*paramp);
> > > > > >             *paramp = str;
> > > > > > -           mutex_unlock(&gpu->lock);
> > > > > > +           mutex_unlock(&ctx->lock);
> > > > > >             return 0;
> > > > > >     }
> > > > > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > > > > > index 3d73b98d6a9c..ca0e89e46e13 100644
> > > > > > --- a/drivers/gpu/drm/msm/msm_drv.c
> > > > > > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > > > > > @@ -581,6 +581,8 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
> > > > > >     rwlock_init(&ctx->queuelock);
> > > > > >     kref_init(&ctx->ref);
> > > > > > +   ctx->pid = get_pid(task_pid(current));
> > > > >
> > > > > Would it simplify things for msm if DRM core had an up to date file->pid as
> > > > > proposed in
> > > > > https://patchwork.freedesktop.org/patch/526752/?series=109902&rev=4 ? It
> > > > > gets updated if ioctl issuer is different than fd opener and this being
> > > > > context_init here reminded me of it. Maybe you wouldn't have to track the
> > > > > pid in msm?
> > >
> > > The problem is that we also need this for gpu devcore dumps, which
> > > could happen after the drm_file is closed.  The ctx can outlive the
> > > file.
> > >
> > I think we all kept forgetting about that. MSM had support for ages,
> > while AMDGPU is the second driver to land support - just a release
> > ago.
> >
> > > But the ctx->pid has the same problem as the existing file->pid when
> > > it comes to Xorg.. hopefully over time that problem just goes away.
> >
> > Out of curiosity: what do you mean with "when it comes to Xorg" - the
> > "was_master" handling or something else?
> 
> The problem is that Xorg is the one to open the drm fd, and then
> passes the fd to the client.. so the pid of drm_file is the Xorg pid,
> not the client.  Making it not terribly informative.
> 
> Tvrtko's patch he linked above would address that for drm_file, but
> not for other driver internal usages.  Maybe it could be wired up as a
> helper so that drivers don't have to re-invent that dance.  Idk, I
> have to think about it.
> 
> Btw, with my WIP drm sched fence signalling patch lockdep is unhappy
> when gpu devcore dumps are triggered.  I'm still pondering how to
> decouple the locking so that anything coming from fs (ie.
> show_fdinfo()) is decoupled from anything that happens in the fence
> signaling path.  But will repost this series once I get that sorted
> out.

So the cleanest imo is that you push most of the capturing into a worker
that's entirely decoupled. If you have terminal context (i.e. on first
hang they stop all further cmd submission, which is anyway what
vk/arb_robustness want), then you don't have to capture at tdr time,
because there's no subsequent batch that will wreck the state.

But it only works if your gpu ctx don't have recoverable semantics.

If you can't do that it's a _lot_ of GFP_ATOMIC and trylock and bailing
out if any fails :-/
-Daniel

> 
> BR,
> -R
> 
> >
> > > guess I could do a similar dance to your patch to update the pid
> > > whenever (for ex) a submitqueue is created.
> > >
> > > > Can we go one step further and let the drm fdinfo stuff print these new
> > > > additions? Consistency across drivers and all that.
> > >
> > > Hmm, I guess I could _also_ store the overridden comm/cmdline in
> > > drm_file.  I still need to track it in ctx (msm_file_private) because
> > > I could need it after the file is closed.
> > >
> > > Maybe it could be useful to have a gl extension to let the app set a
> > > name on the context so that this is useful beyond native-ctx (ie.
> > > maybe it would be nice to see that "chrome: lwn.net" is using less gpu
> > > memory than "chrome: phoronix.com", etc)
> > >
> >
> > /me awaits for the series to hit the respective websites ;-)
> >
> > But seriously - the series from Tvrtko (thanks for the link, will
> > check in a moment) makes sense. Although given the livespan issue
> > mentioned above, I don't think it's applicable here.
> >
> > So if it were me, I would consider the two orthogonal for the
> > short/mid term. Fwiw this and patch 1/3 are:
> > Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
> >
> > HTH
> > -Emil

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper
  2023-04-27  9:39               ` Daniel Vetter
  (?)
@ 2023-04-27 14:31               ` Rob Clark
  -1 siblings, 0 replies; 35+ messages in thread
From: Rob Clark @ 2023-04-27 14:31 UTC (permalink / raw)
  To: Rob Clark, Emil Velikov, Rob Clark, Tvrtko Ursulin,
	Akhil P Oommen, Abhinav Kumar, dri-devel, open list,
	Konrad Dybcio, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Dmitry Baryshkov, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Sean Paul

On Thu, Apr 27, 2023 at 2:39 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Apr 21, 2023 at 07:47:26AM -0700, Rob Clark wrote:
> > On Fri, Apr 21, 2023 at 2:33 AM Emil Velikov <emil.l.velikov@gmail.com> wrote:
> > >
> > > Greeting all,
> > >
> > > Sorry for the delay - Easter Holidays, food coma and all that :-)
> > >
> > > On Tue, 18 Apr 2023 at 15:31, Rob Clark <robdclark@gmail.com> wrote:
> > > >
> > > > On Tue, Apr 18, 2023 at 1:34 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >
> > > > > On Tue, Apr 18, 2023 at 09:27:49AM +0100, Tvrtko Ursulin wrote:
> > > > > >
> > > > > > On 17/04/2023 21:12, Rob Clark wrote:
> > > > > > > From: Rob Clark <robdclark@chromium.org>
> > > > > > >
> > > > > > > Make it work in terms of ctx so that it can be re-used for fdinfo.
> > > > > > >
> > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > > > > > ---
> > > > > > >   drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 ++--
> > > > > > >   drivers/gpu/drm/msm/msm_drv.c           |  2 ++
> > > > > > >   drivers/gpu/drm/msm/msm_gpu.c           | 13 ++++++-------
> > > > > > >   drivers/gpu/drm/msm/msm_gpu.h           | 12 ++++++++++--
> > > > > > >   drivers/gpu/drm/msm/msm_submitqueue.c   |  1 +
> > > > > > >   5 files changed, 21 insertions(+), 11 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > > > > index bb38e728864d..43c4e1fea83f 100644
> > > > > > > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > > > > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> > > > > > > @@ -412,7 +412,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > > > > > >             /* Ensure string is null terminated: */
> > > > > > >             str[len] = '\0';
> > > > > > > -           mutex_lock(&gpu->lock);
> > > > > > > +           mutex_lock(&ctx->lock);
> > > > > > >             if (param == MSM_PARAM_COMM) {
> > > > > > >                     paramp = &ctx->comm;
> > > > > > > @@ -423,7 +423,7 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
> > > > > > >             kfree(*paramp);
> > > > > > >             *paramp = str;
> > > > > > > -           mutex_unlock(&gpu->lock);
> > > > > > > +           mutex_unlock(&ctx->lock);
> > > > > > >             return 0;
> > > > > > >     }
> > > > > > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > > > > > > index 3d73b98d6a9c..ca0e89e46e13 100644
> > > > > > > --- a/drivers/gpu/drm/msm/msm_drv.c
> > > > > > > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > > > > > > @@ -581,6 +581,8 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
> > > > > > >     rwlock_init(&ctx->queuelock);
> > > > > > >     kref_init(&ctx->ref);
> > > > > > > +   ctx->pid = get_pid(task_pid(current));
> > > > > >
> > > > > > Would it simplify things for msm if DRM core had an up to date file->pid as
> > > > > > proposed in
> > > > > > https://patchwork.freedesktop.org/patch/526752/?series=109902&rev=4 ? It
> > > > > > gets updated if ioctl issuer is different than fd opener and this being
> > > > > > context_init here reminded me of it. Maybe you wouldn't have to track the
> > > > > > pid in msm?
> > > >
> > > > The problem is that we also need this for gpu devcore dumps, which
> > > > could happen after the drm_file is closed.  The ctx can outlive the
> > > > file.
> > > >
> > > I think we all kept forgetting about that. MSM had support for ages,
> > > while AMDGPU is the second driver to land support - just a release
> > > ago.
> > >
> > > > But the ctx->pid has the same problem as the existing file->pid when
> > > > it comes to Xorg.. hopefully over time that problem just goes away.
> > >
> > > Out of curiosity: what do you mean with "when it comes to Xorg" - the
> > > "was_master" handling or something else?
> >
> > The problem is that Xorg is the one to open the drm fd, and then
> > passes the fd to the client.. so the pid of drm_file is the Xorg pid,
> > not the client.  Making it not terribly informative.
> >
> > Tvrtko's patch he linked above would address that for drm_file, but
> > not for other driver internal usages.  Maybe it could be wired up as a
> > helper so that drivers don't have to re-invent that dance.  Idk, I
> > have to think about it.
> >
> > Btw, with my WIP drm sched fence signalling patch lockdep is unhappy
> > when gpu devcore dumps are triggered.  I'm still pondering how to
> > decouple the locking so that anything coming from fs (ie.
> > show_fdinfo()) is decoupled from anything that happens in the fence
> > signaling path.  But will repost this series once I get that sorted
> > out.
>
> So the cleanest imo is that you push most of the capturing into a worker
> that's entirely decoupled. If you have terminal context (i.e. on first
> hang they stop all further cmd submission, which is anyway what
> vk/arb_robustness want), then you don't have to capture at tdr time,
> because there's no subsequent batch that will wreck the state.

It is already in a worker, but we (a) need to block other contexts
from submitting while at the same time (b) using the GPU itself to
capture its state.. (yes, the way the hw works is overly complicated
in this regard)

> But it only works if your gpu ctx don't have recoverable semantics.

We do have recoverable semantics.. but that is pretty orthogonal.  We
just need a different lock.. I have a plan to move (a copy) of the
override strings to drm_file with it's own locking decoupled from what
we need in the recovery path.. and hopefully will finally have time to
type it up today and post it (just before disappearing off into the
woods to go backpacking ;-))

BR,
-R

> If you can't do that it's a _lot_ of GFP_ATOMIC and trylock and bailing
> out if any fails :-/
> -Daniel
>
> >
> > BR,
> > -R
> >
> > >
> > > > guess I could do a similar dance to your patch to update the pid
> > > > whenever (for ex) a submitqueue is created.
> > > >
> > > > > Can we go one step further and let the drm fdinfo stuff print these new
> > > > > additions? Consistency across drivers and all that.
> > > >
> > > > Hmm, I guess I could _also_ store the overridden comm/cmdline in
> > > > drm_file.  I still need to track it in ctx (msm_file_private) because
> > > > I could need it after the file is closed.
> > > >
> > > > Maybe it could be useful to have a gl extension to let the app set a
> > > > name on the context so that this is useful beyond native-ctx (ie.
> > > > maybe it would be nice to see that "chrome: lwn.net" is using less gpu
> > > > memory than "chrome: phoronix.com", etc)
> > > >
> > >
> > > /me awaits for the series to hit the respective websites ;-)
> > >
> > > But seriously - the series from Tvrtko (thanks for the link, will
> > > check in a moment) makes sense. Although given the livespan issue
> > > mentioned above, I don't think it's applicable here.
> > >
> > > So if it were me, I would consider the two orthogonal for the
> > > short/mid term. Fwiw this and patch 1/3 are:
> > > Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
> > >
> > > HTH
> > > -Emil
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2023-04-27 14:32 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-17 20:12 [RFC 0/3] drm: Add comm/cmdline fdinfo fields Rob Clark
2023-04-17 20:12 ` Rob Clark
2023-04-17 20:12 ` [RFC 1/3] drm/doc: Relax fdinfo string constraints Rob Clark
2023-04-17 20:12   ` Rob Clark
2023-04-18  8:19   ` Tvrtko Ursulin
2023-04-18  8:19     ` Tvrtko Ursulin
2023-04-17 20:12 ` [RFC 2/3] drm/msm: Rework get_comm_cmdline() helper Rob Clark
2023-04-17 20:12   ` Rob Clark
2023-04-18  8:27   ` Tvrtko Ursulin
2023-04-18  8:27     ` Tvrtko Ursulin
2023-04-18  8:34     ` Daniel Vetter
2023-04-18  8:34       ` Daniel Vetter
2023-04-18 14:31       ` Rob Clark
2023-04-18 14:31         ` Rob Clark
2023-04-21  9:33         ` Emil Velikov
2023-04-21  9:33           ` Emil Velikov
2023-04-21 14:47           ` Rob Clark
2023-04-21 14:47             ` Rob Clark
2023-04-27  9:39             ` Daniel Vetter
2023-04-27  9:39               ` Daniel Vetter
2023-04-27 14:31               ` Rob Clark
2023-04-17 20:12 ` [RFC 3/3] drm/msm: Add comm/cmdline fields Rob Clark
2023-04-17 20:12   ` Rob Clark
2023-04-18  8:53   ` Tvrtko Ursulin
2023-04-18  8:53     ` Tvrtko Ursulin
2023-04-18 14:56     ` Rob Clark
2023-04-18 14:56       ` Rob Clark
2023-04-19 13:36       ` Tvrtko Ursulin
2023-04-19 13:36         ` Tvrtko Ursulin
2023-04-19 15:00         ` Rob Clark
2023-04-19 15:00           ` Rob Clark
2023-04-17 20:45 ` [RFC 0/3] drm: Add comm/cmdline fdinfo fields Rob Clark
2023-04-17 20:45   ` Rob Clark
2023-04-18  9:33 ` Konrad Dybcio
2023-04-18  9:33   ` Konrad Dybcio

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.