All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/10] Fixes and cleanups for Host1x
@ 2021-03-29 13:38 ` Mikko Perttunen
  0 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: dri-devel, linux-tegra, Mikko Perttunen

This is the first part of the Host1x/TegraDRM UAPI series, split out
into a separate series that should be easier to merge. It contains
a number of Host1x-related cleanups and fixes. In addition to the
previous series there are a couple of new fixes.

Tested on Jetson TX2.

Thanks,
Mikko

Jon Hunter (1):
  gpu: host1x: Fix Tegra194 syncpt interrupt threshold

Mikko Perttunen (9):
  gpu: host1x: Use different lock classes for each client
  gpu: host1x: Allow syncpoints without associated client
  gpu: host1x: Show number of pending waiters in debugfs
  gpu: host1x: Remove cancelled waiters immediately
  gpu: host1x: Use HW-equivalent syncpoint expiration check
  gpu: host1x: Cleanup and refcounting for syncpoints
  gpu: host1x: Reset max value when freeing a syncpoint
  gpu: host1x: Reserve VBLANK syncpoints at initialization
  gpu: host1x: Assign intr waiter inside lock

 drivers/gpu/drm/tegra/dc.c             |  10 +-
 drivers/gpu/drm/tegra/drm.c            |  14 +-
 drivers/gpu/drm/tegra/gr2d.c           |   4 +-
 drivers/gpu/drm/tegra/gr3d.c           |   4 +-
 drivers/gpu/drm/tegra/vic.c            |   4 +-
 drivers/gpu/host1x/bus.c               |  10 +-
 drivers/gpu/host1x/cdma.c              |  11 +-
 drivers/gpu/host1x/debug.c             |  14 +-
 drivers/gpu/host1x/dev.c               |   6 +
 drivers/gpu/host1x/dev.h               |  13 +-
 drivers/gpu/host1x/hw/cdma_hw.c        |   2 +-
 drivers/gpu/host1x/hw/channel_hw.c     |  10 +-
 drivers/gpu/host1x/hw/debug_hw.c       |   2 +-
 drivers/gpu/host1x/hw/hw_host1x07_vm.h |   2 +-
 drivers/gpu/host1x/intr.c              |  28 +++-
 drivers/gpu/host1x/intr.h              |   4 +-
 drivers/gpu/host1x/job.c               |   5 +-
 drivers/gpu/host1x/syncpt.c            | 202 +++++++++++++++----------
 drivers/gpu/host1x/syncpt.h            |   4 +-
 drivers/staging/media/tegra-video/vi.c |   4 +-
 include/linux/host1x.h                 |  23 ++-
 21 files changed, 235 insertions(+), 141 deletions(-)

-- 
2.30.1


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v6 00/10] Fixes and cleanups for Host1x
@ 2021-03-29 13:38 ` Mikko Perttunen
  0 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel, Mikko Perttunen

This is the first part of the Host1x/TegraDRM UAPI series, split out
into a separate series that should be easier to merge. It contains
a number of Host1x-related cleanups and fixes. In addition to the
previous series there are a couple of new fixes.

Tested on Jetson TX2.

Thanks,
Mikko

Jon Hunter (1):
  gpu: host1x: Fix Tegra194 syncpt interrupt threshold

Mikko Perttunen (9):
  gpu: host1x: Use different lock classes for each client
  gpu: host1x: Allow syncpoints without associated client
  gpu: host1x: Show number of pending waiters in debugfs
  gpu: host1x: Remove cancelled waiters immediately
  gpu: host1x: Use HW-equivalent syncpoint expiration check
  gpu: host1x: Cleanup and refcounting for syncpoints
  gpu: host1x: Reset max value when freeing a syncpoint
  gpu: host1x: Reserve VBLANK syncpoints at initialization
  gpu: host1x: Assign intr waiter inside lock

 drivers/gpu/drm/tegra/dc.c             |  10 +-
 drivers/gpu/drm/tegra/drm.c            |  14 +-
 drivers/gpu/drm/tegra/gr2d.c           |   4 +-
 drivers/gpu/drm/tegra/gr3d.c           |   4 +-
 drivers/gpu/drm/tegra/vic.c            |   4 +-
 drivers/gpu/host1x/bus.c               |  10 +-
 drivers/gpu/host1x/cdma.c              |  11 +-
 drivers/gpu/host1x/debug.c             |  14 +-
 drivers/gpu/host1x/dev.c               |   6 +
 drivers/gpu/host1x/dev.h               |  13 +-
 drivers/gpu/host1x/hw/cdma_hw.c        |   2 +-
 drivers/gpu/host1x/hw/channel_hw.c     |  10 +-
 drivers/gpu/host1x/hw/debug_hw.c       |   2 +-
 drivers/gpu/host1x/hw/hw_host1x07_vm.h |   2 +-
 drivers/gpu/host1x/intr.c              |  28 +++-
 drivers/gpu/host1x/intr.h              |   4 +-
 drivers/gpu/host1x/job.c               |   5 +-
 drivers/gpu/host1x/syncpt.c            | 202 +++++++++++++++----------
 drivers/gpu/host1x/syncpt.h            |   4 +-
 drivers/staging/media/tegra-video/vi.c |   4 +-
 include/linux/host1x.h                 |  23 ++-
 21 files changed, 235 insertions(+), 141 deletions(-)

-- 
2.30.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v6 01/10] gpu: host1x: Use different lock classes for each client
  2021-03-29 13:38 ` Mikko Perttunen
@ 2021-03-29 13:38   ` Mikko Perttunen
  -1 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: dri-devel, linux-tegra, Mikko Perttunen

To avoid false lockdep warnings, give each client lock a different
lock class, passed from the initialization site by macro.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
v6:
- Update kerneldoc
---
 drivers/gpu/host1x/bus.c | 10 ++++++----
 include/linux/host1x.h   |  9 ++++++++-
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/host1x/bus.c b/drivers/gpu/host1x/bus.c
index 347fb962b6c9..68a766ff0e9d 100644
--- a/drivers/gpu/host1x/bus.c
+++ b/drivers/gpu/host1x/bus.c
@@ -705,8 +705,9 @@ void host1x_driver_unregister(struct host1x_driver *driver)
 EXPORT_SYMBOL(host1x_driver_unregister);
 
 /**
- * host1x_client_register() - register a host1x client
+ * __host1x_client_register() - register a host1x client
  * @client: host1x client
+ * @key: lock class key for the client-specific mutex
  *
  * Registers a host1x client with each host1x controller instance. Note that
  * each client will only match their parent host1x controller and will only be
@@ -715,13 +716,14 @@ EXPORT_SYMBOL(host1x_driver_unregister);
  * device and call host1x_device_init(), which will in turn call each client's
  * &host1x_client_ops.init implementation.
  */
-int host1x_client_register(struct host1x_client *client)
+int __host1x_client_register(struct host1x_client *client,
+			     struct lock_class_key *key)
 {
 	struct host1x *host1x;
 	int err;
 
 	INIT_LIST_HEAD(&client->list);
-	mutex_init(&client->lock);
+	__mutex_init(&client->lock, "host1x client lock", key);
 	client->usecount = 0;
 
 	mutex_lock(&devices_lock);
@@ -742,7 +744,7 @@ int host1x_client_register(struct host1x_client *client)
 
 	return 0;
 }
-EXPORT_SYMBOL(host1x_client_register);
+EXPORT_SYMBOL(__host1x_client_register);
 
 /**
  * host1x_client_unregister() - unregister a host1x client
diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index ce59a6a6a008..9eb77c87a83b 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -320,7 +320,14 @@ static inline struct host1x_device *to_host1x_device(struct device *dev)
 int host1x_device_init(struct host1x_device *device);
 int host1x_device_exit(struct host1x_device *device);
 
-int host1x_client_register(struct host1x_client *client);
+int __host1x_client_register(struct host1x_client *client,
+			     struct lock_class_key *key);
+#define host1x_client_register(class) \
+	({ \
+		static struct lock_class_key __key; \
+		__host1x_client_register(class, &__key); \
+	})
+
 int host1x_client_unregister(struct host1x_client *client);
 
 int host1x_client_suspend(struct host1x_client *client);
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 01/10] gpu: host1x: Use different lock classes for each client
@ 2021-03-29 13:38   ` Mikko Perttunen
  0 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel, Mikko Perttunen

To avoid false lockdep warnings, give each client lock a different
lock class, passed from the initialization site by macro.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
v6:
- Update kerneldoc
---
 drivers/gpu/host1x/bus.c | 10 ++++++----
 include/linux/host1x.h   |  9 ++++++++-
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/host1x/bus.c b/drivers/gpu/host1x/bus.c
index 347fb962b6c9..68a766ff0e9d 100644
--- a/drivers/gpu/host1x/bus.c
+++ b/drivers/gpu/host1x/bus.c
@@ -705,8 +705,9 @@ void host1x_driver_unregister(struct host1x_driver *driver)
 EXPORT_SYMBOL(host1x_driver_unregister);
 
 /**
- * host1x_client_register() - register a host1x client
+ * __host1x_client_register() - register a host1x client
  * @client: host1x client
+ * @key: lock class key for the client-specific mutex
  *
  * Registers a host1x client with each host1x controller instance. Note that
  * each client will only match their parent host1x controller and will only be
@@ -715,13 +716,14 @@ EXPORT_SYMBOL(host1x_driver_unregister);
  * device and call host1x_device_init(), which will in turn call each client's
  * &host1x_client_ops.init implementation.
  */
-int host1x_client_register(struct host1x_client *client)
+int __host1x_client_register(struct host1x_client *client,
+			     struct lock_class_key *key)
 {
 	struct host1x *host1x;
 	int err;
 
 	INIT_LIST_HEAD(&client->list);
-	mutex_init(&client->lock);
+	__mutex_init(&client->lock, "host1x client lock", key);
 	client->usecount = 0;
 
 	mutex_lock(&devices_lock);
@@ -742,7 +744,7 @@ int host1x_client_register(struct host1x_client *client)
 
 	return 0;
 }
-EXPORT_SYMBOL(host1x_client_register);
+EXPORT_SYMBOL(__host1x_client_register);
 
 /**
  * host1x_client_unregister() - unregister a host1x client
diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index ce59a6a6a008..9eb77c87a83b 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -320,7 +320,14 @@ static inline struct host1x_device *to_host1x_device(struct device *dev)
 int host1x_device_init(struct host1x_device *device);
 int host1x_device_exit(struct host1x_device *device);
 
-int host1x_client_register(struct host1x_client *client);
+int __host1x_client_register(struct host1x_client *client,
+			     struct lock_class_key *key);
+#define host1x_client_register(class) \
+	({ \
+		static struct lock_class_key __key; \
+		__host1x_client_register(class, &__key); \
+	})
+
 int host1x_client_unregister(struct host1x_client *client);
 
 int host1x_client_suspend(struct host1x_client *client);
-- 
2.30.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 02/10] gpu: host1x: Allow syncpoints without associated client
  2021-03-29 13:38 ` Mikko Perttunen
@ 2021-03-29 13:38   ` Mikko Perttunen
  -1 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: dri-devel, linux-tegra, Mikko Perttunen

Syncpoints don't need to be associated with any client,
so remove the property, and expose host1x_syncpt_alloc.
This will allow allocating syncpoints without prior knowledge
of the engine that it will be used with.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
v6:
* Add kerneldoc
* Return error if a NULL name was specified
v3:
* Clean up host1x_syncpt_alloc signature to allow specifying
  a name for the syncpoint.
* Export the function.
---
 drivers/gpu/host1x/syncpt.c | 37 +++++++++++++++++++++++++------------
 drivers/gpu/host1x/syncpt.h |  1 -
 include/linux/host1x.h      |  3 +++
 3 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index fce7892d5137..9a113016d482 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -42,13 +42,28 @@ static void host1x_syncpt_base_free(struct host1x_syncpt_base *base)
 		base->requested = false;
 }
 
-static struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
-						 struct host1x_client *client,
-						 unsigned long flags)
+/**
+ * host1x_syncpt_alloc() - allocate a syncpoint
+ * @host: host1x device data
+ * @flags: bitfield of HOST1X_SYNCPT_* flags
+ * @name: name for the syncpoint for use in debug prints
+ *
+ * Allocates a hardware syncpoint for the caller's use. The caller then has
+ * the sole authority to mutate the syncpoint's value until it is freed again.
+ *
+ * If no free syncpoints are available, or a NULL name was specified, returns
+ * NULL.
+ */
+struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
+					  unsigned long flags,
+					  const char *name)
 {
 	struct host1x_syncpt *sp = host->syncpt;
+	char *full_name;
 	unsigned int i;
-	char *name;
+
+	if (!name)
+		return NULL;
 
 	mutex_lock(&host->syncpt_mutex);
 
@@ -64,13 +79,11 @@ static struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
 			goto unlock;
 	}
 
-	name = kasprintf(GFP_KERNEL, "%02u-%s", sp->id,
-			 client ? dev_name(client->dev) : NULL);
-	if (!name)
+	full_name = kasprintf(GFP_KERNEL, "%u-%s", sp->id, name);
+	if (!full_name)
 		goto free_base;
 
-	sp->client = client;
-	sp->name = name;
+	sp->name = full_name;
 
 	if (flags & HOST1X_SYNCPT_CLIENT_MANAGED)
 		sp->client_managed = true;
@@ -87,6 +100,7 @@ static struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
 	mutex_unlock(&host->syncpt_mutex);
 	return NULL;
 }
+EXPORT_SYMBOL(host1x_syncpt_alloc);
 
 /**
  * host1x_syncpt_id() - retrieve syncpoint ID
@@ -401,7 +415,7 @@ int host1x_syncpt_init(struct host1x *host)
 	host1x_hw_syncpt_enable_protection(host);
 
 	/* Allocate sync point to use for clearing waits for expired fences */
-	host->nop_sp = host1x_syncpt_alloc(host, NULL, 0);
+	host->nop_sp = host1x_syncpt_alloc(host, 0, "reserved-nop");
 	if (!host->nop_sp)
 		return -ENOMEM;
 
@@ -423,7 +437,7 @@ struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
 {
 	struct host1x *host = dev_get_drvdata(client->host->parent);
 
-	return host1x_syncpt_alloc(host, client, flags);
+	return host1x_syncpt_alloc(host, flags, dev_name(client->dev));
 }
 EXPORT_SYMBOL(host1x_syncpt_request);
 
@@ -447,7 +461,6 @@ void host1x_syncpt_free(struct host1x_syncpt *sp)
 	host1x_syncpt_base_free(sp->base);
 	kfree(sp->name);
 	sp->base = NULL;
-	sp->client = NULL;
 	sp->name = NULL;
 	sp->client_managed = false;
 
diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h
index 8e1d04dacaa0..3aa6b25b1b9c 100644
--- a/drivers/gpu/host1x/syncpt.h
+++ b/drivers/gpu/host1x/syncpt.h
@@ -33,7 +33,6 @@ struct host1x_syncpt {
 	const char *name;
 	bool client_managed;
 	struct host1x *host;
-	struct host1x_client *client;
 	struct host1x_syncpt_base *base;
 
 	/* interrupt data */
diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index 9eb77c87a83b..7137ce0e35d4 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -154,6 +154,9 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout,
 struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
 					    unsigned long flags);
 void host1x_syncpt_free(struct host1x_syncpt *sp);
+struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
+					  unsigned long flags,
+					  const char *name);
 
 struct host1x_syncpt_base *host1x_syncpt_get_base(struct host1x_syncpt *sp);
 u32 host1x_syncpt_base_id(struct host1x_syncpt_base *base);
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 02/10] gpu: host1x: Allow syncpoints without associated client
@ 2021-03-29 13:38   ` Mikko Perttunen
  0 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel, Mikko Perttunen

Syncpoints don't need to be associated with any client,
so remove the property, and expose host1x_syncpt_alloc.
This will allow allocating syncpoints without prior knowledge
of the engine that it will be used with.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
v6:
* Add kerneldoc
* Return error if a NULL name was specified
v3:
* Clean up host1x_syncpt_alloc signature to allow specifying
  a name for the syncpoint.
* Export the function.
---
 drivers/gpu/host1x/syncpt.c | 37 +++++++++++++++++++++++++------------
 drivers/gpu/host1x/syncpt.h |  1 -
 include/linux/host1x.h      |  3 +++
 3 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index fce7892d5137..9a113016d482 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -42,13 +42,28 @@ static void host1x_syncpt_base_free(struct host1x_syncpt_base *base)
 		base->requested = false;
 }
 
-static struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
-						 struct host1x_client *client,
-						 unsigned long flags)
+/**
+ * host1x_syncpt_alloc() - allocate a syncpoint
+ * @host: host1x device data
+ * @flags: bitfield of HOST1X_SYNCPT_* flags
+ * @name: name for the syncpoint for use in debug prints
+ *
+ * Allocates a hardware syncpoint for the caller's use. The caller then has
+ * the sole authority to mutate the syncpoint's value until it is freed again.
+ *
+ * If no free syncpoints are available, or a NULL name was specified, returns
+ * NULL.
+ */
+struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
+					  unsigned long flags,
+					  const char *name)
 {
 	struct host1x_syncpt *sp = host->syncpt;
+	char *full_name;
 	unsigned int i;
-	char *name;
+
+	if (!name)
+		return NULL;
 
 	mutex_lock(&host->syncpt_mutex);
 
@@ -64,13 +79,11 @@ static struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
 			goto unlock;
 	}
 
-	name = kasprintf(GFP_KERNEL, "%02u-%s", sp->id,
-			 client ? dev_name(client->dev) : NULL);
-	if (!name)
+	full_name = kasprintf(GFP_KERNEL, "%u-%s", sp->id, name);
+	if (!full_name)
 		goto free_base;
 
-	sp->client = client;
-	sp->name = name;
+	sp->name = full_name;
 
 	if (flags & HOST1X_SYNCPT_CLIENT_MANAGED)
 		sp->client_managed = true;
@@ -87,6 +100,7 @@ static struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
 	mutex_unlock(&host->syncpt_mutex);
 	return NULL;
 }
+EXPORT_SYMBOL(host1x_syncpt_alloc);
 
 /**
  * host1x_syncpt_id() - retrieve syncpoint ID
@@ -401,7 +415,7 @@ int host1x_syncpt_init(struct host1x *host)
 	host1x_hw_syncpt_enable_protection(host);
 
 	/* Allocate sync point to use for clearing waits for expired fences */
-	host->nop_sp = host1x_syncpt_alloc(host, NULL, 0);
+	host->nop_sp = host1x_syncpt_alloc(host, 0, "reserved-nop");
 	if (!host->nop_sp)
 		return -ENOMEM;
 
@@ -423,7 +437,7 @@ struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
 {
 	struct host1x *host = dev_get_drvdata(client->host->parent);
 
-	return host1x_syncpt_alloc(host, client, flags);
+	return host1x_syncpt_alloc(host, flags, dev_name(client->dev));
 }
 EXPORT_SYMBOL(host1x_syncpt_request);
 
@@ -447,7 +461,6 @@ void host1x_syncpt_free(struct host1x_syncpt *sp)
 	host1x_syncpt_base_free(sp->base);
 	kfree(sp->name);
 	sp->base = NULL;
-	sp->client = NULL;
 	sp->name = NULL;
 	sp->client_managed = false;
 
diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h
index 8e1d04dacaa0..3aa6b25b1b9c 100644
--- a/drivers/gpu/host1x/syncpt.h
+++ b/drivers/gpu/host1x/syncpt.h
@@ -33,7 +33,6 @@ struct host1x_syncpt {
 	const char *name;
 	bool client_managed;
 	struct host1x *host;
-	struct host1x_client *client;
 	struct host1x_syncpt_base *base;
 
 	/* interrupt data */
diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index 9eb77c87a83b..7137ce0e35d4 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -154,6 +154,9 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout,
 struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
 					    unsigned long flags);
 void host1x_syncpt_free(struct host1x_syncpt *sp);
+struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
+					  unsigned long flags,
+					  const char *name);
 
 struct host1x_syncpt_base *host1x_syncpt_get_base(struct host1x_syncpt *sp);
 u32 host1x_syncpt_base_id(struct host1x_syncpt_base *base);
-- 
2.30.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 03/10] gpu: host1x: Show number of pending waiters in debugfs
  2021-03-29 13:38 ` Mikko Perttunen
@ 2021-03-29 13:38   ` Mikko Perttunen
  -1 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: dri-devel, linux-tegra, Mikko Perttunen

Show the number of pending waiters in the debugfs status file.
This is useful for testing to verify that waiters do not leak
or accumulate incorrectly.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
 drivers/gpu/host1x/debug.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c
index 1b4997bda1c7..8a14880c61bb 100644
--- a/drivers/gpu/host1x/debug.c
+++ b/drivers/gpu/host1x/debug.c
@@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo)
 
 static void show_syncpts(struct host1x *m, struct output *o)
 {
+	struct list_head *pos;
 	unsigned int i;
 
 	host1x_debug_output(o, "---- syncpts ----\n");
@@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o)
 	for (i = 0; i < host1x_syncpt_nb_pts(m); i++) {
 		u32 max = host1x_syncpt_read_max(m->syncpt + i);
 		u32 min = host1x_syncpt_load(m->syncpt + i);
+		unsigned int waiters = 0;
 
-		if (!min && !max)
+		spin_lock(&m->syncpt[i].intr.lock);
+		list_for_each(pos, &m->syncpt[i].intr.wait_head)
+			waiters++;
+		spin_unlock(&m->syncpt[i].intr.lock);
+
+		if (!min && !max && !waiters)
 			continue;
 
-		host1x_debug_output(o, "id %u (%s) min %d max %d\n",
-				    i, m->syncpt[i].name, min, max);
+		host1x_debug_output(o,
+				    "id %u (%s) min %d max %d (%d waiters)\n",
+				    i, m->syncpt[i].name, min, max, waiters);
 	}
 
 	for (i = 0; i < host1x_syncpt_nb_bases(m); i++) {
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 03/10] gpu: host1x: Show number of pending waiters in debugfs
@ 2021-03-29 13:38   ` Mikko Perttunen
  0 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel, Mikko Perttunen

Show the number of pending waiters in the debugfs status file.
This is useful for testing to verify that waiters do not leak
or accumulate incorrectly.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
 drivers/gpu/host1x/debug.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c
index 1b4997bda1c7..8a14880c61bb 100644
--- a/drivers/gpu/host1x/debug.c
+++ b/drivers/gpu/host1x/debug.c
@@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo)
 
 static void show_syncpts(struct host1x *m, struct output *o)
 {
+	struct list_head *pos;
 	unsigned int i;
 
 	host1x_debug_output(o, "---- syncpts ----\n");
@@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o)
 	for (i = 0; i < host1x_syncpt_nb_pts(m); i++) {
 		u32 max = host1x_syncpt_read_max(m->syncpt + i);
 		u32 min = host1x_syncpt_load(m->syncpt + i);
+		unsigned int waiters = 0;
 
-		if (!min && !max)
+		spin_lock(&m->syncpt[i].intr.lock);
+		list_for_each(pos, &m->syncpt[i].intr.wait_head)
+			waiters++;
+		spin_unlock(&m->syncpt[i].intr.lock);
+
+		if (!min && !max && !waiters)
 			continue;
 
-		host1x_debug_output(o, "id %u (%s) min %d max %d\n",
-				    i, m->syncpt[i].name, min, max);
+		host1x_debug_output(o,
+				    "id %u (%s) min %d max %d (%d waiters)\n",
+				    i, m->syncpt[i].name, min, max, waiters);
 	}
 
 	for (i = 0; i < host1x_syncpt_nb_bases(m); i++) {
-- 
2.30.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 04/10] gpu: host1x: Remove cancelled waiters immediately
  2021-03-29 13:38 ` Mikko Perttunen
@ 2021-03-29 13:38   ` Mikko Perttunen
  -1 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: dri-devel, linux-tegra, Mikko Perttunen

Before this patch, cancelled waiters would only be cleaned up
once their threshold value was reached. Make host1x_intr_put_ref
process the cancellation immediately to fix this.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
v6:
* Call schedule instead of cpu_relax while waiting for pending
  interrupt processing
v5:
* Add parameter to flush, i.e. wait for all pending waiters to
  complete before returning. The reason this is not always true
  is that the pending waiter might be the place that is calling
  the put_ref.
---
 drivers/gpu/host1x/intr.c   | 23 +++++++++++++++++------
 drivers/gpu/host1x/intr.h   |  4 +++-
 drivers/gpu/host1x/syncpt.c |  2 +-
 3 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/host1x/intr.c b/drivers/gpu/host1x/intr.c
index 9245add23b5d..69b0e8e41466 100644
--- a/drivers/gpu/host1x/intr.c
+++ b/drivers/gpu/host1x/intr.c
@@ -242,18 +242,29 @@ int host1x_intr_add_action(struct host1x *host, struct host1x_syncpt *syncpt,
 	return 0;
 }
 
-void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref)
+void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref,
+			 bool flush)
 {
 	struct host1x_waitlist *waiter = ref;
 	struct host1x_syncpt *syncpt;
 
-	while (atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED) ==
-	       WLS_REMOVED)
-		schedule();
+	atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED);
 
 	syncpt = host->syncpt + id;
-	(void)process_wait_list(host, syncpt,
-				host1x_syncpt_load(host->syncpt + id));
+
+	spin_lock(&syncpt->intr.lock);
+	if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) ==
+	    WLS_CANCELLED) {
+		list_del(&waiter->list);
+		kref_put(&waiter->refcount, waiter_release);
+	}
+	spin_unlock(&syncpt->intr.lock);
+
+	if (flush) {
+		/* Wait until any concurrently executing handler has finished. */
+		while (atomic_read(&waiter->state) != WLS_HANDLED)
+			schedule();
+	}
 
 	kref_put(&waiter->refcount, waiter_release);
 }
diff --git a/drivers/gpu/host1x/intr.h b/drivers/gpu/host1x/intr.h
index aac38194398f..6ea55e615e3a 100644
--- a/drivers/gpu/host1x/intr.h
+++ b/drivers/gpu/host1x/intr.h
@@ -74,8 +74,10 @@ int host1x_intr_add_action(struct host1x *host, struct host1x_syncpt *syncpt,
  * Unreference an action submitted to host1x_intr_add_action().
  * You must call this if you passed non-NULL as ref.
  * @ref the ref returned from host1x_intr_add_action()
+ * @flush wait until any pending handlers have completed before returning.
  */
-void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref);
+void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref,
+			 bool flush);
 
 /* Initialize host1x sync point interrupt */
 int host1x_intr_init(struct host1x *host, unsigned int irq_sync);
diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index 9a113016d482..f061dfd5bbc7 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -308,7 +308,7 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout,
 		}
 	}
 
-	host1x_intr_put_ref(sp->host, sp->id, ref);
+	host1x_intr_put_ref(sp->host, sp->id, ref, true);
 
 done:
 	return err;
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 04/10] gpu: host1x: Remove cancelled waiters immediately
@ 2021-03-29 13:38   ` Mikko Perttunen
  0 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel, Mikko Perttunen

Before this patch, cancelled waiters would only be cleaned up
once their threshold value was reached. Make host1x_intr_put_ref
process the cancellation immediately to fix this.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
v6:
* Call schedule instead of cpu_relax while waiting for pending
  interrupt processing
v5:
* Add parameter to flush, i.e. wait for all pending waiters to
  complete before returning. The reason this is not always true
  is that the pending waiter might be the place that is calling
  the put_ref.
---
 drivers/gpu/host1x/intr.c   | 23 +++++++++++++++++------
 drivers/gpu/host1x/intr.h   |  4 +++-
 drivers/gpu/host1x/syncpt.c |  2 +-
 3 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/host1x/intr.c b/drivers/gpu/host1x/intr.c
index 9245add23b5d..69b0e8e41466 100644
--- a/drivers/gpu/host1x/intr.c
+++ b/drivers/gpu/host1x/intr.c
@@ -242,18 +242,29 @@ int host1x_intr_add_action(struct host1x *host, struct host1x_syncpt *syncpt,
 	return 0;
 }
 
-void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref)
+void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref,
+			 bool flush)
 {
 	struct host1x_waitlist *waiter = ref;
 	struct host1x_syncpt *syncpt;
 
-	while (atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED) ==
-	       WLS_REMOVED)
-		schedule();
+	atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED);
 
 	syncpt = host->syncpt + id;
-	(void)process_wait_list(host, syncpt,
-				host1x_syncpt_load(host->syncpt + id));
+
+	spin_lock(&syncpt->intr.lock);
+	if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) ==
+	    WLS_CANCELLED) {
+		list_del(&waiter->list);
+		kref_put(&waiter->refcount, waiter_release);
+	}
+	spin_unlock(&syncpt->intr.lock);
+
+	if (flush) {
+		/* Wait until any concurrently executing handler has finished. */
+		while (atomic_read(&waiter->state) != WLS_HANDLED)
+			schedule();
+	}
 
 	kref_put(&waiter->refcount, waiter_release);
 }
diff --git a/drivers/gpu/host1x/intr.h b/drivers/gpu/host1x/intr.h
index aac38194398f..6ea55e615e3a 100644
--- a/drivers/gpu/host1x/intr.h
+++ b/drivers/gpu/host1x/intr.h
@@ -74,8 +74,10 @@ int host1x_intr_add_action(struct host1x *host, struct host1x_syncpt *syncpt,
  * Unreference an action submitted to host1x_intr_add_action().
  * You must call this if you passed non-NULL as ref.
  * @ref the ref returned from host1x_intr_add_action()
+ * @flush wait until any pending handlers have completed before returning.
  */
-void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref);
+void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref,
+			 bool flush);
 
 /* Initialize host1x sync point interrupt */
 int host1x_intr_init(struct host1x *host, unsigned int irq_sync);
diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index 9a113016d482..f061dfd5bbc7 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -308,7 +308,7 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout,
 		}
 	}
 
-	host1x_intr_put_ref(sp->host, sp->id, ref);
+	host1x_intr_put_ref(sp->host, sp->id, ref, true);
 
 done:
 	return err;
-- 
2.30.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 05/10] gpu: host1x: Use HW-equivalent syncpoint expiration check
  2021-03-29 13:38 ` Mikko Perttunen
@ 2021-03-29 13:38   ` Mikko Perttunen
  -1 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: dri-devel, linux-tegra, Mikko Perttunen

Make syncpoint expiration checks always use the same logic used by
the hardware. This ensures that there are no race conditions that
could occur because of the hardware triggering a syncpoint interrupt
and then the driver disagreeing.

One situation where this could occur is if a job incremented a
syncpoint too many times -- then the hardware would trigger an
interrupt, but the driver would assume that a syncpoint value
greater than the syncpoint's max value is in the future, and not
clean up the job.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
 drivers/gpu/host1x/syncpt.c | 51 ++-----------------------------------
 1 file changed, 2 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index f061dfd5bbc7..8da4bbce8b9d 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -321,59 +321,12 @@ EXPORT_SYMBOL(host1x_syncpt_wait);
 bool host1x_syncpt_is_expired(struct host1x_syncpt *sp, u32 thresh)
 {
 	u32 current_val;
-	u32 future_val;
 
 	smp_rmb();
 
 	current_val = (u32)atomic_read(&sp->min_val);
-	future_val = (u32)atomic_read(&sp->max_val);
-
-	/* Note the use of unsigned arithmetic here (mod 1<<32).
-	 *
-	 * c = current_val = min_val	= the current value of the syncpoint.
-	 * t = thresh			= the value we are checking
-	 * f = future_val  = max_val	= the value c will reach when all
-	 *				  outstanding increments have completed.
-	 *
-	 * Note that c always chases f until it reaches f.
-	 *
-	 * Dtf = (f - t)
-	 * Dtc = (c - t)
-	 *
-	 *  Consider all cases:
-	 *
-	 *	A) .....c..t..f.....	Dtf < Dtc	need to wait
-	 *	B) .....c.....f..t..	Dtf > Dtc	expired
-	 *	C) ..t..c.....f.....	Dtf > Dtc	expired	   (Dct very large)
-	 *
-	 *  Any case where f==c: always expired (for any t).	Dtf == Dcf
-	 *  Any case where t==c: always expired (for any f).	Dtf >= Dtc (because Dtc==0)
-	 *  Any case where t==f!=c: always wait.		Dtf <  Dtc (because Dtf==0,
-	 *							Dtc!=0)
-	 *
-	 *  Other cases:
-	 *
-	 *	A) .....t..f..c.....	Dtf < Dtc	need to wait
-	 *	A) .....f..c..t.....	Dtf < Dtc	need to wait
-	 *	A) .....f..t..c.....	Dtf > Dtc	expired
-	 *
-	 *   So:
-	 *	   Dtf >= Dtc implies EXPIRED	(return true)
-	 *	   Dtf <  Dtc implies WAIT	(return false)
-	 *
-	 * Note: If t is expired then we *cannot* wait on it. We would wait
-	 * forever (hang the system).
-	 *
-	 * Note: do NOT get clever and remove the -thresh from both sides. It
-	 * is NOT the same.
-	 *
-	 * If future valueis zero, we have a client managed sync point. In that
-	 * case we do a direct comparison.
-	 */
-	if (!host1x_syncpt_client_managed(sp))
-		return future_val - thresh >= current_val - thresh;
-	else
-		return (s32)(current_val - thresh) >= 0;
+
+	return ((current_val - thresh) & 0x80000000U) == 0U;
 }
 
 int host1x_syncpt_init(struct host1x *host)
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 05/10] gpu: host1x: Use HW-equivalent syncpoint expiration check
@ 2021-03-29 13:38   ` Mikko Perttunen
  0 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel, Mikko Perttunen

Make syncpoint expiration checks always use the same logic used by
the hardware. This ensures that there are no race conditions that
could occur because of the hardware triggering a syncpoint interrupt
and then the driver disagreeing.

One situation where this could occur is if a job incremented a
syncpoint too many times -- then the hardware would trigger an
interrupt, but the driver would assume that a syncpoint value
greater than the syncpoint's max value is in the future, and not
clean up the job.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
 drivers/gpu/host1x/syncpt.c | 51 ++-----------------------------------
 1 file changed, 2 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index f061dfd5bbc7..8da4bbce8b9d 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -321,59 +321,12 @@ EXPORT_SYMBOL(host1x_syncpt_wait);
 bool host1x_syncpt_is_expired(struct host1x_syncpt *sp, u32 thresh)
 {
 	u32 current_val;
-	u32 future_val;
 
 	smp_rmb();
 
 	current_val = (u32)atomic_read(&sp->min_val);
-	future_val = (u32)atomic_read(&sp->max_val);
-
-	/* Note the use of unsigned arithmetic here (mod 1<<32).
-	 *
-	 * c = current_val = min_val	= the current value of the syncpoint.
-	 * t = thresh			= the value we are checking
-	 * f = future_val  = max_val	= the value c will reach when all
-	 *				  outstanding increments have completed.
-	 *
-	 * Note that c always chases f until it reaches f.
-	 *
-	 * Dtf = (f - t)
-	 * Dtc = (c - t)
-	 *
-	 *  Consider all cases:
-	 *
-	 *	A) .....c..t..f.....	Dtf < Dtc	need to wait
-	 *	B) .....c.....f..t..	Dtf > Dtc	expired
-	 *	C) ..t..c.....f.....	Dtf > Dtc	expired	   (Dct very large)
-	 *
-	 *  Any case where f==c: always expired (for any t).	Dtf == Dcf
-	 *  Any case where t==c: always expired (for any f).	Dtf >= Dtc (because Dtc==0)
-	 *  Any case where t==f!=c: always wait.		Dtf <  Dtc (because Dtf==0,
-	 *							Dtc!=0)
-	 *
-	 *  Other cases:
-	 *
-	 *	A) .....t..f..c.....	Dtf < Dtc	need to wait
-	 *	A) .....f..c..t.....	Dtf < Dtc	need to wait
-	 *	A) .....f..t..c.....	Dtf > Dtc	expired
-	 *
-	 *   So:
-	 *	   Dtf >= Dtc implies EXPIRED	(return true)
-	 *	   Dtf <  Dtc implies WAIT	(return false)
-	 *
-	 * Note: If t is expired then we *cannot* wait on it. We would wait
-	 * forever (hang the system).
-	 *
-	 * Note: do NOT get clever and remove the -thresh from both sides. It
-	 * is NOT the same.
-	 *
-	 * If future valueis zero, we have a client managed sync point. In that
-	 * case we do a direct comparison.
-	 */
-	if (!host1x_syncpt_client_managed(sp))
-		return future_val - thresh >= current_val - thresh;
-	else
-		return (s32)(current_val - thresh) >= 0;
+
+	return ((current_val - thresh) & 0x80000000U) == 0U;
 }
 
 int host1x_syncpt_init(struct host1x *host)
-- 
2.30.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 06/10] gpu: host1x: Cleanup and refcounting for syncpoints
  2021-03-29 13:38 ` Mikko Perttunen
@ 2021-03-29 13:38   ` Mikko Perttunen
  -1 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: dri-devel, linux-tegra, Mikko Perttunen

Add reference counting for allocated syncpoints to allow keeping
them allocated while jobs are referencing them. Additionally,
clean up various places using syncpoint IDs to use host1x_syncpt
pointers instead.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
v5:
- Remove host1x_syncpt_put in submit code, as job_put already
  puts the syncpoint.
- Changes due to rebase in VI driver.
v4:
- Update from _free to _put in VI driver as well
---
 drivers/gpu/drm/tegra/dc.c             |  4 +-
 drivers/gpu/drm/tegra/drm.c            | 14 ++---
 drivers/gpu/drm/tegra/gr2d.c           |  4 +-
 drivers/gpu/drm/tegra/gr3d.c           |  4 +-
 drivers/gpu/drm/tegra/vic.c            |  4 +-
 drivers/gpu/host1x/cdma.c              | 11 ++--
 drivers/gpu/host1x/dev.h               |  7 ++-
 drivers/gpu/host1x/hw/cdma_hw.c        |  2 +-
 drivers/gpu/host1x/hw/channel_hw.c     | 10 ++--
 drivers/gpu/host1x/hw/debug_hw.c       |  2 +-
 drivers/gpu/host1x/job.c               |  5 +-
 drivers/gpu/host1x/syncpt.c            | 75 +++++++++++++++++++-------
 drivers/gpu/host1x/syncpt.h            |  3 ++
 drivers/staging/media/tegra-video/vi.c |  4 +-
 include/linux/host1x.h                 |  8 +--
 15 files changed, 98 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 5635fac01e3e..416d6036cf47 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -2141,7 +2141,7 @@ static int tegra_dc_init(struct host1x_client *client)
 		drm_plane_cleanup(primary);
 
 	host1x_client_iommu_detach(client);
-	host1x_syncpt_free(dc->syncpt);
+	host1x_syncpt_put(dc->syncpt);
 
 	return err;
 }
@@ -2166,7 +2166,7 @@ static int tegra_dc_exit(struct host1x_client *client)
 	}
 
 	host1x_client_iommu_detach(client);
-	host1x_syncpt_free(dc->syncpt);
+	host1x_syncpt_put(dc->syncpt);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index 90709c38c993..ce5bdc58d315 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -174,7 +174,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 	struct drm_tegra_syncpt syncpt;
 	struct host1x *host1x = dev_get_drvdata(drm->dev->parent);
 	struct drm_gem_object **refs;
-	struct host1x_syncpt *sp;
+	struct host1x_syncpt *sp = NULL;
 	struct host1x_job *job;
 	unsigned int num_refs;
 	int err;
@@ -301,8 +301,8 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 		goto fail;
 	}
 
-	/* check whether syncpoint ID is valid */
-	sp = host1x_syncpt_get(host1x, syncpt.id);
+	/* Syncpoint ref will be dropped on job release. */
+	sp = host1x_syncpt_get_by_id(host1x, syncpt.id);
 	if (!sp) {
 		err = -ENOENT;
 		goto fail;
@@ -311,7 +311,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 	job->is_addr_reg = context->client->ops->is_addr_reg;
 	job->is_valid_class = context->client->ops->is_valid_class;
 	job->syncpt_incrs = syncpt.incrs;
-	job->syncpt_id = syncpt.id;
+	job->syncpt = sp;
 	job->timeout = 10000;
 
 	if (args->timeout && args->timeout < 10000)
@@ -383,7 +383,7 @@ static int tegra_syncpt_read(struct drm_device *drm, void *data,
 	struct drm_tegra_syncpt_read *args = data;
 	struct host1x_syncpt *sp;
 
-	sp = host1x_syncpt_get(host, args->id);
+	sp = host1x_syncpt_get_by_id_noref(host, args->id);
 	if (!sp)
 		return -EINVAL;
 
@@ -398,7 +398,7 @@ static int tegra_syncpt_incr(struct drm_device *drm, void *data,
 	struct drm_tegra_syncpt_incr *args = data;
 	struct host1x_syncpt *sp;
 
-	sp = host1x_syncpt_get(host1x, args->id);
+	sp = host1x_syncpt_get_by_id_noref(host1x, args->id);
 	if (!sp)
 		return -EINVAL;
 
@@ -412,7 +412,7 @@ static int tegra_syncpt_wait(struct drm_device *drm, void *data,
 	struct drm_tegra_syncpt_wait *args = data;
 	struct host1x_syncpt *sp;
 
-	sp = host1x_syncpt_get(host1x, args->id);
+	sp = host1x_syncpt_get_by_id_noref(host1x, args->id);
 	if (!sp)
 		return -EINVAL;
 
diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c
index adbe2ddcda19..de288cba3905 100644
--- a/drivers/gpu/drm/tegra/gr2d.c
+++ b/drivers/gpu/drm/tegra/gr2d.c
@@ -67,7 +67,7 @@ static int gr2d_init(struct host1x_client *client)
 detach:
 	host1x_client_iommu_detach(client);
 free:
-	host1x_syncpt_free(client->syncpts[0]);
+	host1x_syncpt_put(client->syncpts[0]);
 put:
 	host1x_channel_put(gr2d->channel);
 	return err;
@@ -86,7 +86,7 @@ static int gr2d_exit(struct host1x_client *client)
 		return err;
 
 	host1x_client_iommu_detach(client);
-	host1x_syncpt_free(client->syncpts[0]);
+	host1x_syncpt_put(client->syncpts[0]);
 	host1x_channel_put(gr2d->channel);
 
 	return 0;
diff --git a/drivers/gpu/drm/tegra/gr3d.c b/drivers/gpu/drm/tegra/gr3d.c
index b0b8154e8104..24442ade0da3 100644
--- a/drivers/gpu/drm/tegra/gr3d.c
+++ b/drivers/gpu/drm/tegra/gr3d.c
@@ -76,7 +76,7 @@ static int gr3d_init(struct host1x_client *client)
 detach:
 	host1x_client_iommu_detach(client);
 free:
-	host1x_syncpt_free(client->syncpts[0]);
+	host1x_syncpt_put(client->syncpts[0]);
 put:
 	host1x_channel_put(gr3d->channel);
 	return err;
@@ -94,7 +94,7 @@ static int gr3d_exit(struct host1x_client *client)
 		return err;
 
 	host1x_client_iommu_detach(client);
-	host1x_syncpt_free(client->syncpts[0]);
+	host1x_syncpt_put(client->syncpts[0]);
 	host1x_channel_put(gr3d->channel);
 
 	return 0;
diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
index 77e128832920..72aea1cc0cfa 100644
--- a/drivers/gpu/drm/tegra/vic.c
+++ b/drivers/gpu/drm/tegra/vic.c
@@ -214,7 +214,7 @@ static int vic_init(struct host1x_client *client)
 	return 0;
 
 free_syncpt:
-	host1x_syncpt_free(client->syncpts[0]);
+	host1x_syncpt_put(client->syncpts[0]);
 free_channel:
 	host1x_channel_put(vic->channel);
 detach:
@@ -238,7 +238,7 @@ static int vic_exit(struct host1x_client *client)
 	if (err < 0)
 		return err;
 
-	host1x_syncpt_free(client->syncpts[0]);
+	host1x_syncpt_put(client->syncpts[0]);
 	host1x_channel_put(vic->channel);
 	host1x_client_iommu_detach(client);
 
diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c
index e8d3fda91d8a..6e6ca774f68d 100644
--- a/drivers/gpu/host1x/cdma.c
+++ b/drivers/gpu/host1x/cdma.c
@@ -273,15 +273,13 @@ static int host1x_cdma_wait_pushbuffer_space(struct host1x *host1x,
 static void cdma_start_timer_locked(struct host1x_cdma *cdma,
 				    struct host1x_job *job)
 {
-	struct host1x *host = cdma_to_host1x(cdma);
-
 	if (cdma->timeout.client) {
 		/* timer already started */
 		return;
 	}
 
 	cdma->timeout.client = job->client;
-	cdma->timeout.syncpt = host1x_syncpt_get(host, job->syncpt_id);
+	cdma->timeout.syncpt = job->syncpt;
 	cdma->timeout.syncpt_val = job->syncpt_end;
 	cdma->timeout.start_ktime = ktime_get();
 
@@ -312,7 +310,6 @@ static void stop_cdma_timer_locked(struct host1x_cdma *cdma)
 static void update_cdma_locked(struct host1x_cdma *cdma)
 {
 	bool signal = false;
-	struct host1x *host1x = cdma_to_host1x(cdma);
 	struct host1x_job *job, *n;
 
 	/* If CDMA is stopped, queue is cleared and we can return */
@@ -324,8 +321,7 @@ static void update_cdma_locked(struct host1x_cdma *cdma)
 	 * to consume as many sync queue entries as possible without blocking
 	 */
 	list_for_each_entry_safe(job, n, &cdma->sync_queue, list) {
-		struct host1x_syncpt *sp =
-			host1x_syncpt_get(host1x, job->syncpt_id);
+		struct host1x_syncpt *sp = job->syncpt;
 
 		/* Check whether this syncpt has completed, and bail if not */
 		if (!host1x_syncpt_is_expired(sp, job->syncpt_end)) {
@@ -499,8 +495,7 @@ int host1x_cdma_begin(struct host1x_cdma *cdma, struct host1x_job *job)
 		if (!cdma->timeout.initialized) {
 			int err;
 
-			err = host1x_hw_cdma_timeout_init(host1x, cdma,
-							  job->syncpt_id);
+			err = host1x_hw_cdma_timeout_init(host1x, cdma);
 			if (err) {
 				mutex_unlock(&cdma->lock);
 				return err;
diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
index f781a9b0f39d..63010ae37a97 100644
--- a/drivers/gpu/host1x/dev.h
+++ b/drivers/gpu/host1x/dev.h
@@ -37,7 +37,7 @@ struct host1x_cdma_ops {
 	void (*start)(struct host1x_cdma *cdma);
 	void (*stop)(struct host1x_cdma *cdma);
 	void (*flush)(struct  host1x_cdma *cdma);
-	int (*timeout_init)(struct host1x_cdma *cdma, unsigned int syncpt);
+	int (*timeout_init)(struct host1x_cdma *cdma);
 	void (*timeout_destroy)(struct host1x_cdma *cdma);
 	void (*freeze)(struct host1x_cdma *cdma);
 	void (*resume)(struct host1x_cdma *cdma, u32 getptr);
@@ -261,10 +261,9 @@ static inline void host1x_hw_cdma_flush(struct host1x *host,
 }
 
 static inline int host1x_hw_cdma_timeout_init(struct host1x *host,
-					      struct host1x_cdma *cdma,
-					      unsigned int syncpt)
+					      struct host1x_cdma *cdma)
 {
-	return host->cdma_op->timeout_init(cdma, syncpt);
+	return host->cdma_op->timeout_init(cdma);
 }
 
 static inline void host1x_hw_cdma_timeout_destroy(struct host1x *host,
diff --git a/drivers/gpu/host1x/hw/cdma_hw.c b/drivers/gpu/host1x/hw/cdma_hw.c
index 2f3bf94cf365..e49cd5b8f735 100644
--- a/drivers/gpu/host1x/hw/cdma_hw.c
+++ b/drivers/gpu/host1x/hw/cdma_hw.c
@@ -295,7 +295,7 @@ static void cdma_timeout_handler(struct work_struct *work)
 /*
  * Init timeout resources
  */
-static int cdma_timeout_init(struct host1x_cdma *cdma, unsigned int syncpt)
+static int cdma_timeout_init(struct host1x_cdma *cdma)
 {
 	INIT_DELAYED_WORK(&cdma->timeout.wq, cdma_timeout_handler);
 	cdma->timeout.initialized = true;
diff --git a/drivers/gpu/host1x/hw/channel_hw.c b/drivers/gpu/host1x/hw/channel_hw.c
index 5eaa29d171c9..d4c28faf27d1 100644
--- a/drivers/gpu/host1x/hw/channel_hw.c
+++ b/drivers/gpu/host1x/hw/channel_hw.c
@@ -86,8 +86,7 @@ static void submit_gathers(struct host1x_job *job)
 
 static inline void synchronize_syncpt_base(struct host1x_job *job)
 {
-	struct host1x *host = dev_get_drvdata(job->channel->dev->parent);
-	struct host1x_syncpt *sp = host->syncpt + job->syncpt_id;
+	struct host1x_syncpt *sp = job->syncpt;
 	unsigned int id;
 	u32 value;
 
@@ -118,7 +117,7 @@ static void host1x_channel_set_streamid(struct host1x_channel *channel)
 static int channel_submit(struct host1x_job *job)
 {
 	struct host1x_channel *ch = job->channel;
-	struct host1x_syncpt *sp;
+	struct host1x_syncpt *sp = job->syncpt;
 	u32 user_syncpt_incrs = job->syncpt_incrs;
 	u32 prev_max = 0;
 	u32 syncval;
@@ -126,10 +125,9 @@ static int channel_submit(struct host1x_job *job)
 	struct host1x_waitlist *completed_waiter = NULL;
 	struct host1x *host = dev_get_drvdata(ch->dev->parent);
 
-	sp = host->syncpt + job->syncpt_id;
 	trace_host1x_channel_submit(dev_name(ch->dev),
 				    job->num_gathers, job->num_relocs,
-				    job->syncpt_id, job->syncpt_incrs);
+				    job->syncpt->id, job->syncpt_incrs);
 
 	/* before error checks, return current max */
 	prev_max = job->syncpt_end = host1x_syncpt_read_max(sp);
@@ -163,7 +161,7 @@ static int channel_submit(struct host1x_job *job)
 		host1x_cdma_push(&ch->cdma,
 				 host1x_opcode_setclass(HOST1X_CLASS_HOST1X,
 					host1x_uclass_wait_syncpt_r(), 1),
-				 host1x_class_host_wait_syncpt(job->syncpt_id,
+				 host1x_class_host_wait_syncpt(job->syncpt->id,
 					host1x_syncpt_read_max(sp)));
 	}
 
diff --git a/drivers/gpu/host1x/hw/debug_hw.c b/drivers/gpu/host1x/hw/debug_hw.c
index f31bcfa1b837..ceb48229d14b 100644
--- a/drivers/gpu/host1x/hw/debug_hw.c
+++ b/drivers/gpu/host1x/hw/debug_hw.c
@@ -204,7 +204,7 @@ static void show_channel_gathers(struct output *o, struct host1x_cdma *cdma)
 		unsigned int i;
 
 		host1x_debug_output(o, "\n%p: JOB, syncpt_id=%d, syncpt_val=%d, first_get=%08x, timeout=%d num_slots=%d, num_handles=%d\n",
-				    job, job->syncpt_id, job->syncpt_end,
+				    job, job->syncpt->id, job->syncpt_end,
 				    job->first_get, job->timeout,
 				    job->num_slots, job->num_unpins);
 
diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index 82d0a60ba3f7..adbdc225de8d 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -79,6 +79,9 @@ static void job_free(struct kref *ref)
 {
 	struct host1x_job *job = container_of(ref, struct host1x_job, ref);
 
+	if (job->syncpt)
+		host1x_syncpt_put(job->syncpt);
+
 	kfree(job);
 }
 
@@ -674,7 +677,7 @@ EXPORT_SYMBOL(host1x_job_unpin);
  */
 void host1x_job_dump(struct device *dev, struct host1x_job *job)
 {
-	dev_dbg(dev, "    SYNCPT_ID   %d\n", job->syncpt_id);
+	dev_dbg(dev, "    SYNCPT_ID   %d\n", job->syncpt->id);
 	dev_dbg(dev, "    SYNCPT_VAL  %d\n", job->syncpt_end);
 	dev_dbg(dev, "    FIRST_GET   0x%x\n", job->first_get);
 	dev_dbg(dev, "    TIMEOUT     %d\n", job->timeout);
diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index 8da4bbce8b9d..7bb5de8c3d63 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -90,6 +90,8 @@ struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
 	else
 		sp->client_managed = false;
 
+	kref_init(&sp->ref);
+
 	mutex_unlock(&host->syncpt_mutex);
 	return sp;
 
@@ -383,7 +385,7 @@ int host1x_syncpt_init(struct host1x *host)
  * host1x client drivers can use this function to allocate a syncpoint for
  * subsequent use. A syncpoint returned by this function will be reserved for
  * use by the client exclusively. When no longer using a syncpoint, a host1x
- * client driver needs to release it using host1x_syncpt_free().
+ * client driver needs to release it using host1x_syncpt_put().
  */
 struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
 					    unsigned long flags)
@@ -394,20 +396,9 @@ struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
 }
 EXPORT_SYMBOL(host1x_syncpt_request);
 
-/**
- * host1x_syncpt_free() - free a requested syncpoint
- * @sp: host1x syncpoint
- *
- * Release a syncpoint previously allocated using host1x_syncpt_request(). A
- * host1x client driver should call this when the syncpoint is no longer in
- * use. Note that client drivers must ensure that the syncpoint doesn't remain
- * under the control of hardware after calling this function, otherwise two
- * clients may end up trying to access the same syncpoint concurrently.
- */
-void host1x_syncpt_free(struct host1x_syncpt *sp)
+static void syncpt_release(struct kref *ref)
 {
-	if (!sp)
-		return;
+	struct host1x_syncpt *sp = container_of(ref, struct host1x_syncpt, ref);
 
 	mutex_lock(&sp->host->syncpt_mutex);
 
@@ -419,7 +410,23 @@ void host1x_syncpt_free(struct host1x_syncpt *sp)
 
 	mutex_unlock(&sp->host->syncpt_mutex);
 }
-EXPORT_SYMBOL(host1x_syncpt_free);
+
+/**
+ * host1x_syncpt_put() - free a requested syncpoint
+ * @sp: host1x syncpoint
+ *
+ * Release a syncpoint previously allocated using host1x_syncpt_request(). A
+ * host1x client driver should call this when the syncpoint is no longer in
+ * use.
+ */
+void host1x_syncpt_put(struct host1x_syncpt *sp)
+{
+	if (!sp)
+		return;
+
+	kref_put(&sp->ref, syncpt_release);
+}
+EXPORT_SYMBOL(host1x_syncpt_put);
 
 void host1x_syncpt_deinit(struct host1x *host)
 {
@@ -486,16 +493,48 @@ unsigned int host1x_syncpt_nb_mlocks(struct host1x *host)
 }
 
 /**
- * host1x_syncpt_get() - obtain a syncpoint by ID
+ * host1x_syncpt_get_by_id() - obtain a syncpoint by ID
+ * @host: host1x controller
+ * @id: syncpoint ID
+ */
+struct host1x_syncpt *host1x_syncpt_get_by_id(struct host1x *host,
+					      unsigned int id)
+{
+	if (id >= host->info->nb_pts)
+		return NULL;
+
+	if (kref_get_unless_zero(&host->syncpt[id].ref))
+		return &host->syncpt[id];
+	else
+		return NULL;
+}
+EXPORT_SYMBOL(host1x_syncpt_get_by_id);
+
+/**
+ * host1x_syncpt_get_by_id_noref() - obtain a syncpoint by ID but don't
+ * 	increase the refcount.
  * @host: host1x controller
  * @id: syncpoint ID
  */
-struct host1x_syncpt *host1x_syncpt_get(struct host1x *host, unsigned int id)
+struct host1x_syncpt *host1x_syncpt_get_by_id_noref(struct host1x *host,
+						    unsigned int id)
 {
 	if (id >= host->info->nb_pts)
 		return NULL;
 
-	return host->syncpt + id;
+	return &host->syncpt[id];
+}
+EXPORT_SYMBOL(host1x_syncpt_get_by_id_noref);
+
+/**
+ * host1x_syncpt_get() - increment syncpoint refcount
+ * @sp: syncpoint
+ */
+struct host1x_syncpt *host1x_syncpt_get(struct host1x_syncpt *sp)
+{
+	kref_get(&sp->ref);
+
+	return sp;
 }
 EXPORT_SYMBOL(host1x_syncpt_get);
 
diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h
index 3aa6b25b1b9c..a6766f8d55ee 100644
--- a/drivers/gpu/host1x/syncpt.h
+++ b/drivers/gpu/host1x/syncpt.h
@@ -11,6 +11,7 @@
 #include <linux/atomic.h>
 #include <linux/host1x.h>
 #include <linux/kernel.h>
+#include <linux/kref.h>
 #include <linux/sched.h>
 
 #include "intr.h"
@@ -26,6 +27,8 @@ struct host1x_syncpt_base {
 };
 
 struct host1x_syncpt {
+	struct kref ref;
+
 	unsigned int id;
 	atomic_t min_val;
 	atomic_t max_val;
diff --git a/drivers/staging/media/tegra-video/vi.c b/drivers/staging/media/tegra-video/vi.c
index 7a09061cda57..7e0cb5529b49 100644
--- a/drivers/staging/media/tegra-video/vi.c
+++ b/drivers/staging/media/tegra-video/vi.c
@@ -1131,8 +1131,8 @@ static void tegra_channel_host1x_syncpts_free(struct tegra_vi_channel *chan)
 	int i;
 
 	for (i = 0; i < chan->numgangports; i++) {
-		host1x_syncpt_free(chan->mw_ack_sp[i]);
-		host1x_syncpt_free(chan->frame_start_sp[i]);
+		host1x_syncpt_put(chan->mw_ack_sp[i]);
+		host1x_syncpt_put(chan->frame_start_sp[i]);
 	}
 }
 
diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index 7137ce0e35d4..107aea29bccb 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -142,7 +142,9 @@ struct host1x_syncpt_base;
 struct host1x_syncpt;
 struct host1x;
 
-struct host1x_syncpt *host1x_syncpt_get(struct host1x *host, u32 id);
+struct host1x_syncpt *host1x_syncpt_get_by_id(struct host1x *host, u32 id);
+struct host1x_syncpt *host1x_syncpt_get_by_id_noref(struct host1x *host, u32 id);
+struct host1x_syncpt *host1x_syncpt_get(struct host1x_syncpt *sp);
 u32 host1x_syncpt_id(struct host1x_syncpt *sp);
 u32 host1x_syncpt_read_min(struct host1x_syncpt *sp);
 u32 host1x_syncpt_read_max(struct host1x_syncpt *sp);
@@ -153,7 +155,7 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout,
 		       u32 *value);
 struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
 					    unsigned long flags);
-void host1x_syncpt_free(struct host1x_syncpt *sp);
+void host1x_syncpt_put(struct host1x_syncpt *sp);
 struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
 					  unsigned long flags,
 					  const char *name);
@@ -221,7 +223,7 @@ struct host1x_job {
 	dma_addr_t *reloc_addr_phys;
 
 	/* Sync point id, number of increments and end related to the submit */
-	u32 syncpt_id;
+	struct host1x_syncpt *syncpt;
 	u32 syncpt_incrs;
 	u32 syncpt_end;
 
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 06/10] gpu: host1x: Cleanup and refcounting for syncpoints
@ 2021-03-29 13:38   ` Mikko Perttunen
  0 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel, Mikko Perttunen

Add reference counting for allocated syncpoints to allow keeping
them allocated while jobs are referencing them. Additionally,
clean up various places using syncpoint IDs to use host1x_syncpt
pointers instead.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
v5:
- Remove host1x_syncpt_put in submit code, as job_put already
  puts the syncpoint.
- Changes due to rebase in VI driver.
v4:
- Update from _free to _put in VI driver as well
---
 drivers/gpu/drm/tegra/dc.c             |  4 +-
 drivers/gpu/drm/tegra/drm.c            | 14 ++---
 drivers/gpu/drm/tegra/gr2d.c           |  4 +-
 drivers/gpu/drm/tegra/gr3d.c           |  4 +-
 drivers/gpu/drm/tegra/vic.c            |  4 +-
 drivers/gpu/host1x/cdma.c              | 11 ++--
 drivers/gpu/host1x/dev.h               |  7 ++-
 drivers/gpu/host1x/hw/cdma_hw.c        |  2 +-
 drivers/gpu/host1x/hw/channel_hw.c     | 10 ++--
 drivers/gpu/host1x/hw/debug_hw.c       |  2 +-
 drivers/gpu/host1x/job.c               |  5 +-
 drivers/gpu/host1x/syncpt.c            | 75 +++++++++++++++++++-------
 drivers/gpu/host1x/syncpt.h            |  3 ++
 drivers/staging/media/tegra-video/vi.c |  4 +-
 include/linux/host1x.h                 |  8 +--
 15 files changed, 98 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 5635fac01e3e..416d6036cf47 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -2141,7 +2141,7 @@ static int tegra_dc_init(struct host1x_client *client)
 		drm_plane_cleanup(primary);
 
 	host1x_client_iommu_detach(client);
-	host1x_syncpt_free(dc->syncpt);
+	host1x_syncpt_put(dc->syncpt);
 
 	return err;
 }
@@ -2166,7 +2166,7 @@ static int tegra_dc_exit(struct host1x_client *client)
 	}
 
 	host1x_client_iommu_detach(client);
-	host1x_syncpt_free(dc->syncpt);
+	host1x_syncpt_put(dc->syncpt);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index 90709c38c993..ce5bdc58d315 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -174,7 +174,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 	struct drm_tegra_syncpt syncpt;
 	struct host1x *host1x = dev_get_drvdata(drm->dev->parent);
 	struct drm_gem_object **refs;
-	struct host1x_syncpt *sp;
+	struct host1x_syncpt *sp = NULL;
 	struct host1x_job *job;
 	unsigned int num_refs;
 	int err;
@@ -301,8 +301,8 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 		goto fail;
 	}
 
-	/* check whether syncpoint ID is valid */
-	sp = host1x_syncpt_get(host1x, syncpt.id);
+	/* Syncpoint ref will be dropped on job release. */
+	sp = host1x_syncpt_get_by_id(host1x, syncpt.id);
 	if (!sp) {
 		err = -ENOENT;
 		goto fail;
@@ -311,7 +311,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 	job->is_addr_reg = context->client->ops->is_addr_reg;
 	job->is_valid_class = context->client->ops->is_valid_class;
 	job->syncpt_incrs = syncpt.incrs;
-	job->syncpt_id = syncpt.id;
+	job->syncpt = sp;
 	job->timeout = 10000;
 
 	if (args->timeout && args->timeout < 10000)
@@ -383,7 +383,7 @@ static int tegra_syncpt_read(struct drm_device *drm, void *data,
 	struct drm_tegra_syncpt_read *args = data;
 	struct host1x_syncpt *sp;
 
-	sp = host1x_syncpt_get(host, args->id);
+	sp = host1x_syncpt_get_by_id_noref(host, args->id);
 	if (!sp)
 		return -EINVAL;
 
@@ -398,7 +398,7 @@ static int tegra_syncpt_incr(struct drm_device *drm, void *data,
 	struct drm_tegra_syncpt_incr *args = data;
 	struct host1x_syncpt *sp;
 
-	sp = host1x_syncpt_get(host1x, args->id);
+	sp = host1x_syncpt_get_by_id_noref(host1x, args->id);
 	if (!sp)
 		return -EINVAL;
 
@@ -412,7 +412,7 @@ static int tegra_syncpt_wait(struct drm_device *drm, void *data,
 	struct drm_tegra_syncpt_wait *args = data;
 	struct host1x_syncpt *sp;
 
-	sp = host1x_syncpt_get(host1x, args->id);
+	sp = host1x_syncpt_get_by_id_noref(host1x, args->id);
 	if (!sp)
 		return -EINVAL;
 
diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c
index adbe2ddcda19..de288cba3905 100644
--- a/drivers/gpu/drm/tegra/gr2d.c
+++ b/drivers/gpu/drm/tegra/gr2d.c
@@ -67,7 +67,7 @@ static int gr2d_init(struct host1x_client *client)
 detach:
 	host1x_client_iommu_detach(client);
 free:
-	host1x_syncpt_free(client->syncpts[0]);
+	host1x_syncpt_put(client->syncpts[0]);
 put:
 	host1x_channel_put(gr2d->channel);
 	return err;
@@ -86,7 +86,7 @@ static int gr2d_exit(struct host1x_client *client)
 		return err;
 
 	host1x_client_iommu_detach(client);
-	host1x_syncpt_free(client->syncpts[0]);
+	host1x_syncpt_put(client->syncpts[0]);
 	host1x_channel_put(gr2d->channel);
 
 	return 0;
diff --git a/drivers/gpu/drm/tegra/gr3d.c b/drivers/gpu/drm/tegra/gr3d.c
index b0b8154e8104..24442ade0da3 100644
--- a/drivers/gpu/drm/tegra/gr3d.c
+++ b/drivers/gpu/drm/tegra/gr3d.c
@@ -76,7 +76,7 @@ static int gr3d_init(struct host1x_client *client)
 detach:
 	host1x_client_iommu_detach(client);
 free:
-	host1x_syncpt_free(client->syncpts[0]);
+	host1x_syncpt_put(client->syncpts[0]);
 put:
 	host1x_channel_put(gr3d->channel);
 	return err;
@@ -94,7 +94,7 @@ static int gr3d_exit(struct host1x_client *client)
 		return err;
 
 	host1x_client_iommu_detach(client);
-	host1x_syncpt_free(client->syncpts[0]);
+	host1x_syncpt_put(client->syncpts[0]);
 	host1x_channel_put(gr3d->channel);
 
 	return 0;
diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
index 77e128832920..72aea1cc0cfa 100644
--- a/drivers/gpu/drm/tegra/vic.c
+++ b/drivers/gpu/drm/tegra/vic.c
@@ -214,7 +214,7 @@ static int vic_init(struct host1x_client *client)
 	return 0;
 
 free_syncpt:
-	host1x_syncpt_free(client->syncpts[0]);
+	host1x_syncpt_put(client->syncpts[0]);
 free_channel:
 	host1x_channel_put(vic->channel);
 detach:
@@ -238,7 +238,7 @@ static int vic_exit(struct host1x_client *client)
 	if (err < 0)
 		return err;
 
-	host1x_syncpt_free(client->syncpts[0]);
+	host1x_syncpt_put(client->syncpts[0]);
 	host1x_channel_put(vic->channel);
 	host1x_client_iommu_detach(client);
 
diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c
index e8d3fda91d8a..6e6ca774f68d 100644
--- a/drivers/gpu/host1x/cdma.c
+++ b/drivers/gpu/host1x/cdma.c
@@ -273,15 +273,13 @@ static int host1x_cdma_wait_pushbuffer_space(struct host1x *host1x,
 static void cdma_start_timer_locked(struct host1x_cdma *cdma,
 				    struct host1x_job *job)
 {
-	struct host1x *host = cdma_to_host1x(cdma);
-
 	if (cdma->timeout.client) {
 		/* timer already started */
 		return;
 	}
 
 	cdma->timeout.client = job->client;
-	cdma->timeout.syncpt = host1x_syncpt_get(host, job->syncpt_id);
+	cdma->timeout.syncpt = job->syncpt;
 	cdma->timeout.syncpt_val = job->syncpt_end;
 	cdma->timeout.start_ktime = ktime_get();
 
@@ -312,7 +310,6 @@ static void stop_cdma_timer_locked(struct host1x_cdma *cdma)
 static void update_cdma_locked(struct host1x_cdma *cdma)
 {
 	bool signal = false;
-	struct host1x *host1x = cdma_to_host1x(cdma);
 	struct host1x_job *job, *n;
 
 	/* If CDMA is stopped, queue is cleared and we can return */
@@ -324,8 +321,7 @@ static void update_cdma_locked(struct host1x_cdma *cdma)
 	 * to consume as many sync queue entries as possible without blocking
 	 */
 	list_for_each_entry_safe(job, n, &cdma->sync_queue, list) {
-		struct host1x_syncpt *sp =
-			host1x_syncpt_get(host1x, job->syncpt_id);
+		struct host1x_syncpt *sp = job->syncpt;
 
 		/* Check whether this syncpt has completed, and bail if not */
 		if (!host1x_syncpt_is_expired(sp, job->syncpt_end)) {
@@ -499,8 +495,7 @@ int host1x_cdma_begin(struct host1x_cdma *cdma, struct host1x_job *job)
 		if (!cdma->timeout.initialized) {
 			int err;
 
-			err = host1x_hw_cdma_timeout_init(host1x, cdma,
-							  job->syncpt_id);
+			err = host1x_hw_cdma_timeout_init(host1x, cdma);
 			if (err) {
 				mutex_unlock(&cdma->lock);
 				return err;
diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
index f781a9b0f39d..63010ae37a97 100644
--- a/drivers/gpu/host1x/dev.h
+++ b/drivers/gpu/host1x/dev.h
@@ -37,7 +37,7 @@ struct host1x_cdma_ops {
 	void (*start)(struct host1x_cdma *cdma);
 	void (*stop)(struct host1x_cdma *cdma);
 	void (*flush)(struct  host1x_cdma *cdma);
-	int (*timeout_init)(struct host1x_cdma *cdma, unsigned int syncpt);
+	int (*timeout_init)(struct host1x_cdma *cdma);
 	void (*timeout_destroy)(struct host1x_cdma *cdma);
 	void (*freeze)(struct host1x_cdma *cdma);
 	void (*resume)(struct host1x_cdma *cdma, u32 getptr);
@@ -261,10 +261,9 @@ static inline void host1x_hw_cdma_flush(struct host1x *host,
 }
 
 static inline int host1x_hw_cdma_timeout_init(struct host1x *host,
-					      struct host1x_cdma *cdma,
-					      unsigned int syncpt)
+					      struct host1x_cdma *cdma)
 {
-	return host->cdma_op->timeout_init(cdma, syncpt);
+	return host->cdma_op->timeout_init(cdma);
 }
 
 static inline void host1x_hw_cdma_timeout_destroy(struct host1x *host,
diff --git a/drivers/gpu/host1x/hw/cdma_hw.c b/drivers/gpu/host1x/hw/cdma_hw.c
index 2f3bf94cf365..e49cd5b8f735 100644
--- a/drivers/gpu/host1x/hw/cdma_hw.c
+++ b/drivers/gpu/host1x/hw/cdma_hw.c
@@ -295,7 +295,7 @@ static void cdma_timeout_handler(struct work_struct *work)
 /*
  * Init timeout resources
  */
-static int cdma_timeout_init(struct host1x_cdma *cdma, unsigned int syncpt)
+static int cdma_timeout_init(struct host1x_cdma *cdma)
 {
 	INIT_DELAYED_WORK(&cdma->timeout.wq, cdma_timeout_handler);
 	cdma->timeout.initialized = true;
diff --git a/drivers/gpu/host1x/hw/channel_hw.c b/drivers/gpu/host1x/hw/channel_hw.c
index 5eaa29d171c9..d4c28faf27d1 100644
--- a/drivers/gpu/host1x/hw/channel_hw.c
+++ b/drivers/gpu/host1x/hw/channel_hw.c
@@ -86,8 +86,7 @@ static void submit_gathers(struct host1x_job *job)
 
 static inline void synchronize_syncpt_base(struct host1x_job *job)
 {
-	struct host1x *host = dev_get_drvdata(job->channel->dev->parent);
-	struct host1x_syncpt *sp = host->syncpt + job->syncpt_id;
+	struct host1x_syncpt *sp = job->syncpt;
 	unsigned int id;
 	u32 value;
 
@@ -118,7 +117,7 @@ static void host1x_channel_set_streamid(struct host1x_channel *channel)
 static int channel_submit(struct host1x_job *job)
 {
 	struct host1x_channel *ch = job->channel;
-	struct host1x_syncpt *sp;
+	struct host1x_syncpt *sp = job->syncpt;
 	u32 user_syncpt_incrs = job->syncpt_incrs;
 	u32 prev_max = 0;
 	u32 syncval;
@@ -126,10 +125,9 @@ static int channel_submit(struct host1x_job *job)
 	struct host1x_waitlist *completed_waiter = NULL;
 	struct host1x *host = dev_get_drvdata(ch->dev->parent);
 
-	sp = host->syncpt + job->syncpt_id;
 	trace_host1x_channel_submit(dev_name(ch->dev),
 				    job->num_gathers, job->num_relocs,
-				    job->syncpt_id, job->syncpt_incrs);
+				    job->syncpt->id, job->syncpt_incrs);
 
 	/* before error checks, return current max */
 	prev_max = job->syncpt_end = host1x_syncpt_read_max(sp);
@@ -163,7 +161,7 @@ static int channel_submit(struct host1x_job *job)
 		host1x_cdma_push(&ch->cdma,
 				 host1x_opcode_setclass(HOST1X_CLASS_HOST1X,
 					host1x_uclass_wait_syncpt_r(), 1),
-				 host1x_class_host_wait_syncpt(job->syncpt_id,
+				 host1x_class_host_wait_syncpt(job->syncpt->id,
 					host1x_syncpt_read_max(sp)));
 	}
 
diff --git a/drivers/gpu/host1x/hw/debug_hw.c b/drivers/gpu/host1x/hw/debug_hw.c
index f31bcfa1b837..ceb48229d14b 100644
--- a/drivers/gpu/host1x/hw/debug_hw.c
+++ b/drivers/gpu/host1x/hw/debug_hw.c
@@ -204,7 +204,7 @@ static void show_channel_gathers(struct output *o, struct host1x_cdma *cdma)
 		unsigned int i;
 
 		host1x_debug_output(o, "\n%p: JOB, syncpt_id=%d, syncpt_val=%d, first_get=%08x, timeout=%d num_slots=%d, num_handles=%d\n",
-				    job, job->syncpt_id, job->syncpt_end,
+				    job, job->syncpt->id, job->syncpt_end,
 				    job->first_get, job->timeout,
 				    job->num_slots, job->num_unpins);
 
diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index 82d0a60ba3f7..adbdc225de8d 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -79,6 +79,9 @@ static void job_free(struct kref *ref)
 {
 	struct host1x_job *job = container_of(ref, struct host1x_job, ref);
 
+	if (job->syncpt)
+		host1x_syncpt_put(job->syncpt);
+
 	kfree(job);
 }
 
@@ -674,7 +677,7 @@ EXPORT_SYMBOL(host1x_job_unpin);
  */
 void host1x_job_dump(struct device *dev, struct host1x_job *job)
 {
-	dev_dbg(dev, "    SYNCPT_ID   %d\n", job->syncpt_id);
+	dev_dbg(dev, "    SYNCPT_ID   %d\n", job->syncpt->id);
 	dev_dbg(dev, "    SYNCPT_VAL  %d\n", job->syncpt_end);
 	dev_dbg(dev, "    FIRST_GET   0x%x\n", job->first_get);
 	dev_dbg(dev, "    TIMEOUT     %d\n", job->timeout);
diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index 8da4bbce8b9d..7bb5de8c3d63 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -90,6 +90,8 @@ struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
 	else
 		sp->client_managed = false;
 
+	kref_init(&sp->ref);
+
 	mutex_unlock(&host->syncpt_mutex);
 	return sp;
 
@@ -383,7 +385,7 @@ int host1x_syncpt_init(struct host1x *host)
  * host1x client drivers can use this function to allocate a syncpoint for
  * subsequent use. A syncpoint returned by this function will be reserved for
  * use by the client exclusively. When no longer using a syncpoint, a host1x
- * client driver needs to release it using host1x_syncpt_free().
+ * client driver needs to release it using host1x_syncpt_put().
  */
 struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
 					    unsigned long flags)
@@ -394,20 +396,9 @@ struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
 }
 EXPORT_SYMBOL(host1x_syncpt_request);
 
-/**
- * host1x_syncpt_free() - free a requested syncpoint
- * @sp: host1x syncpoint
- *
- * Release a syncpoint previously allocated using host1x_syncpt_request(). A
- * host1x client driver should call this when the syncpoint is no longer in
- * use. Note that client drivers must ensure that the syncpoint doesn't remain
- * under the control of hardware after calling this function, otherwise two
- * clients may end up trying to access the same syncpoint concurrently.
- */
-void host1x_syncpt_free(struct host1x_syncpt *sp)
+static void syncpt_release(struct kref *ref)
 {
-	if (!sp)
-		return;
+	struct host1x_syncpt *sp = container_of(ref, struct host1x_syncpt, ref);
 
 	mutex_lock(&sp->host->syncpt_mutex);
 
@@ -419,7 +410,23 @@ void host1x_syncpt_free(struct host1x_syncpt *sp)
 
 	mutex_unlock(&sp->host->syncpt_mutex);
 }
-EXPORT_SYMBOL(host1x_syncpt_free);
+
+/**
+ * host1x_syncpt_put() - free a requested syncpoint
+ * @sp: host1x syncpoint
+ *
+ * Release a syncpoint previously allocated using host1x_syncpt_request(). A
+ * host1x client driver should call this when the syncpoint is no longer in
+ * use.
+ */
+void host1x_syncpt_put(struct host1x_syncpt *sp)
+{
+	if (!sp)
+		return;
+
+	kref_put(&sp->ref, syncpt_release);
+}
+EXPORT_SYMBOL(host1x_syncpt_put);
 
 void host1x_syncpt_deinit(struct host1x *host)
 {
@@ -486,16 +493,48 @@ unsigned int host1x_syncpt_nb_mlocks(struct host1x *host)
 }
 
 /**
- * host1x_syncpt_get() - obtain a syncpoint by ID
+ * host1x_syncpt_get_by_id() - obtain a syncpoint by ID
+ * @host: host1x controller
+ * @id: syncpoint ID
+ */
+struct host1x_syncpt *host1x_syncpt_get_by_id(struct host1x *host,
+					      unsigned int id)
+{
+	if (id >= host->info->nb_pts)
+		return NULL;
+
+	if (kref_get_unless_zero(&host->syncpt[id].ref))
+		return &host->syncpt[id];
+	else
+		return NULL;
+}
+EXPORT_SYMBOL(host1x_syncpt_get_by_id);
+
+/**
+ * host1x_syncpt_get_by_id_noref() - obtain a syncpoint by ID but don't
+ * 	increase the refcount.
  * @host: host1x controller
  * @id: syncpoint ID
  */
-struct host1x_syncpt *host1x_syncpt_get(struct host1x *host, unsigned int id)
+struct host1x_syncpt *host1x_syncpt_get_by_id_noref(struct host1x *host,
+						    unsigned int id)
 {
 	if (id >= host->info->nb_pts)
 		return NULL;
 
-	return host->syncpt + id;
+	return &host->syncpt[id];
+}
+EXPORT_SYMBOL(host1x_syncpt_get_by_id_noref);
+
+/**
+ * host1x_syncpt_get() - increment syncpoint refcount
+ * @sp: syncpoint
+ */
+struct host1x_syncpt *host1x_syncpt_get(struct host1x_syncpt *sp)
+{
+	kref_get(&sp->ref);
+
+	return sp;
 }
 EXPORT_SYMBOL(host1x_syncpt_get);
 
diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h
index 3aa6b25b1b9c..a6766f8d55ee 100644
--- a/drivers/gpu/host1x/syncpt.h
+++ b/drivers/gpu/host1x/syncpt.h
@@ -11,6 +11,7 @@
 #include <linux/atomic.h>
 #include <linux/host1x.h>
 #include <linux/kernel.h>
+#include <linux/kref.h>
 #include <linux/sched.h>
 
 #include "intr.h"
@@ -26,6 +27,8 @@ struct host1x_syncpt_base {
 };
 
 struct host1x_syncpt {
+	struct kref ref;
+
 	unsigned int id;
 	atomic_t min_val;
 	atomic_t max_val;
diff --git a/drivers/staging/media/tegra-video/vi.c b/drivers/staging/media/tegra-video/vi.c
index 7a09061cda57..7e0cb5529b49 100644
--- a/drivers/staging/media/tegra-video/vi.c
+++ b/drivers/staging/media/tegra-video/vi.c
@@ -1131,8 +1131,8 @@ static void tegra_channel_host1x_syncpts_free(struct tegra_vi_channel *chan)
 	int i;
 
 	for (i = 0; i < chan->numgangports; i++) {
-		host1x_syncpt_free(chan->mw_ack_sp[i]);
-		host1x_syncpt_free(chan->frame_start_sp[i]);
+		host1x_syncpt_put(chan->mw_ack_sp[i]);
+		host1x_syncpt_put(chan->frame_start_sp[i]);
 	}
 }
 
diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index 7137ce0e35d4..107aea29bccb 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -142,7 +142,9 @@ struct host1x_syncpt_base;
 struct host1x_syncpt;
 struct host1x;
 
-struct host1x_syncpt *host1x_syncpt_get(struct host1x *host, u32 id);
+struct host1x_syncpt *host1x_syncpt_get_by_id(struct host1x *host, u32 id);
+struct host1x_syncpt *host1x_syncpt_get_by_id_noref(struct host1x *host, u32 id);
+struct host1x_syncpt *host1x_syncpt_get(struct host1x_syncpt *sp);
 u32 host1x_syncpt_id(struct host1x_syncpt *sp);
 u32 host1x_syncpt_read_min(struct host1x_syncpt *sp);
 u32 host1x_syncpt_read_max(struct host1x_syncpt *sp);
@@ -153,7 +155,7 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout,
 		       u32 *value);
 struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
 					    unsigned long flags);
-void host1x_syncpt_free(struct host1x_syncpt *sp);
+void host1x_syncpt_put(struct host1x_syncpt *sp);
 struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
 					  unsigned long flags,
 					  const char *name);
@@ -221,7 +223,7 @@ struct host1x_job {
 	dma_addr_t *reloc_addr_phys;
 
 	/* Sync point id, number of increments and end related to the submit */
-	u32 syncpt_id;
+	struct host1x_syncpt *syncpt;
 	u32 syncpt_incrs;
 	u32 syncpt_end;
 
-- 
2.30.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 07/10] gpu: host1x: Reset max value when freeing a syncpoint
  2021-03-29 13:38 ` Mikko Perttunen
@ 2021-03-29 13:38   ` Mikko Perttunen
  -1 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: dri-devel, linux-tegra, Mikko Perttunen

With job recovery becoming optional, syncpoints may have a mismatch
between their value and max value when freed. As such, when freeing,
set the max value to the current value of the syncpoint so that it
is in a sane state for the next user.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
v3:
* Use host1x_syncpt_read instead of read_min to ensure syncpoint
  value is current.
---
 drivers/gpu/host1x/syncpt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index 7bb5de8c3d63..877c5ab40cbd 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -400,6 +400,8 @@ static void syncpt_release(struct kref *ref)
 {
 	struct host1x_syncpt *sp = container_of(ref, struct host1x_syncpt, ref);
 
+	atomic_set(&sp->max_val, host1x_syncpt_read(sp));
+
 	mutex_lock(&sp->host->syncpt_mutex);
 
 	host1x_syncpt_base_free(sp->base);
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 07/10] gpu: host1x: Reset max value when freeing a syncpoint
@ 2021-03-29 13:38   ` Mikko Perttunen
  0 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel, Mikko Perttunen

With job recovery becoming optional, syncpoints may have a mismatch
between their value and max value when freed. As such, when freeing,
set the max value to the current value of the syncpoint so that it
is in a sane state for the next user.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
v3:
* Use host1x_syncpt_read instead of read_min to ensure syncpoint
  value is current.
---
 drivers/gpu/host1x/syncpt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index 7bb5de8c3d63..877c5ab40cbd 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -400,6 +400,8 @@ static void syncpt_release(struct kref *ref)
 {
 	struct host1x_syncpt *sp = container_of(ref, struct host1x_syncpt, ref);
 
+	atomic_set(&sp->max_val, host1x_syncpt_read(sp));
+
 	mutex_lock(&sp->host->syncpt_mutex);
 
 	host1x_syncpt_base_free(sp->base);
-- 
2.30.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 08/10] gpu: host1x: Reserve VBLANK syncpoints at initialization
  2021-03-29 13:38 ` Mikko Perttunen
@ 2021-03-29 13:38   ` Mikko Perttunen
  -1 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: dri-devel, linux-tegra, Mikko Perttunen

On T20-T148 chips, the bootloader can set up a boot splash
screen with DC configured to increment syncpoint 26/27
at VBLANK. Because of this we shouldn't allow these syncpoints
to be allocated until DC has been reset and will no longer
increment them in the background.

As such, on these chips, reserve those two syncpoints at
initialization, and only mark them free once the DC
driver has indicated it's safe to do so.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
v6:
* Updated kerneldoc
v3:
* New patch
---
 drivers/gpu/drm/tegra/dc.c  |  6 ++++++
 drivers/gpu/host1x/dev.c    |  6 ++++++
 drivers/gpu/host1x/dev.h    |  6 ++++++
 drivers/gpu/host1x/syncpt.c | 35 ++++++++++++++++++++++++++++++++++-
 include/linux/host1x.h      |  3 +++
 5 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 416d6036cf47..71b9658c39e6 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -2045,6 +2045,12 @@ static int tegra_dc_init(struct host1x_client *client)
 	struct drm_plane *cursor = NULL;
 	int err;
 
+	/*
+	 * DC has been reset by now, so VBLANK syncpoint can be released
+	 * for general use.
+	 */
+	host1x_syncpt_release_vblank_reservation(client, 26 + dc->pipe);
+
 	/*
 	 * XXX do not register DCs with no window groups because we cannot
 	 * assign a primary plane to them, which in turn will cause KMS to
diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
index d0ebb70e2fdd..fbb6447b8659 100644
--- a/drivers/gpu/host1x/dev.c
+++ b/drivers/gpu/host1x/dev.c
@@ -77,6 +77,7 @@ static const struct host1x_info host1x01_info = {
 	.has_hypervisor = false,
 	.num_sid_entries = 0,
 	.sid_table = NULL,
+	.reserve_vblank_syncpts = true,
 };
 
 static const struct host1x_info host1x02_info = {
@@ -91,6 +92,7 @@ static const struct host1x_info host1x02_info = {
 	.has_hypervisor = false,
 	.num_sid_entries = 0,
 	.sid_table = NULL,
+	.reserve_vblank_syncpts = true,
 };
 
 static const struct host1x_info host1x04_info = {
@@ -105,6 +107,7 @@ static const struct host1x_info host1x04_info = {
 	.has_hypervisor = false,
 	.num_sid_entries = 0,
 	.sid_table = NULL,
+	.reserve_vblank_syncpts = false,
 };
 
 static const struct host1x_info host1x05_info = {
@@ -119,6 +122,7 @@ static const struct host1x_info host1x05_info = {
 	.has_hypervisor = false,
 	.num_sid_entries = 0,
 	.sid_table = NULL,
+	.reserve_vblank_syncpts = false,
 };
 
 static const struct host1x_sid_entry tegra186_sid_table[] = {
@@ -142,6 +146,7 @@ static const struct host1x_info host1x06_info = {
 	.has_hypervisor = true,
 	.num_sid_entries = ARRAY_SIZE(tegra186_sid_table),
 	.sid_table = tegra186_sid_table,
+	.reserve_vblank_syncpts = false,
 };
 
 static const struct host1x_sid_entry tegra194_sid_table[] = {
@@ -165,6 +170,7 @@ static const struct host1x_info host1x07_info = {
 	.has_hypervisor = true,
 	.num_sid_entries = ARRAY_SIZE(tegra194_sid_table),
 	.sid_table = tegra194_sid_table,
+	.reserve_vblank_syncpts = false,
 };
 
 static const struct of_device_id host1x_of_match[] = {
diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
index 63010ae37a97..fa6d4bc46e98 100644
--- a/drivers/gpu/host1x/dev.h
+++ b/drivers/gpu/host1x/dev.h
@@ -101,6 +101,12 @@ struct host1x_info {
 	bool has_hypervisor; /* has hypervisor registers */
 	unsigned int num_sid_entries;
 	const struct host1x_sid_entry *sid_table;
+	/*
+	 * On T20-T148, the boot chain may setup DC to increment syncpoints
+	 * 26/27 on VBLANK. As such we cannot use these syncpoints until
+	 * the display driver disables VBLANK increments.
+	 */
+	bool reserve_vblank_syncpts;
 };
 
 struct host1x {
diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index 877c5ab40cbd..e648ebbb2027 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -67,7 +67,7 @@ struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
 
 	mutex_lock(&host->syncpt_mutex);
 
-	for (i = 0; i < host->info->nb_pts && sp->name; i++, sp++)
+	for (i = 0; i < host->info->nb_pts && kref_read(&sp->ref); i++, sp++)
 		;
 
 	if (i >= host->info->nb_pts)
@@ -374,6 +374,11 @@ int host1x_syncpt_init(struct host1x *host)
 	if (!host->nop_sp)
 		return -ENOMEM;
 
+	if (host->info->reserve_vblank_syncpts) {
+		kref_init(&host->syncpt[26].ref);
+		kref_init(&host->syncpt[27].ref);
+	}
+
 	return 0;
 }
 
@@ -559,3 +564,31 @@ u32 host1x_syncpt_base_id(struct host1x_syncpt_base *base)
 	return base->id;
 }
 EXPORT_SYMBOL(host1x_syncpt_base_id);
+
+static void do_nothing(struct kref *ref)
+{
+}
+
+/**
+ * host1x_syncpt_release_vblank_reservation() - Make VBLANK syncpoint
+ *   available for allocation
+ *
+ * @client: host1x bus client
+ * @syncpt_id: syncpoint ID to make available
+ *
+ * Makes VBLANK<i> syncpoint available for allocatation if it was
+ * reserved at initialization time. This should be called by the display
+ * driver after it has ensured that any VBLANK increment programming configured
+ * by the boot chain has been disabled.
+ */
+void host1x_syncpt_release_vblank_reservation(struct host1x_client *client,
+					      u32 syncpt_id)
+{
+	struct host1x *host = dev_get_drvdata(client->host->parent);
+
+	if (!host->info->reserve_vblank_syncpts)
+		return;
+
+	kref_put(&host->syncpt[syncpt_id].ref, do_nothing);
+}
+EXPORT_SYMBOL(host1x_syncpt_release_vblank_reservation);
diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index 107aea29bccb..e0a41c2b4c7a 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -163,6 +163,9 @@ struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
 struct host1x_syncpt_base *host1x_syncpt_get_base(struct host1x_syncpt *sp);
 u32 host1x_syncpt_base_id(struct host1x_syncpt_base *base);
 
+void host1x_syncpt_release_vblank_reservation(struct host1x_client *client,
+					      u32 syncpt_id);
+
 /*
  * host1x channel
  */
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 08/10] gpu: host1x: Reserve VBLANK syncpoints at initialization
@ 2021-03-29 13:38   ` Mikko Perttunen
  0 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel, Mikko Perttunen

On T20-T148 chips, the bootloader can set up a boot splash
screen with DC configured to increment syncpoint 26/27
at VBLANK. Because of this we shouldn't allow these syncpoints
to be allocated until DC has been reset and will no longer
increment them in the background.

As such, on these chips, reserve those two syncpoints at
initialization, and only mark them free once the DC
driver has indicated it's safe to do so.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
v6:
* Updated kerneldoc
v3:
* New patch
---
 drivers/gpu/drm/tegra/dc.c  |  6 ++++++
 drivers/gpu/host1x/dev.c    |  6 ++++++
 drivers/gpu/host1x/dev.h    |  6 ++++++
 drivers/gpu/host1x/syncpt.c | 35 ++++++++++++++++++++++++++++++++++-
 include/linux/host1x.h      |  3 +++
 5 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 416d6036cf47..71b9658c39e6 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -2045,6 +2045,12 @@ static int tegra_dc_init(struct host1x_client *client)
 	struct drm_plane *cursor = NULL;
 	int err;
 
+	/*
+	 * DC has been reset by now, so VBLANK syncpoint can be released
+	 * for general use.
+	 */
+	host1x_syncpt_release_vblank_reservation(client, 26 + dc->pipe);
+
 	/*
 	 * XXX do not register DCs with no window groups because we cannot
 	 * assign a primary plane to them, which in turn will cause KMS to
diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
index d0ebb70e2fdd..fbb6447b8659 100644
--- a/drivers/gpu/host1x/dev.c
+++ b/drivers/gpu/host1x/dev.c
@@ -77,6 +77,7 @@ static const struct host1x_info host1x01_info = {
 	.has_hypervisor = false,
 	.num_sid_entries = 0,
 	.sid_table = NULL,
+	.reserve_vblank_syncpts = true,
 };
 
 static const struct host1x_info host1x02_info = {
@@ -91,6 +92,7 @@ static const struct host1x_info host1x02_info = {
 	.has_hypervisor = false,
 	.num_sid_entries = 0,
 	.sid_table = NULL,
+	.reserve_vblank_syncpts = true,
 };
 
 static const struct host1x_info host1x04_info = {
@@ -105,6 +107,7 @@ static const struct host1x_info host1x04_info = {
 	.has_hypervisor = false,
 	.num_sid_entries = 0,
 	.sid_table = NULL,
+	.reserve_vblank_syncpts = false,
 };
 
 static const struct host1x_info host1x05_info = {
@@ -119,6 +122,7 @@ static const struct host1x_info host1x05_info = {
 	.has_hypervisor = false,
 	.num_sid_entries = 0,
 	.sid_table = NULL,
+	.reserve_vblank_syncpts = false,
 };
 
 static const struct host1x_sid_entry tegra186_sid_table[] = {
@@ -142,6 +146,7 @@ static const struct host1x_info host1x06_info = {
 	.has_hypervisor = true,
 	.num_sid_entries = ARRAY_SIZE(tegra186_sid_table),
 	.sid_table = tegra186_sid_table,
+	.reserve_vblank_syncpts = false,
 };
 
 static const struct host1x_sid_entry tegra194_sid_table[] = {
@@ -165,6 +170,7 @@ static const struct host1x_info host1x07_info = {
 	.has_hypervisor = true,
 	.num_sid_entries = ARRAY_SIZE(tegra194_sid_table),
 	.sid_table = tegra194_sid_table,
+	.reserve_vblank_syncpts = false,
 };
 
 static const struct of_device_id host1x_of_match[] = {
diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
index 63010ae37a97..fa6d4bc46e98 100644
--- a/drivers/gpu/host1x/dev.h
+++ b/drivers/gpu/host1x/dev.h
@@ -101,6 +101,12 @@ struct host1x_info {
 	bool has_hypervisor; /* has hypervisor registers */
 	unsigned int num_sid_entries;
 	const struct host1x_sid_entry *sid_table;
+	/*
+	 * On T20-T148, the boot chain may setup DC to increment syncpoints
+	 * 26/27 on VBLANK. As such we cannot use these syncpoints until
+	 * the display driver disables VBLANK increments.
+	 */
+	bool reserve_vblank_syncpts;
 };
 
 struct host1x {
diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index 877c5ab40cbd..e648ebbb2027 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -67,7 +67,7 @@ struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
 
 	mutex_lock(&host->syncpt_mutex);
 
-	for (i = 0; i < host->info->nb_pts && sp->name; i++, sp++)
+	for (i = 0; i < host->info->nb_pts && kref_read(&sp->ref); i++, sp++)
 		;
 
 	if (i >= host->info->nb_pts)
@@ -374,6 +374,11 @@ int host1x_syncpt_init(struct host1x *host)
 	if (!host->nop_sp)
 		return -ENOMEM;
 
+	if (host->info->reserve_vblank_syncpts) {
+		kref_init(&host->syncpt[26].ref);
+		kref_init(&host->syncpt[27].ref);
+	}
+
 	return 0;
 }
 
@@ -559,3 +564,31 @@ u32 host1x_syncpt_base_id(struct host1x_syncpt_base *base)
 	return base->id;
 }
 EXPORT_SYMBOL(host1x_syncpt_base_id);
+
+static void do_nothing(struct kref *ref)
+{
+}
+
+/**
+ * host1x_syncpt_release_vblank_reservation() - Make VBLANK syncpoint
+ *   available for allocation
+ *
+ * @client: host1x bus client
+ * @syncpt_id: syncpoint ID to make available
+ *
+ * Makes VBLANK<i> syncpoint available for allocatation if it was
+ * reserved at initialization time. This should be called by the display
+ * driver after it has ensured that any VBLANK increment programming configured
+ * by the boot chain has been disabled.
+ */
+void host1x_syncpt_release_vblank_reservation(struct host1x_client *client,
+					      u32 syncpt_id)
+{
+	struct host1x *host = dev_get_drvdata(client->host->parent);
+
+	if (!host->info->reserve_vblank_syncpts)
+		return;
+
+	kref_put(&host->syncpt[syncpt_id].ref, do_nothing);
+}
+EXPORT_SYMBOL(host1x_syncpt_release_vblank_reservation);
diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index 107aea29bccb..e0a41c2b4c7a 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -163,6 +163,9 @@ struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
 struct host1x_syncpt_base *host1x_syncpt_get_base(struct host1x_syncpt *sp);
 u32 host1x_syncpt_base_id(struct host1x_syncpt_base *base);
 
+void host1x_syncpt_release_vblank_reservation(struct host1x_client *client,
+					      u32 syncpt_id);
+
 /*
  * host1x channel
  */
-- 
2.30.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 09/10] gpu: host1x: Assign intr waiter inside lock
  2021-03-29 13:38 ` Mikko Perttunen
@ 2021-03-29 13:38   ` Mikko Perttunen
  -1 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: dri-devel, linux-tegra, Mikko Perttunen

Move the assignment of the ref out-pointer in host1x_intr_add_action
to happen within the spinlock. With the current arrangement,
it is possible for the waiter to complete before the assignment
has happened, which breaks horribly if the waiter completion
callback tries to use the reference.

In practice, there is currently no situation where this issue can
manifest -- it was first noticed with the upcoming DMA fence
implementation patches. As such this doesn't need to be backported.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
 drivers/gpu/host1x/intr.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/host1x/intr.c b/drivers/gpu/host1x/intr.c
index 69b0e8e41466..6d1f3c0fdbe7 100644
--- a/drivers/gpu/host1x/intr.c
+++ b/drivers/gpu/host1x/intr.c
@@ -235,10 +235,11 @@ int host1x_intr_add_action(struct host1x *host, struct host1x_syncpt *syncpt,
 			host1x_hw_intr_enable_syncpt_intr(host, syncpt->id);
 	}
 
-	spin_unlock(&syncpt->intr.lock);
-
 	if (ref)
 		*ref = waiter;
+
+	spin_unlock(&syncpt->intr.lock);
+
 	return 0;
 }
 
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 09/10] gpu: host1x: Assign intr waiter inside lock
@ 2021-03-29 13:38   ` Mikko Perttunen
  0 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel, Mikko Perttunen

Move the assignment of the ref out-pointer in host1x_intr_add_action
to happen within the spinlock. With the current arrangement,
it is possible for the waiter to complete before the assignment
has happened, which breaks horribly if the waiter completion
callback tries to use the reference.

In practice, there is currently no situation where this issue can
manifest -- it was first noticed with the upcoming DMA fence
implementation patches. As such this doesn't need to be backported.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
 drivers/gpu/host1x/intr.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/host1x/intr.c b/drivers/gpu/host1x/intr.c
index 69b0e8e41466..6d1f3c0fdbe7 100644
--- a/drivers/gpu/host1x/intr.c
+++ b/drivers/gpu/host1x/intr.c
@@ -235,10 +235,11 @@ int host1x_intr_add_action(struct host1x *host, struct host1x_syncpt *syncpt,
 			host1x_hw_intr_enable_syncpt_intr(host, syncpt->id);
 	}
 
-	spin_unlock(&syncpt->intr.lock);
-
 	if (ref)
 		*ref = waiter;
+
+	spin_unlock(&syncpt->intr.lock);
+
 	return 0;
 }
 
-- 
2.30.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 10/10] gpu: host1x: Fix Tegra194 syncpt interrupt threshold
  2021-03-29 13:38 ` Mikko Perttunen
@ 2021-03-29 13:38   ` Mikko Perttunen
  -1 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: dri-devel, linux-tegra, Mikko Perttunen

From: Jon Hunter <jonathanh@nvidia.com>

Syncpoint interrupts are not working as expected on Tegra194. The
problem is that the syncpoint interrupt threshold being used is the
global interrupt threshold and not the virtual interrupt threshold.
Fix this by using the virtual interrupt threshold which aligns with
downstream.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
 drivers/gpu/host1x/hw/hw_host1x07_vm.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/host1x/hw/hw_host1x07_vm.h b/drivers/gpu/host1x/hw/hw_host1x07_vm.h
index 3058b3c9a91d..b766851d5b83 100644
--- a/drivers/gpu/host1x/hw/hw_host1x07_vm.h
+++ b/drivers/gpu/host1x/hw/hw_host1x07_vm.h
@@ -29,6 +29,6 @@
 #define HOST1X_SYNC_SYNCPT_THRESH_INT_ENABLE_CPU0(x)	(0x652c + 4 * (x))
 #define HOST1X_SYNC_SYNCPT_THRESH_INT_DISABLE(x)	(0x6590 + 4 * (x))
 #define HOST1X_SYNC_SYNCPT(x)				(0x8080 + 4 * (x))
-#define HOST1X_SYNC_SYNCPT_INT_THRESH(x)		(0x8d00 + 4 * (x))
+#define HOST1X_SYNC_SYNCPT_INT_THRESH(x)		(0x9980 + 4 * (x))
 #define HOST1X_SYNC_SYNCPT_CH_APP(x)			(0xa604 + 4 * (x))
 #define HOST1X_SYNC_SYNCPT_CH_APP_CH(v)			(((v) & 0x3f) << 8)
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v6 10/10] gpu: host1x: Fix Tegra194 syncpt interrupt threshold
@ 2021-03-29 13:38   ` Mikko Perttunen
  0 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 13:38 UTC (permalink / raw)
  To: thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel, Mikko Perttunen

From: Jon Hunter <jonathanh@nvidia.com>

Syncpoint interrupts are not working as expected on Tegra194. The
problem is that the syncpoint interrupt threshold being used is the
global interrupt threshold and not the virtual interrupt threshold.
Fix this by using the virtual interrupt threshold which aligns with
downstream.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
---
 drivers/gpu/host1x/hw/hw_host1x07_vm.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/host1x/hw/hw_host1x07_vm.h b/drivers/gpu/host1x/hw/hw_host1x07_vm.h
index 3058b3c9a91d..b766851d5b83 100644
--- a/drivers/gpu/host1x/hw/hw_host1x07_vm.h
+++ b/drivers/gpu/host1x/hw/hw_host1x07_vm.h
@@ -29,6 +29,6 @@
 #define HOST1X_SYNC_SYNCPT_THRESH_INT_ENABLE_CPU0(x)	(0x652c + 4 * (x))
 #define HOST1X_SYNC_SYNCPT_THRESH_INT_DISABLE(x)	(0x6590 + 4 * (x))
 #define HOST1X_SYNC_SYNCPT(x)				(0x8080 + 4 * (x))
-#define HOST1X_SYNC_SYNCPT_INT_THRESH(x)		(0x8d00 + 4 * (x))
+#define HOST1X_SYNC_SYNCPT_INT_THRESH(x)		(0x9980 + 4 * (x))
 #define HOST1X_SYNC_SYNCPT_CH_APP(x)			(0xa604 + 4 * (x))
 #define HOST1X_SYNC_SYNCPT_CH_APP_CH(v)			(((v) & 0x3f) << 8)
-- 
2.30.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v6 04/10] gpu: host1x: Remove cancelled waiters immediately
  2021-03-29 13:38   ` Mikko Perttunen
@ 2021-03-29 20:27     ` Dmitry Osipenko
  -1 siblings, 0 replies; 35+ messages in thread
From: Dmitry Osipenko @ 2021-03-29 20:27 UTC (permalink / raw)
  To: Mikko Perttunen, thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel

29.03.2021 16:38, Mikko Perttunen пишет:
> Before this patch, cancelled waiters would only be cleaned up
> once their threshold value was reached. Make host1x_intr_put_ref
> process the cancellation immediately to fix this.
> 
> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> ---
> v6:
> * Call schedule instead of cpu_relax while waiting for pending
>   interrupt processing
> v5:
> * Add parameter to flush, i.e. wait for all pending waiters to
>   complete before returning. The reason this is not always true
>   is that the pending waiter might be the place that is calling
>   the put_ref.
> ---
>  drivers/gpu/host1x/intr.c   | 23 +++++++++++++++++------
>  drivers/gpu/host1x/intr.h   |  4 +++-
>  drivers/gpu/host1x/syncpt.c |  2 +-
>  3 files changed, 21 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/host1x/intr.c b/drivers/gpu/host1x/intr.c
> index 9245add23b5d..69b0e8e41466 100644
> --- a/drivers/gpu/host1x/intr.c
> +++ b/drivers/gpu/host1x/intr.c
> @@ -242,18 +242,29 @@ int host1x_intr_add_action(struct host1x *host, struct host1x_syncpt *syncpt,
>  	return 0;
>  }
>  
> -void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref)
> +void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref,
> +			 bool flush)
>  {
>  	struct host1x_waitlist *waiter = ref;
>  	struct host1x_syncpt *syncpt;
>  
> -	while (atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED) ==
> -	       WLS_REMOVED)
> -		schedule();
> +	atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED);
>  
>  	syncpt = host->syncpt + id;
> -	(void)process_wait_list(host, syncpt,
> -				host1x_syncpt_load(host->syncpt + id));
> +
> +	spin_lock(&syncpt->intr.lock);
> +	if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) ==
> +	    WLS_CANCELLED) {
> +		list_del(&waiter->list);
> +		kref_put(&waiter->refcount, waiter_release);
> +	}
> +	spin_unlock(&syncpt->intr.lock);

Looks like we need to use IRQ-safe version of the locking here in order
not to race with the interrupt handler(?), preventing lockup.

But what real bug is fixed by this patch? If no real problem is fixed,
then maybe will be better to defer touching this code till we will just
replace it all with a proper dma-fence handlers?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v6 04/10] gpu: host1x: Remove cancelled waiters immediately
@ 2021-03-29 20:27     ` Dmitry Osipenko
  0 siblings, 0 replies; 35+ messages in thread
From: Dmitry Osipenko @ 2021-03-29 20:27 UTC (permalink / raw)
  To: Mikko Perttunen, thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel

29.03.2021 16:38, Mikko Perttunen пишет:
> Before this patch, cancelled waiters would only be cleaned up
> once their threshold value was reached. Make host1x_intr_put_ref
> process the cancellation immediately to fix this.
> 
> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> ---
> v6:
> * Call schedule instead of cpu_relax while waiting for pending
>   interrupt processing
> v5:
> * Add parameter to flush, i.e. wait for all pending waiters to
>   complete before returning. The reason this is not always true
>   is that the pending waiter might be the place that is calling
>   the put_ref.
> ---
>  drivers/gpu/host1x/intr.c   | 23 +++++++++++++++++------
>  drivers/gpu/host1x/intr.h   |  4 +++-
>  drivers/gpu/host1x/syncpt.c |  2 +-
>  3 files changed, 21 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/host1x/intr.c b/drivers/gpu/host1x/intr.c
> index 9245add23b5d..69b0e8e41466 100644
> --- a/drivers/gpu/host1x/intr.c
> +++ b/drivers/gpu/host1x/intr.c
> @@ -242,18 +242,29 @@ int host1x_intr_add_action(struct host1x *host, struct host1x_syncpt *syncpt,
>  	return 0;
>  }
>  
> -void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref)
> +void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref,
> +			 bool flush)
>  {
>  	struct host1x_waitlist *waiter = ref;
>  	struct host1x_syncpt *syncpt;
>  
> -	while (atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED) ==
> -	       WLS_REMOVED)
> -		schedule();
> +	atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED);
>  
>  	syncpt = host->syncpt + id;
> -	(void)process_wait_list(host, syncpt,
> -				host1x_syncpt_load(host->syncpt + id));
> +
> +	spin_lock(&syncpt->intr.lock);
> +	if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) ==
> +	    WLS_CANCELLED) {
> +		list_del(&waiter->list);
> +		kref_put(&waiter->refcount, waiter_release);
> +	}
> +	spin_unlock(&syncpt->intr.lock);

Looks like we need to use IRQ-safe version of the locking here in order
not to race with the interrupt handler(?), preventing lockup.

But what real bug is fixed by this patch? If no real problem is fixed,
then maybe will be better to defer touching this code till we will just
replace it all with a proper dma-fence handlers?
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v6 04/10] gpu: host1x: Remove cancelled waiters immediately
  2021-03-29 20:27     ` Dmitry Osipenko
@ 2021-03-29 21:36       ` Mikko Perttunen
  -1 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 21:36 UTC (permalink / raw)
  To: Dmitry Osipenko, Mikko Perttunen, thierry.reding, jonathanh
  Cc: linux-tegra, dri-devel

On 3/29/21 11:27 PM, Dmitry Osipenko wrote:
> 29.03.2021 16:38, Mikko Perttunen пишет:
>> Before this patch, cancelled waiters would only be cleaned up
>> once their threshold value was reached. Make host1x_intr_put_ref
>> process the cancellation immediately to fix this.
>>
>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
>> ---
>> v6:
>> * Call schedule instead of cpu_relax while waiting for pending
>>    interrupt processing
>> v5:
>> * Add parameter to flush, i.e. wait for all pending waiters to
>>    complete before returning. The reason this is not always true
>>    is that the pending waiter might be the place that is calling
>>    the put_ref.
>> ---
>>   drivers/gpu/host1x/intr.c   | 23 +++++++++++++++++------
>>   drivers/gpu/host1x/intr.h   |  4 +++-
>>   drivers/gpu/host1x/syncpt.c |  2 +-
>>   3 files changed, 21 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/host1x/intr.c b/drivers/gpu/host1x/intr.c
>> index 9245add23b5d..69b0e8e41466 100644
>> --- a/drivers/gpu/host1x/intr.c
>> +++ b/drivers/gpu/host1x/intr.c
>> @@ -242,18 +242,29 @@ int host1x_intr_add_action(struct host1x *host, struct host1x_syncpt *syncpt,
>>   	return 0;
>>   }
>>   
>> -void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref)
>> +void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref,
>> +			 bool flush)
>>   {
>>   	struct host1x_waitlist *waiter = ref;
>>   	struct host1x_syncpt *syncpt;
>>   
>> -	while (atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED) ==
>> -	       WLS_REMOVED)
>> -		schedule();
>> +	atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED);
>>   
>>   	syncpt = host->syncpt + id;
>> -	(void)process_wait_list(host, syncpt,
>> -				host1x_syncpt_load(host->syncpt + id));
>> +
>> +	spin_lock(&syncpt->intr.lock);
>> +	if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) ==
>> +	    WLS_CANCELLED) {
>> +		list_del(&waiter->list);
>> +		kref_put(&waiter->refcount, waiter_release);
>> +	}
>> +	spin_unlock(&syncpt->intr.lock);
> 
> Looks like we need to use IRQ-safe version of the locking here in order
> not to race with the interrupt handler(?), preventing lockup.

The potential contention is with the syncpt_thresh_work scheduled work, 
and not the actual interrupt handler, so there is no issue.

> 
> But what real bug is fixed by this patch? If no real problem is fixed,
> then maybe will be better to defer touching this code till we will just
> replace it all with a proper dma-fence handlers?
> 

It improves things in that we won't litter the waiter data structures 
with unbounded waiter entries when waits are cancelled. Also, I prefer 
working in steps when possible - next is writing dma_fences on top of 
this (which is already done) and then eventually phasing/refactoring 
code from intr.c to fence.c so eventually only dma_fences remain. In my 
experience that works better than big rewrites.

Mikko

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v6 04/10] gpu: host1x: Remove cancelled waiters immediately
@ 2021-03-29 21:36       ` Mikko Perttunen
  0 siblings, 0 replies; 35+ messages in thread
From: Mikko Perttunen @ 2021-03-29 21:36 UTC (permalink / raw)
  To: Dmitry Osipenko, Mikko Perttunen, thierry.reding, jonathanh
  Cc: linux-tegra, dri-devel

On 3/29/21 11:27 PM, Dmitry Osipenko wrote:
> 29.03.2021 16:38, Mikko Perttunen пишет:
>> Before this patch, cancelled waiters would only be cleaned up
>> once their threshold value was reached. Make host1x_intr_put_ref
>> process the cancellation immediately to fix this.
>>
>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
>> ---
>> v6:
>> * Call schedule instead of cpu_relax while waiting for pending
>>    interrupt processing
>> v5:
>> * Add parameter to flush, i.e. wait for all pending waiters to
>>    complete before returning. The reason this is not always true
>>    is that the pending waiter might be the place that is calling
>>    the put_ref.
>> ---
>>   drivers/gpu/host1x/intr.c   | 23 +++++++++++++++++------
>>   drivers/gpu/host1x/intr.h   |  4 +++-
>>   drivers/gpu/host1x/syncpt.c |  2 +-
>>   3 files changed, 21 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/host1x/intr.c b/drivers/gpu/host1x/intr.c
>> index 9245add23b5d..69b0e8e41466 100644
>> --- a/drivers/gpu/host1x/intr.c
>> +++ b/drivers/gpu/host1x/intr.c
>> @@ -242,18 +242,29 @@ int host1x_intr_add_action(struct host1x *host, struct host1x_syncpt *syncpt,
>>   	return 0;
>>   }
>>   
>> -void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref)
>> +void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref,
>> +			 bool flush)
>>   {
>>   	struct host1x_waitlist *waiter = ref;
>>   	struct host1x_syncpt *syncpt;
>>   
>> -	while (atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED) ==
>> -	       WLS_REMOVED)
>> -		schedule();
>> +	atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED);
>>   
>>   	syncpt = host->syncpt + id;
>> -	(void)process_wait_list(host, syncpt,
>> -				host1x_syncpt_load(host->syncpt + id));
>> +
>> +	spin_lock(&syncpt->intr.lock);
>> +	if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) ==
>> +	    WLS_CANCELLED) {
>> +		list_del(&waiter->list);
>> +		kref_put(&waiter->refcount, waiter_release);
>> +	}
>> +	spin_unlock(&syncpt->intr.lock);
> 
> Looks like we need to use IRQ-safe version of the locking here in order
> not to race with the interrupt handler(?), preventing lockup.

The potential contention is with the syncpt_thresh_work scheduled work, 
and not the actual interrupt handler, so there is no issue.

> 
> But what real bug is fixed by this patch? If no real problem is fixed,
> then maybe will be better to defer touching this code till we will just
> replace it all with a proper dma-fence handlers?
> 

It improves things in that we won't litter the waiter data structures 
with unbounded waiter entries when waits are cancelled. Also, I prefer 
working in steps when possible - next is writing dma_fences on top of 
this (which is already done) and then eventually phasing/refactoring 
code from intr.c to fence.c so eventually only dma_fences remain. In my 
experience that works better than big rewrites.

Mikko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v6 06/10] gpu: host1x: Cleanup and refcounting for syncpoints
  2021-03-29 13:38   ` Mikko Perttunen
  (?)
@ 2021-03-29 23:42   ` kernel test robot
  -1 siblings, 0 replies; 35+ messages in thread
From: kernel test robot @ 2021-03-29 23:42 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4938 bytes --]

Hi Mikko,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tegra-drm/drm/tegra/for-next]
[also build test ERROR on tegra/for-next v5.12-rc5 next-20210329]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Mikko-Perttunen/Fixes-and-cleanups-for-Host1x/20210329-214010
base:   git://anongit.freedesktop.org/tegra/linux.git drm/tegra/for-next
config: arm64-allyesconfig (attached as .config)
compiler: aarch64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/6f97c34dc7baa7890453c71f62a9eed8f29d89e9
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Mikko-Perttunen/Fixes-and-cleanups-for-Host1x/20210329-214010
        git checkout 6f97c34dc7baa7890453c71f62a9eed8f29d89e9
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   drivers/staging/media/tegra-video/vi.c: In function 'tegra_channel_host1x_syncpt_init':
>> drivers/staging/media/tegra-video/vi.c:1180:4: error: implicit declaration of function 'host1x_syncpt_free'; did you mean 'host1x_syncpt_read'? [-Werror=implicit-function-declaration]
    1180 |    host1x_syncpt_free(fs_sp);
         |    ^~~~~~~~~~~~~~~~~~
         |    host1x_syncpt_read
   cc1: some warnings being treated as errors


vim +1180 drivers/staging/media/tegra-video/vi.c

3d8a97eabef088 Sowjanya Komatineni 2020-05-04  1160  
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1161  static int tegra_channel_host1x_syncpt_init(struct tegra_vi_channel *chan)
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1162  {
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1163  	struct tegra_vi *vi = chan->vi;
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1164  	unsigned long flags = HOST1X_SYNCPT_CLIENT_MANAGED;
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1165  	struct host1x_syncpt *fs_sp;
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1166  	struct host1x_syncpt *mw_sp;
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1167  	int ret, i;
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1168  
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1169  	for (i = 0; i < chan->numgangports; i++) {
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1170  		fs_sp = host1x_syncpt_request(&vi->client, flags);
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1171  		if (!fs_sp) {
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1172  			dev_err(vi->dev, "failed to request frame start syncpoint\n");
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1173  			ret = -ENOMEM;
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1174  			goto free_syncpts;
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1175  		}
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1176  
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1177  		mw_sp = host1x_syncpt_request(&vi->client, flags);
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1178  		if (!mw_sp) {
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1179  			dev_err(vi->dev, "failed to request memory ack syncpoint\n");
2ac4035a78c933 Sowjanya Komatineni 2020-12-11 @1180  			host1x_syncpt_free(fs_sp);
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1181  			ret = -ENOMEM;
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1182  			goto free_syncpts;
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1183  		}
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1184  
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1185  		chan->frame_start_sp[i] = fs_sp;
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1186  		chan->mw_ack_sp[i] = mw_sp;
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1187  		spin_lock_init(&chan->sp_incr_lock[i]);
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1188  	}
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1189  
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1190  	return 0;
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1191  
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1192  free_syncpts:
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1193  	tegra_channel_host1x_syncpts_free(chan);
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1194  	return ret;
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1195  }
2ac4035a78c933 Sowjanya Komatineni 2020-12-11  1196  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 76804 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v6 04/10] gpu: host1x: Remove cancelled waiters immediately
  2021-03-29 21:36       ` Mikko Perttunen
@ 2021-03-30  1:35         ` Dmitry Osipenko
  -1 siblings, 0 replies; 35+ messages in thread
From: Dmitry Osipenko @ 2021-03-30  1:35 UTC (permalink / raw)
  To: Mikko Perttunen, Mikko Perttunen, thierry.reding, jonathanh
  Cc: linux-tegra, dri-devel

30.03.2021 00:36, Mikko Perttunen пишет:
> On 3/29/21 11:27 PM, Dmitry Osipenko wrote:
>> 29.03.2021 16:38, Mikko Perttunen пишет:
>>> Before this patch, cancelled waiters would only be cleaned up
>>> once their threshold value was reached. Make host1x_intr_put_ref
>>> process the cancellation immediately to fix this.
>>>
>>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
>>> ---
>>> v6:
>>> * Call schedule instead of cpu_relax while waiting for pending
>>>    interrupt processing
>>> v5:
>>> * Add parameter to flush, i.e. wait for all pending waiters to
>>>    complete before returning. The reason this is not always true
>>>    is that the pending waiter might be the place that is calling
>>>    the put_ref.
>>> ---
>>>   drivers/gpu/host1x/intr.c   | 23 +++++++++++++++++------
>>>   drivers/gpu/host1x/intr.h   |  4 +++-
>>>   drivers/gpu/host1x/syncpt.c |  2 +-
>>>   3 files changed, 21 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/host1x/intr.c b/drivers/gpu/host1x/intr.c
>>> index 9245add23b5d..69b0e8e41466 100644
>>> --- a/drivers/gpu/host1x/intr.c
>>> +++ b/drivers/gpu/host1x/intr.c
>>> @@ -242,18 +242,29 @@ int host1x_intr_add_action(struct host1x *host,
>>> struct host1x_syncpt *syncpt,
>>>       return 0;
>>>   }
>>>   -void host1x_intr_put_ref(struct host1x *host, unsigned int id,
>>> void *ref)
>>> +void host1x_intr_put_ref(struct host1x *host, unsigned int id, void
>>> *ref,
>>> +             bool flush)
>>>   {
>>>       struct host1x_waitlist *waiter = ref;
>>>       struct host1x_syncpt *syncpt;
>>>   -    while (atomic_cmpxchg(&waiter->state, WLS_PENDING,
>>> WLS_CANCELLED) ==
>>> -           WLS_REMOVED)
>>> -        schedule();
>>> +    atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED);
>>>         syncpt = host->syncpt + id;
>>> -    (void)process_wait_list(host, syncpt,
>>> -                host1x_syncpt_load(host->syncpt + id));
>>> +
>>> +    spin_lock(&syncpt->intr.lock);
>>> +    if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) ==
>>> +        WLS_CANCELLED) {
>>> +        list_del(&waiter->list);
>>> +        kref_put(&waiter->refcount, waiter_release);
>>> +    }
>>> +    spin_unlock(&syncpt->intr.lock);
>>
>> Looks like we need to use IRQ-safe version of the locking here in order
>> not to race with the interrupt handler(?), preventing lockup.
> 
> The potential contention is with the syncpt_thresh_work scheduled work,
> and not the actual interrupt handler, so there is no issue.

I see now, thanks.

>> But what real bug is fixed by this patch? If no real problem is fixed,
>> then maybe will be better to defer touching this code till we will just
>> replace it all with a proper dma-fence handlers?
>>
> 
> It improves things in that we won't litter the waiter data structures
> with unbounded waiter entries when waits are cancelled. Also, I prefer
> working in steps when possible - next is writing dma_fences on top of
> this (which is already done) and then eventually phasing/refactoring
> code from intr.c to fence.c so eventually only dma_fences remain. In my
> experience that works better than big rewrites.

So this change is a cleanup and not a bugfix, which wasn't clear from
the commit description. In my experience it usually tends to help with a
review if commit message explicitly states whether it is a minor cleanup
or a critical bugfix.

The small cleanups should be okay. It could be better if you could
explicitly separate the fixes from cleanups since there is a better
chance that fixes will be picked up immediately.

I'd also suggest to try to group the cleanup changes with the new
features that benefit from them, where possible. This should make
patches to look more logical, not like it's some random change, helping
with a review.

I'll try to give a test to this series later today.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v6 04/10] gpu: host1x: Remove cancelled waiters immediately
@ 2021-03-30  1:35         ` Dmitry Osipenko
  0 siblings, 0 replies; 35+ messages in thread
From: Dmitry Osipenko @ 2021-03-30  1:35 UTC (permalink / raw)
  To: Mikko Perttunen, Mikko Perttunen, thierry.reding, jonathanh
  Cc: linux-tegra, dri-devel

30.03.2021 00:36, Mikko Perttunen пишет:
> On 3/29/21 11:27 PM, Dmitry Osipenko wrote:
>> 29.03.2021 16:38, Mikko Perttunen пишет:
>>> Before this patch, cancelled waiters would only be cleaned up
>>> once their threshold value was reached. Make host1x_intr_put_ref
>>> process the cancellation immediately to fix this.
>>>
>>> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
>>> ---
>>> v6:
>>> * Call schedule instead of cpu_relax while waiting for pending
>>>    interrupt processing
>>> v5:
>>> * Add parameter to flush, i.e. wait for all pending waiters to
>>>    complete before returning. The reason this is not always true
>>>    is that the pending waiter might be the place that is calling
>>>    the put_ref.
>>> ---
>>>   drivers/gpu/host1x/intr.c   | 23 +++++++++++++++++------
>>>   drivers/gpu/host1x/intr.h   |  4 +++-
>>>   drivers/gpu/host1x/syncpt.c |  2 +-
>>>   3 files changed, 21 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/host1x/intr.c b/drivers/gpu/host1x/intr.c
>>> index 9245add23b5d..69b0e8e41466 100644
>>> --- a/drivers/gpu/host1x/intr.c
>>> +++ b/drivers/gpu/host1x/intr.c
>>> @@ -242,18 +242,29 @@ int host1x_intr_add_action(struct host1x *host,
>>> struct host1x_syncpt *syncpt,
>>>       return 0;
>>>   }
>>>   -void host1x_intr_put_ref(struct host1x *host, unsigned int id,
>>> void *ref)
>>> +void host1x_intr_put_ref(struct host1x *host, unsigned int id, void
>>> *ref,
>>> +             bool flush)
>>>   {
>>>       struct host1x_waitlist *waiter = ref;
>>>       struct host1x_syncpt *syncpt;
>>>   -    while (atomic_cmpxchg(&waiter->state, WLS_PENDING,
>>> WLS_CANCELLED) ==
>>> -           WLS_REMOVED)
>>> -        schedule();
>>> +    atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED);
>>>         syncpt = host->syncpt + id;
>>> -    (void)process_wait_list(host, syncpt,
>>> -                host1x_syncpt_load(host->syncpt + id));
>>> +
>>> +    spin_lock(&syncpt->intr.lock);
>>> +    if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) ==
>>> +        WLS_CANCELLED) {
>>> +        list_del(&waiter->list);
>>> +        kref_put(&waiter->refcount, waiter_release);
>>> +    }
>>> +    spin_unlock(&syncpt->intr.lock);
>>
>> Looks like we need to use IRQ-safe version of the locking here in order
>> not to race with the interrupt handler(?), preventing lockup.
> 
> The potential contention is with the syncpt_thresh_work scheduled work,
> and not the actual interrupt handler, so there is no issue.

I see now, thanks.

>> But what real bug is fixed by this patch? If no real problem is fixed,
>> then maybe will be better to defer touching this code till we will just
>> replace it all with a proper dma-fence handlers?
>>
> 
> It improves things in that we won't litter the waiter data structures
> with unbounded waiter entries when waits are cancelled. Also, I prefer
> working in steps when possible - next is writing dma_fences on top of
> this (which is already done) and then eventually phasing/refactoring
> code from intr.c to fence.c so eventually only dma_fences remain. In my
> experience that works better than big rewrites.

So this change is a cleanup and not a bugfix, which wasn't clear from
the commit description. In my experience it usually tends to help with a
review if commit message explicitly states whether it is a minor cleanup
or a critical bugfix.

The small cleanups should be okay. It could be better if you could
explicitly separate the fixes from cleanups since there is a better
chance that fixes will be picked up immediately.

I'd also suggest to try to group the cleanup changes with the new
features that benefit from them, where possible. This should make
patches to look more logical, not like it's some random change, helping
with a review.

I'll try to give a test to this series later today.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v6 06/10] gpu: host1x: Cleanup and refcounting for syncpoints
  2021-03-29 13:38   ` Mikko Perttunen
@ 2021-03-30 21:12     ` Dmitry Osipenko
  -1 siblings, 0 replies; 35+ messages in thread
From: Dmitry Osipenko @ 2021-03-30 21:12 UTC (permalink / raw)
  To: Mikko Perttunen, thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel

29.03.2021 16:38, Mikko Perttunen пишет:
> Add reference counting for allocated syncpoints to allow keeping
> them allocated while jobs are referencing them. Additionally,
> clean up various places using syncpoint IDs to use host1x_syncpt
> pointers instead.
> 
> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> ---
> v5:
> - Remove host1x_syncpt_put in submit code, as job_put already
>   puts the syncpoint.
> - Changes due to rebase in VI driver.
> v4:
> - Update from _free to _put in VI driver as well
> ---
>  drivers/gpu/drm/tegra/dc.c             |  4 +-
>  drivers/gpu/drm/tegra/drm.c            | 14 ++---
>  drivers/gpu/drm/tegra/gr2d.c           |  4 +-
>  drivers/gpu/drm/tegra/gr3d.c           |  4 +-
>  drivers/gpu/drm/tegra/vic.c            |  4 +-
>  drivers/gpu/host1x/cdma.c              | 11 ++--
>  drivers/gpu/host1x/dev.h               |  7 ++-
>  drivers/gpu/host1x/hw/cdma_hw.c        |  2 +-
>  drivers/gpu/host1x/hw/channel_hw.c     | 10 ++--
>  drivers/gpu/host1x/hw/debug_hw.c       |  2 +-
>  drivers/gpu/host1x/job.c               |  5 +-
>  drivers/gpu/host1x/syncpt.c            | 75 +++++++++++++++++++-------
>  drivers/gpu/host1x/syncpt.h            |  3 ++
>  drivers/staging/media/tegra-video/vi.c |  4 +-
>  include/linux/host1x.h                 |  8 +--
>  15 files changed, 98 insertions(+), 59 deletions(-)
> 
> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> index 5635fac01e3e..416d6036cf47 100644
> --- a/drivers/gpu/drm/tegra/dc.c
> +++ b/drivers/gpu/drm/tegra/dc.c
> @@ -2141,7 +2141,7 @@ static int tegra_dc_init(struct host1x_client *client)
>  		drm_plane_cleanup(primary);
>  
>  	host1x_client_iommu_detach(client);
> -	host1x_syncpt_free(dc->syncpt);
> +	host1x_syncpt_put(dc->syncpt);
>  
>  	return err;
>  }
> @@ -2166,7 +2166,7 @@ static int tegra_dc_exit(struct host1x_client *client)
>  	}
>  
>  	host1x_client_iommu_detach(client);
> -	host1x_syncpt_free(dc->syncpt);
> +	host1x_syncpt_put(dc->syncpt);
>  
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> index 90709c38c993..ce5bdc58d315 100644
> --- a/drivers/gpu/drm/tegra/drm.c
> +++ b/drivers/gpu/drm/tegra/drm.c
> @@ -174,7 +174,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
>  	struct drm_tegra_syncpt syncpt;
>  	struct host1x *host1x = dev_get_drvdata(drm->dev->parent);
>  	struct drm_gem_object **refs;
> -	struct host1x_syncpt *sp;
> +	struct host1x_syncpt *sp = NULL;
>  	struct host1x_job *job;
>  	unsigned int num_refs;
>  	int err;
> @@ -301,8 +301,8 @@ int tegra_drm_submit(struct tegra_drm_context *context,
>  		goto fail;
>  	}
>  
> -	/* check whether syncpoint ID is valid */
> -	sp = host1x_syncpt_get(host1x, syncpt.id);
> +	/* Syncpoint ref will be dropped on job release. */
> +	sp = host1x_syncpt_get_by_id(host1x, syncpt.id);
>  	if (!sp) {
>  		err = -ENOENT;
>  		goto fail;
> @@ -311,7 +311,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
>  	job->is_addr_reg = context->client->ops->is_addr_reg;
>  	job->is_valid_class = context->client->ops->is_valid_class;
>  	job->syncpt_incrs = syncpt.incrs;
> -	job->syncpt_id = syncpt.id;
> +	job->syncpt = sp;
>  	job->timeout = 10000;
>  
>  	if (args->timeout && args->timeout < 10000)
> @@ -383,7 +383,7 @@ static int tegra_syncpt_read(struct drm_device *drm, void *data,
>  	struct drm_tegra_syncpt_read *args = data;
>  	struct host1x_syncpt *sp;
>  
> -	sp = host1x_syncpt_get(host, args->id);
> +	sp = host1x_syncpt_get_by_id_noref(host, args->id);
>  	if (!sp)
>  		return -EINVAL;
>  
> @@ -398,7 +398,7 @@ static int tegra_syncpt_incr(struct drm_device *drm, void *data,
>  	struct drm_tegra_syncpt_incr *args = data;
>  	struct host1x_syncpt *sp;
>  
> -	sp = host1x_syncpt_get(host1x, args->id);
> +	sp = host1x_syncpt_get_by_id_noref(host1x, args->id);
>  	if (!sp)
>  		return -EINVAL;
>  
> @@ -412,7 +412,7 @@ static int tegra_syncpt_wait(struct drm_device *drm, void *data,
>  	struct drm_tegra_syncpt_wait *args = data;
>  	struct host1x_syncpt *sp;
>  
> -	sp = host1x_syncpt_get(host1x, args->id);
> +	sp = host1x_syncpt_get_by_id_noref(host1x, args->id);
>  	if (!sp)
>  		return -EINVAL;
>  
> diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c
> index adbe2ddcda19..de288cba3905 100644
> --- a/drivers/gpu/drm/tegra/gr2d.c
> +++ b/drivers/gpu/drm/tegra/gr2d.c
> @@ -67,7 +67,7 @@ static int gr2d_init(struct host1x_client *client)
>  detach:
>  	host1x_client_iommu_detach(client);
>  free:
> -	host1x_syncpt_free(client->syncpts[0]);
> +	host1x_syncpt_put(client->syncpts[0]);
>  put:
>  	host1x_channel_put(gr2d->channel);
>  	return err;
> @@ -86,7 +86,7 @@ static int gr2d_exit(struct host1x_client *client)
>  		return err;
>  
>  	host1x_client_iommu_detach(client);
> -	host1x_syncpt_free(client->syncpts[0]);
> +	host1x_syncpt_put(client->syncpts[0]);
>  	host1x_channel_put(gr2d->channel);
>  
>  	return 0;
> diff --git a/drivers/gpu/drm/tegra/gr3d.c b/drivers/gpu/drm/tegra/gr3d.c
> index b0b8154e8104..24442ade0da3 100644
> --- a/drivers/gpu/drm/tegra/gr3d.c
> +++ b/drivers/gpu/drm/tegra/gr3d.c
> @@ -76,7 +76,7 @@ static int gr3d_init(struct host1x_client *client)
>  detach:
>  	host1x_client_iommu_detach(client);
>  free:
> -	host1x_syncpt_free(client->syncpts[0]);
> +	host1x_syncpt_put(client->syncpts[0]);
>  put:
>  	host1x_channel_put(gr3d->channel);
>  	return err;
> @@ -94,7 +94,7 @@ static int gr3d_exit(struct host1x_client *client)
>  		return err;
>  
>  	host1x_client_iommu_detach(client);
> -	host1x_syncpt_free(client->syncpts[0]);
> +	host1x_syncpt_put(client->syncpts[0]);
>  	host1x_channel_put(gr3d->channel);
>  
>  	return 0;
> diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
> index 77e128832920..72aea1cc0cfa 100644
> --- a/drivers/gpu/drm/tegra/vic.c
> +++ b/drivers/gpu/drm/tegra/vic.c
> @@ -214,7 +214,7 @@ static int vic_init(struct host1x_client *client)
>  	return 0;
>  
>  free_syncpt:
> -	host1x_syncpt_free(client->syncpts[0]);
> +	host1x_syncpt_put(client->syncpts[0]);
>  free_channel:
>  	host1x_channel_put(vic->channel);
>  detach:
> @@ -238,7 +238,7 @@ static int vic_exit(struct host1x_client *client)
>  	if (err < 0)
>  		return err;
>  
> -	host1x_syncpt_free(client->syncpts[0]);
> +	host1x_syncpt_put(client->syncpts[0]);
>  	host1x_channel_put(vic->channel);
>  	host1x_client_iommu_detach(client);
>  
> diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c
> index e8d3fda91d8a..6e6ca774f68d 100644
> --- a/drivers/gpu/host1x/cdma.c
> +++ b/drivers/gpu/host1x/cdma.c
> @@ -273,15 +273,13 @@ static int host1x_cdma_wait_pushbuffer_space(struct host1x *host1x,
>  static void cdma_start_timer_locked(struct host1x_cdma *cdma,
>  				    struct host1x_job *job)
>  {
> -	struct host1x *host = cdma_to_host1x(cdma);
> -
>  	if (cdma->timeout.client) {
>  		/* timer already started */
>  		return;
>  	}
>  
>  	cdma->timeout.client = job->client;
> -	cdma->timeout.syncpt = host1x_syncpt_get(host, job->syncpt_id);
> +	cdma->timeout.syncpt = job->syncpt;
>  	cdma->timeout.syncpt_val = job->syncpt_end;
>  	cdma->timeout.start_ktime = ktime_get();
>  
> @@ -312,7 +310,6 @@ static void stop_cdma_timer_locked(struct host1x_cdma *cdma)
>  static void update_cdma_locked(struct host1x_cdma *cdma)
>  {
>  	bool signal = false;
> -	struct host1x *host1x = cdma_to_host1x(cdma);
>  	struct host1x_job *job, *n;
>  
>  	/* If CDMA is stopped, queue is cleared and we can return */
> @@ -324,8 +321,7 @@ static void update_cdma_locked(struct host1x_cdma *cdma)
>  	 * to consume as many sync queue entries as possible without blocking
>  	 */
>  	list_for_each_entry_safe(job, n, &cdma->sync_queue, list) {
> -		struct host1x_syncpt *sp =
> -			host1x_syncpt_get(host1x, job->syncpt_id);
> +		struct host1x_syncpt *sp = job->syncpt;
>  
>  		/* Check whether this syncpt has completed, and bail if not */
>  		if (!host1x_syncpt_is_expired(sp, job->syncpt_end)) {
> @@ -499,8 +495,7 @@ int host1x_cdma_begin(struct host1x_cdma *cdma, struct host1x_job *job)
>  		if (!cdma->timeout.initialized) {
>  			int err;
>  
> -			err = host1x_hw_cdma_timeout_init(host1x, cdma,
> -							  job->syncpt_id);
> +			err = host1x_hw_cdma_timeout_init(host1x, cdma);
>  			if (err) {
>  				mutex_unlock(&cdma->lock);
>  				return err;
> diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
> index f781a9b0f39d..63010ae37a97 100644
> --- a/drivers/gpu/host1x/dev.h
> +++ b/drivers/gpu/host1x/dev.h
> @@ -37,7 +37,7 @@ struct host1x_cdma_ops {
>  	void (*start)(struct host1x_cdma *cdma);
>  	void (*stop)(struct host1x_cdma *cdma);
>  	void (*flush)(struct  host1x_cdma *cdma);
> -	int (*timeout_init)(struct host1x_cdma *cdma, unsigned int syncpt);
> +	int (*timeout_init)(struct host1x_cdma *cdma);
>  	void (*timeout_destroy)(struct host1x_cdma *cdma);
>  	void (*freeze)(struct host1x_cdma *cdma);
>  	void (*resume)(struct host1x_cdma *cdma, u32 getptr);
> @@ -261,10 +261,9 @@ static inline void host1x_hw_cdma_flush(struct host1x *host,
>  }
>  
>  static inline int host1x_hw_cdma_timeout_init(struct host1x *host,
> -					      struct host1x_cdma *cdma,
> -					      unsigned int syncpt)
> +					      struct host1x_cdma *cdma)
>  {
> -	return host->cdma_op->timeout_init(cdma, syncpt);
> +	return host->cdma_op->timeout_init(cdma);
>  }
>  
>  static inline void host1x_hw_cdma_timeout_destroy(struct host1x *host,
> diff --git a/drivers/gpu/host1x/hw/cdma_hw.c b/drivers/gpu/host1x/hw/cdma_hw.c
> index 2f3bf94cf365..e49cd5b8f735 100644
> --- a/drivers/gpu/host1x/hw/cdma_hw.c
> +++ b/drivers/gpu/host1x/hw/cdma_hw.c
> @@ -295,7 +295,7 @@ static void cdma_timeout_handler(struct work_struct *work)
>  /*
>   * Init timeout resources
>   */
> -static int cdma_timeout_init(struct host1x_cdma *cdma, unsigned int syncpt)
> +static int cdma_timeout_init(struct host1x_cdma *cdma)
>  {
>  	INIT_DELAYED_WORK(&cdma->timeout.wq, cdma_timeout_handler);
>  	cdma->timeout.initialized = true;
> diff --git a/drivers/gpu/host1x/hw/channel_hw.c b/drivers/gpu/host1x/hw/channel_hw.c
> index 5eaa29d171c9..d4c28faf27d1 100644
> --- a/drivers/gpu/host1x/hw/channel_hw.c
> +++ b/drivers/gpu/host1x/hw/channel_hw.c
> @@ -86,8 +86,7 @@ static void submit_gathers(struct host1x_job *job)
>  
>  static inline void synchronize_syncpt_base(struct host1x_job *job)
>  {
> -	struct host1x *host = dev_get_drvdata(job->channel->dev->parent);
> -	struct host1x_syncpt *sp = host->syncpt + job->syncpt_id;
> +	struct host1x_syncpt *sp = job->syncpt;
>  	unsigned int id;
>  	u32 value;
>  
> @@ -118,7 +117,7 @@ static void host1x_channel_set_streamid(struct host1x_channel *channel)
>  static int channel_submit(struct host1x_job *job)
>  {
>  	struct host1x_channel *ch = job->channel;
> -	struct host1x_syncpt *sp;
> +	struct host1x_syncpt *sp = job->syncpt;
>  	u32 user_syncpt_incrs = job->syncpt_incrs;
>  	u32 prev_max = 0;
>  	u32 syncval;
> @@ -126,10 +125,9 @@ static int channel_submit(struct host1x_job *job)
>  	struct host1x_waitlist *completed_waiter = NULL;
>  	struct host1x *host = dev_get_drvdata(ch->dev->parent);
>  
> -	sp = host->syncpt + job->syncpt_id;
>  	trace_host1x_channel_submit(dev_name(ch->dev),
>  				    job->num_gathers, job->num_relocs,
> -				    job->syncpt_id, job->syncpt_incrs);
> +				    job->syncpt->id, job->syncpt_incrs);
>  
>  	/* before error checks, return current max */
>  	prev_max = job->syncpt_end = host1x_syncpt_read_max(sp);
> @@ -163,7 +161,7 @@ static int channel_submit(struct host1x_job *job)
>  		host1x_cdma_push(&ch->cdma,
>  				 host1x_opcode_setclass(HOST1X_CLASS_HOST1X,
>  					host1x_uclass_wait_syncpt_r(), 1),
> -				 host1x_class_host_wait_syncpt(job->syncpt_id,
> +				 host1x_class_host_wait_syncpt(job->syncpt->id,
>  					host1x_syncpt_read_max(sp)));
>  	}
>  
> diff --git a/drivers/gpu/host1x/hw/debug_hw.c b/drivers/gpu/host1x/hw/debug_hw.c
> index f31bcfa1b837..ceb48229d14b 100644
> --- a/drivers/gpu/host1x/hw/debug_hw.c
> +++ b/drivers/gpu/host1x/hw/debug_hw.c
> @@ -204,7 +204,7 @@ static void show_channel_gathers(struct output *o, struct host1x_cdma *cdma)
>  		unsigned int i;
>  
>  		host1x_debug_output(o, "\n%p: JOB, syncpt_id=%d, syncpt_val=%d, first_get=%08x, timeout=%d num_slots=%d, num_handles=%d\n",
> -				    job, job->syncpt_id, job->syncpt_end,
> +				    job, job->syncpt->id, job->syncpt_end,
>  				    job->first_get, job->timeout,
>  				    job->num_slots, job->num_unpins);
>  
> diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
> index 82d0a60ba3f7..adbdc225de8d 100644
> --- a/drivers/gpu/host1x/job.c
> +++ b/drivers/gpu/host1x/job.c
> @@ -79,6 +79,9 @@ static void job_free(struct kref *ref)
>  {
>  	struct host1x_job *job = container_of(ref, struct host1x_job, ref);
>  
> +	if (job->syncpt)
> +		host1x_syncpt_put(job->syncpt);
> +
>  	kfree(job);
>  }
>  
> @@ -674,7 +677,7 @@ EXPORT_SYMBOL(host1x_job_unpin);
>   */
>  void host1x_job_dump(struct device *dev, struct host1x_job *job)
>  {
> -	dev_dbg(dev, "    SYNCPT_ID   %d\n", job->syncpt_id);
> +	dev_dbg(dev, "    SYNCPT_ID   %d\n", job->syncpt->id);
>  	dev_dbg(dev, "    SYNCPT_VAL  %d\n", job->syncpt_end);
>  	dev_dbg(dev, "    FIRST_GET   0x%x\n", job->first_get);
>  	dev_dbg(dev, "    TIMEOUT     %d\n", job->timeout);
> diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
> index 8da4bbce8b9d..7bb5de8c3d63 100644
> --- a/drivers/gpu/host1x/syncpt.c
> +++ b/drivers/gpu/host1x/syncpt.c
> @@ -90,6 +90,8 @@ struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
>  	else
>  		sp->client_managed = false;
>  
> +	kref_init(&sp->ref);
> +
>  	mutex_unlock(&host->syncpt_mutex);
>  	return sp;
>  
> @@ -383,7 +385,7 @@ int host1x_syncpt_init(struct host1x *host)
>   * host1x client drivers can use this function to allocate a syncpoint for
>   * subsequent use. A syncpoint returned by this function will be reserved for
>   * use by the client exclusively. When no longer using a syncpoint, a host1x
> - * client driver needs to release it using host1x_syncpt_free().
> + * client driver needs to release it using host1x_syncpt_put().
>   */
>  struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
>  					    unsigned long flags)
> @@ -394,20 +396,9 @@ struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
>  }
>  EXPORT_SYMBOL(host1x_syncpt_request);
>  
> -/**
> - * host1x_syncpt_free() - free a requested syncpoint
> - * @sp: host1x syncpoint
> - *
> - * Release a syncpoint previously allocated using host1x_syncpt_request(). A
> - * host1x client driver should call this when the syncpoint is no longer in
> - * use. Note that client drivers must ensure that the syncpoint doesn't remain
> - * under the control of hardware after calling this function, otherwise two
> - * clients may end up trying to access the same syncpoint concurrently.
> - */
> -void host1x_syncpt_free(struct host1x_syncpt *sp)
> +static void syncpt_release(struct kref *ref)
>  {
> -	if (!sp)
> -		return;
> +	struct host1x_syncpt *sp = container_of(ref, struct host1x_syncpt, ref);
>  
>  	mutex_lock(&sp->host->syncpt_mutex);
>  
> @@ -419,7 +410,23 @@ void host1x_syncpt_free(struct host1x_syncpt *sp)
>  
>  	mutex_unlock(&sp->host->syncpt_mutex);
>  }
> -EXPORT_SYMBOL(host1x_syncpt_free);
> +
> +/**
> + * host1x_syncpt_put() - free a requested syncpoint
> + * @sp: host1x syncpoint
> + *
> + * Release a syncpoint previously allocated using host1x_syncpt_request(). A
> + * host1x client driver should call this when the syncpoint is no longer in
> + * use.
> + */
> +void host1x_syncpt_put(struct host1x_syncpt *sp)
> +{
> +	if (!sp)
> +		return;
> +
> +	kref_put(&sp->ref, syncpt_release);
> +}
> +EXPORT_SYMBOL(host1x_syncpt_put);
>  
>  void host1x_syncpt_deinit(struct host1x *host)
>  {
> @@ -486,16 +493,48 @@ unsigned int host1x_syncpt_nb_mlocks(struct host1x *host)
>  }
>  
>  /**
> - * host1x_syncpt_get() - obtain a syncpoint by ID
> + * host1x_syncpt_get_by_id() - obtain a syncpoint by ID
> + * @host: host1x controller
> + * @id: syncpoint ID
> + */
> +struct host1x_syncpt *host1x_syncpt_get_by_id(struct host1x *host,
> +					      unsigned int id)
> +{
> +	if (id >= host->info->nb_pts)
> +		return NULL;
> +
> +	if (kref_get_unless_zero(&host->syncpt[id].ref))
> +		return &host->syncpt[id];
> +	else
> +		return NULL;
> +}
> +EXPORT_SYMBOL(host1x_syncpt_get_by_id);
> +
> +/**
> + * host1x_syncpt_get_by_id_noref() - obtain a syncpoint by ID but don't
> + * 	increase the refcount.
>   * @host: host1x controller
>   * @id: syncpoint ID
>   */
> -struct host1x_syncpt *host1x_syncpt_get(struct host1x *host, unsigned int id)
> +struct host1x_syncpt *host1x_syncpt_get_by_id_noref(struct host1x *host,
> +						    unsigned int id)
>  {
>  	if (id >= host->info->nb_pts)
>  		return NULL;
>  
> -	return host->syncpt + id;
> +	return &host->syncpt[id];
> +}
> +EXPORT_SYMBOL(host1x_syncpt_get_by_id_noref);
> +
> +/**
> + * host1x_syncpt_get() - increment syncpoint refcount
> + * @sp: syncpoint
> + */
> +struct host1x_syncpt *host1x_syncpt_get(struct host1x_syncpt *sp)
> +{
> +	kref_get(&sp->ref);
> +
> +	return sp;
>  }
>  EXPORT_SYMBOL(host1x_syncpt_get);
>  
> diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h
> index 3aa6b25b1b9c..a6766f8d55ee 100644
> --- a/drivers/gpu/host1x/syncpt.h
> +++ b/drivers/gpu/host1x/syncpt.h
> @@ -11,6 +11,7 @@
>  #include <linux/atomic.h>
>  #include <linux/host1x.h>
>  #include <linux/kernel.h>
> +#include <linux/kref.h>
>  #include <linux/sched.h>
>  
>  #include "intr.h"
> @@ -26,6 +27,8 @@ struct host1x_syncpt_base {
>  };
>  
>  struct host1x_syncpt {
> +	struct kref ref;
> +
>  	unsigned int id;
>  	atomic_t min_val;
>  	atomic_t max_val;
> diff --git a/drivers/staging/media/tegra-video/vi.c b/drivers/staging/media/tegra-video/vi.c
> index 7a09061cda57..7e0cb5529b49 100644
> --- a/drivers/staging/media/tegra-video/vi.c
> +++ b/drivers/staging/media/tegra-video/vi.c
> @@ -1131,8 +1131,8 @@ static void tegra_channel_host1x_syncpts_free(struct tegra_vi_channel *chan)
>  	int i;
>  
>  	for (i = 0; i < chan->numgangports; i++) {
> -		host1x_syncpt_free(chan->mw_ack_sp[i]);
> -		host1x_syncpt_free(chan->frame_start_sp[i]);
> +		host1x_syncpt_put(chan->mw_ack_sp[i]);
> +		host1x_syncpt_put(chan->frame_start_sp[i]);
>  	}
>  }
>  
> diff --git a/include/linux/host1x.h b/include/linux/host1x.h
> index 7137ce0e35d4..107aea29bccb 100644
> --- a/include/linux/host1x.h
> +++ b/include/linux/host1x.h
> @@ -142,7 +142,9 @@ struct host1x_syncpt_base;
>  struct host1x_syncpt;
>  struct host1x;
>  
> -struct host1x_syncpt *host1x_syncpt_get(struct host1x *host, u32 id);
> +struct host1x_syncpt *host1x_syncpt_get_by_id(struct host1x *host, u32 id);
> +struct host1x_syncpt *host1x_syncpt_get_by_id_noref(struct host1x *host, u32 id);
> +struct host1x_syncpt *host1x_syncpt_get(struct host1x_syncpt *sp);
>  u32 host1x_syncpt_id(struct host1x_syncpt *sp);
>  u32 host1x_syncpt_read_min(struct host1x_syncpt *sp);
>  u32 host1x_syncpt_read_max(struct host1x_syncpt *sp);
> @@ -153,7 +155,7 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout,
>  		       u32 *value);
>  struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
>  					    unsigned long flags);
> -void host1x_syncpt_free(struct host1x_syncpt *sp);
> +void host1x_syncpt_put(struct host1x_syncpt *sp);
>  struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
>  					  unsigned long flags,
>  					  const char *name);
> @@ -221,7 +223,7 @@ struct host1x_job {
>  	dma_addr_t *reloc_addr_phys;
>  
>  	/* Sync point id, number of increments and end related to the submit */
> -	u32 syncpt_id;
> +	struct host1x_syncpt *syncpt;
>  	u32 syncpt_incrs;
>  	u32 syncpt_end;
>  
> 

Tested-by: Dmitry Osipenko <digetx@gmail.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v6 06/10] gpu: host1x: Cleanup and refcounting for syncpoints
@ 2021-03-30 21:12     ` Dmitry Osipenko
  0 siblings, 0 replies; 35+ messages in thread
From: Dmitry Osipenko @ 2021-03-30 21:12 UTC (permalink / raw)
  To: Mikko Perttunen, thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel

29.03.2021 16:38, Mikko Perttunen пишет:
> Add reference counting for allocated syncpoints to allow keeping
> them allocated while jobs are referencing them. Additionally,
> clean up various places using syncpoint IDs to use host1x_syncpt
> pointers instead.
> 
> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> ---
> v5:
> - Remove host1x_syncpt_put in submit code, as job_put already
>   puts the syncpoint.
> - Changes due to rebase in VI driver.
> v4:
> - Update from _free to _put in VI driver as well
> ---
>  drivers/gpu/drm/tegra/dc.c             |  4 +-
>  drivers/gpu/drm/tegra/drm.c            | 14 ++---
>  drivers/gpu/drm/tegra/gr2d.c           |  4 +-
>  drivers/gpu/drm/tegra/gr3d.c           |  4 +-
>  drivers/gpu/drm/tegra/vic.c            |  4 +-
>  drivers/gpu/host1x/cdma.c              | 11 ++--
>  drivers/gpu/host1x/dev.h               |  7 ++-
>  drivers/gpu/host1x/hw/cdma_hw.c        |  2 +-
>  drivers/gpu/host1x/hw/channel_hw.c     | 10 ++--
>  drivers/gpu/host1x/hw/debug_hw.c       |  2 +-
>  drivers/gpu/host1x/job.c               |  5 +-
>  drivers/gpu/host1x/syncpt.c            | 75 +++++++++++++++++++-------
>  drivers/gpu/host1x/syncpt.h            |  3 ++
>  drivers/staging/media/tegra-video/vi.c |  4 +-
>  include/linux/host1x.h                 |  8 +--
>  15 files changed, 98 insertions(+), 59 deletions(-)
> 
> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> index 5635fac01e3e..416d6036cf47 100644
> --- a/drivers/gpu/drm/tegra/dc.c
> +++ b/drivers/gpu/drm/tegra/dc.c
> @@ -2141,7 +2141,7 @@ static int tegra_dc_init(struct host1x_client *client)
>  		drm_plane_cleanup(primary);
>  
>  	host1x_client_iommu_detach(client);
> -	host1x_syncpt_free(dc->syncpt);
> +	host1x_syncpt_put(dc->syncpt);
>  
>  	return err;
>  }
> @@ -2166,7 +2166,7 @@ static int tegra_dc_exit(struct host1x_client *client)
>  	}
>  
>  	host1x_client_iommu_detach(client);
> -	host1x_syncpt_free(dc->syncpt);
> +	host1x_syncpt_put(dc->syncpt);
>  
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> index 90709c38c993..ce5bdc58d315 100644
> --- a/drivers/gpu/drm/tegra/drm.c
> +++ b/drivers/gpu/drm/tegra/drm.c
> @@ -174,7 +174,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
>  	struct drm_tegra_syncpt syncpt;
>  	struct host1x *host1x = dev_get_drvdata(drm->dev->parent);
>  	struct drm_gem_object **refs;
> -	struct host1x_syncpt *sp;
> +	struct host1x_syncpt *sp = NULL;
>  	struct host1x_job *job;
>  	unsigned int num_refs;
>  	int err;
> @@ -301,8 +301,8 @@ int tegra_drm_submit(struct tegra_drm_context *context,
>  		goto fail;
>  	}
>  
> -	/* check whether syncpoint ID is valid */
> -	sp = host1x_syncpt_get(host1x, syncpt.id);
> +	/* Syncpoint ref will be dropped on job release. */
> +	sp = host1x_syncpt_get_by_id(host1x, syncpt.id);
>  	if (!sp) {
>  		err = -ENOENT;
>  		goto fail;
> @@ -311,7 +311,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
>  	job->is_addr_reg = context->client->ops->is_addr_reg;
>  	job->is_valid_class = context->client->ops->is_valid_class;
>  	job->syncpt_incrs = syncpt.incrs;
> -	job->syncpt_id = syncpt.id;
> +	job->syncpt = sp;
>  	job->timeout = 10000;
>  
>  	if (args->timeout && args->timeout < 10000)
> @@ -383,7 +383,7 @@ static int tegra_syncpt_read(struct drm_device *drm, void *data,
>  	struct drm_tegra_syncpt_read *args = data;
>  	struct host1x_syncpt *sp;
>  
> -	sp = host1x_syncpt_get(host, args->id);
> +	sp = host1x_syncpt_get_by_id_noref(host, args->id);
>  	if (!sp)
>  		return -EINVAL;
>  
> @@ -398,7 +398,7 @@ static int tegra_syncpt_incr(struct drm_device *drm, void *data,
>  	struct drm_tegra_syncpt_incr *args = data;
>  	struct host1x_syncpt *sp;
>  
> -	sp = host1x_syncpt_get(host1x, args->id);
> +	sp = host1x_syncpt_get_by_id_noref(host1x, args->id);
>  	if (!sp)
>  		return -EINVAL;
>  
> @@ -412,7 +412,7 @@ static int tegra_syncpt_wait(struct drm_device *drm, void *data,
>  	struct drm_tegra_syncpt_wait *args = data;
>  	struct host1x_syncpt *sp;
>  
> -	sp = host1x_syncpt_get(host1x, args->id);
> +	sp = host1x_syncpt_get_by_id_noref(host1x, args->id);
>  	if (!sp)
>  		return -EINVAL;
>  
> diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c
> index adbe2ddcda19..de288cba3905 100644
> --- a/drivers/gpu/drm/tegra/gr2d.c
> +++ b/drivers/gpu/drm/tegra/gr2d.c
> @@ -67,7 +67,7 @@ static int gr2d_init(struct host1x_client *client)
>  detach:
>  	host1x_client_iommu_detach(client);
>  free:
> -	host1x_syncpt_free(client->syncpts[0]);
> +	host1x_syncpt_put(client->syncpts[0]);
>  put:
>  	host1x_channel_put(gr2d->channel);
>  	return err;
> @@ -86,7 +86,7 @@ static int gr2d_exit(struct host1x_client *client)
>  		return err;
>  
>  	host1x_client_iommu_detach(client);
> -	host1x_syncpt_free(client->syncpts[0]);
> +	host1x_syncpt_put(client->syncpts[0]);
>  	host1x_channel_put(gr2d->channel);
>  
>  	return 0;
> diff --git a/drivers/gpu/drm/tegra/gr3d.c b/drivers/gpu/drm/tegra/gr3d.c
> index b0b8154e8104..24442ade0da3 100644
> --- a/drivers/gpu/drm/tegra/gr3d.c
> +++ b/drivers/gpu/drm/tegra/gr3d.c
> @@ -76,7 +76,7 @@ static int gr3d_init(struct host1x_client *client)
>  detach:
>  	host1x_client_iommu_detach(client);
>  free:
> -	host1x_syncpt_free(client->syncpts[0]);
> +	host1x_syncpt_put(client->syncpts[0]);
>  put:
>  	host1x_channel_put(gr3d->channel);
>  	return err;
> @@ -94,7 +94,7 @@ static int gr3d_exit(struct host1x_client *client)
>  		return err;
>  
>  	host1x_client_iommu_detach(client);
> -	host1x_syncpt_free(client->syncpts[0]);
> +	host1x_syncpt_put(client->syncpts[0]);
>  	host1x_channel_put(gr3d->channel);
>  
>  	return 0;
> diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
> index 77e128832920..72aea1cc0cfa 100644
> --- a/drivers/gpu/drm/tegra/vic.c
> +++ b/drivers/gpu/drm/tegra/vic.c
> @@ -214,7 +214,7 @@ static int vic_init(struct host1x_client *client)
>  	return 0;
>  
>  free_syncpt:
> -	host1x_syncpt_free(client->syncpts[0]);
> +	host1x_syncpt_put(client->syncpts[0]);
>  free_channel:
>  	host1x_channel_put(vic->channel);
>  detach:
> @@ -238,7 +238,7 @@ static int vic_exit(struct host1x_client *client)
>  	if (err < 0)
>  		return err;
>  
> -	host1x_syncpt_free(client->syncpts[0]);
> +	host1x_syncpt_put(client->syncpts[0]);
>  	host1x_channel_put(vic->channel);
>  	host1x_client_iommu_detach(client);
>  
> diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c
> index e8d3fda91d8a..6e6ca774f68d 100644
> --- a/drivers/gpu/host1x/cdma.c
> +++ b/drivers/gpu/host1x/cdma.c
> @@ -273,15 +273,13 @@ static int host1x_cdma_wait_pushbuffer_space(struct host1x *host1x,
>  static void cdma_start_timer_locked(struct host1x_cdma *cdma,
>  				    struct host1x_job *job)
>  {
> -	struct host1x *host = cdma_to_host1x(cdma);
> -
>  	if (cdma->timeout.client) {
>  		/* timer already started */
>  		return;
>  	}
>  
>  	cdma->timeout.client = job->client;
> -	cdma->timeout.syncpt = host1x_syncpt_get(host, job->syncpt_id);
> +	cdma->timeout.syncpt = job->syncpt;
>  	cdma->timeout.syncpt_val = job->syncpt_end;
>  	cdma->timeout.start_ktime = ktime_get();
>  
> @@ -312,7 +310,6 @@ static void stop_cdma_timer_locked(struct host1x_cdma *cdma)
>  static void update_cdma_locked(struct host1x_cdma *cdma)
>  {
>  	bool signal = false;
> -	struct host1x *host1x = cdma_to_host1x(cdma);
>  	struct host1x_job *job, *n;
>  
>  	/* If CDMA is stopped, queue is cleared and we can return */
> @@ -324,8 +321,7 @@ static void update_cdma_locked(struct host1x_cdma *cdma)
>  	 * to consume as many sync queue entries as possible without blocking
>  	 */
>  	list_for_each_entry_safe(job, n, &cdma->sync_queue, list) {
> -		struct host1x_syncpt *sp =
> -			host1x_syncpt_get(host1x, job->syncpt_id);
> +		struct host1x_syncpt *sp = job->syncpt;
>  
>  		/* Check whether this syncpt has completed, and bail if not */
>  		if (!host1x_syncpt_is_expired(sp, job->syncpt_end)) {
> @@ -499,8 +495,7 @@ int host1x_cdma_begin(struct host1x_cdma *cdma, struct host1x_job *job)
>  		if (!cdma->timeout.initialized) {
>  			int err;
>  
> -			err = host1x_hw_cdma_timeout_init(host1x, cdma,
> -							  job->syncpt_id);
> +			err = host1x_hw_cdma_timeout_init(host1x, cdma);
>  			if (err) {
>  				mutex_unlock(&cdma->lock);
>  				return err;
> diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
> index f781a9b0f39d..63010ae37a97 100644
> --- a/drivers/gpu/host1x/dev.h
> +++ b/drivers/gpu/host1x/dev.h
> @@ -37,7 +37,7 @@ struct host1x_cdma_ops {
>  	void (*start)(struct host1x_cdma *cdma);
>  	void (*stop)(struct host1x_cdma *cdma);
>  	void (*flush)(struct  host1x_cdma *cdma);
> -	int (*timeout_init)(struct host1x_cdma *cdma, unsigned int syncpt);
> +	int (*timeout_init)(struct host1x_cdma *cdma);
>  	void (*timeout_destroy)(struct host1x_cdma *cdma);
>  	void (*freeze)(struct host1x_cdma *cdma);
>  	void (*resume)(struct host1x_cdma *cdma, u32 getptr);
> @@ -261,10 +261,9 @@ static inline void host1x_hw_cdma_flush(struct host1x *host,
>  }
>  
>  static inline int host1x_hw_cdma_timeout_init(struct host1x *host,
> -					      struct host1x_cdma *cdma,
> -					      unsigned int syncpt)
> +					      struct host1x_cdma *cdma)
>  {
> -	return host->cdma_op->timeout_init(cdma, syncpt);
> +	return host->cdma_op->timeout_init(cdma);
>  }
>  
>  static inline void host1x_hw_cdma_timeout_destroy(struct host1x *host,
> diff --git a/drivers/gpu/host1x/hw/cdma_hw.c b/drivers/gpu/host1x/hw/cdma_hw.c
> index 2f3bf94cf365..e49cd5b8f735 100644
> --- a/drivers/gpu/host1x/hw/cdma_hw.c
> +++ b/drivers/gpu/host1x/hw/cdma_hw.c
> @@ -295,7 +295,7 @@ static void cdma_timeout_handler(struct work_struct *work)
>  /*
>   * Init timeout resources
>   */
> -static int cdma_timeout_init(struct host1x_cdma *cdma, unsigned int syncpt)
> +static int cdma_timeout_init(struct host1x_cdma *cdma)
>  {
>  	INIT_DELAYED_WORK(&cdma->timeout.wq, cdma_timeout_handler);
>  	cdma->timeout.initialized = true;
> diff --git a/drivers/gpu/host1x/hw/channel_hw.c b/drivers/gpu/host1x/hw/channel_hw.c
> index 5eaa29d171c9..d4c28faf27d1 100644
> --- a/drivers/gpu/host1x/hw/channel_hw.c
> +++ b/drivers/gpu/host1x/hw/channel_hw.c
> @@ -86,8 +86,7 @@ static void submit_gathers(struct host1x_job *job)
>  
>  static inline void synchronize_syncpt_base(struct host1x_job *job)
>  {
> -	struct host1x *host = dev_get_drvdata(job->channel->dev->parent);
> -	struct host1x_syncpt *sp = host->syncpt + job->syncpt_id;
> +	struct host1x_syncpt *sp = job->syncpt;
>  	unsigned int id;
>  	u32 value;
>  
> @@ -118,7 +117,7 @@ static void host1x_channel_set_streamid(struct host1x_channel *channel)
>  static int channel_submit(struct host1x_job *job)
>  {
>  	struct host1x_channel *ch = job->channel;
> -	struct host1x_syncpt *sp;
> +	struct host1x_syncpt *sp = job->syncpt;
>  	u32 user_syncpt_incrs = job->syncpt_incrs;
>  	u32 prev_max = 0;
>  	u32 syncval;
> @@ -126,10 +125,9 @@ static int channel_submit(struct host1x_job *job)
>  	struct host1x_waitlist *completed_waiter = NULL;
>  	struct host1x *host = dev_get_drvdata(ch->dev->parent);
>  
> -	sp = host->syncpt + job->syncpt_id;
>  	trace_host1x_channel_submit(dev_name(ch->dev),
>  				    job->num_gathers, job->num_relocs,
> -				    job->syncpt_id, job->syncpt_incrs);
> +				    job->syncpt->id, job->syncpt_incrs);
>  
>  	/* before error checks, return current max */
>  	prev_max = job->syncpt_end = host1x_syncpt_read_max(sp);
> @@ -163,7 +161,7 @@ static int channel_submit(struct host1x_job *job)
>  		host1x_cdma_push(&ch->cdma,
>  				 host1x_opcode_setclass(HOST1X_CLASS_HOST1X,
>  					host1x_uclass_wait_syncpt_r(), 1),
> -				 host1x_class_host_wait_syncpt(job->syncpt_id,
> +				 host1x_class_host_wait_syncpt(job->syncpt->id,
>  					host1x_syncpt_read_max(sp)));
>  	}
>  
> diff --git a/drivers/gpu/host1x/hw/debug_hw.c b/drivers/gpu/host1x/hw/debug_hw.c
> index f31bcfa1b837..ceb48229d14b 100644
> --- a/drivers/gpu/host1x/hw/debug_hw.c
> +++ b/drivers/gpu/host1x/hw/debug_hw.c
> @@ -204,7 +204,7 @@ static void show_channel_gathers(struct output *o, struct host1x_cdma *cdma)
>  		unsigned int i;
>  
>  		host1x_debug_output(o, "\n%p: JOB, syncpt_id=%d, syncpt_val=%d, first_get=%08x, timeout=%d num_slots=%d, num_handles=%d\n",
> -				    job, job->syncpt_id, job->syncpt_end,
> +				    job, job->syncpt->id, job->syncpt_end,
>  				    job->first_get, job->timeout,
>  				    job->num_slots, job->num_unpins);
>  
> diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
> index 82d0a60ba3f7..adbdc225de8d 100644
> --- a/drivers/gpu/host1x/job.c
> +++ b/drivers/gpu/host1x/job.c
> @@ -79,6 +79,9 @@ static void job_free(struct kref *ref)
>  {
>  	struct host1x_job *job = container_of(ref, struct host1x_job, ref);
>  
> +	if (job->syncpt)
> +		host1x_syncpt_put(job->syncpt);
> +
>  	kfree(job);
>  }
>  
> @@ -674,7 +677,7 @@ EXPORT_SYMBOL(host1x_job_unpin);
>   */
>  void host1x_job_dump(struct device *dev, struct host1x_job *job)
>  {
> -	dev_dbg(dev, "    SYNCPT_ID   %d\n", job->syncpt_id);
> +	dev_dbg(dev, "    SYNCPT_ID   %d\n", job->syncpt->id);
>  	dev_dbg(dev, "    SYNCPT_VAL  %d\n", job->syncpt_end);
>  	dev_dbg(dev, "    FIRST_GET   0x%x\n", job->first_get);
>  	dev_dbg(dev, "    TIMEOUT     %d\n", job->timeout);
> diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
> index 8da4bbce8b9d..7bb5de8c3d63 100644
> --- a/drivers/gpu/host1x/syncpt.c
> +++ b/drivers/gpu/host1x/syncpt.c
> @@ -90,6 +90,8 @@ struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
>  	else
>  		sp->client_managed = false;
>  
> +	kref_init(&sp->ref);
> +
>  	mutex_unlock(&host->syncpt_mutex);
>  	return sp;
>  
> @@ -383,7 +385,7 @@ int host1x_syncpt_init(struct host1x *host)
>   * host1x client drivers can use this function to allocate a syncpoint for
>   * subsequent use. A syncpoint returned by this function will be reserved for
>   * use by the client exclusively. When no longer using a syncpoint, a host1x
> - * client driver needs to release it using host1x_syncpt_free().
> + * client driver needs to release it using host1x_syncpt_put().
>   */
>  struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
>  					    unsigned long flags)
> @@ -394,20 +396,9 @@ struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
>  }
>  EXPORT_SYMBOL(host1x_syncpt_request);
>  
> -/**
> - * host1x_syncpt_free() - free a requested syncpoint
> - * @sp: host1x syncpoint
> - *
> - * Release a syncpoint previously allocated using host1x_syncpt_request(). A
> - * host1x client driver should call this when the syncpoint is no longer in
> - * use. Note that client drivers must ensure that the syncpoint doesn't remain
> - * under the control of hardware after calling this function, otherwise two
> - * clients may end up trying to access the same syncpoint concurrently.
> - */
> -void host1x_syncpt_free(struct host1x_syncpt *sp)
> +static void syncpt_release(struct kref *ref)
>  {
> -	if (!sp)
> -		return;
> +	struct host1x_syncpt *sp = container_of(ref, struct host1x_syncpt, ref);
>  
>  	mutex_lock(&sp->host->syncpt_mutex);
>  
> @@ -419,7 +410,23 @@ void host1x_syncpt_free(struct host1x_syncpt *sp)
>  
>  	mutex_unlock(&sp->host->syncpt_mutex);
>  }
> -EXPORT_SYMBOL(host1x_syncpt_free);
> +
> +/**
> + * host1x_syncpt_put() - free a requested syncpoint
> + * @sp: host1x syncpoint
> + *
> + * Release a syncpoint previously allocated using host1x_syncpt_request(). A
> + * host1x client driver should call this when the syncpoint is no longer in
> + * use.
> + */
> +void host1x_syncpt_put(struct host1x_syncpt *sp)
> +{
> +	if (!sp)
> +		return;
> +
> +	kref_put(&sp->ref, syncpt_release);
> +}
> +EXPORT_SYMBOL(host1x_syncpt_put);
>  
>  void host1x_syncpt_deinit(struct host1x *host)
>  {
> @@ -486,16 +493,48 @@ unsigned int host1x_syncpt_nb_mlocks(struct host1x *host)
>  }
>  
>  /**
> - * host1x_syncpt_get() - obtain a syncpoint by ID
> + * host1x_syncpt_get_by_id() - obtain a syncpoint by ID
> + * @host: host1x controller
> + * @id: syncpoint ID
> + */
> +struct host1x_syncpt *host1x_syncpt_get_by_id(struct host1x *host,
> +					      unsigned int id)
> +{
> +	if (id >= host->info->nb_pts)
> +		return NULL;
> +
> +	if (kref_get_unless_zero(&host->syncpt[id].ref))
> +		return &host->syncpt[id];
> +	else
> +		return NULL;
> +}
> +EXPORT_SYMBOL(host1x_syncpt_get_by_id);
> +
> +/**
> + * host1x_syncpt_get_by_id_noref() - obtain a syncpoint by ID but don't
> + * 	increase the refcount.
>   * @host: host1x controller
>   * @id: syncpoint ID
>   */
> -struct host1x_syncpt *host1x_syncpt_get(struct host1x *host, unsigned int id)
> +struct host1x_syncpt *host1x_syncpt_get_by_id_noref(struct host1x *host,
> +						    unsigned int id)
>  {
>  	if (id >= host->info->nb_pts)
>  		return NULL;
>  
> -	return host->syncpt + id;
> +	return &host->syncpt[id];
> +}
> +EXPORT_SYMBOL(host1x_syncpt_get_by_id_noref);
> +
> +/**
> + * host1x_syncpt_get() - increment syncpoint refcount
> + * @sp: syncpoint
> + */
> +struct host1x_syncpt *host1x_syncpt_get(struct host1x_syncpt *sp)
> +{
> +	kref_get(&sp->ref);
> +
> +	return sp;
>  }
>  EXPORT_SYMBOL(host1x_syncpt_get);
>  
> diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h
> index 3aa6b25b1b9c..a6766f8d55ee 100644
> --- a/drivers/gpu/host1x/syncpt.h
> +++ b/drivers/gpu/host1x/syncpt.h
> @@ -11,6 +11,7 @@
>  #include <linux/atomic.h>
>  #include <linux/host1x.h>
>  #include <linux/kernel.h>
> +#include <linux/kref.h>
>  #include <linux/sched.h>
>  
>  #include "intr.h"
> @@ -26,6 +27,8 @@ struct host1x_syncpt_base {
>  };
>  
>  struct host1x_syncpt {
> +	struct kref ref;
> +
>  	unsigned int id;
>  	atomic_t min_val;
>  	atomic_t max_val;
> diff --git a/drivers/staging/media/tegra-video/vi.c b/drivers/staging/media/tegra-video/vi.c
> index 7a09061cda57..7e0cb5529b49 100644
> --- a/drivers/staging/media/tegra-video/vi.c
> +++ b/drivers/staging/media/tegra-video/vi.c
> @@ -1131,8 +1131,8 @@ static void tegra_channel_host1x_syncpts_free(struct tegra_vi_channel *chan)
>  	int i;
>  
>  	for (i = 0; i < chan->numgangports; i++) {
> -		host1x_syncpt_free(chan->mw_ack_sp[i]);
> -		host1x_syncpt_free(chan->frame_start_sp[i]);
> +		host1x_syncpt_put(chan->mw_ack_sp[i]);
> +		host1x_syncpt_put(chan->frame_start_sp[i]);
>  	}
>  }
>  
> diff --git a/include/linux/host1x.h b/include/linux/host1x.h
> index 7137ce0e35d4..107aea29bccb 100644
> --- a/include/linux/host1x.h
> +++ b/include/linux/host1x.h
> @@ -142,7 +142,9 @@ struct host1x_syncpt_base;
>  struct host1x_syncpt;
>  struct host1x;
>  
> -struct host1x_syncpt *host1x_syncpt_get(struct host1x *host, u32 id);
> +struct host1x_syncpt *host1x_syncpt_get_by_id(struct host1x *host, u32 id);
> +struct host1x_syncpt *host1x_syncpt_get_by_id_noref(struct host1x *host, u32 id);
> +struct host1x_syncpt *host1x_syncpt_get(struct host1x_syncpt *sp);
>  u32 host1x_syncpt_id(struct host1x_syncpt *sp);
>  u32 host1x_syncpt_read_min(struct host1x_syncpt *sp);
>  u32 host1x_syncpt_read_max(struct host1x_syncpt *sp);
> @@ -153,7 +155,7 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout,
>  		       u32 *value);
>  struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
>  					    unsigned long flags);
> -void host1x_syncpt_free(struct host1x_syncpt *sp);
> +void host1x_syncpt_put(struct host1x_syncpt *sp);
>  struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
>  					  unsigned long flags,
>  					  const char *name);
> @@ -221,7 +223,7 @@ struct host1x_job {
>  	dma_addr_t *reloc_addr_phys;
>  
>  	/* Sync point id, number of increments and end related to the submit */
> -	u32 syncpt_id;
> +	struct host1x_syncpt *syncpt;
>  	u32 syncpt_incrs;
>  	u32 syncpt_end;
>  
> 

Tested-by: Dmitry Osipenko <digetx@gmail.com>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v6 05/10] gpu: host1x: Use HW-equivalent syncpoint expiration check
  2021-03-29 13:38   ` Mikko Perttunen
@ 2021-03-30 21:12     ` Dmitry Osipenko
  -1 siblings, 0 replies; 35+ messages in thread
From: Dmitry Osipenko @ 2021-03-30 21:12 UTC (permalink / raw)
  To: Mikko Perttunen, thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel

29.03.2021 16:38, Mikko Perttunen пишет:
> Make syncpoint expiration checks always use the same logic used by
> the hardware. This ensures that there are no race conditions that
> could occur because of the hardware triggering a syncpoint interrupt
> and then the driver disagreeing.
> 
> One situation where this could occur is if a job incremented a
> syncpoint too many times -- then the hardware would trigger an
> interrupt, but the driver would assume that a syncpoint value
> greater than the syncpoint's max value is in the future, and not
> clean up the job.
> 
> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> ---
>  drivers/gpu/host1x/syncpt.c | 51 ++-----------------------------------
>  1 file changed, 2 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
> index f061dfd5bbc7..8da4bbce8b9d 100644
> --- a/drivers/gpu/host1x/syncpt.c
> +++ b/drivers/gpu/host1x/syncpt.c
> @@ -321,59 +321,12 @@ EXPORT_SYMBOL(host1x_syncpt_wait);
>  bool host1x_syncpt_is_expired(struct host1x_syncpt *sp, u32 thresh)
>  {
>  	u32 current_val;
> -	u32 future_val;
>  
>  	smp_rmb();
>  
>  	current_val = (u32)atomic_read(&sp->min_val);
> -	future_val = (u32)atomic_read(&sp->max_val);
> -
> -	/* Note the use of unsigned arithmetic here (mod 1<<32).
> -	 *
> -	 * c = current_val = min_val	= the current value of the syncpoint.
> -	 * t = thresh			= the value we are checking
> -	 * f = future_val  = max_val	= the value c will reach when all
> -	 *				  outstanding increments have completed.
> -	 *
> -	 * Note that c always chases f until it reaches f.
> -	 *
> -	 * Dtf = (f - t)
> -	 * Dtc = (c - t)
> -	 *
> -	 *  Consider all cases:
> -	 *
> -	 *	A) .....c..t..f.....	Dtf < Dtc	need to wait
> -	 *	B) .....c.....f..t..	Dtf > Dtc	expired
> -	 *	C) ..t..c.....f.....	Dtf > Dtc	expired	   (Dct very large)
> -	 *
> -	 *  Any case where f==c: always expired (for any t).	Dtf == Dcf
> -	 *  Any case where t==c: always expired (for any f).	Dtf >= Dtc (because Dtc==0)
> -	 *  Any case where t==f!=c: always wait.		Dtf <  Dtc (because Dtf==0,
> -	 *							Dtc!=0)
> -	 *
> -	 *  Other cases:
> -	 *
> -	 *	A) .....t..f..c.....	Dtf < Dtc	need to wait
> -	 *	A) .....f..c..t.....	Dtf < Dtc	need to wait
> -	 *	A) .....f..t..c.....	Dtf > Dtc	expired
> -	 *
> -	 *   So:
> -	 *	   Dtf >= Dtc implies EXPIRED	(return true)
> -	 *	   Dtf <  Dtc implies WAIT	(return false)
> -	 *
> -	 * Note: If t is expired then we *cannot* wait on it. We would wait
> -	 * forever (hang the system).
> -	 *
> -	 * Note: do NOT get clever and remove the -thresh from both sides. It
> -	 * is NOT the same.
> -	 *
> -	 * If future valueis zero, we have a client managed sync point. In that
> -	 * case we do a direct comparison.
> -	 */
> -	if (!host1x_syncpt_client_managed(sp))
> -		return future_val - thresh >= current_val - thresh;
> -	else
> -		return (s32)(current_val - thresh) >= 0;
> +
> +	return ((current_val - thresh) & 0x80000000U) == 0U;
>  }
>  
>  int host1x_syncpt_init(struct host1x *host)
> 

Tested-by: Dmitry Osipenko <digetx@gmail.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v6 05/10] gpu: host1x: Use HW-equivalent syncpoint expiration check
@ 2021-03-30 21:12     ` Dmitry Osipenko
  0 siblings, 0 replies; 35+ messages in thread
From: Dmitry Osipenko @ 2021-03-30 21:12 UTC (permalink / raw)
  To: Mikko Perttunen, thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel

29.03.2021 16:38, Mikko Perttunen пишет:
> Make syncpoint expiration checks always use the same logic used by
> the hardware. This ensures that there are no race conditions that
> could occur because of the hardware triggering a syncpoint interrupt
> and then the driver disagreeing.
> 
> One situation where this could occur is if a job incremented a
> syncpoint too many times -- then the hardware would trigger an
> interrupt, but the driver would assume that a syncpoint value
> greater than the syncpoint's max value is in the future, and not
> clean up the job.
> 
> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> ---
>  drivers/gpu/host1x/syncpt.c | 51 ++-----------------------------------
>  1 file changed, 2 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
> index f061dfd5bbc7..8da4bbce8b9d 100644
> --- a/drivers/gpu/host1x/syncpt.c
> +++ b/drivers/gpu/host1x/syncpt.c
> @@ -321,59 +321,12 @@ EXPORT_SYMBOL(host1x_syncpt_wait);
>  bool host1x_syncpt_is_expired(struct host1x_syncpt *sp, u32 thresh)
>  {
>  	u32 current_val;
> -	u32 future_val;
>  
>  	smp_rmb();
>  
>  	current_val = (u32)atomic_read(&sp->min_val);
> -	future_val = (u32)atomic_read(&sp->max_val);
> -
> -	/* Note the use of unsigned arithmetic here (mod 1<<32).
> -	 *
> -	 * c = current_val = min_val	= the current value of the syncpoint.
> -	 * t = thresh			= the value we are checking
> -	 * f = future_val  = max_val	= the value c will reach when all
> -	 *				  outstanding increments have completed.
> -	 *
> -	 * Note that c always chases f until it reaches f.
> -	 *
> -	 * Dtf = (f - t)
> -	 * Dtc = (c - t)
> -	 *
> -	 *  Consider all cases:
> -	 *
> -	 *	A) .....c..t..f.....	Dtf < Dtc	need to wait
> -	 *	B) .....c.....f..t..	Dtf > Dtc	expired
> -	 *	C) ..t..c.....f.....	Dtf > Dtc	expired	   (Dct very large)
> -	 *
> -	 *  Any case where f==c: always expired (for any t).	Dtf == Dcf
> -	 *  Any case where t==c: always expired (for any f).	Dtf >= Dtc (because Dtc==0)
> -	 *  Any case where t==f!=c: always wait.		Dtf <  Dtc (because Dtf==0,
> -	 *							Dtc!=0)
> -	 *
> -	 *  Other cases:
> -	 *
> -	 *	A) .....t..f..c.....	Dtf < Dtc	need to wait
> -	 *	A) .....f..c..t.....	Dtf < Dtc	need to wait
> -	 *	A) .....f..t..c.....	Dtf > Dtc	expired
> -	 *
> -	 *   So:
> -	 *	   Dtf >= Dtc implies EXPIRED	(return true)
> -	 *	   Dtf <  Dtc implies WAIT	(return false)
> -	 *
> -	 * Note: If t is expired then we *cannot* wait on it. We would wait
> -	 * forever (hang the system).
> -	 *
> -	 * Note: do NOT get clever and remove the -thresh from both sides. It
> -	 * is NOT the same.
> -	 *
> -	 * If future valueis zero, we have a client managed sync point. In that
> -	 * case we do a direct comparison.
> -	 */
> -	if (!host1x_syncpt_client_managed(sp))
> -		return future_val - thresh >= current_val - thresh;
> -	else
> -		return (s32)(current_val - thresh) >= 0;
> +
> +	return ((current_val - thresh) & 0x80000000U) == 0U;
>  }
>  
>  int host1x_syncpt_init(struct host1x *host)
> 

Tested-by: Dmitry Osipenko <digetx@gmail.com>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v6 03/10] gpu: host1x: Show number of pending waiters in debugfs
  2021-03-29 13:38   ` Mikko Perttunen
@ 2021-03-30 21:12     ` Dmitry Osipenko
  -1 siblings, 0 replies; 35+ messages in thread
From: Dmitry Osipenko @ 2021-03-30 21:12 UTC (permalink / raw)
  To: Mikko Perttunen, thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel

29.03.2021 16:38, Mikko Perttunen пишет:
> Show the number of pending waiters in the debugfs status file.
> This is useful for testing to verify that waiters do not leak
> or accumulate incorrectly.
> 
> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> ---
>  drivers/gpu/host1x/debug.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c
> index 1b4997bda1c7..8a14880c61bb 100644
> --- a/drivers/gpu/host1x/debug.c
> +++ b/drivers/gpu/host1x/debug.c
> @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo)
>  
>  static void show_syncpts(struct host1x *m, struct output *o)
>  {
> +	struct list_head *pos;
>  	unsigned int i;
>  
>  	host1x_debug_output(o, "---- syncpts ----\n");
> @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o)
>  	for (i = 0; i < host1x_syncpt_nb_pts(m); i++) {
>  		u32 max = host1x_syncpt_read_max(m->syncpt + i);
>  		u32 min = host1x_syncpt_load(m->syncpt + i);
> +		unsigned int waiters = 0;
>  
> -		if (!min && !max)
> +		spin_lock(&m->syncpt[i].intr.lock);
> +		list_for_each(pos, &m->syncpt[i].intr.wait_head)
> +			waiters++;
> +		spin_unlock(&m->syncpt[i].intr.lock);
> +
> +		if (!min && !max && !waiters)
>  			continue;
>  
> -		host1x_debug_output(o, "id %u (%s) min %d max %d\n",
> -				    i, m->syncpt[i].name, min, max);
> +		host1x_debug_output(o,
> +				    "id %u (%s) min %d max %d (%d waiters)\n",
> +				    i, m->syncpt[i].name, min, max, waiters);
>  	}
>  
>  	for (i = 0; i < host1x_syncpt_nb_bases(m); i++) {
> 

Tested-by: Dmitry Osipenko <digetx@gmail.com>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v6 03/10] gpu: host1x: Show number of pending waiters in debugfs
@ 2021-03-30 21:12     ` Dmitry Osipenko
  0 siblings, 0 replies; 35+ messages in thread
From: Dmitry Osipenko @ 2021-03-30 21:12 UTC (permalink / raw)
  To: Mikko Perttunen, thierry.reding, jonathanh; +Cc: linux-tegra, dri-devel

29.03.2021 16:38, Mikko Perttunen пишет:
> Show the number of pending waiters in the debugfs status file.
> This is useful for testing to verify that waiters do not leak
> or accumulate incorrectly.
> 
> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
> ---
>  drivers/gpu/host1x/debug.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c
> index 1b4997bda1c7..8a14880c61bb 100644
> --- a/drivers/gpu/host1x/debug.c
> +++ b/drivers/gpu/host1x/debug.c
> @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo)
>  
>  static void show_syncpts(struct host1x *m, struct output *o)
>  {
> +	struct list_head *pos;
>  	unsigned int i;
>  
>  	host1x_debug_output(o, "---- syncpts ----\n");
> @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o)
>  	for (i = 0; i < host1x_syncpt_nb_pts(m); i++) {
>  		u32 max = host1x_syncpt_read_max(m->syncpt + i);
>  		u32 min = host1x_syncpt_load(m->syncpt + i);
> +		unsigned int waiters = 0;
>  
> -		if (!min && !max)
> +		spin_lock(&m->syncpt[i].intr.lock);
> +		list_for_each(pos, &m->syncpt[i].intr.wait_head)
> +			waiters++;
> +		spin_unlock(&m->syncpt[i].intr.lock);
> +
> +		if (!min && !max && !waiters)
>  			continue;
>  
> -		host1x_debug_output(o, "id %u (%s) min %d max %d\n",
> -				    i, m->syncpt[i].name, min, max);
> +		host1x_debug_output(o,
> +				    "id %u (%s) min %d max %d (%d waiters)\n",
> +				    i, m->syncpt[i].name, min, max, waiters);
>  	}
>  
>  	for (i = 0; i < host1x_syncpt_nb_bases(m); i++) {
> 

Tested-by: Dmitry Osipenko <digetx@gmail.com>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2021-03-30 21:13 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-29 13:38 [PATCH v6 00/10] Fixes and cleanups for Host1x Mikko Perttunen
2021-03-29 13:38 ` Mikko Perttunen
2021-03-29 13:38 ` [PATCH v6 01/10] gpu: host1x: Use different lock classes for each client Mikko Perttunen
2021-03-29 13:38   ` Mikko Perttunen
2021-03-29 13:38 ` [PATCH v6 02/10] gpu: host1x: Allow syncpoints without associated client Mikko Perttunen
2021-03-29 13:38   ` Mikko Perttunen
2021-03-29 13:38 ` [PATCH v6 03/10] gpu: host1x: Show number of pending waiters in debugfs Mikko Perttunen
2021-03-29 13:38   ` Mikko Perttunen
2021-03-30 21:12   ` Dmitry Osipenko
2021-03-30 21:12     ` Dmitry Osipenko
2021-03-29 13:38 ` [PATCH v6 04/10] gpu: host1x: Remove cancelled waiters immediately Mikko Perttunen
2021-03-29 13:38   ` Mikko Perttunen
2021-03-29 20:27   ` Dmitry Osipenko
2021-03-29 20:27     ` Dmitry Osipenko
2021-03-29 21:36     ` Mikko Perttunen
2021-03-29 21:36       ` Mikko Perttunen
2021-03-30  1:35       ` Dmitry Osipenko
2021-03-30  1:35         ` Dmitry Osipenko
2021-03-29 13:38 ` [PATCH v6 05/10] gpu: host1x: Use HW-equivalent syncpoint expiration check Mikko Perttunen
2021-03-29 13:38   ` Mikko Perttunen
2021-03-30 21:12   ` Dmitry Osipenko
2021-03-30 21:12     ` Dmitry Osipenko
2021-03-29 13:38 ` [PATCH v6 06/10] gpu: host1x: Cleanup and refcounting for syncpoints Mikko Perttunen
2021-03-29 13:38   ` Mikko Perttunen
2021-03-29 23:42   ` kernel test robot
2021-03-30 21:12   ` Dmitry Osipenko
2021-03-30 21:12     ` Dmitry Osipenko
2021-03-29 13:38 ` [PATCH v6 07/10] gpu: host1x: Reset max value when freeing a syncpoint Mikko Perttunen
2021-03-29 13:38   ` Mikko Perttunen
2021-03-29 13:38 ` [PATCH v6 08/10] gpu: host1x: Reserve VBLANK syncpoints at initialization Mikko Perttunen
2021-03-29 13:38   ` Mikko Perttunen
2021-03-29 13:38 ` [PATCH v6 09/10] gpu: host1x: Assign intr waiter inside lock Mikko Perttunen
2021-03-29 13:38   ` Mikko Perttunen
2021-03-29 13:38 ` [PATCH v6 10/10] gpu: host1x: Fix Tegra194 syncpt interrupt threshold Mikko Perttunen
2021-03-29 13:38   ` Mikko Perttunen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.