linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/4] drm/vc4: Binner BO management improvements
@ 2019-04-25 12:29 Paul Kocialkowski
  2019-04-25 12:29 ` [PATCH v7 1/4] drm/vc4: Reformat and the binner bo allocation helper Paul Kocialkowski
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Paul Kocialkowski @ 2019-04-25 12:29 UTC (permalink / raw)
  To: dri-devel, linux-kernel
  Cc: Eric Anholt, David Airlie, Daniel Vetter, Maxime Ripard,
	Eben Upton, Paul Kocialkowski

Changes since v6:
* Removed vc4_v3d_bin_bo_put from error paths;
* Added WARN_ON_ONCE when no bin BO at refcount release.

Changes since v5:
* Fix more locking mistakes;
* Introduce get/put helpers;
* Grabbed a reference when submitting an exec job with a binner slot.
* Addressed misc comments.

Changes since v4:
* Used a kref on the binner bo instead of firstopen/lastclose;
* Added a mutex to prevent race conditions;
* Took care of enabling the OOM interrupt when we have a binner BO allocated.

Changes since v3:
* Split changes into more commits when possible;
* Reworked binner bo alloc condition as discussed.

Changes since v2:
* Removed deprecated sentence about fristopen;
* Added collected Reviewed-By tags.

Changes since v1:
* Squashed the two final patches into one.

Paul Kocialkowski (4):
  drm/vc4: Reformat and the binner bo allocation helper
  drm/vc4: Check for V3D before binner bo alloc
  drm/vc4: Check for the binner bo before handling OOM interrupt
  drm/vc4: Allocate binner bo when starting to use the V3D

 drivers/gpu/drm/vc4/vc4_bo.c  | 33 ++++++++++++++++-
 drivers/gpu/drm/vc4/vc4_drv.c |  6 ++++
 drivers/gpu/drm/vc4/vc4_drv.h | 14 ++++++++
 drivers/gpu/drm/vc4/vc4_gem.c | 13 +++++++
 drivers/gpu/drm/vc4/vc4_irq.c | 20 ++++++++---
 drivers/gpu/drm/vc4/vc4_v3d.c | 68 ++++++++++++++++++++++++++---------
 6 files changed, 132 insertions(+), 22 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v7 1/4] drm/vc4: Reformat and the binner bo allocation helper
  2019-04-25 12:29 [PATCH v7 0/4] drm/vc4: Binner BO management improvements Paul Kocialkowski
@ 2019-04-25 12:29 ` Paul Kocialkowski
  2019-04-25 12:29 ` [PATCH v7 2/4] drm/vc4: Check for V3D before binner bo alloc Paul Kocialkowski
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Paul Kocialkowski @ 2019-04-25 12:29 UTC (permalink / raw)
  To: dri-devel, linux-kernel
  Cc: Eric Anholt, David Airlie, Daniel Vetter, Maxime Ripard,
	Eben Upton, Paul Kocialkowski

In preparation for wrapping the binner bo allocation helper with
put/get helpers, pass the vc4 dev directly and drop the vc4 prefix.

Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
---
 drivers/gpu/drm/vc4/vc4_v3d.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_v3d.c b/drivers/gpu/drm/vc4/vc4_v3d.c
index a4b6859e3af6..7c490106e185 100644
--- a/drivers/gpu/drm/vc4/vc4_v3d.c
+++ b/drivers/gpu/drm/vc4/vc4_v3d.c
@@ -213,7 +213,7 @@ int vc4_v3d_get_bin_slot(struct vc4_dev *vc4)
 }
 
 /**
- * vc4_allocate_bin_bo() - allocates the memory that will be used for
+ * bin_bo_alloc() - allocates the memory that will be used for
  * tile binning.
  *
  * The binner has a limitation that the addresses in the tile state
@@ -234,9 +234,8 @@ int vc4_v3d_get_bin_slot(struct vc4_dev *vc4)
  * overall CMA pool before they make scenes complicated enough to run
  * out of bin space.
  */
-static int vc4_allocate_bin_bo(struct drm_device *drm)
+static int bin_bo_alloc(struct vc4_dev *vc4)
 {
-	struct vc4_dev *vc4 = to_vc4_dev(drm);
 	struct vc4_v3d *v3d = vc4->v3d;
 	uint32_t size = 16 * 1024 * 1024;
 	int ret = 0;
@@ -251,7 +250,7 @@ static int vc4_allocate_bin_bo(struct drm_device *drm)
 	INIT_LIST_HEAD(&list);
 
 	while (true) {
-		struct vc4_bo *bo = vc4_bo_create(drm, size, true,
+		struct vc4_bo *bo = vc4_bo_create(vc4->dev, size, true,
 						  VC4_BO_TYPE_BIN);
 
 		if (IS_ERR(bo)) {
@@ -333,7 +332,7 @@ static int vc4_v3d_runtime_resume(struct device *dev)
 	struct vc4_dev *vc4 = v3d->vc4;
 	int ret;
 
-	ret = vc4_allocate_bin_bo(vc4->dev);
+	ret = bin_bo_alloc(vc4);
 	if (ret)
 		return ret;
 
@@ -403,7 +402,7 @@ static int vc4_v3d_bind(struct device *dev, struct device *master, void *data)
 	if (ret != 0)
 		return ret;
 
-	ret = vc4_allocate_bin_bo(drm);
+	ret = bin_bo_alloc(vc4);
 	if (ret) {
 		clk_disable_unprepare(v3d->clk);
 		return ret;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v7 2/4] drm/vc4: Check for V3D before binner bo alloc
  2019-04-25 12:29 [PATCH v7 0/4] drm/vc4: Binner BO management improvements Paul Kocialkowski
  2019-04-25 12:29 ` [PATCH v7 1/4] drm/vc4: Reformat and the binner bo allocation helper Paul Kocialkowski
@ 2019-04-25 12:29 ` Paul Kocialkowski
  2019-04-25 12:29 ` [PATCH v7 3/4] drm/vc4: Check for the binner bo before handling OOM interrupt Paul Kocialkowski
  2019-04-25 12:29 ` [PATCH v7 4/4] drm/vc4: Allocate binner bo when starting to use the V3D Paul Kocialkowski
  3 siblings, 0 replies; 8+ messages in thread
From: Paul Kocialkowski @ 2019-04-25 12:29 UTC (permalink / raw)
  To: dri-devel, linux-kernel
  Cc: Eric Anholt, David Airlie, Daniel Vetter, Maxime Ripard,
	Eben Upton, Paul Kocialkowski

Check that we have a V3D device registered before attempting to
allocate a binner buffer object.

Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
---
 drivers/gpu/drm/vc4/vc4_v3d.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/vc4/vc4_v3d.c b/drivers/gpu/drm/vc4/vc4_v3d.c
index 7c490106e185..c16db4665af6 100644
--- a/drivers/gpu/drm/vc4/vc4_v3d.c
+++ b/drivers/gpu/drm/vc4/vc4_v3d.c
@@ -241,6 +241,9 @@ static int bin_bo_alloc(struct vc4_dev *vc4)
 	int ret = 0;
 	struct list_head list;
 
+	if (!v3d)
+		return -ENODEV;
+
 	/* We may need to try allocating more than once to get a BO
 	 * that doesn't cross 256MB.  Track the ones we've allocated
 	 * that failed so far, so that we can free them when we've got
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v7 3/4] drm/vc4: Check for the binner bo before handling OOM interrupt
  2019-04-25 12:29 [PATCH v7 0/4] drm/vc4: Binner BO management improvements Paul Kocialkowski
  2019-04-25 12:29 ` [PATCH v7 1/4] drm/vc4: Reformat and the binner bo allocation helper Paul Kocialkowski
  2019-04-25 12:29 ` [PATCH v7 2/4] drm/vc4: Check for V3D before binner bo alloc Paul Kocialkowski
@ 2019-04-25 12:29 ` Paul Kocialkowski
  2019-04-25 12:29 ` [PATCH v7 4/4] drm/vc4: Allocate binner bo when starting to use the V3D Paul Kocialkowski
  3 siblings, 0 replies; 8+ messages in thread
From: Paul Kocialkowski @ 2019-04-25 12:29 UTC (permalink / raw)
  To: dri-devel, linux-kernel
  Cc: Eric Anholt, David Airlie, Daniel Vetter, Maxime Ripard,
	Eben Upton, Paul Kocialkowski

Since the OOM interrupt directly deals with the binner bo, it doesn't
make sense to try and handle it without a binner buffer registered.

Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
---
 drivers/gpu/drm/vc4/vc4_irq.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
index ffd0a4388752..723dc86b4511 100644
--- a/drivers/gpu/drm/vc4/vc4_irq.c
+++ b/drivers/gpu/drm/vc4/vc4_irq.c
@@ -64,6 +64,9 @@ vc4_overflow_mem_work(struct work_struct *work)
 	struct vc4_exec_info *exec;
 	unsigned long irqflags;
 
+	if (!bo)
+		return;
+
 	bin_bo_slot = vc4_v3d_get_bin_slot(vc4);
 	if (bin_bo_slot < 0) {
 		DRM_ERROR("Couldn't allocate binner overflow mem\n");
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v7 4/4] drm/vc4: Allocate binner bo when starting to use the V3D
  2019-04-25 12:29 [PATCH v7 0/4] drm/vc4: Binner BO management improvements Paul Kocialkowski
                   ` (2 preceding siblings ...)
  2019-04-25 12:29 ` [PATCH v7 3/4] drm/vc4: Check for the binner bo before handling OOM interrupt Paul Kocialkowski
@ 2019-04-25 12:29 ` Paul Kocialkowski
  2019-04-25 17:42   ` Eric Anholt
  3 siblings, 1 reply; 8+ messages in thread
From: Paul Kocialkowski @ 2019-04-25 12:29 UTC (permalink / raw)
  To: dri-devel, linux-kernel
  Cc: Eric Anholt, David Airlie, Daniel Vetter, Maxime Ripard,
	Eben Upton, Paul Kocialkowski

The binner BO is not required until the V3D is in use, so avoid
allocating it at probe and do it on the first non-dumb BO allocation.

Keep track of which clients are using the V3D and liberate the buffer
when there is none left, using a kref. Protect the logic with a
mutex to avoid race conditions.

The binner BO is created at the time of the first render ioctl and is
destroyed when there is no client and no exec job using it left.

The Out-Of-Memory (OOM) interrupt also gets some tweaking, to avoid
enabling it before having allocated a binner bo.

We also want to keep the BO alive during runtime suspend/resume to avoid
failing to allocate it at resume. This happens when the CMA pool is
full at that point and results in a hard crash.

Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
---
 drivers/gpu/drm/vc4/vc4_bo.c  | 33 +++++++++++++++++++-
 drivers/gpu/drm/vc4/vc4_drv.c |  6 ++++
 drivers/gpu/drm/vc4/vc4_drv.h | 14 +++++++++
 drivers/gpu/drm/vc4/vc4_gem.c | 13 ++++++++
 drivers/gpu/drm/vc4/vc4_irq.c | 21 +++++++++----
 drivers/gpu/drm/vc4/vc4_v3d.c | 58 +++++++++++++++++++++++++++--------
 6 files changed, 125 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
index 88ebd681d7eb..2b3ec5926fe2 100644
--- a/drivers/gpu/drm/vc4/vc4_bo.c
+++ b/drivers/gpu/drm/vc4/vc4_bo.c
@@ -799,13 +799,38 @@ vc4_prime_import_sg_table(struct drm_device *dev,
 	return obj;
 }
 
+static int vc4_grab_bin_bo(struct vc4_dev *vc4, struct vc4_file *vc4file)
+{
+	int ret;
+
+	if (!vc4->v3d)
+		return -ENODEV;
+
+	if (vc4file->bin_bo_used)
+		return 0;
+
+	ret = vc4_v3d_bin_bo_get(vc4);
+	if (ret)
+		return ret;
+
+	vc4file->bin_bo_used = true;
+
+	return 0;
+}
+
 int vc4_create_bo_ioctl(struct drm_device *dev, void *data,
 			struct drm_file *file_priv)
 {
 	struct drm_vc4_create_bo *args = data;
+	struct vc4_file *vc4file = file_priv->driver_priv;
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct vc4_bo *bo = NULL;
 	int ret;
 
+	ret = vc4_grab_bin_bo(vc4, vc4file);
+	if (ret)
+		return ret;
+
 	/*
 	 * We can't allocate from the BO cache, because the BOs don't
 	 * get zeroed, and that might leak data between users.
@@ -846,6 +871,8 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
 			   struct drm_file *file_priv)
 {
 	struct drm_vc4_create_shader_bo *args = data;
+	struct vc4_file *vc4file = file_priv->driver_priv;
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct vc4_bo *bo = NULL;
 	int ret;
 
@@ -865,6 +892,10 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
 		return -EINVAL;
 	}
 
+	ret = vc4_grab_bin_bo(vc4, vc4file);
+	if (ret)
+		return ret;
+
 	bo = vc4_bo_create(dev, args->size, true, VC4_BO_TYPE_V3D_SHADER);
 	if (IS_ERR(bo))
 		return PTR_ERR(bo);
@@ -894,7 +925,7 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
 	 */
 	ret = drm_gem_handle_create(file_priv, &bo->base.base, &args->handle);
 
- fail:
+fail:
 	drm_gem_object_put_unlocked(&bo->base.base);
 
 	return ret;
diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index 6d9be20a32be..0f99ad03614e 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -128,8 +128,12 @@ static int vc4_open(struct drm_device *dev, struct drm_file *file)
 
 static void vc4_close(struct drm_device *dev, struct drm_file *file)
 {
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	struct vc4_file *vc4file = file->driver_priv;
 
+	if (vc4file->bin_bo_used)
+		vc4_v3d_bin_bo_put(vc4);
+
 	vc4_perfmon_close_file(vc4file);
 	kfree(vc4file);
 }
@@ -274,6 +278,8 @@ static int vc4_drm_bind(struct device *dev)
 	drm->dev_private = vc4;
 	INIT_LIST_HEAD(&vc4->debugfs_list);
 
+	mutex_init(&vc4->bin_bo_lock);
+
 	ret = vc4_bo_cache_init(drm);
 	if (ret)
 		goto dev_put;
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index 4f13f6262491..5bfca83deb8e 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -216,6 +216,11 @@ struct vc4_dev {
 	 * the minor is available (after drm_dev_register()).
 	 */
 	struct list_head debugfs_list;
+
+	/* Mutex for binner bo allocation. */
+	struct mutex bin_bo_lock;
+	/* Reference count for our binner bo. */
+	struct kref bin_bo_kref;
 };
 
 static inline struct vc4_dev *
@@ -584,6 +589,11 @@ struct vc4_exec_info {
 	 * NULL otherwise.
 	 */
 	struct vc4_perfmon *perfmon;
+
+	/* Whether the exec has taken a reference to the binner BO, which should
+	 * happen with a VC4_PACKET_TILE_BINNING_MODE_CONFIG packet.
+	 */
+	bool bin_bo_used;
 };
 
 /* Per-open file private data. Any driver-specific resource that has to be
@@ -594,6 +604,8 @@ struct vc4_file {
 		struct idr idr;
 		struct mutex lock;
 	} perfmon;
+
+	bool bin_bo_used;
 };
 
 static inline struct vc4_exec_info *
@@ -833,6 +845,8 @@ void vc4_plane_async_set_fb(struct drm_plane *plane,
 extern struct platform_driver vc4_v3d_driver;
 extern const struct of_device_id vc4_v3d_dt_match[];
 int vc4_v3d_get_bin_slot(struct vc4_dev *vc4);
+int vc4_v3d_bin_bo_get(struct vc4_dev *vc4);
+void vc4_v3d_bin_bo_put(struct vc4_dev *vc4);
 int vc4_v3d_pm_get(struct vc4_dev *vc4);
 void vc4_v3d_pm_put(struct vc4_dev *vc4);
 
diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
index d9311be32a4f..6219e9368ef2 100644
--- a/drivers/gpu/drm/vc4/vc4_gem.c
+++ b/drivers/gpu/drm/vc4/vc4_gem.c
@@ -820,6 +820,7 @@ static int
 vc4_get_bcl(struct drm_device *dev, struct vc4_exec_info *exec)
 {
 	struct drm_vc4_submit_cl *args = exec->args;
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	void *temp = NULL;
 	void *bin;
 	int ret = 0;
@@ -918,6 +919,14 @@ vc4_get_bcl(struct drm_device *dev, struct vc4_exec_info *exec)
 	if (ret)
 		goto fail;
 
+	if (exec->found_tile_binning_mode_config_packet) {
+		ret = vc4_v3d_bin_bo_get(vc4);
+		if (ret)
+			goto fail;
+
+		exec->bin_bo_used = true;
+	}
+
 	/* Block waiting on any previous rendering into the CS's VBO,
 	 * IB, or textures, so that pixels are actually written by the
 	 * time we try to read them.
@@ -966,6 +975,10 @@ vc4_complete_exec(struct drm_device *dev, struct vc4_exec_info *exec)
 	vc4->bin_alloc_used &= ~exec->bin_slots;
 	spin_unlock_irqrestore(&vc4->job_lock, irqflags);
 
+	/* Release the reference on the binner BO if needed. */
+	if (exec->bin_bo_used)
+		vc4_v3d_bin_bo_put(vc4);
+
 	/* Release the reference we had on the perf monitor. */
 	vc4_perfmon_put(exec->perfmon);
 
diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
index 723dc86b4511..e226c24e543f 100644
--- a/drivers/gpu/drm/vc4/vc4_irq.c
+++ b/drivers/gpu/drm/vc4/vc4_irq.c
@@ -59,18 +59,22 @@ vc4_overflow_mem_work(struct work_struct *work)
 {
 	struct vc4_dev *vc4 =
 		container_of(work, struct vc4_dev, overflow_mem_work);
-	struct vc4_bo *bo = vc4->bin_bo;
+	struct vc4_bo *bo;
 	int bin_bo_slot;
 	struct vc4_exec_info *exec;
 	unsigned long irqflags;
 
-	if (!bo)
-		return;
+	mutex_lock(&vc4->bin_bo_lock);
+
+	if (!vc4->bin_bo)
+		goto complete;
+
+	bo = vc4->bin_bo;
 
 	bin_bo_slot = vc4_v3d_get_bin_slot(vc4);
 	if (bin_bo_slot < 0) {
 		DRM_ERROR("Couldn't allocate binner overflow mem\n");
-		return;
+		goto complete;
 	}
 
 	spin_lock_irqsave(&vc4->job_lock, irqflags);
@@ -101,6 +105,9 @@ vc4_overflow_mem_work(struct work_struct *work)
 	V3D_WRITE(V3D_INTCTL, V3D_INT_OUTOMEM);
 	V3D_WRITE(V3D_INTENA, V3D_INT_OUTOMEM);
 	spin_unlock_irqrestore(&vc4->job_lock, irqflags);
+
+complete:
+	mutex_unlock(&vc4->bin_bo_lock);
 }
 
 static void
@@ -252,8 +259,10 @@ vc4_irq_postinstall(struct drm_device *dev)
 	if (!vc4->v3d)
 		return 0;
 
-	/* Enable both the render done and out of memory interrupts. */
-	V3D_WRITE(V3D_INTENA, V3D_DRIVER_IRQS);
+	/* Enable the render done interrupts. The out-of-memory interrupt is
+	 * enabled as soon as we have a binner BO allocated.
+	 */
+	V3D_WRITE(V3D_INTENA, V3D_INT_FLDONE | V3D_INT_FRDONE);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/vc4/vc4_v3d.c b/drivers/gpu/drm/vc4/vc4_v3d.c
index c16db4665af6..4ede8a920aa8 100644
--- a/drivers/gpu/drm/vc4/vc4_v3d.c
+++ b/drivers/gpu/drm/vc4/vc4_v3d.c
@@ -294,6 +294,14 @@ static int bin_bo_alloc(struct vc4_dev *vc4)
 			WARN_ON_ONCE(sizeof(vc4->bin_alloc_used) * 8 !=
 				     bo->base.base.size / vc4->bin_alloc_size);
 
+			kref_init(&vc4->bin_bo_kref);
+
+			/* Enable the out-of-memory interrupt to set our
+			 * newly-allocated binner BO, potentially from an
+			 * already-pending-but-masked interupt.
+			 */
+			V3D_WRITE(V3D_INTENA, V3D_INT_OUTOMEM);
+
 			break;
 		}
 
@@ -313,6 +321,43 @@ static int bin_bo_alloc(struct vc4_dev *vc4)
 	return ret;
 }
 
+int vc4_v3d_bin_bo_get(struct vc4_dev *vc4)
+{
+	int ret = 0;
+
+	mutex_lock(&vc4->bin_bo_lock);
+
+	if (vc4->bin_bo) {
+		kref_get(&vc4->bin_bo_kref);
+		goto complete;
+	}
+
+	ret = bin_bo_alloc(vc4);
+
+complete:
+	mutex_unlock(&vc4->bin_bo_lock);
+
+	return ret;
+}
+
+static void bin_bo_release(struct kref *ref)
+{
+	struct vc4_dev *vc4 = container_of(ref, struct vc4_dev, bin_bo_kref);
+
+	if (WARN_ON_ONCE(!vc4->bin_bo))
+		return;
+
+	drm_gem_object_put_unlocked(&vc4->bin_bo->base.base);
+	vc4->bin_bo = NULL;
+}
+
+void vc4_v3d_bin_bo_put(struct vc4_dev *vc4)
+{
+	mutex_lock(&vc4->bin_bo_lock);
+	kref_put(&vc4->bin_bo_kref, bin_bo_release);
+	mutex_unlock(&vc4->bin_bo_lock);
+}
+
 #ifdef CONFIG_PM
 static int vc4_v3d_runtime_suspend(struct device *dev)
 {
@@ -321,9 +366,6 @@ static int vc4_v3d_runtime_suspend(struct device *dev)
 
 	vc4_irq_uninstall(vc4->dev);
 
-	drm_gem_object_put_unlocked(&vc4->bin_bo->base.base);
-	vc4->bin_bo = NULL;
-
 	clk_disable_unprepare(v3d->clk);
 
 	return 0;
@@ -335,10 +377,6 @@ static int vc4_v3d_runtime_resume(struct device *dev)
 	struct vc4_dev *vc4 = v3d->vc4;
 	int ret;
 
-	ret = bin_bo_alloc(vc4);
-	if (ret)
-		return ret;
-
 	ret = clk_prepare_enable(v3d->clk);
 	if (ret != 0)
 		return ret;
@@ -405,12 +443,6 @@ static int vc4_v3d_bind(struct device *dev, struct device *master, void *data)
 	if (ret != 0)
 		return ret;
 
-	ret = bin_bo_alloc(vc4);
-	if (ret) {
-		clk_disable_unprepare(v3d->clk);
-		return ret;
-	}
-
 	/* Reset the binner overflow address/size at setup, to be sure
 	 * we don't reuse an old one.
 	 */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v7 4/4] drm/vc4: Allocate binner bo when starting to use the V3D
  2019-04-25 12:29 ` [PATCH v7 4/4] drm/vc4: Allocate binner bo when starting to use the V3D Paul Kocialkowski
@ 2019-04-25 17:42   ` Eric Anholt
  2019-05-02 12:02     ` Paul Kocialkowski
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Anholt @ 2019-04-25 17:42 UTC (permalink / raw)
  To: Paul Kocialkowski, dri-devel, linux-kernel
  Cc: David Airlie, Daniel Vetter, Maxime Ripard, Eben Upton,
	Paul Kocialkowski

[-- Attachment #1: Type: text/plain, Size: 4175 bytes --]

Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:

> The binner BO is not required until the V3D is in use, so avoid
> allocating it at probe and do it on the first non-dumb BO allocation.
>
> Keep track of which clients are using the V3D and liberate the buffer
> when there is none left, using a kref. Protect the logic with a
> mutex to avoid race conditions.
>
> The binner BO is created at the time of the first render ioctl and is
> destroyed when there is no client and no exec job using it left.
>
> The Out-Of-Memory (OOM) interrupt also gets some tweaking, to avoid
> enabling it before having allocated a binner bo.
>
> We also want to keep the BO alive during runtime suspend/resume to avoid
> failing to allocate it at resume. This happens when the CMA pool is
> full at that point and results in a hard crash.
>
> Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> ---
>  drivers/gpu/drm/vc4/vc4_bo.c  | 33 +++++++++++++++++++-
>  drivers/gpu/drm/vc4/vc4_drv.c |  6 ++++
>  drivers/gpu/drm/vc4/vc4_drv.h | 14 +++++++++
>  drivers/gpu/drm/vc4/vc4_gem.c | 13 ++++++++
>  drivers/gpu/drm/vc4/vc4_irq.c | 21 +++++++++----
>  drivers/gpu/drm/vc4/vc4_v3d.c | 58 +++++++++++++++++++++++++++--------
>  6 files changed, 125 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
> index 88ebd681d7eb..2b3ec5926fe2 100644
> --- a/drivers/gpu/drm/vc4/vc4_bo.c
> +++ b/drivers/gpu/drm/vc4/vc4_bo.c
> @@ -799,13 +799,38 @@ vc4_prime_import_sg_table(struct drm_device *dev,
>  	return obj;
>  }
>  
> +static int vc4_grab_bin_bo(struct vc4_dev *vc4, struct vc4_file *vc4file)
> +{
> +	int ret;
> +
> +	if (!vc4->v3d)
> +		return -ENODEV;
> +
> +	if (vc4file->bin_bo_used)
> +		return 0;
> +
> +	ret = vc4_v3d_bin_bo_get(vc4);
> +	if (ret)
> +		return ret;
> +


> +	vc4file->bin_bo_used = true;

I think I found one last race.  Multiple threads could be in an ioctl
trying to grab the bin BO at the same time (while this is only during
app startup, since the fd only needs to get the ref once, it's
particularly plausible given that allocating the bin BO is slow).  I
think if you replace this line with:

	mutex_lock(&vc4->bin_bo_lock);
        if (vc4file->bin_bo_used) {
        	mutex_unlock(&vc4->bin_bo_lock);
                vc4_v3d_bin_bo_put(vc4);
        } else {
        	vc4file->bin_bo_used = true;
        	mutex_unlock(&vc4->bin_bo_lock);
        }

that will be the last change we need.  If you agree with this, feel free
to squash it in and apply the series with:

Reviewed-by: Eric Anholt <eric@anholt.net>

> +
> +	return 0;
> +}
> +
>  int vc4_create_bo_ioctl(struct drm_device *dev, void *data,
>  			struct drm_file *file_priv)
>  {
>  	struct drm_vc4_create_bo *args = data;
> +	struct vc4_file *vc4file = file_priv->driver_priv;
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	struct vc4_bo *bo = NULL;
>  	int ret;
>  
> +	ret = vc4_grab_bin_bo(vc4, vc4file);
> +	if (ret)
> +		return ret;
> +
>  	/*
>  	 * We can't allocate from the BO cache, because the BOs don't
>  	 * get zeroed, and that might leak data between users.
> @@ -846,6 +871,8 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
>  			   struct drm_file *file_priv)
>  {
>  	struct drm_vc4_create_shader_bo *args = data;
> +	struct vc4_file *vc4file = file_priv->driver_priv;
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>  	struct vc4_bo *bo = NULL;
>  	int ret;
>  
> @@ -865,6 +892,10 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
>  		return -EINVAL;
>  	}
>  
> +	ret = vc4_grab_bin_bo(vc4, vc4file);
> +	if (ret)
> +		return ret;
> +
>  	bo = vc4_bo_create(dev, args->size, true, VC4_BO_TYPE_V3D_SHADER);
>  	if (IS_ERR(bo))
>  		return PTR_ERR(bo);
> @@ -894,7 +925,7 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
>  	 */
>  	ret = drm_gem_handle_create(file_priv, &bo->base.base, &args->handle);
>  
> - fail:
> +fail:
>  	drm_gem_object_put_unlocked(&bo->base.base);
>  
>  	return ret;

Extraneous whitespace change?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v7 4/4] drm/vc4: Allocate binner bo when starting to use the V3D
  2019-04-25 17:42   ` Eric Anholt
@ 2019-05-02 12:02     ` Paul Kocialkowski
  2019-05-02 18:27       ` Eric Anholt
  0 siblings, 1 reply; 8+ messages in thread
From: Paul Kocialkowski @ 2019-05-02 12:02 UTC (permalink / raw)
  To: Eric Anholt, dri-devel, linux-kernel
  Cc: David Airlie, Daniel Vetter, Maxime Ripard, Eben Upton

Hi,

On Thu, 2019-04-25 at 10:42 -0700, Eric Anholt wrote:
> Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:
> 
> > The binner BO is not required until the V3D is in use, so avoid
> > allocating it at probe and do it on the first non-dumb BO allocation.
> > 
> > Keep track of which clients are using the V3D and liberate the buffer
> > when there is none left, using a kref. Protect the logic with a
> > mutex to avoid race conditions.
> > 
> > The binner BO is created at the time of the first render ioctl and is
> > destroyed when there is no client and no exec job using it left.
> > 
> > The Out-Of-Memory (OOM) interrupt also gets some tweaking, to avoid
> > enabling it before having allocated a binner bo.
> > 
> > We also want to keep the BO alive during runtime suspend/resume to avoid
> > failing to allocate it at resume. This happens when the CMA pool is
> > full at that point and results in a hard crash.
> > 
> > Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > ---
> >  drivers/gpu/drm/vc4/vc4_bo.c  | 33 +++++++++++++++++++-
> >  drivers/gpu/drm/vc4/vc4_drv.c |  6 ++++
> >  drivers/gpu/drm/vc4/vc4_drv.h | 14 +++++++++
> >  drivers/gpu/drm/vc4/vc4_gem.c | 13 ++++++++
> >  drivers/gpu/drm/vc4/vc4_irq.c | 21 +++++++++----
> >  drivers/gpu/drm/vc4/vc4_v3d.c | 58 +++++++++++++++++++++++++++--------
> >  6 files changed, 125 insertions(+), 20 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
> > index 88ebd681d7eb..2b3ec5926fe2 100644
> > --- a/drivers/gpu/drm/vc4/vc4_bo.c
> > +++ b/drivers/gpu/drm/vc4/vc4_bo.c
> > @@ -799,13 +799,38 @@ vc4_prime_import_sg_table(struct drm_device *dev,
> >  	return obj;
> >  }
> >  
> > +static int vc4_grab_bin_bo(struct vc4_dev *vc4, struct vc4_file *vc4file)
> > +{
> > +	int ret;
> > +
> > +	if (!vc4->v3d)
> > +		return -ENODEV;
> > +
> > +	if (vc4file->bin_bo_used)
> > +		return 0;
> > +
> > +	ret = vc4_v3d_bin_bo_get(vc4);
> > +	if (ret)
> > +		return ret;
> > +
> > +	vc4file->bin_bo_used = true;
> 
> I think I found one last race.  Multiple threads could be in an ioctl
> trying to grab the bin BO at the same time (while this is only during
> app startup, since the fd only needs to get the ref once, it's
> particularly plausible given that allocating the bin BO is slow).  I
> think if you replace this line with:
> 
> 	mutex_lock(&vc4->bin_bo_lock);
>         if (vc4file->bin_bo_used) {
>         	mutex_unlock(&vc4->bin_bo_lock);
>                 vc4_v3d_bin_bo_put(vc4);
>         } else {
>         	vc4file->bin_bo_used = true;
>         	mutex_unlock(&vc4->bin_bo_lock);
>         }

Huh, very good catch once again, thanks! It took me some time to grasp
this one, but as far as I understand, the risk is that we could ref our
bin bo twice (although it would only be allocated once) since
bin_bo_used is not protected.

I'd like to suggest another solution, which would avoid re-locking and
doing an extra put if we got an extra ref: adding a "bool *used"
argument to vc4_v3d_bin_bo_get and, which only gets dereferenced with
the bin_bo lock held. Then we can skip obtaining a new reference if
(used && *used) in vc4_v3d_bin_bo_get.

So we could pass a pointer to vc4file->bin_bo_used for vc4_grab_bin_bo
and exec->bin_bo_used for the exec case (where there is no such issue
since we'll only ever try to _get the bin bo once there anyway).

What do you think?

Cheers,

Paul

> that will be the last change we need.  If you agree with this, feel free
> to squash it in and apply the series with:
> 
> Reviewed-by: Eric Anholt <eric@anholt.net>
> 
> > +
> > +	return 0;
> > +}
> > +
> >  int vc4_create_bo_ioctl(struct drm_device *dev, void *data,
> >  			struct drm_file *file_priv)
> >  {
> >  	struct drm_vc4_create_bo *args = data;
> > +	struct vc4_file *vc4file = file_priv->driver_priv;
> > +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> >  	struct vc4_bo *bo = NULL;
> >  	int ret;
> >  
> > +	ret = vc4_grab_bin_bo(vc4, vc4file);
> > +	if (ret)
> > +		return ret;
> > +
> >  	/*
> >  	 * We can't allocate from the BO cache, because the BOs don't
> >  	 * get zeroed, and that might leak data between users.
> > @@ -846,6 +871,8 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
> >  			   struct drm_file *file_priv)
> >  {
> >  	struct drm_vc4_create_shader_bo *args = data;
> > +	struct vc4_file *vc4file = file_priv->driver_priv;
> > +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> >  	struct vc4_bo *bo = NULL;
> >  	int ret;
> >  
> > @@ -865,6 +892,10 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
> >  		return -EINVAL;
> >  	}
> >  
> > +	ret = vc4_grab_bin_bo(vc4, vc4file);
> > +	if (ret)
> > +		return ret;
> > +
> >  	bo = vc4_bo_create(dev, args->size, true, VC4_BO_TYPE_V3D_SHADER);
> >  	if (IS_ERR(bo))
> >  		return PTR_ERR(bo);
> > @@ -894,7 +925,7 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
> >  	 */
> >  	ret = drm_gem_handle_create(file_priv, &bo->base.base, &args->handle);
> >  
> > - fail:
> > +fail:
> >  	drm_gem_object_put_unlocked(&bo->base.base);
> >  
> >  	return ret;
> 
> Extraneous whitespace change?
-- 
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v7 4/4] drm/vc4: Allocate binner bo when starting to use the V3D
  2019-05-02 12:02     ` Paul Kocialkowski
@ 2019-05-02 18:27       ` Eric Anholt
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Anholt @ 2019-05-02 18:27 UTC (permalink / raw)
  To: Paul Kocialkowski, dri-devel, linux-kernel
  Cc: David Airlie, Daniel Vetter, Maxime Ripard, Eben Upton

[-- Attachment #1: Type: text/plain, Size: 3680 bytes --]

Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:

> Hi,
>
> On Thu, 2019-04-25 at 10:42 -0700, Eric Anholt wrote:
>> Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:
>> 
>> > The binner BO is not required until the V3D is in use, so avoid
>> > allocating it at probe and do it on the first non-dumb BO allocation.
>> > 
>> > Keep track of which clients are using the V3D and liberate the buffer
>> > when there is none left, using a kref. Protect the logic with a
>> > mutex to avoid race conditions.
>> > 
>> > The binner BO is created at the time of the first render ioctl and is
>> > destroyed when there is no client and no exec job using it left.
>> > 
>> > The Out-Of-Memory (OOM) interrupt also gets some tweaking, to avoid
>> > enabling it before having allocated a binner bo.
>> > 
>> > We also want to keep the BO alive during runtime suspend/resume to avoid
>> > failing to allocate it at resume. This happens when the CMA pool is
>> > full at that point and results in a hard crash.
>> > 
>> > Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
>> > ---
>> >  drivers/gpu/drm/vc4/vc4_bo.c  | 33 +++++++++++++++++++-
>> >  drivers/gpu/drm/vc4/vc4_drv.c |  6 ++++
>> >  drivers/gpu/drm/vc4/vc4_drv.h | 14 +++++++++
>> >  drivers/gpu/drm/vc4/vc4_gem.c | 13 ++++++++
>> >  drivers/gpu/drm/vc4/vc4_irq.c | 21 +++++++++----
>> >  drivers/gpu/drm/vc4/vc4_v3d.c | 58 +++++++++++++++++++++++++++--------
>> >  6 files changed, 125 insertions(+), 20 deletions(-)
>> > 
>> > diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
>> > index 88ebd681d7eb..2b3ec5926fe2 100644
>> > --- a/drivers/gpu/drm/vc4/vc4_bo.c
>> > +++ b/drivers/gpu/drm/vc4/vc4_bo.c
>> > @@ -799,13 +799,38 @@ vc4_prime_import_sg_table(struct drm_device *dev,
>> >  	return obj;
>> >  }
>> >  
>> > +static int vc4_grab_bin_bo(struct vc4_dev *vc4, struct vc4_file *vc4file)
>> > +{
>> > +	int ret;
>> > +
>> > +	if (!vc4->v3d)
>> > +		return -ENODEV;
>> > +
>> > +	if (vc4file->bin_bo_used)
>> > +		return 0;
>> > +
>> > +	ret = vc4_v3d_bin_bo_get(vc4);
>> > +	if (ret)
>> > +		return ret;
>> > +
>> > +	vc4file->bin_bo_used = true;
>> 
>> I think I found one last race.  Multiple threads could be in an ioctl
>> trying to grab the bin BO at the same time (while this is only during
>> app startup, since the fd only needs to get the ref once, it's
>> particularly plausible given that allocating the bin BO is slow).  I
>> think if you replace this line with:
>> 
>> 	mutex_lock(&vc4->bin_bo_lock);
>>         if (vc4file->bin_bo_used) {
>>         	mutex_unlock(&vc4->bin_bo_lock);
>>                 vc4_v3d_bin_bo_put(vc4);
>>         } else {
>>         	vc4file->bin_bo_used = true;
>>         	mutex_unlock(&vc4->bin_bo_lock);
>>         }
>
> Huh, very good catch once again, thanks! It took me some time to grasp
> this one, but as far as I understand, the risk is that we could ref our
> bin bo twice (although it would only be allocated once) since
> bin_bo_used is not protected.
>
> I'd like to suggest another solution, which would avoid re-locking and
> doing an extra put if we got an extra ref: adding a "bool *used"
> argument to vc4_v3d_bin_bo_get and, which only gets dereferenced with
> the bin_bo lock held. Then we can skip obtaining a new reference if
> (used && *used) in vc4_v3d_bin_bo_get.
>
> So we could pass a pointer to vc4file->bin_bo_used for vc4_grab_bin_bo
> and exec->bin_bo_used for the exec case (where there is no such issue
> since we'll only ever try to _get the bin bo once there anyway).
>
> What do you think?

I like it!

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-05-02 18:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-25 12:29 [PATCH v7 0/4] drm/vc4: Binner BO management improvements Paul Kocialkowski
2019-04-25 12:29 ` [PATCH v7 1/4] drm/vc4: Reformat and the binner bo allocation helper Paul Kocialkowski
2019-04-25 12:29 ` [PATCH v7 2/4] drm/vc4: Check for V3D before binner bo alloc Paul Kocialkowski
2019-04-25 12:29 ` [PATCH v7 3/4] drm/vc4: Check for the binner bo before handling OOM interrupt Paul Kocialkowski
2019-04-25 12:29 ` [PATCH v7 4/4] drm/vc4: Allocate binner bo when starting to use the V3D Paul Kocialkowski
2019-04-25 17:42   ` Eric Anholt
2019-05-02 12:02     ` Paul Kocialkowski
2019-05-02 18:27       ` Eric Anholt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).