dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] drm/nouveau: Introduce CRC support for gf119+
@ 2020-03-18  0:40 Lyude Paul
  2020-03-18  0:40 ` [PATCH 1/9] drm/vblank: Add vblank works Lyude Paul
                   ` (8 more replies)
  0 siblings, 9 replies; 22+ messages in thread
From: Lyude Paul @ 2020-03-18  0:40 UTC (permalink / raw)
  To: nouveau, dri-devel
  Cc: Kate Stewart, David Airlie, Gerd Hoffmann, Sam Ravnborg,
	Ben Skeggs, Thomas Zimmermann, Jani Nikula, Peteris Rudzusiks,
	Sean Paul, Greg Kroah-Hartman, linux-kernel,
	Christian König, Pankaj Bharadiya, Alex Deucher,
	Nicholas Kazlauskas

Nvidia released some documentation on how CRC support works on their
GPUs, hooray!

So: this patch series implements said CRC support in nouveau, along with
adding some special debugfs interfaces for some relevant igt-gpu-tools
tests that we'll be sending in just a short bit.

This additionally adds a feature that Ville Syrjälä came up with: vblank
works. Basically, this is just a generic DRM interface that allows for
scheduling high-priority workers that start on a given vblank interrupt.
Note that while we're currently only using this in nouveau, Intel has
plans to use this for i915 as well (hence why they came up with it!).

Anyway-welcome to the future! :)

Lyude Paul (8):
  drm/nouveau/kms/nv50-: Unroll error cleanup in nv50_head_create()
  drm/nouveau/kms/nv140-: Don't modify depth in state during atomic
    commit
  drm/nouveau/kms/nv50-: Fix disabling dithering
  drm/nouveau/kms/nv50-: s/harm/armh/g
  drm/nouveau/kms/nv140-: Track wndw mappings in nv50_head_atom
  drm/nouveau/kms/nv50-: Expose nv50_outp_atom in disp.h
  drm/nouveau/kms/nv50-: Move hard-coded object handles into header
  drm/nouveau/kms/nvd9-: Add CRC support

Ville Syrjälä (1):
  drm/vblank: Add vblank works

 drivers/gpu/drm/drm_vblank.c                | 322 +++++++++
 drivers/gpu/drm/nouveau/dispnv04/crtc.c     |  25 +-
 drivers/gpu/drm/nouveau/dispnv50/Kbuild     |   4 +
 drivers/gpu/drm/nouveau/dispnv50/atom.h     |  21 +
 drivers/gpu/drm/nouveau/dispnv50/core.h     |   4 +
 drivers/gpu/drm/nouveau/dispnv50/core907d.c |   3 +
 drivers/gpu/drm/nouveau/dispnv50/core917d.c |   3 +
 drivers/gpu/drm/nouveau/dispnv50/corec37d.c |   3 +
 drivers/gpu/drm/nouveau/dispnv50/corec57d.c |   3 +
 drivers/gpu/drm/nouveau/dispnv50/crc.c      | 716 ++++++++++++++++++++
 drivers/gpu/drm/nouveau/dispnv50/crc.h      | 125 ++++
 drivers/gpu/drm/nouveau/dispnv50/crc907d.c  | 139 ++++
 drivers/gpu/drm/nouveau/dispnv50/crcc37d.c  | 153 +++++
 drivers/gpu/drm/nouveau/dispnv50/disp.c     |  65 +-
 drivers/gpu/drm/nouveau/dispnv50/disp.h     |  24 +
 drivers/gpu/drm/nouveau/dispnv50/handles.h  |  16 +
 drivers/gpu/drm/nouveau/dispnv50/head.c     | 142 +++-
 drivers/gpu/drm/nouveau/dispnv50/head.h     |  13 +-
 drivers/gpu/drm/nouveau/dispnv50/head907d.c |  14 +-
 drivers/gpu/drm/nouveau/dispnv50/headc37d.c |  27 +-
 drivers/gpu/drm/nouveau/dispnv50/headc57d.c |  20 +-
 drivers/gpu/drm/nouveau/dispnv50/wndw.c     |  15 +-
 drivers/gpu/drm/nouveau/nouveau_display.c   |  60 +-
 include/drm/drm_vblank.h                    |  34 +
 24 files changed, 1821 insertions(+), 130 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crc.c
 create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crc.h
 create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crc907d.c
 create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crcc37d.c
 create mode 100644 drivers/gpu/drm/nouveau/dispnv50/handles.h

-- 
2.24.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/9] drm/vblank: Add vblank works
  2020-03-18  0:40 [PATCH 0/9] drm/nouveau: Introduce CRC support for gf119+ Lyude Paul
@ 2020-03-18  0:40 ` Lyude Paul
  2020-03-18 13:46   ` Daniel Vetter
  2020-03-18  0:40 ` [PATCH 2/9] drm/nouveau/kms/nv50-: Unroll error cleanup in nv50_head_create() Lyude Paul
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Lyude Paul @ 2020-03-18  0:40 UTC (permalink / raw)
  To: nouveau, dri-devel; +Cc: David Airlie, linux-kernel, Thomas Zimmermann

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

Add some kind of vblank workers. The interface is similar to regular
delayed works, and also allows for re-scheduling.

Whatever hardware programming we do in the work must be fast
(must at least complete during the vblank, sometimes during
the first few scanlines of vblank), so we'll fire up a per-crtc
high priority thread for this.

[based off patches from Ville Syrjälä <ville.syrjala@linux.intel.com>,
change below to signoff later]

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
---
 drivers/gpu/drm/drm_vblank.c | 322 +++++++++++++++++++++++++++++++++++
 include/drm/drm_vblank.h     |  34 ++++
 2 files changed, 356 insertions(+)

diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index da7b0b0c1090..06c796b6c381 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -25,7 +25,9 @@
  */
 
 #include <linux/export.h>
+#include <linux/kthread.h>
 #include <linux/moduleparam.h>
+#include <uapi/linux/sched/types.h>
 
 #include <drm/drm_crtc.h>
 #include <drm/drm_drv.h>
@@ -91,6 +93,7 @@
 static bool
 drm_get_last_vbltimestamp(struct drm_device *dev, unsigned int pipe,
 			  ktime_t *tvblank, bool in_vblank_irq);
+static int drm_vblank_get(struct drm_device *dev, unsigned int pipe);
 
 static unsigned int drm_timestamp_precision = 20;  /* Default to 20 usecs. */
 
@@ -440,6 +443,9 @@ void drm_vblank_cleanup(struct drm_device *dev)
 			drm_core_check_feature(dev, DRIVER_MODESET));
 
 		del_timer_sync(&vblank->disable_timer);
+
+		wake_up_all(&vblank->vblank_work.work_wait);
+		kthread_stop(vblank->vblank_work.thread);
 	}
 
 	kfree(dev->vblank);
@@ -447,6 +453,108 @@ void drm_vblank_cleanup(struct drm_device *dev)
 	dev->num_crtcs = 0;
 }
 
+static int vblank_work_thread(void *data)
+{
+	struct drm_vblank_crtc *vblank = data;
+
+	while (!kthread_should_stop()) {
+		struct drm_vblank_work *work, *next;
+		LIST_HEAD(list);
+		u64 count;
+		int ret;
+
+		spin_lock_irq(&vblank->dev->event_lock);
+
+		ret = wait_event_interruptible_lock_irq(vblank->queue,
+							kthread_should_stop() ||
+							!list_empty(&vblank->vblank_work.work_list),
+							vblank->dev->event_lock);
+
+		WARN_ON(ret && !kthread_should_stop() &&
+			list_empty(&vblank->vblank_work.irq_list) &&
+			list_empty(&vblank->vblank_work.work_list));
+
+		list_for_each_entry_safe(work, next,
+					 &vblank->vblank_work.work_list,
+					 list) {
+			list_move_tail(&work->list, &list);
+			work->state = DRM_VBL_WORK_RUNNING;
+		}
+
+		spin_unlock_irq(&vblank->dev->event_lock);
+
+		if (list_empty(&list))
+			continue;
+
+		count = atomic64_read(&vblank->count);
+		list_for_each_entry(work, &list, list)
+			work->func(work, count);
+
+		spin_lock_irq(&vblank->dev->event_lock);
+
+		list_for_each_entry_safe(work, next, &list, list) {
+			if (work->reschedule) {
+				list_move_tail(&work->list,
+					       &vblank->vblank_work.irq_list);
+				drm_vblank_get(vblank->dev, vblank->pipe);
+				work->reschedule = false;
+				work->state = DRM_VBL_WORK_WAITING;
+			} else {
+				list_del_init(&work->list);
+				work->cancel = false;
+				work->state = DRM_VBL_WORK_IDLE;
+			}
+		}
+
+		spin_unlock_irq(&vblank->dev->event_lock);
+
+		wake_up_all(&vblank->vblank_work.work_wait);
+	}
+
+	return 0;
+}
+
+static void vblank_work_init(struct drm_vblank_crtc *vblank)
+{
+	struct sched_param param = {
+		.sched_priority = MAX_RT_PRIO - 1,
+	};
+	int ret;
+
+	INIT_LIST_HEAD(&vblank->vblank_work.irq_list);
+	INIT_LIST_HEAD(&vblank->vblank_work.work_list);
+	init_waitqueue_head(&vblank->vblank_work.work_wait);
+
+	vblank->vblank_work.thread =
+		kthread_run(vblank_work_thread, vblank, "card %d crtc %d",
+			    vblank->dev->primary->index, vblank->pipe);
+
+	ret = sched_setscheduler(vblank->vblank_work.thread,
+				 SCHED_FIFO, &param);
+	WARN_ON(ret);
+}
+
+/**
+ * drm_vblank_work_init - initialize a vblank work item
+ * @work: vblank work item
+ * @crtc: CRTC whose vblank will trigger the work execution
+ * @func: work function to be executed
+ *
+ * Initialize a vblank work item for a specific crtc.
+ */
+void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc *crtc,
+			  void (*func)(struct drm_vblank_work *work, u64 count))
+{
+	struct drm_device *dev = crtc->dev;
+	struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)];
+
+	work->vblank = vblank;
+	work->state = DRM_VBL_WORK_IDLE;
+	work->func = func;
+	INIT_LIST_HEAD(&work->list);
+}
+EXPORT_SYMBOL(drm_vblank_work_init);
+
 /**
  * drm_vblank_init - initialize vblank support
  * @dev: DRM device
@@ -481,6 +589,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs)
 		init_waitqueue_head(&vblank->queue);
 		timer_setup(&vblank->disable_timer, vblank_disable_fn, 0);
 		seqlock_init(&vblank->seqlock);
+
+		vblank_work_init(vblank);
 	}
 
 	DRM_INFO("Supports vblank timestamp caching Rev 2 (21.10.2013).\n");
@@ -1825,6 +1935,22 @@ static void drm_handle_vblank_events(struct drm_device *dev, unsigned int pipe)
 	trace_drm_vblank_event(pipe, seq, now, high_prec);
 }
 
+static void drm_handle_vblank_works(struct drm_vblank_crtc *vblank)
+{
+	struct drm_vblank_work *work, *next;
+	u64 count = atomic64_read(&vblank->count);
+
+	list_for_each_entry_safe(work, next, &vblank->vblank_work.irq_list,
+				 list) {
+		if (!vblank_passed(count, work->count))
+			continue;
+
+		drm_vblank_put(vblank->dev, vblank->pipe);
+		list_move_tail(&work->list, &vblank->vblank_work.work_list);
+		work->state = DRM_VBL_WORK_SCHEDULED;
+	}
+}
+
 /**
  * drm_handle_vblank - handle a vblank event
  * @dev: DRM device
@@ -1866,6 +1992,7 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe)
 
 	spin_unlock(&dev->vblank_time_lock);
 
+	drm_handle_vblank_works(vblank);
 	wake_up(&vblank->queue);
 
 	/* With instant-off, we defer disabling the interrupt until after
@@ -2076,3 +2203,198 @@ int drm_crtc_queue_sequence_ioctl(struct drm_device *dev, void *data,
 	kfree(e);
 	return ret;
 }
+
+/**
+ * drm_vblank_work_schedule - schedule a vblank work
+ * @work: vblank work to schedule
+ * @count: target vblank count
+ * @nextonmiss: defer until the next vblank if target vblank was missed
+ *
+ * Schedule @work for execution once the crtc vblank count reaches @count.
+ *
+ * If the crtc vblank count has already reached @count and @nextonmiss is
+ * %false the work starts to execute immediately.
+ *
+ * If the crtc vblank count has already reached @count and @nextonmiss is
+ * %true the work is deferred until the next vblank (as if @count has been
+ * specified as crtc vblank count + 1).
+ *
+ * If @work is already scheduled, this function will reschedule said work
+ * using the new @count.
+ *
+ * Returns:
+ * 0 on success, error code on failure.
+ */
+int drm_vblank_work_schedule(struct drm_vblank_work *work,
+			     u64 count, bool nextonmiss)
+{
+	struct drm_vblank_crtc *vblank = work->vblank;
+	unsigned long irqflags;
+	u64 cur_vbl;
+	int ret = 0;
+	bool rescheduling = false;
+	bool passed;
+
+	spin_lock_irqsave(&vblank->dev->event_lock, irqflags);
+
+	if (work->cancel)
+		goto out;
+
+	if (work->state == DRM_VBL_WORK_RUNNING) {
+		work->reschedule = true;
+		work->count = count;
+		goto out;
+	} else if (work->state != DRM_VBL_WORK_IDLE) {
+		if (work->count == count)
+			goto out;
+		rescheduling = true;
+	}
+
+	if (work->state != DRM_VBL_WORK_WAITING) {
+		ret = drm_vblank_get(vblank->dev, vblank->pipe);
+		if (ret)
+			goto out;
+	}
+
+	work->count = count;
+
+	cur_vbl = atomic64_read(&vblank->count);
+	passed = vblank_passed(cur_vbl, count);
+	if (passed)
+		DRM_ERROR("crtc %d vblank %llu already passed (current %llu)\n",
+			  vblank->pipe, count, cur_vbl);
+
+	if (!nextonmiss && passed) {
+		drm_vblank_put(vblank->dev, vblank->pipe);
+		if (rescheduling)
+			list_move_tail(&work->list,
+				       &vblank->vblank_work.work_list);
+		else
+			list_add_tail(&work->list,
+				      &vblank->vblank_work.work_list);
+		work->state = DRM_VBL_WORK_SCHEDULED;
+		wake_up_all(&vblank->queue);
+	} else {
+		if (rescheduling)
+			list_move_tail(&work->list,
+				       &vblank->vblank_work.irq_list);
+		else
+			list_add_tail(&work->list,
+				      &vblank->vblank_work.irq_list);
+		work->state = DRM_VBL_WORK_WAITING;
+	}
+
+ out:
+	spin_unlock_irqrestore(&vblank->dev->event_lock, irqflags);
+
+	return ret;
+}
+EXPORT_SYMBOL(drm_vblank_work_schedule);
+
+static bool vblank_work_cancel(struct drm_vblank_work *work)
+{
+	struct drm_vblank_crtc *vblank = work->vblank;
+
+	switch (work->state) {
+	case DRM_VBL_WORK_RUNNING:
+		work->cancel = true;
+		work->reschedule = false;
+		/* fall through */
+	default:
+	case DRM_VBL_WORK_IDLE:
+		return false;
+	case DRM_VBL_WORK_WAITING:
+		drm_vblank_put(vblank->dev, vblank->pipe);
+		/* fall through */
+	case DRM_VBL_WORK_SCHEDULED:
+		list_del_init(&work->list);
+		work->state = DRM_VBL_WORK_IDLE;
+		return true;
+	}
+}
+
+/**
+ * drm_vblank_work_cancel - cancel a vblank work
+ * @work: vblank work to cancel
+ *
+ * Cancel an already scheduled vblank work.
+ *
+ * On return @work may still be executing, unless the return
+ * value is %true.
+ *
+ * Returns:
+ * True if the work was cancelled before it started to excute, false otherwise.
+ */
+bool drm_vblank_work_cancel(struct drm_vblank_work *work)
+{
+	struct drm_vblank_crtc *vblank = work->vblank;
+	bool cancelled;
+
+	spin_lock_irq(&vblank->dev->event_lock);
+
+	cancelled = vblank_work_cancel(work);
+
+	spin_unlock_irq(&vblank->dev->event_lock);
+
+	return cancelled;
+}
+EXPORT_SYMBOL(drm_vblank_work_cancel);
+
+/**
+ * drm_vblank_work_cancel_sync - cancel a vblank work and wait for it to finish executing
+ * @work: vblank work to cancel
+ *
+ * Cancel an already scheduled vblank work and wait for its
+ * execution to finish.
+ *
+ * On return @work is no longer guaraneed to be executing.
+ *
+ * Returns:
+ * True if the work was cancelled before it started to excute, false otherwise.
+ */
+bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work)
+{
+	struct drm_vblank_crtc *vblank = work->vblank;
+	bool cancelled;
+	long ret;
+
+	spin_lock_irq(&vblank->dev->event_lock);
+
+	cancelled = vblank_work_cancel(work);
+
+	ret = wait_event_lock_irq_timeout(vblank->vblank_work.work_wait,
+					  work->state == DRM_VBL_WORK_IDLE,
+					  vblank->dev->event_lock,
+					  10 * HZ);
+
+	spin_unlock_irq(&vblank->dev->event_lock);
+
+	WARN(!ret, "crtc %d vblank work timed out\n", vblank->pipe);
+
+	return cancelled;
+}
+EXPORT_SYMBOL(drm_vblank_work_cancel_sync);
+
+/**
+ * drm_vblank_work_flush - wait for a scheduled vblank work to finish excuting
+ * @work: vblank work to flush
+ *
+ * Wait until @work has finished executing.
+ */
+void drm_vblank_work_flush(struct drm_vblank_work *work)
+{
+	struct drm_vblank_crtc *vblank = work->vblank;
+	long ret;
+
+	spin_lock_irq(&vblank->dev->event_lock);
+
+	ret = wait_event_lock_irq_timeout(vblank->vblank_work.work_wait,
+					  work->state == DRM_VBL_WORK_IDLE,
+					  vblank->dev->event_lock,
+					  10 * HZ);
+
+	spin_unlock_irq(&vblank->dev->event_lock);
+
+	WARN(!ret, "crtc %d vblank work timed out\n", vblank->pipe);
+}
+EXPORT_SYMBOL(drm_vblank_work_flush);
diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h
index dd9f5b9e56e4..ac9130f419af 100644
--- a/include/drm/drm_vblank.h
+++ b/include/drm/drm_vblank.h
@@ -203,8 +203,42 @@ struct drm_vblank_crtc {
 	 * disabling functions multiple times.
 	 */
 	bool enabled;
+
+	struct {
+		struct task_struct *thread;
+		struct list_head irq_list, work_list;
+		wait_queue_head_t work_wait;
+	} vblank_work;
+};
+
+struct drm_vblank_work {
+	u64 count;
+	struct drm_vblank_crtc *vblank;
+	void (*func)(struct drm_vblank_work *work, u64 count);
+	struct list_head list;
+	enum {
+		DRM_VBL_WORK_IDLE,
+		DRM_VBL_WORK_WAITING,
+		DRM_VBL_WORK_SCHEDULED,
+		DRM_VBL_WORK_RUNNING,
+	} state;
+	bool cancel : 1;
+	bool reschedule : 1;
 };
 
+int drm_vblank_work_schedule(struct drm_vblank_work *work,
+			     u64 count, bool nextonmiss);
+void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc *crtc,
+			  void (*func)(struct drm_vblank_work *work, u64 count));
+bool drm_vblank_work_cancel(struct drm_vblank_work *work);
+bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work);
+void drm_vblank_work_flush(struct drm_vblank_work *work);
+
+static inline bool drm_vblank_work_pending(struct drm_vblank_work *work)
+{
+	return work->state != DRM_VBL_WORK_IDLE;
+}
+
 int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs);
 bool drm_dev_has_vblank(const struct drm_device *dev);
 u64 drm_crtc_vblank_count(struct drm_crtc *crtc);
-- 
2.24.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/9] drm/nouveau/kms/nv50-: Unroll error cleanup in nv50_head_create()
  2020-03-18  0:40 [PATCH 0/9] drm/nouveau: Introduce CRC support for gf119+ Lyude Paul
  2020-03-18  0:40 ` [PATCH 1/9] drm/vblank: Add vblank works Lyude Paul
@ 2020-03-18  0:40 ` Lyude Paul
  2020-03-18  0:41 ` [PATCH 3/9] drm/nouveau/kms/nv140-: Don't modify depth in state during atomic commit Lyude Paul
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Lyude Paul @ 2020-03-18  0:40 UTC (permalink / raw)
  To: nouveau, dri-devel
  Cc: David Airlie, linux-kernel, Ben Skeggs, Thomas Zimmermann

We'll be rolling back more things in this function, and the way it's
structured is a bit confusing. So, let's clean this up a bit and just
unroll in the event of failure.

Signed-off-by: Lyude Paul <lyude@redhat.com>
---
 drivers/gpu/drm/nouveau/dispnv50/head.c | 33 +++++++++++++++++--------
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.c b/drivers/gpu/drm/nouveau/dispnv50/head.c
index 8f6455697ba7..e29ea40e7c33 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/head.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/head.c
@@ -507,20 +507,28 @@ nv50_head_create(struct drm_device *dev, int index)
 
 	if (disp->disp->object.oclass < GV100_DISP) {
 		ret = nv50_base_new(drm, head->base.index, &base);
+		if (ret)
+			goto fail_free;
+
 		ret = nv50_ovly_new(drm, head->base.index, &ovly);
+		if (ret)
+			goto fail_free;
 	} else {
 		ret = nv50_wndw_new(drm, DRM_PLANE_TYPE_PRIMARY,
 				    head->base.index * 2 + 0, &base);
+		if (ret)
+			goto fail_free;
+
 		ret = nv50_wndw_new(drm, DRM_PLANE_TYPE_OVERLAY,
 				    head->base.index * 2 + 1, &ovly);
-	}
-	if (ret == 0)
-		ret = nv50_curs_new(drm, head->base.index, &curs);
-	if (ret) {
-		kfree(head);
-		return ERR_PTR(ret);
+		if (ret)
+			goto fail_free;
 	}
 
+	ret = nv50_curs_new(drm, head->base.index, &curs);
+	if (ret)
+		goto fail_free;
+
 	crtc = &head->base.base;
 	drm_crtc_init_with_planes(dev, crtc, &base->plane, &curs->plane,
 				  &nv50_head_func, "head-%d", head->base.index);
@@ -533,11 +541,16 @@ nv50_head_create(struct drm_device *dev, int index)
 
 	if (head->func->olut_set) {
 		ret = nv50_lut_init(disp, &drm->client.mmu, &head->olut);
-		if (ret) {
-			nv50_head_destroy(crtc);
-			return ERR_PTR(ret);
-		}
+		if (ret)
+			goto fail_crtc_cleanup;
 	}
 
 	return head;
+
+fail_crtc_cleanup:
+	drm_crtc_cleanup(crtc);
+fail_free:
+	kfree(head);
+
+	return ERR_PTR(ret);
 }
-- 
2.24.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/9] drm/nouveau/kms/nv140-: Don't modify depth in state during atomic commit
  2020-03-18  0:40 [PATCH 0/9] drm/nouveau: Introduce CRC support for gf119+ Lyude Paul
  2020-03-18  0:40 ` [PATCH 1/9] drm/vblank: Add vblank works Lyude Paul
  2020-03-18  0:40 ` [PATCH 2/9] drm/nouveau/kms/nv50-: Unroll error cleanup in nv50_head_create() Lyude Paul
@ 2020-03-18  0:41 ` Lyude Paul
  2020-03-18  0:41 ` [PATCH 4/9] drm/nouveau/kms/nv50-: Fix disabling dithering Lyude Paul
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Lyude Paul @ 2020-03-18  0:41 UTC (permalink / raw)
  To: nouveau, dri-devel; +Cc: David Airlie, Ben Skeggs, linux-kernel

Currently, we modify the depth value stored in the atomic state when
performing a commit in order to workaround the fact we haven't
implemented support for depths higher then 10 yet. This isn't idempotent
though, as it will happen every atomic commit where we modify the OR
state even if the head's depth in the atomic state hasn't been modified.

Normally this wouldn't matter, since we don't modify OR state outside of
modesets, but since the CRC capture region is implemented as part of the
OR state in hardware we'll want to make sure all commits modifying OR
state are idempotent so as to avoid changing the depth unexpectedly.

So, fix this by simply not writing the reduced depth value we come up
with to the atomic state.

Signed-off-by: Lyude Paul <lyude@redhat.com>
---
 drivers/gpu/drm/nouveau/dispnv50/headc37d.c | 11 +++++++----
 drivers/gpu/drm/nouveau/dispnv50/headc57d.c | 11 +++++++----
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc37d.c b/drivers/gpu/drm/nouveau/dispnv50/headc37d.c
index 00011ce109a6..68920f8d9c79 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/headc37d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/headc37d.c
@@ -27,17 +27,20 @@ static void
 headc37d_or(struct nv50_head *head, struct nv50_head_atom *asyh)
 {
 	struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan;
+	u8 depth;
 	u32 *push;
+
 	if ((push = evo_wait(core, 2))) {
 		/*XXX: This is a dirty hack until OR depth handling is
 		 *     improved later for deep colour etc.
 		 */
 		switch (asyh->or.depth) {
-		case 6: asyh->or.depth = 5; break;
-		case 5: asyh->or.depth = 4; break;
-		case 2: asyh->or.depth = 1; break;
-		case 0:	asyh->or.depth = 4; break;
+		case 6: depth = 5; break;
+		case 5: depth = 4; break;
+		case 2: depth = 1; break;
+		case 0:	depth = 4; break;
 		default:
+			depth = asyh->or.depth;
 			WARN_ON(1);
 			break;
 		}
diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
index 938d910a1b1e..0296cd1d761f 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
@@ -27,17 +27,20 @@ static void
 headc57d_or(struct nv50_head *head, struct nv50_head_atom *asyh)
 {
 	struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan;
+	u8 depth;
 	u32 *push;
+
 	if ((push = evo_wait(core, 2))) {
 		/*XXX: This is a dirty hack until OR depth handling is
 		 *     improved later for deep colour etc.
 		 */
 		switch (asyh->or.depth) {
-		case 6: asyh->or.depth = 5; break;
-		case 5: asyh->or.depth = 4; break;
-		case 2: asyh->or.depth = 1; break;
-		case 0:	asyh->or.depth = 4; break;
+		case 6: depth = 5; break;
+		case 5: depth = 4; break;
+		case 2: depth = 1; break;
+		case 0:	depth = 4; break;
 		default:
+			depth = asyh->or.depth;
 			WARN_ON(1);
 			break;
 		}
-- 
2.24.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/9] drm/nouveau/kms/nv50-: Fix disabling dithering
  2020-03-18  0:40 [PATCH 0/9] drm/nouveau: Introduce CRC support for gf119+ Lyude Paul
                   ` (2 preceding siblings ...)
  2020-03-18  0:41 ` [PATCH 3/9] drm/nouveau/kms/nv140-: Don't modify depth in state during atomic commit Lyude Paul
@ 2020-03-18  0:41 ` Lyude Paul
  2020-03-18  0:41 ` [PATCH 5/9] drm/nouveau/kms/nv50-: s/harm/armh/g Lyude Paul
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Lyude Paul @ 2020-03-18  0:41 UTC (permalink / raw)
  To: nouveau, dri-devel
  Cc: David Airlie, linux-kernel, Ben Skeggs, Thomas Zimmermann

While we expose the ability to turn off hardware dithering for nouveau,
we actually make the mistake of turning it on anyway, due to
dithering_depth containing a non-zero value if our dithering depth isn't
also set to 6 bpc.

So, fix it by never enabling dithering when it's disabled.

Signed-off-by: Lyude Paul <lyude@redhat.com>
---
 drivers/gpu/drm/nouveau/dispnv50/head.c | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.c b/drivers/gpu/drm/nouveau/dispnv50/head.c
index e29ea40e7c33..72bc3bce396a 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/head.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/head.c
@@ -84,18 +84,20 @@ nv50_head_atomic_check_dither(struct nv50_head_atom *armh,
 {
 	u32 mode = 0x00;
 
-	if (asyc->dither.mode == DITHERING_MODE_AUTO) {
-		if (asyh->base.depth > asyh->or.bpc * 3)
-			mode = DITHERING_MODE_DYNAMIC2X2;
-	} else {
-		mode = asyc->dither.mode;
-	}
+	if (asyc->dither.mode) {
+		if (asyc->dither.mode == DITHERING_MODE_AUTO) {
+			if (asyh->base.depth > asyh->or.bpc * 3)
+				mode = DITHERING_MODE_DYNAMIC2X2;
+		} else {
+			mode = asyc->dither.mode;
+		}
 
-	if (asyc->dither.depth == DITHERING_DEPTH_AUTO) {
-		if (asyh->or.bpc >= 8)
-			mode |= DITHERING_DEPTH_8BPC;
-	} else {
-		mode |= asyc->dither.depth;
+		if (asyc->dither.depth == DITHERING_DEPTH_AUTO) {
+			if (asyh->or.bpc >= 8)
+				mode |= DITHERING_DEPTH_8BPC;
+		} else {
+			mode |= asyc->dither.depth;
+		}
 	}
 
 	asyh->dither.enable = mode;
-- 
2.24.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 5/9] drm/nouveau/kms/nv50-: s/harm/armh/g
  2020-03-18  0:40 [PATCH 0/9] drm/nouveau: Introduce CRC support for gf119+ Lyude Paul
                   ` (3 preceding siblings ...)
  2020-03-18  0:41 ` [PATCH 4/9] drm/nouveau/kms/nv50-: Fix disabling dithering Lyude Paul
@ 2020-03-18  0:41 ` Lyude Paul
  2020-03-18  0:41 ` [PATCH 6/9] drm/nouveau/kms/nv140-: Track wndw mappings in nv50_head_atom Lyude Paul
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Lyude Paul @ 2020-03-18  0:41 UTC (permalink / raw)
  To: nouveau, dri-devel
  Cc: David Airlie, linux-kernel, Gerd Hoffmann, Christian König,
	Ben Skeggs

We refer to the armed hardware assembly as armh elsewhere in nouveau, so
fix the naming here to make it consistent.

This patch contains no functional changes.

Signed-off-by: Lyude Paul <lyude@redhat.com>
---
 drivers/gpu/drm/nouveau/dispnv50/wndw.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
index bb737f9281e6..39cca8eaa066 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
@@ -397,7 +397,7 @@ nv50_wndw_atomic_check(struct drm_plane *plane, struct drm_plane_state *state)
 	struct nv50_wndw *wndw = nv50_wndw(plane);
 	struct nv50_wndw_atom *armw = nv50_wndw_atom(wndw->plane.state);
 	struct nv50_wndw_atom *asyw = nv50_wndw_atom(state);
-	struct nv50_head_atom *harm = NULL, *asyh = NULL;
+	struct nv50_head_atom *armh = NULL, *asyh = NULL;
 	bool modeset = false;
 	int ret;
 
@@ -418,9 +418,9 @@ nv50_wndw_atomic_check(struct drm_plane *plane, struct drm_plane_state *state)
 
 	/* Fetch assembly state for the head the window used to belong to. */
 	if (armw->state.crtc) {
-		harm = nv50_head_atom_get(asyw->state.state, armw->state.crtc);
-		if (IS_ERR(harm))
-			return PTR_ERR(harm);
+		armh = nv50_head_atom_get(asyw->state.state, armw->state.crtc);
+		if (IS_ERR(armh))
+			return PTR_ERR(armh);
 	}
 
 	/* LUT configuration can potentially cause the window to be disabled. */
@@ -444,8 +444,8 @@ nv50_wndw_atomic_check(struct drm_plane *plane, struct drm_plane_state *state)
 		asyh->wndw.mask |= BIT(wndw->id);
 	} else
 	if (armw->visible) {
-		nv50_wndw_atomic_check_release(wndw, asyw, harm);
-		harm->wndw.mask &= ~BIT(wndw->id);
+		nv50_wndw_atomic_check_release(wndw, asyw, armh);
+		armh->wndw.mask &= ~BIT(wndw->id);
 	} else {
 		return 0;
 	}
-- 
2.24.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 6/9] drm/nouveau/kms/nv140-: Track wndw mappings in nv50_head_atom
  2020-03-18  0:40 [PATCH 0/9] drm/nouveau: Introduce CRC support for gf119+ Lyude Paul
                   ` (4 preceding siblings ...)
  2020-03-18  0:41 ` [PATCH 5/9] drm/nouveau/kms/nv50-: s/harm/armh/g Lyude Paul
@ 2020-03-18  0:41 ` Lyude Paul
  2020-03-18  0:41 ` [PATCH 7/9] drm/nouveau/kms/nv50-: Expose nv50_outp_atom in disp.h Lyude Paul
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Lyude Paul @ 2020-03-18  0:41 UTC (permalink / raw)
  To: nouveau, dri-devel
  Cc: David Airlie, Pankaj Bharadiya, linux-kernel, Sean Paul,
	Ben Skeggs, Alex Deucher

While we're not quite ready yet to add support for flexible wndw
mappings, we are going to need to at least keep track of the static wndw
mappings we're currently using in each head's atomic state. We'll likely
use this in the future to implement real flexible window mapping, but
the primary reason we'll need this is for CRC support.

See: on nvidia hardware, each CRC entry in the CRC notifier dma context
has a "tag". This tag corresponds to the nth update on a specific
EVO/NvDisplay channel, which itself is referred to as the "controlling
channel". For gf119+ this can be the core channel, ovly channel, or base
channel. Since we don't expose CRC entry tags to userspace, we simply
ignore this feature and always use the core channel as the controlling
channel. Simple.

Things get a little bit more complicated on gv100+ though. GV100+ only
lets us set the controlling channel to a specific wndw channel, and that
wndw must be owned by the head that we're grabbing CRCs when we enable
CRC generation. Thus, we always need to make sure that each atomic head
state has at least one wndw that is mapped to the head, which will be
used as the controlling channel.

Note that since we don't have flexible wndw mappings yet, we don't
expect to run into any scenarios yet where we'd have a head with no
mapped wndws. When we do add support for flexible wndw mappings however,
we'll need to make sure that we handle reprogramming CRC capture if our
controlling wndw is moved to another head (and potentially reject the
new head state entirely if we can't find another available wndw to
replace it).

With that being said, nouveau currently tracks wndw visibility on heads.
It does not keep track of the actual ownership mappings, which are
(currently) statically programmed. To fix this, we introduce another
bitmask into nv50_head_atom.wndw to keep track of ownership separately
from visibility. We then introduce a nv50_head callback to handle
populating the wndw ownership map, and call it during the atomic check
phase when core->assign_windows is set to true.

Signed-off-by: Lyude Paul <lyude@redhat.com>
---
 drivers/gpu/drm/nouveau/dispnv50/atom.h     |  1 +
 drivers/gpu/drm/nouveau/dispnv50/disp.c     | 16 ++++++++++++++++
 drivers/gpu/drm/nouveau/dispnv50/head.h     |  2 ++
 drivers/gpu/drm/nouveau/dispnv50/headc37d.c | 10 ++++++++++
 drivers/gpu/drm/nouveau/dispnv50/headc57d.c |  2 ++
 5 files changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/dispnv50/atom.h b/drivers/gpu/drm/nouveau/dispnv50/atom.h
index 24f7700768da..62faaf60f47a 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/atom.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/atom.h
@@ -18,6 +18,7 @@ struct nv50_head_atom {
 
 	struct {
 		u32 mask;
+		u32 owned;
 		u32 olut;
 	} wndw;
 
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index 4d1c58468dbc..f510eeafca4b 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -2253,12 +2253,28 @@ static int
 nv50_disp_atomic_check(struct drm_device *dev, struct drm_atomic_state *state)
 {
 	struct nv50_atom *atom = nv50_atom(state);
+	struct nv50_core *core = nv50_disp(dev)->core;
 	struct drm_connector_state *old_connector_state, *new_connector_state;
 	struct drm_connector *connector;
 	struct drm_crtc_state *new_crtc_state;
 	struct drm_crtc *crtc;
+	struct nv50_head *head;
+	struct nv50_head_atom *asyh;
 	int ret, i;
 
+	if (core->assign_windows && core->func->head->static_wndw_map) {
+		drm_for_each_crtc(crtc, dev) {
+			new_crtc_state = drm_atomic_get_crtc_state(state,
+								   crtc);
+			if (IS_ERR(new_crtc_state))
+				return PTR_ERR(new_crtc_state);
+
+			head = nv50_head(crtc);
+			asyh = nv50_head_atom(new_crtc_state);
+			core->func->head->static_wndw_map(head, asyh);
+		}
+	}
+
 	/* We need to handle colour management on a per-plane basis. */
 	for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) {
 		if (new_crtc_state->color_mgmt_changed) {
diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.h b/drivers/gpu/drm/nouveau/dispnv50/head.h
index c32b27cdaefc..c05bbba9e247 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/head.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/head.h
@@ -40,6 +40,7 @@ struct nv50_head_func {
 	void (*dither)(struct nv50_head *, struct nv50_head_atom *);
 	void (*procamp)(struct nv50_head *, struct nv50_head_atom *);
 	void (*or)(struct nv50_head *, struct nv50_head_atom *);
+	void (*static_wndw_map)(struct nv50_head *, struct nv50_head_atom *);
 };
 
 extern const struct nv50_head_func head507d;
@@ -86,6 +87,7 @@ int headc37d_curs_format(struct nv50_head *, struct nv50_wndw_atom *,
 void headc37d_curs_set(struct nv50_head *, struct nv50_head_atom *);
 void headc37d_curs_clr(struct nv50_head *);
 void headc37d_dither(struct nv50_head *, struct nv50_head_atom *);
+void headc37d_static_wndw_map(struct nv50_head *, struct nv50_head_atom *);
 
 extern const struct nv50_head_func headc57d;
 #endif
diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc37d.c b/drivers/gpu/drm/nouveau/dispnv50/headc37d.c
index 68920f8d9c79..cf5a68f4021a 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/headc37d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/headc37d.c
@@ -203,6 +203,15 @@ headc37d_view(struct nv50_head *head, struct nv50_head_atom *asyh)
 	}
 }
 
+void
+headc37d_static_wndw_map(struct nv50_head *head, struct nv50_head_atom *asyh)
+{
+	int i, end;
+
+	for (i = head->base.index * 2, end = i + 2; i < end; i++)
+		asyh->wndw.owned |= BIT(i);
+}
+
 const struct nv50_head_func
 headc37d = {
 	.view = headc37d_view,
@@ -218,4 +227,5 @@ headc37d = {
 	.dither = headc37d_dither,
 	.procamp = headc37d_procamp,
 	.or = headc37d_or,
+	.static_wndw_map = headc37d_static_wndw_map,
 };
diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
index 0296cd1d761f..65e3b60804c6 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
@@ -210,4 +210,6 @@ headc57d = {
 	.dither = headc37d_dither,
 	.procamp = headc57d_procamp,
 	.or = headc57d_or,
+	/* TODO: flexible window mappings */
+	.static_wndw_map = headc37d_static_wndw_map,
 };
-- 
2.24.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 7/9] drm/nouveau/kms/nv50-: Expose nv50_outp_atom in disp.h
  2020-03-18  0:40 [PATCH 0/9] drm/nouveau: Introduce CRC support for gf119+ Lyude Paul
                   ` (5 preceding siblings ...)
  2020-03-18  0:41 ` [PATCH 6/9] drm/nouveau/kms/nv140-: Track wndw mappings in nv50_head_atom Lyude Paul
@ 2020-03-18  0:41 ` Lyude Paul
  2020-03-18  0:41 ` [PATCH 8/9] drm/nouveau/kms/nv50-: Move hard-coded object handles into header Lyude Paul
  2020-03-18  0:41 ` [PATCH 9/9] drm/nouveau/kms/nvd9-: Add CRC support Lyude Paul
  8 siblings, 0 replies; 22+ messages in thread
From: Lyude Paul @ 2020-03-18  0:41 UTC (permalink / raw)
  To: nouveau, dri-devel
  Cc: Jani Nikula, David Airlie, Pankaj Bharadiya, linux-kernel,
	Sean Paul, Ben Skeggs, Alex Deucher

In order to make sure that we flush disable updates at the right time
when disabling CRCs, we'll need to be able to look at the outp state to
see if we're changing it at the same time that we're disabling CRCs.

So, expose the struct in disp.h.

Signed-off-by: Lyude Paul <lyude@redhat.com>
---
 drivers/gpu/drm/nouveau/dispnv50/disp.c | 18 ------------------
 drivers/gpu/drm/nouveau/dispnv50/disp.h | 14 ++++++++++++++
 2 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index f510eeafca4b..ef01f2473947 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -56,24 +56,6 @@
 
 #include <subdev/bios/dp.h>
 
-/******************************************************************************
- * Atomic state
- *****************************************************************************/
-
-struct nv50_outp_atom {
-	struct list_head head;
-
-	struct drm_encoder *encoder;
-	bool flush_disable;
-
-	union nv50_outp_atom_mask {
-		struct {
-			bool ctrl:1;
-		};
-		u8 mask;
-	} set, clr;
-};
-
 /******************************************************************************
  * EVO channel
  *****************************************************************************/
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.h b/drivers/gpu/drm/nouveau/dispnv50/disp.h
index d54fe00ac3a3..8935ebce8ab0 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.h
@@ -70,6 +70,20 @@ struct nv50_dmac {
 	struct mutex lock;
 };
 
+struct nv50_outp_atom {
+	struct list_head head;
+
+	struct drm_encoder *encoder;
+	bool flush_disable;
+
+	union nv50_outp_atom_mask {
+		struct {
+			bool ctrl:1;
+		};
+		u8 mask;
+	} set, clr;
+};
+
 int nv50_dmac_create(struct nvif_device *device, struct nvif_object *disp,
 		     const s32 *oclass, u8 head, void *data, u32 size,
 		     u64 syncbuf, struct nv50_dmac *dmac);
-- 
2.24.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 8/9] drm/nouveau/kms/nv50-: Move hard-coded object handles into header
  2020-03-18  0:40 [PATCH 0/9] drm/nouveau: Introduce CRC support for gf119+ Lyude Paul
                   ` (6 preceding siblings ...)
  2020-03-18  0:41 ` [PATCH 7/9] drm/nouveau/kms/nv50-: Expose nv50_outp_atom in disp.h Lyude Paul
@ 2020-03-18  0:41 ` Lyude Paul
  2020-03-18  0:41 ` [PATCH 9/9] drm/nouveau/kms/nvd9-: Add CRC support Lyude Paul
  8 siblings, 0 replies; 22+ messages in thread
From: Lyude Paul @ 2020-03-18  0:41 UTC (permalink / raw)
  To: nouveau, dri-devel
  Cc: David Airlie, Pankaj Bharadiya, linux-kernel, Sean Paul,
	Ben Skeggs, Alex Deucher, Sam Ravnborg, Christian König,
	Gerd Hoffmann

While most of the functionality on Nvidia GPUs doesn't require using an
explicit handle instead of the main VRAM handle + offset, there are a
couple of places that do require explicit handles, such as CRC
functionality. Since this means we're about to add another
nouveau-chosen handle, let's just go ahead and move any hard-coded
handles into a single header. This is just to keep things slightly
organized, and to make it a little bit easier if we need to add more
handles in the future.

This patch should contain no functional changes.

Signed-off-by: Lyude Paul <lyude@redhat.com>
---
 drivers/gpu/drm/nouveau/dispnv50/disp.c    |  7 +++++--
 drivers/gpu/drm/nouveau/dispnv50/handles.h | 15 +++++++++++++++
 drivers/gpu/drm/nouveau/dispnv50/wndw.c    |  3 ++-
 3 files changed, 22 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/dispnv50/handles.h

diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index ef01f2473947..bfea85782d0e 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -26,6 +26,7 @@
 #include "core.h"
 #include "head.h"
 #include "wndw.h"
+#include "handles.h"
 
 #include <linux/dma-mapping.h>
 #include <linux/hdmi.h>
@@ -153,7 +154,8 @@ nv50_dmac_create(struct nvif_device *device, struct nvif_object *disp,
 	if (!syncbuf)
 		return 0;
 
-	ret = nvif_object_init(&dmac->base.user, 0xf0000000, NV_DMA_IN_MEMORY,
+	ret = nvif_object_init(&dmac->base.user, NV50_DISP_HANDLE_SYNCBUF,
+			       NV_DMA_IN_MEMORY,
 			       &(struct nv_dma_v0) {
 					.target = NV_DMA_V0_TARGET_VRAM,
 					.access = NV_DMA_V0_ACCESS_RDWR,
@@ -164,7 +166,8 @@ nv50_dmac_create(struct nvif_device *device, struct nvif_object *disp,
 	if (ret)
 		return ret;
 
-	ret = nvif_object_init(&dmac->base.user, 0xf0000001, NV_DMA_IN_MEMORY,
+	ret = nvif_object_init(&dmac->base.user, NV50_DISP_HANDLE_VRAM,
+			       NV_DMA_IN_MEMORY,
 			       &(struct nv_dma_v0) {
 					.target = NV_DMA_V0_TARGET_VRAM,
 					.access = NV_DMA_V0_ACCESS_RDWR,
diff --git a/drivers/gpu/drm/nouveau/dispnv50/handles.h b/drivers/gpu/drm/nouveau/dispnv50/handles.h
new file mode 100644
index 000000000000..e3a62c7a0d08
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/dispnv50/handles.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: MIT */
+#ifndef __NV50_KMS_HANDLES_H__
+#define __NV50_KMS_HANDLES_H__
+
+/*
+ * Various hard-coded object handles that nouveau uses. These are made-up by
+ * nouveau developers, not Nvidia. The only significance of the handles chosen
+ * is that they must all be unique.
+ */
+#define NV50_DISP_HANDLE_SYNCBUF                                        0xf0000000
+#define NV50_DISP_HANDLE_VRAM                                           0xf0000001
+
+#define NV50_DISP_HANDLE_WNDW_CTX(kind)                        (0xfb000000 | kind)
+
+#endif /* !__NV50_KMS_HANDLES_H__ */
diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
index 39cca8eaa066..cb67a715bd69 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
@@ -21,6 +21,7 @@
  */
 #include "wndw.h"
 #include "wimm.h"
+#include "handles.h"
 
 #include <nvif/class.h>
 #include <nvif/cl0002.h>
@@ -44,7 +45,7 @@ nv50_wndw_ctxdma_new(struct nv50_wndw *wndw, struct nouveau_framebuffer *fb)
 	struct nouveau_drm *drm = nouveau_drm(fb->base.dev);
 	struct nv50_wndw_ctxdma *ctxdma;
 	const u8    kind = fb->nvbo->kind;
-	const u32 handle = 0xfb000000 | kind;
+	const u32 handle = NV50_DISP_HANDLE_WNDW_CTX(kind);
 	struct {
 		struct nv_dma_v0 base;
 		union {
-- 
2.24.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 9/9] drm/nouveau/kms/nvd9-: Add CRC support
  2020-03-18  0:40 [PATCH 0/9] drm/nouveau: Introduce CRC support for gf119+ Lyude Paul
                   ` (7 preceding siblings ...)
  2020-03-18  0:41 ` [PATCH 8/9] drm/nouveau/kms/nv50-: Move hard-coded object handles into header Lyude Paul
@ 2020-03-18  0:41 ` Lyude Paul
  2020-03-18  1:09   ` [PATCH v2] " Lyude Paul
  2020-03-18  6:02   ` [PATCH 9/9] " Greg Kroah-Hartman
  8 siblings, 2 replies; 22+ messages in thread
From: Lyude Paul @ 2020-03-18  0:41 UTC (permalink / raw)
  To: nouveau, dri-devel
  Cc: Kate Stewart, Jani Nikula, David Airlie, Greg Kroah-Hartman,
	linux-kernel, Sean Paul, Ben Skeggs, Thomas Zimmermann,
	Pankaj Bharadiya, Alex Deucher, Sam Ravnborg

This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:

https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt

We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.

Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.

Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.

Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.

Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.

Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.

In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".

Signed-off-by: Lyude Paul <lyude@redhat.com>
---
 drivers/gpu/drm/nouveau/dispnv04/crtc.c     |  25 +-
 drivers/gpu/drm/nouveau/dispnv50/Kbuild     |   4 +
 drivers/gpu/drm/nouveau/dispnv50/atom.h     |  20 +
 drivers/gpu/drm/nouveau/dispnv50/core.h     |   4 +
 drivers/gpu/drm/nouveau/dispnv50/core907d.c |   3 +
 drivers/gpu/drm/nouveau/dispnv50/core917d.c |   3 +
 drivers/gpu/drm/nouveau/dispnv50/corec37d.c |   3 +
 drivers/gpu/drm/nouveau/dispnv50/corec57d.c |   3 +
 drivers/gpu/drm/nouveau/dispnv50/crc.c      | 716 ++++++++++++++++++++
 drivers/gpu/drm/nouveau/dispnv50/crc.h      | 125 ++++
 drivers/gpu/drm/nouveau/dispnv50/crc907d.c  | 139 ++++
 drivers/gpu/drm/nouveau/dispnv50/crcc37d.c  | 153 +++++
 drivers/gpu/drm/nouveau/dispnv50/disp.c     |  24 +-
 drivers/gpu/drm/nouveau/dispnv50/disp.h     |  10 +
 drivers/gpu/drm/nouveau/dispnv50/handles.h  |   1 +
 drivers/gpu/drm/nouveau/dispnv50/head.c     |  85 ++-
 drivers/gpu/drm/nouveau/dispnv50/head.h     |  11 +-
 drivers/gpu/drm/nouveau/dispnv50/head907d.c |  14 +-
 drivers/gpu/drm/nouveau/dispnv50/headc37d.c |   6 +-
 drivers/gpu/drm/nouveau/dispnv50/headc57d.c |   7 +-
 drivers/gpu/drm/nouveau/nouveau_display.c   |  60 +-
 21 files changed, 1342 insertions(+), 74 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crc.c
 create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crc.h
 create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crc907d.c
 create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crcc37d.c

diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
index 1f08de4241e0..fc178ffce8cd 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
@@ -44,6 +44,9 @@
 #include <subdev/bios/pll.h>
 #include <subdev/clk.h>
 
+#include <nvif/event.h>
+#include <nvif/cl0046.h>
+
 static int
 nv04_crtc_mode_set_base(struct drm_crtc *crtc, int x, int y,
 			struct drm_framebuffer *old_fb);
@@ -755,6 +758,7 @@ static void nv_crtc_destroy(struct drm_crtc *crtc)
 	nouveau_bo_unmap(nv_crtc->cursor.nvbo);
 	nouveau_bo_unpin(nv_crtc->cursor.nvbo);
 	nouveau_bo_ref(NULL, &nv_crtc->cursor.nvbo);
+	nvif_notify_fini(&nv_crtc->vblank);
 	kfree(nv_crtc);
 }
 
@@ -1296,9 +1300,19 @@ create_primary_plane(struct drm_device *dev)
         return primary;
 }
 
+static int nv04_crtc_vblank_handler(struct nvif_notify *notify)
+{
+	struct nouveau_crtc *nv_crtc =
+		container_of(notify, struct nouveau_crtc, vblank);
+
+	drm_crtc_handle_vblank(&nv_crtc->base);
+	return NVIF_NOTIFY_KEEP;
+}
+
 int
 nv04_crtc_create(struct drm_device *dev, int crtc_num)
 {
+	struct nouveau_display *disp = nouveau_display(dev);
 	struct nouveau_crtc *nv_crtc;
 	int ret;
 
@@ -1336,5 +1350,14 @@ nv04_crtc_create(struct drm_device *dev, int crtc_num)
 
 	nv04_cursor_init(nv_crtc);
 
-	return 0;
+	ret = nvif_notify_init(&disp->disp.object, nv04_crtc_vblank_handler,
+			       false, NV04_DISP_NTFY_VBLANK,
+			       &(struct nvif_notify_head_req_v0) {
+				    .head = nv_crtc->index,
+			       },
+			       sizeof(struct nvif_notify_head_req_v0),
+			       sizeof(struct nvif_notify_head_rep_v0),
+			       &nv_crtc->vblank);
+
+	return ret;
 }
diff --git a/drivers/gpu/drm/nouveau/dispnv50/Kbuild b/drivers/gpu/drm/nouveau/dispnv50/Kbuild
index e0c435eae664..6fdddb266fb1 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/Kbuild
+++ b/drivers/gpu/drm/nouveau/dispnv50/Kbuild
@@ -10,6 +10,10 @@ nouveau-y += dispnv50/core917d.o
 nouveau-y += dispnv50/corec37d.o
 nouveau-y += dispnv50/corec57d.o
 
+nouveau-$(CONFIG_DEBUG_FS) += dispnv50/crc.o
+nouveau-$(CONFIG_DEBUG_FS) += dispnv50/crc907d.o
+nouveau-$(CONFIG_DEBUG_FS) += dispnv50/crcc37d.o
+
 nouveau-y += dispnv50/dac507d.o
 nouveau-y += dispnv50/dac907d.o
 
diff --git a/drivers/gpu/drm/nouveau/dispnv50/atom.h b/drivers/gpu/drm/nouveau/dispnv50/atom.h
index 62faaf60f47a..3d82b3c67dec 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/atom.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/atom.h
@@ -2,6 +2,9 @@
 #define __NV50_KMS_ATOM_H__
 #define nv50_atom(p) container_of((p), struct nv50_atom, state)
 #include <drm/drm_atomic.h>
+#include "crc.h"
+
+struct nouveau_encoder;
 
 struct nv50_atom {
 	struct drm_atomic_state state;
@@ -115,9 +118,12 @@ struct nv50_head_atom {
 		u8 nhsync:1;
 		u8 nvsync:1;
 		u8 depth:4;
+		u8 crc_raster:2;
 		u8 bpc;
 	} or;
 
+	struct nv50_crc_atom crc;
+
 	/* Currently only used for MST */
 	struct {
 		int pbn;
@@ -135,6 +141,7 @@ struct nv50_head_atom {
 			bool ovly:1;
 			bool dither:1;
 			bool procamp:1;
+			bool crc:1;
 			bool or:1;
 		};
 		u16 mask;
@@ -150,6 +157,19 @@ nv50_head_atom_get(struct drm_atomic_state *state, struct drm_crtc *crtc)
 	return nv50_head_atom(statec);
 }
 
+static inline struct drm_encoder *
+nv50_head_atom_get_encoder(struct nv50_head_atom *atom)
+{
+	struct drm_encoder *encoder = NULL;
+
+	/* We only ever have a single encoder */
+	drm_for_each_encoder_mask(encoder, atom->state.crtc->dev,
+				  atom->state.encoder_mask)
+		break;
+
+	return encoder;
+}
+
 #define nv50_wndw_atom(p) container_of((p), struct nv50_wndw_atom, state)
 
 struct nv50_wndw_atom {
diff --git a/drivers/gpu/drm/nouveau/dispnv50/core.h b/drivers/gpu/drm/nouveau/dispnv50/core.h
index ff94f3f6f264..47470db0f154 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/core.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/core.h
@@ -2,6 +2,7 @@
 #define __NV50_KMS_CORE_H__
 #include "disp.h"
 #include "atom.h"
+#include "crc.h"
 
 struct nv50_core {
 	const struct nv50_core_func *func;
@@ -24,6 +25,9 @@ struct nv50_core_func {
 	} wndw;
 
 	const struct nv50_head_func *head;
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+	const struct nv50_crc_func *crc;
+#endif
 	const struct nv50_outp_func {
 		void (*ctrl)(struct nv50_core *, int or, u32 ctrl,
 			     struct nv50_head_atom *);
diff --git a/drivers/gpu/drm/nouveau/dispnv50/core907d.c b/drivers/gpu/drm/nouveau/dispnv50/core907d.c
index ef822f813435..fd6aaf629d02 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/core907d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/core907d.c
@@ -29,6 +29,9 @@ core907d = {
 	.ntfy_wait_done = core507d_ntfy_wait_done,
 	.update = core507d_update,
 	.head = &head907d,
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+	.crc = &crc907d,
+#endif
 	.dac = &dac907d,
 	.sor = &sor907d,
 };
diff --git a/drivers/gpu/drm/nouveau/dispnv50/core917d.c b/drivers/gpu/drm/nouveau/dispnv50/core917d.c
index 392338df5bfd..46debdce87ff 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/core917d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/core917d.c
@@ -29,6 +29,9 @@ core917d = {
 	.ntfy_wait_done = core507d_ntfy_wait_done,
 	.update = core507d_update,
 	.head = &head917d,
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+	.crc = &crc907d,
+#endif
 	.dac = &dac907d,
 	.sor = &sor907d,
 };
diff --git a/drivers/gpu/drm/nouveau/dispnv50/corec37d.c b/drivers/gpu/drm/nouveau/dispnv50/corec37d.c
index 3b36dc8d36b2..19761bfdac31 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/corec37d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/corec37d.c
@@ -114,6 +114,9 @@ corec37d = {
 	.wndw.owner = corec37d_wndw_owner,
 	.head = &headc37d,
 	.sor = &sorc37d,
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+	.crc = &crcc37d,
+#endif
 };
 
 int
diff --git a/drivers/gpu/drm/nouveau/dispnv50/corec57d.c b/drivers/gpu/drm/nouveau/dispnv50/corec57d.c
index 147adcd60937..0ffd6286985c 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/corec57d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/corec57d.c
@@ -51,6 +51,9 @@ corec57d = {
 	.wndw.owner = corec37d_wndw_owner,
 	.head = &headc57d,
 	.sor = &sorc37d,
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+	.crc = &crcc37d,
+#endif
 };
 
 int
diff --git a/drivers/gpu/drm/nouveau/dispnv50/crc.c b/drivers/gpu/drm/nouveau/dispnv50/crc.c
new file mode 100644
index 000000000000..b3a21e8256d5
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/dispnv50/crc.c
@@ -0,0 +1,716 @@
+/* SPDX-License-Identifier: MIT */
+#include <linux/string.h>
+#include <drm/drm_crtc.h>
+#include <drm/drm_atomic_helper.h>
+#include <drm/drm_vblank.h>
+
+#include <nvif/class.h>
+#include <nvif/cl0002.h>
+
+#include "nouveau_drv.h"
+#include "core.h"
+#include "head.h"
+#include "wndw.h"
+#include "handles.h"
+#include "crc.h"
+
+static const char * const nv50_crc_sources[] = {
+	[NV50_CRC_SOURCE_NONE] = "none",
+	[NV50_CRC_SOURCE_AUTO] = "auto",
+	[NV50_CRC_SOURCE_RG] = "rg",
+	[NV50_CRC_SOURCE_OUTP_ACTIVE] = "outp-active",
+	[NV50_CRC_SOURCE_OUTP_COMPLETE] = "outp-complete",
+	[NV50_CRC_SOURCE_OUTP_INACTIVE] = "outp-inactive",
+};
+
+static int nv50_crc_parse_source(const char *buf, enum nv50_crc_source *s)
+{
+	int i;
+
+	if (!buf) {
+		*s = NV50_CRC_SOURCE_NONE;
+		return 0;
+	}
+
+	i = match_string(nv50_crc_sources, ARRAY_SIZE(nv50_crc_sources), buf);
+	if (i < 0)
+		return i;
+
+	*s = i;
+	return 0;
+}
+
+int
+nv50_crc_verify_source(struct drm_crtc *crtc, const char *source_name,
+		       size_t *values_cnt)
+{
+	struct nouveau_drm *drm = nouveau_drm(crtc->dev);
+	enum nv50_crc_source source;
+
+	if (nv50_crc_parse_source(source_name, &source) < 0) {
+		NV_DEBUG(drm, "unknown source %s\n", source_name);
+		return -EINVAL;
+	}
+
+	*values_cnt = 1;
+	return 0;
+}
+
+const char *const *nv50_crc_get_sources(struct drm_crtc *crtc, size_t *count)
+{
+	*count = ARRAY_SIZE(nv50_crc_sources);
+	return nv50_crc_sources;
+}
+
+static void
+nv50_crc_program_ctx(struct nv50_head *head,
+		     struct nv50_crc_notifier_ctx *ctx)
+{
+	struct nv50_disp *disp = nv50_disp(head->base.base.dev);
+	struct nv50_core *core = disp->core;
+	u32 interlock[NV50_DISP_INTERLOCK__SIZE] = { 0 };
+
+	core->func->crc->set_ctx(head, ctx);
+	core->func->update(core, interlock, false);
+}
+
+static void nv50_crc_ctx_flip_work(struct drm_vblank_work *work, u64 count)
+{
+	struct nv50_crc *crc = container_of(work, struct nv50_crc, flip_work);
+	struct nv50_head *head = container_of(crc, struct nv50_head, crc);
+	struct drm_crtc *crtc = &head->base.base;
+	struct nv50_disp *disp = nv50_disp(crtc->dev);
+	u8 new_idx = crc->ctx_idx ^ 1;
+
+	/*
+	 * We don't want to accidentally wait for longer then the vblank, so
+	 * try again for the next vblank if we don't grab the lock
+	 */
+	if (!mutex_trylock(&disp->mutex)) {
+		DRM_DEV_DEBUG_KMS(crtc->dev->dev,
+				  "Lock contended, delaying CRC ctx flip for head-%d\n",
+				  head->base.index);
+		drm_vblank_work_schedule(work, count + 1, true);
+		return;
+	}
+
+	DRM_DEV_DEBUG_KMS(crtc->dev->dev,
+			  "Flipping notifier ctx for head %d (%d -> %d)\n",
+			  drm_crtc_index(crtc), crc->ctx_idx, new_idx);
+
+	nv50_crc_program_ctx(head, NULL);
+	nv50_crc_program_ctx(head, &crc->ctx[new_idx]);
+	mutex_unlock(&disp->mutex);
+
+	spin_lock_irq(&crc->lock);
+	crc->ctx_changed = true;
+	spin_unlock_irq(&crc->lock);
+}
+
+static inline void nv50_crc_reset_ctx(struct nv50_crc_notifier_ctx *ctx)
+{
+	memset_io(ctx->mem.object.map.ptr, 0, ctx->mem.object.map.size);
+}
+
+static void
+nv50_crc_get_entries(struct nv50_head *head,
+		     const struct nv50_crc_func *func,
+		     enum nv50_crc_source source)
+{
+	struct drm_crtc *crtc = &head->base.base;
+	struct nv50_crc *crc = &head->crc;
+	u32 output_crc;
+
+	while (crc->entry_idx < func->num_entries) {
+		/*
+		 * While Nvidia's documentation says CRCs are written on each
+		 * subsequent vblank after being enabled, in practice they
+		 * aren't written immediately.
+		 */
+		output_crc = func->get_entry(head, &crc->ctx[crc->ctx_idx],
+					     source, crc->entry_idx);
+		if (!output_crc)
+			return;
+
+		drm_crtc_add_crc_entry(crtc, true, crc->frame, &output_crc);
+		crc->frame++;
+		crc->entry_idx++;
+	}
+}
+
+void nv50_crc_handle_vblank(struct nv50_head *head)
+{
+	struct drm_crtc *crtc = &head->base.base;
+	struct nv50_crc *crc = &head->crc;
+	const struct nv50_crc_func *func =
+		nv50_disp(head->base.base.dev)->core->func->crc;
+	struct nv50_crc_notifier_ctx *ctx;
+	bool need_reschedule = false;
+
+	if (!func)
+		return;
+
+	/*
+	 * We don't lose events if we aren't able to report CRCs until the
+	 * next vblank, so only report CRCs if the locks we need aren't
+	 * contended to prevent missing an actual vblank event
+	 */
+	if (!spin_trylock(&crc->lock))
+		return;
+
+	if (!crc->src)
+		goto out;
+
+	ctx = &crc->ctx[crc->ctx_idx];
+	if (crc->ctx_changed && func->ctx_finished(head, ctx)) {
+		nv50_crc_get_entries(head, func, crc->src);
+
+		crc->ctx_idx ^= 1;
+		crc->entry_idx = 0;
+		crc->ctx_changed = false;
+
+		/*
+		 * Unfortunately when notifier contexts are changed during CRC
+		 * capture, we will inevitably lose the CRC entry for the
+		 * frame where the hardware actually latched onto the first
+		 * UPDATE. According to Nvidia's hardware engineers, there's
+		 * no workaround for this.
+		 *
+		 * Now, we could try to be smart here and calculate the number
+		 * of missed CRCs based on audit timestamps, but those were
+		 * removed starting with volta. Since we always flush our
+		 * updates back-to-back without waiting, we'll just be
+		 * optimistic and assume we always miss exactly one frame.
+		 */
+		DRM_DEV_DEBUG_KMS(head->base.base.dev->dev,
+				  "Notifier ctx flip for head-%d finished, lost CRC for frame %llu\n",
+				  head->base.index, crc->frame);
+		crc->frame++;
+
+		nv50_crc_reset_ctx(ctx);
+		need_reschedule = true;
+	}
+
+	nv50_crc_get_entries(head, func, crc->src);
+
+	if (need_reschedule)
+		drm_vblank_work_schedule(&crc->flip_work,
+					 drm_crtc_vblank_count(crtc)
+					 + crc->flip_threshold
+					 - crc->entry_idx,
+					 true);
+
+out:
+	spin_unlock(&crc->lock);
+}
+
+static void nv50_crc_wait_ctx_finished(struct nv50_head *head,
+				       const struct nv50_crc_func *func,
+				       struct nv50_crc_notifier_ctx *ctx)
+{
+	struct drm_device *dev = head->base.base.dev;
+	struct nouveau_drm *drm = nouveau_drm(dev);
+	s64 ret;
+
+	ret = nvif_msec(&drm->client.device, 50,
+			if (func->ctx_finished(head, ctx)) break;);
+	if (ret == -ETIMEDOUT)
+		NV_ERROR(drm,
+			 "CRC notifier ctx for head %d not finished after 50ms\n",
+			 head->base.index);
+	else if (ret)
+		NV_ATOMIC(drm,
+			  "CRC notifier ctx for head-%d finished after %lldns\n",
+			  head->base.index, ret);
+}
+
+void nv50_crc_atomic_stop_reporting(struct drm_atomic_state *state)
+{
+	struct drm_crtc_state *crtc_state;
+	struct drm_crtc *crtc;
+	int i;
+
+	for_each_new_crtc_in_state(state, crtc, crtc_state, i) {
+		struct nv50_head *head = nv50_head(crtc);
+		struct nv50_head_atom *asyh = nv50_head_atom(crtc_state);
+		struct nv50_crc *crc = &head->crc;
+
+		if (!asyh->clr.crc)
+			continue;
+
+		spin_lock_irq(&crc->lock);
+		crc->src = NV50_CRC_SOURCE_NONE;
+		spin_unlock_irq(&crc->lock);
+
+		drm_crtc_vblank_put(crtc);
+		drm_vblank_work_cancel_sync(&crc->flip_work);
+
+		NV_ATOMIC(nouveau_drm(crtc->dev),
+			  "CRC reporting on vblank for head-%d disabled\n",
+			  head->base.index);
+
+		/* CRC generation is still enabled in hw, we'll just report
+		 * any remaining CRC entries ourselves after it gets disabled
+		 * in hardware
+		 */
+	}
+}
+
+void nv50_crc_atomic_prepare_notifier_contexts(struct drm_atomic_state *state)
+{
+	const struct nv50_crc_func *func =
+		nv50_disp(state->dev)->core->func->crc;
+	struct drm_crtc_state *new_crtc_state;
+	struct drm_crtc *crtc;
+	int i;
+
+	for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) {
+		struct nv50_head *head = nv50_head(crtc);
+		struct nv50_head_atom *asyh = nv50_head_atom(new_crtc_state);
+		struct nv50_crc *crc = &head->crc;
+		struct nv50_crc_notifier_ctx *ctx = &crc->ctx[crc->ctx_idx];
+		int i;
+
+		if (asyh->clr.crc && asyh->crc.src) {
+			if (crc->ctx_changed) {
+				nv50_crc_wait_ctx_finished(head, func, ctx);
+				ctx = &crc->ctx[crc->ctx_idx ^ 1];
+			}
+			nv50_crc_wait_ctx_finished(head, func, ctx);
+		}
+
+		if (asyh->set.crc) {
+			crc->entry_idx = 0;
+			crc->ctx_changed = false;
+			for (i = 0; i < ARRAY_SIZE(crc->ctx); i++)
+				nv50_crc_reset_ctx(&crc->ctx[i]);
+		}
+	}
+}
+
+void nv50_crc_atomic_start_reporting(struct drm_atomic_state *state)
+{
+	struct drm_crtc_state *crtc_state;
+	struct drm_crtc *crtc;
+	int i;
+
+	for_each_new_crtc_in_state(state, crtc, crtc_state, i) {
+		struct nv50_head *head = nv50_head(crtc);
+		struct nv50_head_atom *asyh = nv50_head_atom(crtc_state);
+		struct nv50_crc *crc = &head->crc;
+		u64 vbl_count;
+
+		if (!asyh->set.crc)
+			continue;
+
+		drm_crtc_vblank_get(crtc);
+
+		spin_lock_irq(&crc->lock);
+		vbl_count = drm_crtc_vblank_count(crtc);
+		crc->frame = vbl_count;
+		crc->src = asyh->crc.src;
+		drm_vblank_work_schedule(&crc->flip_work,
+					 vbl_count + crc->flip_threshold,
+					 true);
+		spin_unlock_irq(&crc->lock);
+
+		NV_ATOMIC(nouveau_drm(crtc->dev),
+			  "CRC reporting on vblank for head-%d enabled\n",
+			  head->base.index);
+	}
+}
+
+int nv50_crc_atomic_check(struct nv50_head *head,
+			  struct nv50_head_atom *asyh,
+			  struct nv50_head_atom *armh)
+{
+	struct drm_atomic_state *state = asyh->state.state;
+	struct drm_device *dev = head->base.base.dev;
+	struct nv50_atom *atom = nv50_atom(state);
+	struct nv50_disp *disp = nv50_disp(dev);
+	struct drm_encoder *encoder;
+	struct nouveau_encoder *outp;
+	struct nv50_outp_atom *outp_atom;
+	bool changed = armh->crc.src != asyh->crc.src;
+
+	if (!armh->crc.src && !asyh->crc.src) {
+		asyh->set.crc = false;
+		asyh->clr.crc = false;
+		return 0;
+	}
+
+	/* While we don't care about entry tags, Volta+ hw always needs the
+	 * controlling wndw channel programmed to a wndw that's owned by our
+	 * head
+	 */
+	if (asyh->crc.src && disp->disp->object.oclass >= GV100_DISP &&
+	    !(BIT(asyh->crc.wndw) & asyh->wndw.owned)) {
+		if (!asyh->wndw.owned) {
+			/* TODO: once we support flexible channel ownership,
+			 * we should write some code here to handle attempting
+			 * to "steal" a plane: e.g. take a plane that is
+			 * currently not-visible and owned by another head,
+			 * and reassign it to this head. If we fail to do so,
+			 * we shuld reject the mode outright as CRC capture
+			 * then becomes impossible.
+			 */
+			NV_ATOMIC(nouveau_drm(dev),
+				  "No available wndws for CRC readback\n");
+			return -EINVAL;
+		}
+		asyh->crc.wndw = ffs(asyh->wndw.owned) - 1;
+	}
+
+	if (drm_atomic_crtc_needs_modeset(&asyh->state) || changed ||
+	    armh->crc.wndw != asyh->crc.wndw) {
+		asyh->clr.crc = armh->crc.src && armh->state.active;
+		asyh->set.crc = asyh->crc.src && asyh->state.active;
+		if (changed)
+			asyh->set.or |= armh->or.crc_raster !=
+					asyh->or.crc_raster;
+
+		/*
+		 * If we're reprogramming our OR, we need to flush the CRC
+		 * disable first
+		 */
+		if (asyh->clr.crc) {
+			encoder = nv50_head_atom_get_encoder(armh);
+			outp = nv50_real_outp(encoder);
+
+			list_for_each_entry(outp_atom, &atom->outp, head) {
+				if (outp_atom->encoder == encoder) {
+					if (outp_atom->set.mask)
+						atom->flush_disable = true;
+					break;
+				}
+			}
+		}
+	} else {
+		asyh->set.crc = false;
+		asyh->clr.crc = false;
+	}
+
+	return 0;
+}
+
+static enum nv50_crc_source_type
+nv50_crc_source_type(struct nouveau_encoder *outp,
+		     enum nv50_crc_source source)
+{
+	struct dcb_output *dcbe = outp->dcb;
+
+	switch (source) {
+	case NV50_CRC_SOURCE_NONE: return NV50_CRC_SOURCE_TYPE_NONE;
+	case NV50_CRC_SOURCE_RG:   return NV50_CRC_SOURCE_TYPE_RG;
+	default:		   break;
+	}
+
+	if (dcbe->location != DCB_LOC_ON_CHIP)
+		return NV50_CRC_SOURCE_TYPE_PIOR;
+
+	switch (dcbe->type) {
+	case DCB_OUTPUT_DP:	return NV50_CRC_SOURCE_TYPE_SF;
+	case DCB_OUTPUT_ANALOG:	return NV50_CRC_SOURCE_TYPE_DAC;
+	default:		return NV50_CRC_SOURCE_TYPE_SOR;
+	}
+}
+
+void nv50_crc_atomic_set(struct nv50_head *head,
+			 struct nv50_head_atom *asyh)
+{
+	struct drm_crtc *crtc = &head->base.base;
+	struct drm_device *dev = crtc->dev;
+	struct nv50_crc *crc = &head->crc;
+	const struct nv50_crc_func *func = nv50_disp(dev)->core->func->crc;
+	struct nouveau_encoder *outp =
+		nv50_real_outp(nv50_head_atom_get_encoder(asyh));
+
+	func->set_src(head, outp->or,
+		      nv50_crc_source_type(outp, asyh->crc.src),
+		      &crc->ctx[crc->ctx_idx], asyh->crc.wndw);
+}
+
+void nv50_crc_atomic_clr(struct nv50_head *head)
+{
+	const struct nv50_crc_func *func =
+		nv50_disp(head->base.base.dev)->core->func->crc;
+
+	func->set_src(head, 0, NV50_CRC_SOURCE_TYPE_NONE, NULL, 0);
+}
+
+#define NV50_CRC_RASTER_ACTIVE   0
+#define NV50_CRC_RASTER_COMPLETE 1
+#define NV50_CRC_RASTER_INACTIVE 2
+
+static inline int
+nv50_crc_raster_type(enum nv50_crc_source source)
+{
+	switch (source) {
+	case NV50_CRC_SOURCE_NONE:
+	case NV50_CRC_SOURCE_AUTO:
+	case NV50_CRC_SOURCE_RG:
+	case NV50_CRC_SOURCE_OUTP_ACTIVE:
+		return NV50_CRC_RASTER_ACTIVE;
+	case NV50_CRC_SOURCE_OUTP_COMPLETE:
+		return NV50_CRC_RASTER_COMPLETE;
+	case NV50_CRC_SOURCE_OUTP_INACTIVE:
+		return NV50_CRC_RASTER_INACTIVE;
+	}
+
+	return 0;
+}
+
+/* We handle mapping the memory for CRC notifiers ourselves, since each
+ * notifier needs it's own handle
+ */
+static inline int
+nv50_crc_ctx_init(struct nv50_head *head, struct nvif_mmu *mmu,
+		  struct nv50_crc_notifier_ctx *ctx, size_t len, int idx)
+{
+	struct nv50_core *core = nv50_disp(head->base.base.dev)->core;
+	int ret;
+
+	ret = nvif_mem_init_map(mmu, NVIF_MEM_VRAM, len, &ctx->mem);
+	if (ret)
+		return ret;
+
+	ret = nvif_object_init(&core->chan.base.user,
+			       NV50_DISP_HANDLE_CRC_CTX(head, idx),
+			       NV_DMA_IN_MEMORY,
+			       &(struct nv_dma_v0) {
+					.target = NV_DMA_V0_TARGET_VRAM,
+					.access = NV_DMA_V0_ACCESS_RDWR,
+					.start = ctx->mem.addr,
+					.limit =  ctx->mem.addr
+						+ ctx->mem.size - 1,
+			       }, sizeof(struct nv_dma_v0),
+			       &ctx->ntfy);
+	if (ret)
+		goto fail_fini;
+
+	return 0;
+
+fail_fini:
+	nvif_mem_fini(&ctx->mem);
+	return ret;
+}
+
+static inline void
+nv50_crc_ctx_fini(struct nv50_crc_notifier_ctx *ctx)
+{
+	nvif_object_fini(&ctx->ntfy);
+	nvif_mem_fini(&ctx->mem);
+}
+
+int nv50_crc_set_source(struct drm_crtc *crtc, const char *source_str)
+{
+	struct drm_device *dev = crtc->dev;
+	struct drm_atomic_state *state;
+	struct drm_modeset_acquire_ctx ctx;
+	struct nv50_head *head = nv50_head(crtc);
+	struct nv50_crc *crc = &head->crc;
+	const struct nv50_crc_func *func = nv50_disp(dev)->core->func->crc;
+	struct nvif_mmu *mmu = &nouveau_drm(dev)->client.mmu;
+	struct nv50_head_atom *asyh;
+	struct drm_crtc_state *crtc_state;
+	enum nv50_crc_source source;
+	int ret = 0, ctx_flags = 0, i;
+
+	ret = nv50_crc_parse_source(source_str, &source);
+	if (ret)
+		return ret;
+
+	/*
+	 * Since we don't want the user to accidentally interrupt us as we're
+	 * disabling CRCs
+	 */
+	if (source)
+		ctx_flags |= DRM_MODESET_ACQUIRE_INTERRUPTIBLE;
+	drm_modeset_acquire_init(&ctx, ctx_flags);
+
+	state = drm_atomic_state_alloc(dev);
+	if (!state) {
+		ret = -ENOMEM;
+		goto out_acquire_fini;
+	}
+	state->acquire_ctx = &ctx;
+
+	if (source) {
+		for (i = 0; i < ARRAY_SIZE(head->crc.ctx); i++) {
+			ret = nv50_crc_ctx_init(head, mmu, &crc->ctx[i],
+						func->notifier_len, i);
+			if (ret)
+				goto out_ctx_fini;
+		}
+	}
+
+retry:
+	crtc_state = drm_atomic_get_crtc_state(state, &head->base.base);
+	if (IS_ERR(crtc_state)) {
+		ret = PTR_ERR(crtc_state);
+		if (ret == -EDEADLK)
+			goto deadlock;
+		else if (ret)
+			goto out_drop_locks;
+	}
+	asyh = nv50_head_atom(crtc_state);
+	asyh->crc.src = source;
+	asyh->or.crc_raster = nv50_crc_raster_type(source);
+
+	ret = drm_atomic_commit(state);
+	if (ret == -EDEADLK)
+		goto deadlock;
+	else if (ret)
+		goto out_drop_locks;
+
+	if (!source) {
+		/*
+		 * If the user specified a custom flip threshold through
+		 * debugfs, reset it
+		 */
+		crc->flip_threshold = func->flip_threshold;
+	}
+
+out_drop_locks:
+	drm_modeset_drop_locks(&ctx);
+out_ctx_fini:
+	if (!source || ret) {
+		for (i = 0; i < ARRAY_SIZE(crc->ctx); i++)
+			nv50_crc_ctx_fini(&crc->ctx[i]);
+	}
+	drm_atomic_state_put(state);
+out_acquire_fini:
+	drm_modeset_acquire_fini(&ctx);
+	return ret;
+
+deadlock:
+	drm_atomic_state_clear(state);
+	drm_modeset_backoff(&ctx);
+	goto retry;
+}
+
+static int
+nv50_crc_debugfs_flip_threshold_get(struct seq_file *m, void *data)
+{
+	struct nv50_head *head = m->private;
+	struct drm_crtc *crtc = &head->base.base;
+	struct nv50_crc *crc = &head->crc;
+	int ret;
+
+	ret = drm_modeset_lock_single_interruptible(&crtc->mutex);
+	if (ret)
+		return ret;
+
+	seq_printf(m, "%d\n", crc->flip_threshold);
+
+	drm_modeset_unlock(&crtc->mutex);
+	return ret;
+}
+
+static int
+nv50_crc_debugfs_flip_threshold_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, nv50_crc_debugfs_flip_threshold_get,
+			   inode->i_private);
+}
+
+static ssize_t
+nv50_crc_debugfs_flip_threshold_set(struct file *file,
+				    const char __user *ubuf, size_t len,
+				    loff_t *offp)
+{
+	struct seq_file *m = file->private_data;
+	struct nv50_head *head = m->private;
+	struct nv50_head_atom *armh;
+	struct drm_crtc *crtc = &head->base.base;
+	struct nouveau_drm *drm = nouveau_drm(crtc->dev);
+	struct nv50_crc *crc = &head->crc;
+	const struct nv50_crc_func *func =
+		nv50_disp(crtc->dev)->core->func->crc;
+	int value, ret;
+
+	ret = kstrtoint_from_user(ubuf, len, 10, &value);
+	if (ret)
+		return ret;
+
+	if (value > func->flip_threshold)
+		return -EINVAL;
+	else if (value == -1)
+		value = func->flip_threshold;
+	else if (value < -1)
+		return -EINVAL;
+
+	ret = drm_modeset_lock_single_interruptible(&crtc->mutex);
+	if (ret)
+		return ret;
+
+	armh = nv50_head_atom(crtc->state);
+	if (armh->crc.src) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	NV_DEBUG(drm,
+		 "Changing CRC flip threshold for next capture on head-%d to %d\n",
+		 head->base.index, value);
+	crc->flip_threshold = value;
+	ret = len;
+
+out:
+	drm_modeset_unlock(&crtc->mutex);
+	return ret;
+}
+
+static const struct file_operations nv50_crc_flip_threshold_fops = {
+	.owner = THIS_MODULE,
+	.open = nv50_crc_debugfs_flip_threshold_open,
+	.read = seq_read,
+	.write = nv50_crc_debugfs_flip_threshold_set,
+};
+
+int nv50_head_crc_late_register(struct nv50_head *head)
+{
+	struct drm_crtc *crtc = &head->base.base;
+	const struct nv50_crc_func *func =
+		nv50_disp(crtc->dev)->core->func->crc;
+	struct dentry *root, *dent;
+
+	if (!func || !crtc->debugfs_entry)
+		return 0;
+
+	root = debugfs_create_dir("nv_crc", crtc->debugfs_entry);
+	if (IS_ERR(root))
+		return PTR_ERR(root);
+
+	dent = debugfs_create_file("flip_threshold", 0644, root, head,
+				   &nv50_crc_flip_threshold_fops);
+	if (IS_ERR(dent))
+		return PTR_ERR(dent);
+
+	return 0;
+}
+
+static inline void
+nv50_crc_init_head(struct nv50_disp *disp, const struct nv50_crc_func *func,
+		   struct nv50_head *head)
+{
+	struct nv50_crc *crc = &head->crc;
+
+	crc->flip_threshold = func->flip_threshold;
+	spin_lock_init(&crc->lock);
+	drm_vblank_work_init(&crc->flip_work, &head->base.base,
+			     nv50_crc_ctx_flip_work);
+}
+
+void nv50_crc_init(struct drm_device *dev)
+{
+	struct nv50_disp *disp = nv50_disp(dev);
+	struct drm_crtc *crtc;
+	const struct nv50_crc_func *func = disp->core->func->crc;
+
+	if (!func)
+		return;
+
+	drm_for_each_crtc(crtc, dev)
+		nv50_crc_init_head(disp, func, nv50_head(crtc));
+}
diff --git a/drivers/gpu/drm/nouveau/dispnv50/crc.h b/drivers/gpu/drm/nouveau/dispnv50/crc.h
new file mode 100644
index 000000000000..11ec20b06534
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/dispnv50/crc.h
@@ -0,0 +1,125 @@
+/* SPDX-License-Identifier: MIT */
+#ifndef __NV50_CRC_H__
+#define __NV50_CRC_H__
+
+#include <linux/mutex.h>
+#include <drm/drm_crtc.h>
+#include <drm/drm_vblank.h>
+
+#include <nvif/mem.h>
+#include <nvkm/subdev/bios.h>
+#include "nouveau_encoder.h"
+
+struct nv50_disp;
+struct nv50_head;
+
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+enum nv50_crc_source {
+	NV50_CRC_SOURCE_NONE = 0,
+	NV50_CRC_SOURCE_AUTO,
+	NV50_CRC_SOURCE_RG,
+	NV50_CRC_SOURCE_OUTP_ACTIVE,
+	NV50_CRC_SOURCE_OUTP_COMPLETE,
+	NV50_CRC_SOURCE_OUTP_INACTIVE,
+};
+
+/* RG -> SF (DP only)
+ *    -> SOR
+ *    -> PIOR
+ *    -> DAC
+ */
+enum nv50_crc_source_type {
+	NV50_CRC_SOURCE_TYPE_NONE = 0,
+	NV50_CRC_SOURCE_TYPE_SOR,
+	NV50_CRC_SOURCE_TYPE_PIOR,
+	NV50_CRC_SOURCE_TYPE_DAC,
+	NV50_CRC_SOURCE_TYPE_RG,
+	NV50_CRC_SOURCE_TYPE_SF,
+};
+
+struct nv50_crc_notifier_ctx {
+	struct nvif_mem mem;
+	struct nvif_object ntfy;
+};
+
+struct nv50_crc_atom {
+	enum nv50_crc_source src;
+	/* Only used for gv100+ */
+	u8 wndw : 4;
+};
+
+struct nv50_crc_func {
+	void (*set_src)(struct nv50_head *, int or, enum nv50_crc_source_type,
+			struct nv50_crc_notifier_ctx *, u32 wndw);
+	void (*set_ctx)(struct nv50_head *, struct nv50_crc_notifier_ctx *);
+	u32 (*get_entry)(struct nv50_head *, struct nv50_crc_notifier_ctx *,
+			 enum nv50_crc_source, int idx);
+	bool (*ctx_finished)(struct nv50_head *,
+			     struct nv50_crc_notifier_ctx *);
+	short flip_threshold;
+	short num_entries;
+	size_t notifier_len;
+};
+
+struct nv50_crc {
+	spinlock_t lock;
+	struct nv50_crc_notifier_ctx ctx[2];
+	struct drm_vblank_work flip_work;
+	enum nv50_crc_source src;
+
+	u64 frame;
+	short entry_idx;
+	short flip_threshold;
+	u8 ctx_idx : 1;
+	bool ctx_changed : 1;
+};
+
+void nv50_crc_init(struct drm_device *dev);
+int nv50_head_crc_late_register(struct nv50_head *);
+void nv50_crc_handle_vblank(struct nv50_head *head);
+
+int nv50_crc_verify_source(struct drm_crtc *, const char *, size_t *);
+const char *const *nv50_crc_get_sources(struct drm_crtc *, size_t *);
+int nv50_crc_set_source(struct drm_crtc *, const char *);
+
+int nv50_crc_atomic_check(struct nv50_head *, struct nv50_head_atom *,
+			  struct nv50_head_atom *);
+void nv50_crc_atomic_stop_reporting(struct drm_atomic_state *);
+void nv50_crc_atomic_prepare_notifier_contexts(struct drm_atomic_state *);
+void nv50_crc_atomic_start_reporting(struct drm_atomic_state *);
+void nv50_crc_atomic_set(struct nv50_head *, struct nv50_head_atom *);
+void nv50_crc_atomic_clr(struct nv50_head *);
+
+extern const struct nv50_crc_func crc907d;
+extern const struct nv50_crc_func crcc37d;
+
+#else /* IS_ENABLED(CONFIG_DEBUG_FS) */
+struct nv50_crc {};
+struct nv50_crc_func {};
+struct nv50_crc_atom {};
+
+#define nv50_crc_verify_source NULL
+#define nv50_crc_get_sources NULL
+#define nv50_crc_set_source NULL
+
+static inline void nv50_crc_init(struct nv50_crc *) {}
+static inline void nv50_crc_fini(struct nv50_crc *) {}
+static inline void nv50_crc_get_entries(struct nv50_head *) {}
+static inline int nv50_crc_late_register(nv50_head *) { return 0; }
+
+static inline int
+nv50_crc_atomic_check(struct nv50_head *, struct nv50_head_atom *,
+		      struct nv50_head_atom *) {}
+static inline void
+nv50_crc_atomic_stop_reporting(struct drm_atomic_state *) {}
+static inline void
+nv50_crc_atomic_prepare_notifier_contexts(struct drm_atomic_state *) {}
+static inline void
+nv50_crc_atomic_start_reporting(struct drm_atomic_state *) {}
+static inline void
+nv50_crc_atomic_set(struct nv50_head *, struct nv50_head_atom *) {}
+static inline void
+nv50_crc_atomic_clr(struct nv50_head *) {}
+
+#endif /* IS_ENABLED(CONFIG_DEBUG_FS) */
+#endif /* !__NV50_CRC_H__ */
diff --git a/drivers/gpu/drm/nouveau/dispnv50/crc907d.c b/drivers/gpu/drm/nouveau/dispnv50/crc907d.c
new file mode 100644
index 000000000000..d2e5f4b8c49f
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/dispnv50/crc907d.c
@@ -0,0 +1,139 @@
+/* SPDX-License-Identifier: MIT */
+#include <drm/drm_crtc.h>
+
+#include "crc.h"
+#include "core.h"
+#include "disp.h"
+#include "head.h"
+
+#define CRC907D_MAX_ENTRIES 255
+
+struct crc907d_notifier {
+	uint32_t status;
+	uint32_t :32; /* reserved */
+	struct crc907d_entry {
+		uint32_t status;
+		uint32_t compositor_crc;
+		u32 output_crc[2];
+	} entries[CRC907D_MAX_ENTRIES];
+} __packed;
+
+static void
+crc907d_set_src(struct nv50_head *head, int or,
+		enum nv50_crc_source_type source,
+		struct nv50_crc_notifier_ctx *ctx, u32 wndw)
+{
+	struct drm_crtc *crtc = &head->base.base;
+	struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan;
+	const u32 hoff = head->base.index * 0x300;
+	u32 *push;
+	u32 crc_args = 0xfff00000;
+
+	switch (source) {
+	case NV50_CRC_SOURCE_TYPE_SOR:
+		crc_args |= (0x00000f0f + or * 16) << 8;
+		break;
+	case NV50_CRC_SOURCE_TYPE_PIOR:
+		crc_args |= (0x000000ff + or * 256) << 8;
+		break;
+	case NV50_CRC_SOURCE_TYPE_DAC:
+		crc_args |= (0x00000ff0 + or) << 8;
+		break;
+	case NV50_CRC_SOURCE_TYPE_RG:
+		crc_args |= (0x00000ff8 + drm_crtc_index(crtc)) << 8;
+		break;
+	case NV50_CRC_SOURCE_TYPE_SF:
+		crc_args |= (0x00000f8f + drm_crtc_index(crtc) * 16) << 8;
+		break;
+	case NV50_CRC_SOURCE_NONE:
+		crc_args |= 0x000fff00;
+		break;
+	}
+
+	push = evo_wait(core, 4);
+	if (!push)
+		return;
+
+	if (source) {
+		evo_mthd(push, 0x0438 + hoff, 1);
+		evo_data(push, ctx->ntfy.handle);
+		evo_mthd(push, 0x0430 + hoff, 1);
+		evo_data(push, crc_args);
+	} else {
+		evo_mthd(push, 0x0430 + hoff, 1);
+		evo_data(push, crc_args);
+		evo_mthd(push, 0x0438 + hoff, 1);
+		evo_data(push, 0);
+	}
+	evo_kick(push, core);
+}
+
+static void crc907d_set_ctx(struct nv50_head *head,
+			    struct nv50_crc_notifier_ctx *ctx)
+{
+	struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan;
+	u32 *push = evo_wait(core, 2);
+
+	if (!push)
+		return;
+
+	evo_mthd(push, 0x0438 + (head->base.index * 0x300), 1);
+	evo_data(push, ctx ? ctx->ntfy.handle : 0);
+	evo_kick(push, core);
+}
+
+static u32 crc907d_get_entry(struct nv50_head *head,
+			     struct nv50_crc_notifier_ctx *ctx,
+			     enum nv50_crc_source source, int idx)
+{
+	struct crc907d_notifier __iomem *notifier = ctx->mem.object.map.ptr;
+
+	return ioread32_native(&notifier->entries[idx].output_crc[0]);
+}
+
+static bool crc907d_ctx_finished(struct nv50_head *head,
+				 struct nv50_crc_notifier_ctx *ctx)
+{
+	struct nouveau_drm *drm = nouveau_drm(head->base.base.dev);
+	struct crc907d_notifier __iomem *notifier = ctx->mem.object.map.ptr;
+	const u32 status = ioread32_native(&notifier->status);
+	const u32 overflow = status & 0x0000003e;
+
+	if (!(status & 0x00000001))
+		return false;
+
+	if (overflow) {
+		const char *engine = NULL;
+
+		switch (overflow) {
+			case 0x00000004: engine = "DSI"; break;
+			case 0x00000008: engine = "Compositor"; break;
+			case 0x00000010: engine = "CRC output 1"; break;
+			case 0x00000020: engine = "CRC output 2"; break;
+		}
+
+		if (engine)
+			NV_ERROR(drm,
+				 "CRC notifier context for head %d overflowed on %s: %x\n",
+				 head->base.index, engine, status);
+		else
+			NV_ERROR(drm,
+				 "CRC notifier context for head %d overflowed: %x\n",
+				 head->base.index, status);
+	}
+
+	NV_DEBUG(drm, "Head %d CRC context status: %x\n",
+		 head->base.index, status);
+
+	return true;
+}
+
+const struct nv50_crc_func crc907d = {
+	.set_src = crc907d_set_src,
+	.set_ctx = crc907d_set_ctx,
+	.get_entry = crc907d_get_entry,
+	.ctx_finished = crc907d_ctx_finished,
+	.flip_threshold = CRC907D_MAX_ENTRIES - 10,
+	.num_entries = CRC907D_MAX_ENTRIES,
+	.notifier_len = sizeof(struct crc907d_notifier),
+};
diff --git a/drivers/gpu/drm/nouveau/dispnv50/crcc37d.c b/drivers/gpu/drm/nouveau/dispnv50/crcc37d.c
new file mode 100644
index 000000000000..19b50948f486
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/dispnv50/crcc37d.c
@@ -0,0 +1,153 @@
+/* SPDX-License-Identifier: MIT */
+#include <drm/drm_crtc.h>
+
+#include "crc.h"
+#include "core.h"
+#include "disp.h"
+#include "head.h"
+
+#define CRCC37D_MAX_ENTRIES 2047
+
+struct crcc37d_notifier {
+	u32 status;
+
+	/* reserved */
+	u32 :32;
+	u32 :32;
+	u32 :32;
+	u32 :32;
+	u32 :32;
+	u32 :32;
+	u32 :32;
+
+	struct crcc37d_entry {
+		u32 status[2];
+		u32 :32; /* reserved */
+		u32 compositor_crc;
+		u32 rg_crc;
+		u32 output_crc[2];
+		u32 :32; /* reserved */
+	} entries[CRCC37D_MAX_ENTRIES];
+} __packed;
+
+static void
+crcc37d_set_src(struct nv50_head *head, int or,
+		enum nv50_crc_source_type source,
+		struct nv50_crc_notifier_ctx *ctx, u32 wndw)
+{
+	struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan;
+	const u32 hoff = head->base.index * 0x400;
+	u32 *push;
+	u32 crc_args;
+
+	switch (source) {
+	case NV50_CRC_SOURCE_TYPE_SOR:
+		crc_args = (0x00000050 + or) << 12;
+		break;
+	case NV50_CRC_SOURCE_TYPE_PIOR:
+		crc_args = (0x00000060 + or) << 12;
+		break;
+	case NV50_CRC_SOURCE_TYPE_SF:
+		crc_args = 0x00000030 << 12;
+		break;
+	default:
+		crc_args = 0;
+		break;
+	}
+
+	push = evo_wait(core, 4);
+	if (!push)
+		return;
+
+	if (source) {
+		evo_mthd(push, 0x2180 + hoff, 1);
+		evo_data(push, ctx->ntfy.handle);
+		evo_mthd(push, 0x2184 + hoff, 1);
+		evo_data(push, crc_args | wndw);
+	} else {
+		evo_mthd(push, 0x2184 + hoff, 1);
+		evo_data(push, 0);
+		evo_mthd(push, 0x2180 + hoff, 1);
+		evo_data(push, 0);
+	}
+
+	evo_kick(push, core);
+}
+
+static void crcc37d_set_ctx(struct nv50_head *head,
+			    struct nv50_crc_notifier_ctx *ctx)
+{
+	struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan;
+	u32 *push = evo_wait(core, 2);
+
+	if (!push)
+		return;
+
+	evo_mthd(push, 0x2180 + (head->base.index * 0x400), 1);
+	evo_data(push, ctx ? ctx->ntfy.handle : 0);
+	evo_kick(push, core);
+}
+
+static u32 crcc37d_get_entry(struct nv50_head *head,
+			     struct nv50_crc_notifier_ctx *ctx,
+			     enum nv50_crc_source source, int idx)
+{
+	struct crcc37d_notifier __iomem *notifier = ctx->mem.object.map.ptr;
+	struct crcc37d_entry __iomem *entry = &notifier->entries[idx];
+	u32 __iomem *crc_addr;
+
+	if (source == NV50_CRC_SOURCE_RG)
+		crc_addr = &entry->rg_crc;
+	else
+		crc_addr = &entry->output_crc[0];
+
+	return ioread32_native(crc_addr);
+}
+
+static bool crcc37d_ctx_finished(struct nv50_head *head,
+				 struct nv50_crc_notifier_ctx *ctx)
+{
+	struct nouveau_drm *drm = nouveau_drm(head->base.base.dev);
+	struct crcc37d_notifier __iomem *notifier = ctx->mem.object.map.ptr;
+	const u32 status = ioread32_native(&notifier->status);
+	const u32 overflow = status & 0x0000007e;
+
+	if (!(status & 0x00000001))
+		return false;
+
+	if (overflow) {
+		const char *engine = NULL;
+
+		switch (overflow) {
+		case 0x00000004: engine = "Front End"; break;
+		case 0x00000008: engine = "Compositor"; break;
+		case 0x00000010: engine = "RG"; break;
+		case 0x00000020: engine = "CRC output 1"; break;
+		case 0x00000040: engine = "CRC output 2"; break;
+		}
+
+		if (engine)
+			NV_ERROR(drm,
+				 "CRC notifier context for head %d overflowed on %s: %x\n",
+				 head->base.index, engine, status);
+		else
+			NV_ERROR(drm,
+				 "CRC notifier context for head %d overflowed: %x\n",
+				 head->base.index, status);
+	}
+
+	NV_DEBUG(drm, "Head %d CRC context status: %x\n",
+		 head->base.index, status);
+
+	return true;
+}
+
+const struct nv50_crc_func crcc37d = {
+	.set_src = crcc37d_set_src,
+	.set_ctx = crcc37d_set_ctx,
+	.get_entry = crcc37d_get_entry,
+	.ctx_finished = crcc37d_ctx_finished,
+	.flip_threshold = CRCC37D_MAX_ENTRIES - 30,
+	.num_entries = CRCC37D_MAX_ENTRIES,
+	.notifier_len = sizeof(struct crcc37d_notifier),
+};
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index bfea85782d0e..47015cea8063 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -775,6 +775,19 @@ struct nv50_msto {
 	bool disabled;
 };
 
+struct nouveau_encoder *nv50_real_outp(struct drm_encoder *encoder)
+{
+	struct nv50_msto *msto;
+
+	if (encoder->encoder_type != DRM_MODE_ENCODER_DPMST)
+		return nouveau_encoder(encoder);
+
+	msto = nv50_msto(encoder);
+	if (!msto->mstc)
+		return NULL;
+	return msto->mstc->mstm->outp;
+}
+
 static struct drm_dp_payload *
 nv50_msto_payload(struct nv50_msto *msto)
 {
@@ -1898,6 +1911,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 	int i;
 
 	NV_ATOMIC(drm, "commit %d %d\n", atom->lock_core, atom->flush_disable);
+	nv50_crc_atomic_stop_reporting(state);
 	drm_atomic_helper_wait_for_fences(dev, state, false);
 	drm_atomic_helper_wait_for_dependencies(state);
 	drm_atomic_helper_update_legacy_modeset_state(dev, state);
@@ -1907,6 +1921,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 
 	/* Disable head(s). */
 	for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) {
+		struct nv50_head_atom *armh = nv50_head_atom(old_crtc_state);
 		struct nv50_head_atom *asyh = nv50_head_atom(new_crtc_state);
 		struct nv50_head *head = nv50_head(crtc);
 
@@ -1919,7 +1934,8 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 		}
 
 		if (asyh->clr.mask) {
-			nv50_head_flush_clr(head, asyh, atom->flush_disable);
+			nv50_head_flush_clr(head, armh, asyh,
+					    atom->flush_disable);
 			interlock[NV50_DISP_INTERLOCK_CORE] |= 1;
 		}
 	}
@@ -1968,6 +1984,8 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 		}
 	}
 
+	nv50_crc_atomic_prepare_notifier_contexts(state);
+
 	/* Update output path(s). */
 	list_for_each_entry_safe(outp, outt, &atom->outp, head) {
 		const struct drm_encoder_helper_funcs *help;
@@ -1990,6 +2008,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 
 	/* Update head(s). */
 	for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) {
+		struct nv50_head_atom *armh = nv50_head_atom(old_crtc_state);
 		struct nv50_head_atom *asyh = nv50_head_atom(new_crtc_state);
 		struct nv50_head *head = nv50_head(crtc);
 
@@ -1997,7 +2016,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 			  asyh->set.mask, asyh->clr.mask);
 
 		if (asyh->set.mask) {
-			nv50_head_flush_set(head, asyh);
+			nv50_head_flush_set(head, armh, asyh);
 			interlock[NV50_DISP_INTERLOCK_CORE] = 1;
 		}
 
@@ -2081,6 +2100,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 		}
 	}
 
+	nv50_crc_atomic_start_reporting(state);
 	drm_atomic_helper_commit_hw_done(state);
 	drm_atomic_helper_cleanup_planes(dev, state);
 	drm_atomic_helper_commit_cleanup_done(state);
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.h b/drivers/gpu/drm/nouveau/dispnv50/disp.h
index 8935ebce8ab0..da72223277fe 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.h
@@ -1,10 +1,12 @@
 #ifndef __NV50_KMS_H__
 #define __NV50_KMS_H__
+#include <linux/workqueue.h>
 #include <nvif/mem.h>
 
 #include "nouveau_display.h"
 
 struct nv50_msto;
+struct nouveau_encoder;
 
 struct nv50_disp {
 	struct nvif_disp *disp;
@@ -89,6 +91,14 @@ int nv50_dmac_create(struct nvif_device *device, struct nvif_object *disp,
 		     u64 syncbuf, struct nv50_dmac *dmac);
 void nv50_dmac_destroy(struct nv50_dmac *);
 
+/*
+ * For normal encoders this just returns the encoder. For active MST encoders,
+ * this returns the real outp that's driving displays on the topology.
+ * Inactive MST encoders return NULL, since they would have no real outp to
+ * return anyway.
+ */
+struct nouveau_encoder *nv50_real_outp(struct drm_encoder *encoder);
+
 u32 *evo_wait(struct nv50_dmac *, int nr);
 void evo_kick(u32 *, struct nv50_dmac *);
 
diff --git a/drivers/gpu/drm/nouveau/dispnv50/handles.h b/drivers/gpu/drm/nouveau/dispnv50/handles.h
index e3a62c7a0d08..a97a7bd29243 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/handles.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/handles.h
@@ -11,5 +11,6 @@
 #define NV50_DISP_HANDLE_VRAM                                           0xf0000001
 
 #define NV50_DISP_HANDLE_WNDW_CTX(kind)                        (0xfb000000 | kind)
+#define NV50_DISP_HANDLE_CRC_CTX(head, i) (0xfc000000 | head->base.index << 1 | i)
 
 #endif /* !__NV50_KMS_HANDLES_H__ */
diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.c b/drivers/gpu/drm/nouveau/dispnv50/head.c
index 72bc3bce396a..4668bcfb596c 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/head.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/head.c
@@ -24,27 +24,36 @@
 #include "core.h"
 #include "curs.h"
 #include "ovly.h"
+#include "crc.h"
 
 #include <nvif/class.h>
+#include <nvif/event.h>
+#include <nvif/cl0046.h>
 
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_vblank.h>
 #include "nouveau_connector.h"
+
 void
 nv50_head_flush_clr(struct nv50_head *head,
-		    struct nv50_head_atom *asyh, bool flush)
+		    struct nv50_head_atom *armh,
+		    struct nv50_head_atom *asyh,
+		    bool flush)
 {
 	union nv50_head_atom_mask clr = {
 		.mask = asyh->clr.mask & ~(flush ? 0 : asyh->set.mask),
 	};
+	if (clr.crc)  nv50_crc_atomic_clr(head);
 	if (clr.olut) head->func->olut_clr(head);
 	if (clr.core) head->func->core_clr(head);
 	if (clr.curs) head->func->curs_clr(head);
 }
 
 void
-nv50_head_flush_set(struct nv50_head *head, struct nv50_head_atom *asyh)
+nv50_head_flush_set(struct nv50_head *head,
+		    struct nv50_head_atom *armh,
+		    struct nv50_head_atom *asyh)
 {
 	if (asyh->set.view   ) head->func->view    (head, asyh);
 	if (asyh->set.mode   ) head->func->mode    (head, asyh);
@@ -61,6 +70,7 @@ nv50_head_flush_set(struct nv50_head *head, struct nv50_head_atom *asyh)
 	if (asyh->set.ovly   ) head->func->ovly    (head, asyh);
 	if (asyh->set.dither ) head->func->dither  (head, asyh);
 	if (asyh->set.procamp) head->func->procamp (head, asyh);
+	if (asyh->set.crc    ) nv50_crc_atomic_set (head, asyh);
 	if (asyh->set.or     ) head->func->or      (head, asyh);
 }
 
@@ -313,7 +323,7 @@ nv50_head_atomic_check(struct drm_crtc *crtc, struct drm_crtc_state *state)
 	struct nouveau_conn_atom *asyc = NULL;
 	struct drm_connector_state *conns;
 	struct drm_connector *conn;
-	int i;
+	int i, ret;
 
 	NV_ATOMIC(drm, "%s atomic_check %d\n", crtc->name, asyh->state.active);
 	if (asyh->state.active) {
@@ -408,6 +418,10 @@ nv50_head_atomic_check(struct drm_crtc *crtc, struct drm_crtc_state *state)
 		asyh->set.curs = asyh->curs.visible;
 	}
 
+	ret = nv50_crc_atomic_check(head, asyh, armh);
+	if (ret)
+		return ret;
+
 	if (asyh->clr.mask || asyh->set.mask)
 		nv50_atom(asyh->state.state)->lock_core = true;
 	return 0;
@@ -446,6 +460,7 @@ nv50_head_atomic_duplicate_state(struct drm_crtc *crtc)
 	asyh->ovly = armh->ovly;
 	asyh->dither = armh->dither;
 	asyh->procamp = armh->procamp;
+	asyh->crc = armh->crc;
 	asyh->or = armh->or;
 	asyh->dp = armh->dp;
 	asyh->clr.mask = 0;
@@ -467,10 +482,18 @@ nv50_head_reset(struct drm_crtc *crtc)
 	__drm_atomic_helper_crtc_reset(crtc, &asyh->state);
 }
 
+static int
+nv50_head_late_register(struct drm_crtc *crtc)
+{
+	return nv50_head_crc_late_register(nv50_head(crtc));
+}
+
 static void
 nv50_head_destroy(struct drm_crtc *crtc)
 {
 	struct nv50_head *head = nv50_head(crtc);
+
+	nvif_notify_fini(&head->base.vblank);
 	nv50_lut_fini(&head->olut);
 	drm_crtc_cleanup(crtc);
 	kfree(head);
@@ -488,8 +511,38 @@ nv50_head_func = {
 	.enable_vblank = nouveau_display_vblank_enable,
 	.disable_vblank = nouveau_display_vblank_disable,
 	.get_vblank_timestamp = drm_crtc_vblank_helper_get_vblank_timestamp,
+	.late_register = nv50_head_late_register,
+};
+
+static const struct drm_crtc_funcs
+nvd9_head_func = {
+	.reset = nv50_head_reset,
+	.gamma_set = drm_atomic_helper_legacy_gamma_set,
+	.destroy = nv50_head_destroy,
+	.set_config = drm_atomic_helper_set_config,
+	.page_flip = drm_atomic_helper_page_flip,
+	.atomic_duplicate_state = nv50_head_atomic_duplicate_state,
+	.atomic_destroy_state = nv50_head_atomic_destroy_state,
+	.enable_vblank = nouveau_display_vblank_enable,
+	.disable_vblank = nouveau_display_vblank_disable,
+	.get_vblank_timestamp = drm_crtc_vblank_helper_get_vblank_timestamp,
+	.verify_crc_source = nv50_crc_verify_source,
+	.get_crc_sources = nv50_crc_get_sources,
+	.set_crc_source = nv50_crc_set_source,
+	.late_register = nv50_head_late_register,
 };
 
+static int nv50_head_vblank_handler(struct nvif_notify *notify)
+{
+	struct nouveau_crtc *nv_crtc =
+		container_of(notify, struct nouveau_crtc, vblank);
+
+	if (drm_crtc_handle_vblank(&nv_crtc->base))
+		nv50_crc_handle_vblank(nv50_head(&nv_crtc->base));
+
+	return NVIF_NOTIFY_KEEP;
+}
+
 struct nv50_head *
 nv50_head_create(struct drm_device *dev, int index)
 {
@@ -497,7 +550,9 @@ nv50_head_create(struct drm_device *dev, int index)
 	struct nv50_disp *disp = nv50_disp(dev);
 	struct nv50_head *head;
 	struct nv50_wndw *base, *ovly, *curs;
+	struct nouveau_crtc *nv_crtc;
 	struct drm_crtc *crtc;
+	const struct drm_crtc_funcs *funcs;
 	int ret;
 
 	head = kzalloc(sizeof(*head), GFP_KERNEL);
@@ -507,6 +562,11 @@ nv50_head_create(struct drm_device *dev, int index)
 	head->func = disp->core->func->head;
 	head->base.index = index;
 
+	if (disp->disp->object.oclass < GF110_DISP)
+		funcs = &nv50_head_func;
+	else
+		funcs = &nvd9_head_func;
+
 	if (disp->disp->object.oclass < GV100_DISP) {
 		ret = nv50_base_new(drm, head->base.index, &base);
 		if (ret)
@@ -531,9 +591,10 @@ nv50_head_create(struct drm_device *dev, int index)
 	if (ret)
 		goto fail_free;
 
-	crtc = &head->base.base;
+	nv_crtc = &head->base;
+	crtc = &nv_crtc->base;
 	drm_crtc_init_with_planes(dev, crtc, &base->plane, &curs->plane,
-				  &nv50_head_func, "head-%d", head->base.index);
+				  funcs, "head-%d", head->base.index);
 	drm_crtc_helper_add(crtc, &nv50_head_help);
 	/* Keep the legacy gamma size at 256 to avoid compatibility issues */
 	drm_mode_crtc_set_gamma_size(crtc, 256);
@@ -547,8 +608,22 @@ nv50_head_create(struct drm_device *dev, int index)
 			goto fail_crtc_cleanup;
 	}
 
+	ret = nvif_notify_init(&disp->disp->object, nv50_head_vblank_handler,
+			       false, NV04_DISP_NTFY_VBLANK,
+			       &(struct nvif_notify_head_req_v0) {
+				    .head = nv_crtc->index,
+			       },
+			       sizeof(struct nvif_notify_head_req_v0),
+			       sizeof(struct nvif_notify_head_rep_v0),
+			       &nv_crtc->vblank);
+	if (ret)
+		goto fail_lut_fini;
+
 	return head;
 
+fail_lut_fini:
+	if (head->func->olut_set)
+		nv50_lut_fini(&head->olut);
 fail_crtc_cleanup:
 	drm_crtc_cleanup(crtc);
 fail_free:
diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.h b/drivers/gpu/drm/nouveau/dispnv50/head.h
index c05bbba9e247..2e57910ba450 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/head.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/head.h
@@ -1,22 +1,29 @@
 #ifndef __NV50_KMS_HEAD_H__
 #define __NV50_KMS_HEAD_H__
 #define nv50_head(c) container_of((c), struct nv50_head, base.base)
+#include <linux/workqueue.h>
+
 #include "disp.h"
 #include "atom.h"
+#include "crc.h"
 #include "lut.h"
 
 #include "nouveau_crtc.h"
+#include "nouveau_encoder.h"
 
 struct nv50_head {
 	const struct nv50_head_func *func;
 	struct nouveau_crtc base;
+	struct nv50_crc crc;
 	struct nv50_lut olut;
 	struct nv50_msto *msto;
 };
 
 struct nv50_head *nv50_head_create(struct drm_device *, int index);
-void nv50_head_flush_set(struct nv50_head *, struct nv50_head_atom *);
-void nv50_head_flush_clr(struct nv50_head *, struct nv50_head_atom *, bool y);
+void nv50_head_flush_set(struct nv50_head *, struct nv50_head_atom *armh,
+			 struct nv50_head_atom *asyh);
+void nv50_head_flush_clr(struct nv50_head *, struct nv50_head_atom *armh,
+			 struct nv50_head_atom *asyh, bool y);
 
 struct nv50_head_func {
 	void (*view)(struct nv50_head *, struct nv50_head_atom *);
diff --git a/drivers/gpu/drm/nouveau/dispnv50/head907d.c b/drivers/gpu/drm/nouveau/dispnv50/head907d.c
index 3002ec23d7a6..63a0b45d96d6 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/head907d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/head907d.c
@@ -19,8 +19,15 @@
  * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
  * OTHER DEALINGS IN THE SOFTWARE.
  */
+#include <drm/drm_connector.h>
+#include <drm/drm_mode_config.h>
+#include <drm/drm_vblank.h>
+#include "nouveau_drv.h"
+#include "nouveau_bios.h"
+#include "nouveau_connector.h"
 #include "head.h"
 #include "core.h"
+#include "crc.h"
 
 void
 head907d_or(struct nv50_head *head, struct nv50_head_atom *asyh)
@@ -29,9 +36,10 @@ head907d_or(struct nv50_head *head, struct nv50_head_atom *asyh)
 	u32 *push;
 	if ((push = evo_wait(core, 3))) {
 		evo_mthd(push, 0x0404 + (head->base.index * 0x300), 2);
-		evo_data(push, 0x00000001 | asyh->or.depth  << 6 |
-					    asyh->or.nvsync << 4 |
-					    asyh->or.nhsync << 3);
+		evo_data(push, asyh->or.depth  << 6 |
+			       asyh->or.nvsync << 4 |
+			       asyh->or.nhsync << 3 |
+			       asyh->or.crc_raster);
 		evo_data(push, 0x31ec6000 | head->base.index << 25 |
 					    asyh->mode.interlace);
 		evo_kick(push, core);
diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc37d.c b/drivers/gpu/drm/nouveau/dispnv50/headc37d.c
index cf5a68f4021a..87dd11b2fe96 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/headc37d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/headc37d.c
@@ -46,10 +46,10 @@ headc37d_or(struct nv50_head *head, struct nv50_head_atom *asyh)
 		}
 
 		evo_mthd(push, 0x2004 + (head->base.index * 0x400), 1);
-		evo_data(push, 0x00000001 |
-			       asyh->or.depth << 4 |
+		evo_data(push, depth << 4 |
 			       asyh->or.nvsync << 3 |
-			       asyh->or.nhsync << 2);
+			       asyh->or.nhsync << 2 |
+			       asyh->or.crc_raster);
 		evo_kick(push, core);
 	}
 }
diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
index 65e3b60804c6..e6cef3593aec 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
@@ -46,10 +46,11 @@ headc57d_or(struct nv50_head *head, struct nv50_head_atom *asyh)
 		}
 
 		evo_mthd(push, 0x2004 + (head->base.index * 0x400), 1);
-		evo_data(push, 0xfc000001 |
-			       asyh->or.depth << 4 |
+		evo_data(push, 0xfc000000 |
+			       depth << 4 |
 			       asyh->or.nvsync << 3 |
-			       asyh->or.nhsync << 2);
+			       asyh->or.nhsync << 2 |
+			       asyh->or.crc_raster);
 		evo_kick(push, core);
 	}
 }
diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c
index 700817dc4fa0..7a0bd9a720a0 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.c
+++ b/drivers/gpu/drm/nouveau/nouveau_display.c
@@ -43,15 +43,7 @@
 #include <nvif/class.h>
 #include <nvif/cl0046.h>
 #include <nvif/event.h>
-
-static int
-nouveau_display_vblank_handler(struct nvif_notify *notify)
-{
-	struct nouveau_crtc *nv_crtc =
-		container_of(notify, typeof(*nv_crtc), vblank);
-	drm_crtc_handle_vblank(&nv_crtc->base);
-	return NVIF_NOTIFY_KEEP;
-}
+#include <dispnv50/crc.h>
 
 int
 nouveau_display_vblank_enable(struct drm_crtc *crtc)
@@ -135,50 +127,6 @@ nouveau_display_scanoutpos(struct drm_crtc *crtc,
 					       stime, etime);
 }
 
-static void
-nouveau_display_vblank_fini(struct drm_device *dev)
-{
-	struct drm_crtc *crtc;
-
-	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
-		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
-		nvif_notify_fini(&nv_crtc->vblank);
-	}
-}
-
-static int
-nouveau_display_vblank_init(struct drm_device *dev)
-{
-	struct nouveau_display *disp = nouveau_display(dev);
-	struct drm_crtc *crtc;
-	int ret;
-
-	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
-		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
-		ret = nvif_notify_init(&disp->disp.object,
-				       nouveau_display_vblank_handler, false,
-				       NV04_DISP_NTFY_VBLANK,
-				       &(struct nvif_notify_head_req_v0) {
-					.head = nv_crtc->index,
-				       },
-				       sizeof(struct nvif_notify_head_req_v0),
-				       sizeof(struct nvif_notify_head_rep_v0),
-				       &nv_crtc->vblank);
-		if (ret) {
-			nouveau_display_vblank_fini(dev);
-			return ret;
-		}
-	}
-
-	ret = drm_vblank_init(dev, dev->mode_config.num_crtc);
-	if (ret) {
-		nouveau_display_vblank_fini(dev);
-		return ret;
-	}
-
-	return 0;
-}
-
 static void
 nouveau_user_framebuffer_destroy(struct drm_framebuffer *drm_fb)
 {
@@ -545,9 +493,12 @@ nouveau_display_create(struct drm_device *dev)
 	drm_mode_config_reset(dev);
 
 	if (dev->mode_config.num_crtc) {
-		ret = nouveau_display_vblank_init(dev);
+		ret = drm_vblank_init(dev, dev->mode_config.num_crtc);
 		if (ret)
 			goto vblank_err;
+
+		if (disp->disp.object.oclass >= NV50_DISP)
+			nv50_crc_init(dev);
 	}
 
 	INIT_WORK(&drm->hpd_work, nouveau_display_hpd_work);
@@ -574,7 +525,6 @@ nouveau_display_destroy(struct drm_device *dev)
 #ifdef CONFIG_ACPI
 	unregister_acpi_notifier(&nouveau_drm(dev)->acpi_nb);
 #endif
-	nouveau_display_vblank_fini(dev);
 
 	drm_kms_helper_poll_fini(dev);
 	drm_mode_config_cleanup(dev);
-- 
2.24.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2] drm/nouveau/kms/nvd9-: Add CRC support
  2020-03-18  0:41 ` [PATCH 9/9] drm/nouveau/kms/nvd9-: Add CRC support Lyude Paul
@ 2020-03-18  1:09   ` Lyude Paul
  2020-03-18  6:02   ` [PATCH 9/9] " Greg Kroah-Hartman
  1 sibling, 0 replies; 22+ messages in thread
From: Lyude Paul @ 2020-03-18  1:09 UTC (permalink / raw)
  To: nouveau
  Cc: Kate Stewart, Jani Nikula, David Airlie, Greg Kroah-Hartman,
	dri-devel, Peteris Rudzusiks, linux-kernel, Sean Paul,
	Ben Skeggs, Thomas Zimmermann, Pankaj Bharadiya, Alex Deucher,
	Sam Ravnborg

This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:

https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt

We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.

Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.

Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.

Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.

Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.

Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.

In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".

Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
  some corrections to the empty function declarations that we use if
  CONFIG_DEBUG_FS isn't enabled.

Signed-off-by: Lyude Paul <lyude@redhat.com>
---
 drivers/gpu/drm/nouveau/dispnv04/crtc.c     |  25 +-
 drivers/gpu/drm/nouveau/dispnv50/Kbuild     |   4 +
 drivers/gpu/drm/nouveau/dispnv50/atom.h     |  20 +
 drivers/gpu/drm/nouveau/dispnv50/core.h     |   4 +
 drivers/gpu/drm/nouveau/dispnv50/core907d.c |   3 +
 drivers/gpu/drm/nouveau/dispnv50/core917d.c |   3 +
 drivers/gpu/drm/nouveau/dispnv50/corec37d.c |   3 +
 drivers/gpu/drm/nouveau/dispnv50/corec57d.c |   3 +
 drivers/gpu/drm/nouveau/dispnv50/crc.c      | 716 ++++++++++++++++++++
 drivers/gpu/drm/nouveau/dispnv50/crc.h      | 125 ++++
 drivers/gpu/drm/nouveau/dispnv50/crc907d.c  | 139 ++++
 drivers/gpu/drm/nouveau/dispnv50/crcc37d.c  | 153 +++++
 drivers/gpu/drm/nouveau/dispnv50/disp.c     |  24 +-
 drivers/gpu/drm/nouveau/dispnv50/disp.h     |  10 +
 drivers/gpu/drm/nouveau/dispnv50/handles.h  |   1 +
 drivers/gpu/drm/nouveau/dispnv50/head.c     |  85 ++-
 drivers/gpu/drm/nouveau/dispnv50/head.h     |  11 +-
 drivers/gpu/drm/nouveau/dispnv50/head907d.c |  14 +-
 drivers/gpu/drm/nouveau/dispnv50/headc37d.c |   6 +-
 drivers/gpu/drm/nouveau/dispnv50/headc57d.c |   7 +-
 drivers/gpu/drm/nouveau/nouveau_display.c   |  60 +-
 21 files changed, 1342 insertions(+), 74 deletions(-)
 create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crc.c
 create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crc.h
 create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crc907d.c
 create mode 100644 drivers/gpu/drm/nouveau/dispnv50/crcc37d.c

diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
index 1f08de4241e0..fc178ffce8cd 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
@@ -44,6 +44,9 @@
 #include <subdev/bios/pll.h>
 #include <subdev/clk.h>
 
+#include <nvif/event.h>
+#include <nvif/cl0046.h>
+
 static int
 nv04_crtc_mode_set_base(struct drm_crtc *crtc, int x, int y,
 			struct drm_framebuffer *old_fb);
@@ -755,6 +758,7 @@ static void nv_crtc_destroy(struct drm_crtc *crtc)
 	nouveau_bo_unmap(nv_crtc->cursor.nvbo);
 	nouveau_bo_unpin(nv_crtc->cursor.nvbo);
 	nouveau_bo_ref(NULL, &nv_crtc->cursor.nvbo);
+	nvif_notify_fini(&nv_crtc->vblank);
 	kfree(nv_crtc);
 }
 
@@ -1296,9 +1300,19 @@ create_primary_plane(struct drm_device *dev)
         return primary;
 }
 
+static int nv04_crtc_vblank_handler(struct nvif_notify *notify)
+{
+	struct nouveau_crtc *nv_crtc =
+		container_of(notify, struct nouveau_crtc, vblank);
+
+	drm_crtc_handle_vblank(&nv_crtc->base);
+	return NVIF_NOTIFY_KEEP;
+}
+
 int
 nv04_crtc_create(struct drm_device *dev, int crtc_num)
 {
+	struct nouveau_display *disp = nouveau_display(dev);
 	struct nouveau_crtc *nv_crtc;
 	int ret;
 
@@ -1336,5 +1350,14 @@ nv04_crtc_create(struct drm_device *dev, int crtc_num)
 
 	nv04_cursor_init(nv_crtc);
 
-	return 0;
+	ret = nvif_notify_init(&disp->disp.object, nv04_crtc_vblank_handler,
+			       false, NV04_DISP_NTFY_VBLANK,
+			       &(struct nvif_notify_head_req_v0) {
+				    .head = nv_crtc->index,
+			       },
+			       sizeof(struct nvif_notify_head_req_v0),
+			       sizeof(struct nvif_notify_head_rep_v0),
+			       &nv_crtc->vblank);
+
+	return ret;
 }
diff --git a/drivers/gpu/drm/nouveau/dispnv50/Kbuild b/drivers/gpu/drm/nouveau/dispnv50/Kbuild
index e0c435eae664..6fdddb266fb1 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/Kbuild
+++ b/drivers/gpu/drm/nouveau/dispnv50/Kbuild
@@ -10,6 +10,10 @@ nouveau-y += dispnv50/core917d.o
 nouveau-y += dispnv50/corec37d.o
 nouveau-y += dispnv50/corec57d.o
 
+nouveau-$(CONFIG_DEBUG_FS) += dispnv50/crc.o
+nouveau-$(CONFIG_DEBUG_FS) += dispnv50/crc907d.o
+nouveau-$(CONFIG_DEBUG_FS) += dispnv50/crcc37d.o
+
 nouveau-y += dispnv50/dac507d.o
 nouveau-y += dispnv50/dac907d.o
 
diff --git a/drivers/gpu/drm/nouveau/dispnv50/atom.h b/drivers/gpu/drm/nouveau/dispnv50/atom.h
index 62faaf60f47a..3d82b3c67dec 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/atom.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/atom.h
@@ -2,6 +2,9 @@
 #define __NV50_KMS_ATOM_H__
 #define nv50_atom(p) container_of((p), struct nv50_atom, state)
 #include <drm/drm_atomic.h>
+#include "crc.h"
+
+struct nouveau_encoder;
 
 struct nv50_atom {
 	struct drm_atomic_state state;
@@ -115,9 +118,12 @@ struct nv50_head_atom {
 		u8 nhsync:1;
 		u8 nvsync:1;
 		u8 depth:4;
+		u8 crc_raster:2;
 		u8 bpc;
 	} or;
 
+	struct nv50_crc_atom crc;
+
 	/* Currently only used for MST */
 	struct {
 		int pbn;
@@ -135,6 +141,7 @@ struct nv50_head_atom {
 			bool ovly:1;
 			bool dither:1;
 			bool procamp:1;
+			bool crc:1;
 			bool or:1;
 		};
 		u16 mask;
@@ -150,6 +157,19 @@ nv50_head_atom_get(struct drm_atomic_state *state, struct drm_crtc *crtc)
 	return nv50_head_atom(statec);
 }
 
+static inline struct drm_encoder *
+nv50_head_atom_get_encoder(struct nv50_head_atom *atom)
+{
+	struct drm_encoder *encoder = NULL;
+
+	/* We only ever have a single encoder */
+	drm_for_each_encoder_mask(encoder, atom->state.crtc->dev,
+				  atom->state.encoder_mask)
+		break;
+
+	return encoder;
+}
+
 #define nv50_wndw_atom(p) container_of((p), struct nv50_wndw_atom, state)
 
 struct nv50_wndw_atom {
diff --git a/drivers/gpu/drm/nouveau/dispnv50/core.h b/drivers/gpu/drm/nouveau/dispnv50/core.h
index ff94f3f6f264..47470db0f154 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/core.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/core.h
@@ -2,6 +2,7 @@
 #define __NV50_KMS_CORE_H__
 #include "disp.h"
 #include "atom.h"
+#include "crc.h"
 
 struct nv50_core {
 	const struct nv50_core_func *func;
@@ -24,6 +25,9 @@ struct nv50_core_func {
 	} wndw;
 
 	const struct nv50_head_func *head;
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+	const struct nv50_crc_func *crc;
+#endif
 	const struct nv50_outp_func {
 		void (*ctrl)(struct nv50_core *, int or, u32 ctrl,
 			     struct nv50_head_atom *);
diff --git a/drivers/gpu/drm/nouveau/dispnv50/core907d.c b/drivers/gpu/drm/nouveau/dispnv50/core907d.c
index ef822f813435..fd6aaf629d02 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/core907d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/core907d.c
@@ -29,6 +29,9 @@ core907d = {
 	.ntfy_wait_done = core507d_ntfy_wait_done,
 	.update = core507d_update,
 	.head = &head907d,
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+	.crc = &crc907d,
+#endif
 	.dac = &dac907d,
 	.sor = &sor907d,
 };
diff --git a/drivers/gpu/drm/nouveau/dispnv50/core917d.c b/drivers/gpu/drm/nouveau/dispnv50/core917d.c
index 392338df5bfd..46debdce87ff 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/core917d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/core917d.c
@@ -29,6 +29,9 @@ core917d = {
 	.ntfy_wait_done = core507d_ntfy_wait_done,
 	.update = core507d_update,
 	.head = &head917d,
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+	.crc = &crc907d,
+#endif
 	.dac = &dac907d,
 	.sor = &sor907d,
 };
diff --git a/drivers/gpu/drm/nouveau/dispnv50/corec37d.c b/drivers/gpu/drm/nouveau/dispnv50/corec37d.c
index 3b36dc8d36b2..19761bfdac31 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/corec37d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/corec37d.c
@@ -114,6 +114,9 @@ corec37d = {
 	.wndw.owner = corec37d_wndw_owner,
 	.head = &headc37d,
 	.sor = &sorc37d,
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+	.crc = &crcc37d,
+#endif
 };
 
 int
diff --git a/drivers/gpu/drm/nouveau/dispnv50/corec57d.c b/drivers/gpu/drm/nouveau/dispnv50/corec57d.c
index 147adcd60937..0ffd6286985c 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/corec57d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/corec57d.c
@@ -51,6 +51,9 @@ corec57d = {
 	.wndw.owner = corec37d_wndw_owner,
 	.head = &headc57d,
 	.sor = &sorc37d,
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+	.crc = &crcc37d,
+#endif
 };
 
 int
diff --git a/drivers/gpu/drm/nouveau/dispnv50/crc.c b/drivers/gpu/drm/nouveau/dispnv50/crc.c
new file mode 100644
index 000000000000..b3a21e8256d5
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/dispnv50/crc.c
@@ -0,0 +1,716 @@
+/* SPDX-License-Identifier: MIT */
+#include <linux/string.h>
+#include <drm/drm_crtc.h>
+#include <drm/drm_atomic_helper.h>
+#include <drm/drm_vblank.h>
+
+#include <nvif/class.h>
+#include <nvif/cl0002.h>
+
+#include "nouveau_drv.h"
+#include "core.h"
+#include "head.h"
+#include "wndw.h"
+#include "handles.h"
+#include "crc.h"
+
+static const char * const nv50_crc_sources[] = {
+	[NV50_CRC_SOURCE_NONE] = "none",
+	[NV50_CRC_SOURCE_AUTO] = "auto",
+	[NV50_CRC_SOURCE_RG] = "rg",
+	[NV50_CRC_SOURCE_OUTP_ACTIVE] = "outp-active",
+	[NV50_CRC_SOURCE_OUTP_COMPLETE] = "outp-complete",
+	[NV50_CRC_SOURCE_OUTP_INACTIVE] = "outp-inactive",
+};
+
+static int nv50_crc_parse_source(const char *buf, enum nv50_crc_source *s)
+{
+	int i;
+
+	if (!buf) {
+		*s = NV50_CRC_SOURCE_NONE;
+		return 0;
+	}
+
+	i = match_string(nv50_crc_sources, ARRAY_SIZE(nv50_crc_sources), buf);
+	if (i < 0)
+		return i;
+
+	*s = i;
+	return 0;
+}
+
+int
+nv50_crc_verify_source(struct drm_crtc *crtc, const char *source_name,
+		       size_t *values_cnt)
+{
+	struct nouveau_drm *drm = nouveau_drm(crtc->dev);
+	enum nv50_crc_source source;
+
+	if (nv50_crc_parse_source(source_name, &source) < 0) {
+		NV_DEBUG(drm, "unknown source %s\n", source_name);
+		return -EINVAL;
+	}
+
+	*values_cnt = 1;
+	return 0;
+}
+
+const char *const *nv50_crc_get_sources(struct drm_crtc *crtc, size_t *count)
+{
+	*count = ARRAY_SIZE(nv50_crc_sources);
+	return nv50_crc_sources;
+}
+
+static void
+nv50_crc_program_ctx(struct nv50_head *head,
+		     struct nv50_crc_notifier_ctx *ctx)
+{
+	struct nv50_disp *disp = nv50_disp(head->base.base.dev);
+	struct nv50_core *core = disp->core;
+	u32 interlock[NV50_DISP_INTERLOCK__SIZE] = { 0 };
+
+	core->func->crc->set_ctx(head, ctx);
+	core->func->update(core, interlock, false);
+}
+
+static void nv50_crc_ctx_flip_work(struct drm_vblank_work *work, u64 count)
+{
+	struct nv50_crc *crc = container_of(work, struct nv50_crc, flip_work);
+	struct nv50_head *head = container_of(crc, struct nv50_head, crc);
+	struct drm_crtc *crtc = &head->base.base;
+	struct nv50_disp *disp = nv50_disp(crtc->dev);
+	u8 new_idx = crc->ctx_idx ^ 1;
+
+	/*
+	 * We don't want to accidentally wait for longer then the vblank, so
+	 * try again for the next vblank if we don't grab the lock
+	 */
+	if (!mutex_trylock(&disp->mutex)) {
+		DRM_DEV_DEBUG_KMS(crtc->dev->dev,
+				  "Lock contended, delaying CRC ctx flip for head-%d\n",
+				  head->base.index);
+		drm_vblank_work_schedule(work, count + 1, true);
+		return;
+	}
+
+	DRM_DEV_DEBUG_KMS(crtc->dev->dev,
+			  "Flipping notifier ctx for head %d (%d -> %d)\n",
+			  drm_crtc_index(crtc), crc->ctx_idx, new_idx);
+
+	nv50_crc_program_ctx(head, NULL);
+	nv50_crc_program_ctx(head, &crc->ctx[new_idx]);
+	mutex_unlock(&disp->mutex);
+
+	spin_lock_irq(&crc->lock);
+	crc->ctx_changed = true;
+	spin_unlock_irq(&crc->lock);
+}
+
+static inline void nv50_crc_reset_ctx(struct nv50_crc_notifier_ctx *ctx)
+{
+	memset_io(ctx->mem.object.map.ptr, 0, ctx->mem.object.map.size);
+}
+
+static void
+nv50_crc_get_entries(struct nv50_head *head,
+		     const struct nv50_crc_func *func,
+		     enum nv50_crc_source source)
+{
+	struct drm_crtc *crtc = &head->base.base;
+	struct nv50_crc *crc = &head->crc;
+	u32 output_crc;
+
+	while (crc->entry_idx < func->num_entries) {
+		/*
+		 * While Nvidia's documentation says CRCs are written on each
+		 * subsequent vblank after being enabled, in practice they
+		 * aren't written immediately.
+		 */
+		output_crc = func->get_entry(head, &crc->ctx[crc->ctx_idx],
+					     source, crc->entry_idx);
+		if (!output_crc)
+			return;
+
+		drm_crtc_add_crc_entry(crtc, true, crc->frame, &output_crc);
+		crc->frame++;
+		crc->entry_idx++;
+	}
+}
+
+void nv50_crc_handle_vblank(struct nv50_head *head)
+{
+	struct drm_crtc *crtc = &head->base.base;
+	struct nv50_crc *crc = &head->crc;
+	const struct nv50_crc_func *func =
+		nv50_disp(head->base.base.dev)->core->func->crc;
+	struct nv50_crc_notifier_ctx *ctx;
+	bool need_reschedule = false;
+
+	if (!func)
+		return;
+
+	/*
+	 * We don't lose events if we aren't able to report CRCs until the
+	 * next vblank, so only report CRCs if the locks we need aren't
+	 * contended to prevent missing an actual vblank event
+	 */
+	if (!spin_trylock(&crc->lock))
+		return;
+
+	if (!crc->src)
+		goto out;
+
+	ctx = &crc->ctx[crc->ctx_idx];
+	if (crc->ctx_changed && func->ctx_finished(head, ctx)) {
+		nv50_crc_get_entries(head, func, crc->src);
+
+		crc->ctx_idx ^= 1;
+		crc->entry_idx = 0;
+		crc->ctx_changed = false;
+
+		/*
+		 * Unfortunately when notifier contexts are changed during CRC
+		 * capture, we will inevitably lose the CRC entry for the
+		 * frame where the hardware actually latched onto the first
+		 * UPDATE. According to Nvidia's hardware engineers, there's
+		 * no workaround for this.
+		 *
+		 * Now, we could try to be smart here and calculate the number
+		 * of missed CRCs based on audit timestamps, but those were
+		 * removed starting with volta. Since we always flush our
+		 * updates back-to-back without waiting, we'll just be
+		 * optimistic and assume we always miss exactly one frame.
+		 */
+		DRM_DEV_DEBUG_KMS(head->base.base.dev->dev,
+				  "Notifier ctx flip for head-%d finished, lost CRC for frame %llu\n",
+				  head->base.index, crc->frame);
+		crc->frame++;
+
+		nv50_crc_reset_ctx(ctx);
+		need_reschedule = true;
+	}
+
+	nv50_crc_get_entries(head, func, crc->src);
+
+	if (need_reschedule)
+		drm_vblank_work_schedule(&crc->flip_work,
+					 drm_crtc_vblank_count(crtc)
+					 + crc->flip_threshold
+					 - crc->entry_idx,
+					 true);
+
+out:
+	spin_unlock(&crc->lock);
+}
+
+static void nv50_crc_wait_ctx_finished(struct nv50_head *head,
+				       const struct nv50_crc_func *func,
+				       struct nv50_crc_notifier_ctx *ctx)
+{
+	struct drm_device *dev = head->base.base.dev;
+	struct nouveau_drm *drm = nouveau_drm(dev);
+	s64 ret;
+
+	ret = nvif_msec(&drm->client.device, 50,
+			if (func->ctx_finished(head, ctx)) break;);
+	if (ret == -ETIMEDOUT)
+		NV_ERROR(drm,
+			 "CRC notifier ctx for head %d not finished after 50ms\n",
+			 head->base.index);
+	else if (ret)
+		NV_ATOMIC(drm,
+			  "CRC notifier ctx for head-%d finished after %lldns\n",
+			  head->base.index, ret);
+}
+
+void nv50_crc_atomic_stop_reporting(struct drm_atomic_state *state)
+{
+	struct drm_crtc_state *crtc_state;
+	struct drm_crtc *crtc;
+	int i;
+
+	for_each_new_crtc_in_state(state, crtc, crtc_state, i) {
+		struct nv50_head *head = nv50_head(crtc);
+		struct nv50_head_atom *asyh = nv50_head_atom(crtc_state);
+		struct nv50_crc *crc = &head->crc;
+
+		if (!asyh->clr.crc)
+			continue;
+
+		spin_lock_irq(&crc->lock);
+		crc->src = NV50_CRC_SOURCE_NONE;
+		spin_unlock_irq(&crc->lock);
+
+		drm_crtc_vblank_put(crtc);
+		drm_vblank_work_cancel_sync(&crc->flip_work);
+
+		NV_ATOMIC(nouveau_drm(crtc->dev),
+			  "CRC reporting on vblank for head-%d disabled\n",
+			  head->base.index);
+
+		/* CRC generation is still enabled in hw, we'll just report
+		 * any remaining CRC entries ourselves after it gets disabled
+		 * in hardware
+		 */
+	}
+}
+
+void nv50_crc_atomic_prepare_notifier_contexts(struct drm_atomic_state *state)
+{
+	const struct nv50_crc_func *func =
+		nv50_disp(state->dev)->core->func->crc;
+	struct drm_crtc_state *new_crtc_state;
+	struct drm_crtc *crtc;
+	int i;
+
+	for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) {
+		struct nv50_head *head = nv50_head(crtc);
+		struct nv50_head_atom *asyh = nv50_head_atom(new_crtc_state);
+		struct nv50_crc *crc = &head->crc;
+		struct nv50_crc_notifier_ctx *ctx = &crc->ctx[crc->ctx_idx];
+		int i;
+
+		if (asyh->clr.crc && asyh->crc.src) {
+			if (crc->ctx_changed) {
+				nv50_crc_wait_ctx_finished(head, func, ctx);
+				ctx = &crc->ctx[crc->ctx_idx ^ 1];
+			}
+			nv50_crc_wait_ctx_finished(head, func, ctx);
+		}
+
+		if (asyh->set.crc) {
+			crc->entry_idx = 0;
+			crc->ctx_changed = false;
+			for (i = 0; i < ARRAY_SIZE(crc->ctx); i++)
+				nv50_crc_reset_ctx(&crc->ctx[i]);
+		}
+	}
+}
+
+void nv50_crc_atomic_start_reporting(struct drm_atomic_state *state)
+{
+	struct drm_crtc_state *crtc_state;
+	struct drm_crtc *crtc;
+	int i;
+
+	for_each_new_crtc_in_state(state, crtc, crtc_state, i) {
+		struct nv50_head *head = nv50_head(crtc);
+		struct nv50_head_atom *asyh = nv50_head_atom(crtc_state);
+		struct nv50_crc *crc = &head->crc;
+		u64 vbl_count;
+
+		if (!asyh->set.crc)
+			continue;
+
+		drm_crtc_vblank_get(crtc);
+
+		spin_lock_irq(&crc->lock);
+		vbl_count = drm_crtc_vblank_count(crtc);
+		crc->frame = vbl_count;
+		crc->src = asyh->crc.src;
+		drm_vblank_work_schedule(&crc->flip_work,
+					 vbl_count + crc->flip_threshold,
+					 true);
+		spin_unlock_irq(&crc->lock);
+
+		NV_ATOMIC(nouveau_drm(crtc->dev),
+			  "CRC reporting on vblank for head-%d enabled\n",
+			  head->base.index);
+	}
+}
+
+int nv50_crc_atomic_check(struct nv50_head *head,
+			  struct nv50_head_atom *asyh,
+			  struct nv50_head_atom *armh)
+{
+	struct drm_atomic_state *state = asyh->state.state;
+	struct drm_device *dev = head->base.base.dev;
+	struct nv50_atom *atom = nv50_atom(state);
+	struct nv50_disp *disp = nv50_disp(dev);
+	struct drm_encoder *encoder;
+	struct nouveau_encoder *outp;
+	struct nv50_outp_atom *outp_atom;
+	bool changed = armh->crc.src != asyh->crc.src;
+
+	if (!armh->crc.src && !asyh->crc.src) {
+		asyh->set.crc = false;
+		asyh->clr.crc = false;
+		return 0;
+	}
+
+	/* While we don't care about entry tags, Volta+ hw always needs the
+	 * controlling wndw channel programmed to a wndw that's owned by our
+	 * head
+	 */
+	if (asyh->crc.src && disp->disp->object.oclass >= GV100_DISP &&
+	    !(BIT(asyh->crc.wndw) & asyh->wndw.owned)) {
+		if (!asyh->wndw.owned) {
+			/* TODO: once we support flexible channel ownership,
+			 * we should write some code here to handle attempting
+			 * to "steal" a plane: e.g. take a plane that is
+			 * currently not-visible and owned by another head,
+			 * and reassign it to this head. If we fail to do so,
+			 * we shuld reject the mode outright as CRC capture
+			 * then becomes impossible.
+			 */
+			NV_ATOMIC(nouveau_drm(dev),
+				  "No available wndws for CRC readback\n");
+			return -EINVAL;
+		}
+		asyh->crc.wndw = ffs(asyh->wndw.owned) - 1;
+	}
+
+	if (drm_atomic_crtc_needs_modeset(&asyh->state) || changed ||
+	    armh->crc.wndw != asyh->crc.wndw) {
+		asyh->clr.crc = armh->crc.src && armh->state.active;
+		asyh->set.crc = asyh->crc.src && asyh->state.active;
+		if (changed)
+			asyh->set.or |= armh->or.crc_raster !=
+					asyh->or.crc_raster;
+
+		/*
+		 * If we're reprogramming our OR, we need to flush the CRC
+		 * disable first
+		 */
+		if (asyh->clr.crc) {
+			encoder = nv50_head_atom_get_encoder(armh);
+			outp = nv50_real_outp(encoder);
+
+			list_for_each_entry(outp_atom, &atom->outp, head) {
+				if (outp_atom->encoder == encoder) {
+					if (outp_atom->set.mask)
+						atom->flush_disable = true;
+					break;
+				}
+			}
+		}
+	} else {
+		asyh->set.crc = false;
+		asyh->clr.crc = false;
+	}
+
+	return 0;
+}
+
+static enum nv50_crc_source_type
+nv50_crc_source_type(struct nouveau_encoder *outp,
+		     enum nv50_crc_source source)
+{
+	struct dcb_output *dcbe = outp->dcb;
+
+	switch (source) {
+	case NV50_CRC_SOURCE_NONE: return NV50_CRC_SOURCE_TYPE_NONE;
+	case NV50_CRC_SOURCE_RG:   return NV50_CRC_SOURCE_TYPE_RG;
+	default:		   break;
+	}
+
+	if (dcbe->location != DCB_LOC_ON_CHIP)
+		return NV50_CRC_SOURCE_TYPE_PIOR;
+
+	switch (dcbe->type) {
+	case DCB_OUTPUT_DP:	return NV50_CRC_SOURCE_TYPE_SF;
+	case DCB_OUTPUT_ANALOG:	return NV50_CRC_SOURCE_TYPE_DAC;
+	default:		return NV50_CRC_SOURCE_TYPE_SOR;
+	}
+}
+
+void nv50_crc_atomic_set(struct nv50_head *head,
+			 struct nv50_head_atom *asyh)
+{
+	struct drm_crtc *crtc = &head->base.base;
+	struct drm_device *dev = crtc->dev;
+	struct nv50_crc *crc = &head->crc;
+	const struct nv50_crc_func *func = nv50_disp(dev)->core->func->crc;
+	struct nouveau_encoder *outp =
+		nv50_real_outp(nv50_head_atom_get_encoder(asyh));
+
+	func->set_src(head, outp->or,
+		      nv50_crc_source_type(outp, asyh->crc.src),
+		      &crc->ctx[crc->ctx_idx], asyh->crc.wndw);
+}
+
+void nv50_crc_atomic_clr(struct nv50_head *head)
+{
+	const struct nv50_crc_func *func =
+		nv50_disp(head->base.base.dev)->core->func->crc;
+
+	func->set_src(head, 0, NV50_CRC_SOURCE_TYPE_NONE, NULL, 0);
+}
+
+#define NV50_CRC_RASTER_ACTIVE   0
+#define NV50_CRC_RASTER_COMPLETE 1
+#define NV50_CRC_RASTER_INACTIVE 2
+
+static inline int
+nv50_crc_raster_type(enum nv50_crc_source source)
+{
+	switch (source) {
+	case NV50_CRC_SOURCE_NONE:
+	case NV50_CRC_SOURCE_AUTO:
+	case NV50_CRC_SOURCE_RG:
+	case NV50_CRC_SOURCE_OUTP_ACTIVE:
+		return NV50_CRC_RASTER_ACTIVE;
+	case NV50_CRC_SOURCE_OUTP_COMPLETE:
+		return NV50_CRC_RASTER_COMPLETE;
+	case NV50_CRC_SOURCE_OUTP_INACTIVE:
+		return NV50_CRC_RASTER_INACTIVE;
+	}
+
+	return 0;
+}
+
+/* We handle mapping the memory for CRC notifiers ourselves, since each
+ * notifier needs it's own handle
+ */
+static inline int
+nv50_crc_ctx_init(struct nv50_head *head, struct nvif_mmu *mmu,
+		  struct nv50_crc_notifier_ctx *ctx, size_t len, int idx)
+{
+	struct nv50_core *core = nv50_disp(head->base.base.dev)->core;
+	int ret;
+
+	ret = nvif_mem_init_map(mmu, NVIF_MEM_VRAM, len, &ctx->mem);
+	if (ret)
+		return ret;
+
+	ret = nvif_object_init(&core->chan.base.user,
+			       NV50_DISP_HANDLE_CRC_CTX(head, idx),
+			       NV_DMA_IN_MEMORY,
+			       &(struct nv_dma_v0) {
+					.target = NV_DMA_V0_TARGET_VRAM,
+					.access = NV_DMA_V0_ACCESS_RDWR,
+					.start = ctx->mem.addr,
+					.limit =  ctx->mem.addr
+						+ ctx->mem.size - 1,
+			       }, sizeof(struct nv_dma_v0),
+			       &ctx->ntfy);
+	if (ret)
+		goto fail_fini;
+
+	return 0;
+
+fail_fini:
+	nvif_mem_fini(&ctx->mem);
+	return ret;
+}
+
+static inline void
+nv50_crc_ctx_fini(struct nv50_crc_notifier_ctx *ctx)
+{
+	nvif_object_fini(&ctx->ntfy);
+	nvif_mem_fini(&ctx->mem);
+}
+
+int nv50_crc_set_source(struct drm_crtc *crtc, const char *source_str)
+{
+	struct drm_device *dev = crtc->dev;
+	struct drm_atomic_state *state;
+	struct drm_modeset_acquire_ctx ctx;
+	struct nv50_head *head = nv50_head(crtc);
+	struct nv50_crc *crc = &head->crc;
+	const struct nv50_crc_func *func = nv50_disp(dev)->core->func->crc;
+	struct nvif_mmu *mmu = &nouveau_drm(dev)->client.mmu;
+	struct nv50_head_atom *asyh;
+	struct drm_crtc_state *crtc_state;
+	enum nv50_crc_source source;
+	int ret = 0, ctx_flags = 0, i;
+
+	ret = nv50_crc_parse_source(source_str, &source);
+	if (ret)
+		return ret;
+
+	/*
+	 * Since we don't want the user to accidentally interrupt us as we're
+	 * disabling CRCs
+	 */
+	if (source)
+		ctx_flags |= DRM_MODESET_ACQUIRE_INTERRUPTIBLE;
+	drm_modeset_acquire_init(&ctx, ctx_flags);
+
+	state = drm_atomic_state_alloc(dev);
+	if (!state) {
+		ret = -ENOMEM;
+		goto out_acquire_fini;
+	}
+	state->acquire_ctx = &ctx;
+
+	if (source) {
+		for (i = 0; i < ARRAY_SIZE(head->crc.ctx); i++) {
+			ret = nv50_crc_ctx_init(head, mmu, &crc->ctx[i],
+						func->notifier_len, i);
+			if (ret)
+				goto out_ctx_fini;
+		}
+	}
+
+retry:
+	crtc_state = drm_atomic_get_crtc_state(state, &head->base.base);
+	if (IS_ERR(crtc_state)) {
+		ret = PTR_ERR(crtc_state);
+		if (ret == -EDEADLK)
+			goto deadlock;
+		else if (ret)
+			goto out_drop_locks;
+	}
+	asyh = nv50_head_atom(crtc_state);
+	asyh->crc.src = source;
+	asyh->or.crc_raster = nv50_crc_raster_type(source);
+
+	ret = drm_atomic_commit(state);
+	if (ret == -EDEADLK)
+		goto deadlock;
+	else if (ret)
+		goto out_drop_locks;
+
+	if (!source) {
+		/*
+		 * If the user specified a custom flip threshold through
+		 * debugfs, reset it
+		 */
+		crc->flip_threshold = func->flip_threshold;
+	}
+
+out_drop_locks:
+	drm_modeset_drop_locks(&ctx);
+out_ctx_fini:
+	if (!source || ret) {
+		for (i = 0; i < ARRAY_SIZE(crc->ctx); i++)
+			nv50_crc_ctx_fini(&crc->ctx[i]);
+	}
+	drm_atomic_state_put(state);
+out_acquire_fini:
+	drm_modeset_acquire_fini(&ctx);
+	return ret;
+
+deadlock:
+	drm_atomic_state_clear(state);
+	drm_modeset_backoff(&ctx);
+	goto retry;
+}
+
+static int
+nv50_crc_debugfs_flip_threshold_get(struct seq_file *m, void *data)
+{
+	struct nv50_head *head = m->private;
+	struct drm_crtc *crtc = &head->base.base;
+	struct nv50_crc *crc = &head->crc;
+	int ret;
+
+	ret = drm_modeset_lock_single_interruptible(&crtc->mutex);
+	if (ret)
+		return ret;
+
+	seq_printf(m, "%d\n", crc->flip_threshold);
+
+	drm_modeset_unlock(&crtc->mutex);
+	return ret;
+}
+
+static int
+nv50_crc_debugfs_flip_threshold_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, nv50_crc_debugfs_flip_threshold_get,
+			   inode->i_private);
+}
+
+static ssize_t
+nv50_crc_debugfs_flip_threshold_set(struct file *file,
+				    const char __user *ubuf, size_t len,
+				    loff_t *offp)
+{
+	struct seq_file *m = file->private_data;
+	struct nv50_head *head = m->private;
+	struct nv50_head_atom *armh;
+	struct drm_crtc *crtc = &head->base.base;
+	struct nouveau_drm *drm = nouveau_drm(crtc->dev);
+	struct nv50_crc *crc = &head->crc;
+	const struct nv50_crc_func *func =
+		nv50_disp(crtc->dev)->core->func->crc;
+	int value, ret;
+
+	ret = kstrtoint_from_user(ubuf, len, 10, &value);
+	if (ret)
+		return ret;
+
+	if (value > func->flip_threshold)
+		return -EINVAL;
+	else if (value == -1)
+		value = func->flip_threshold;
+	else if (value < -1)
+		return -EINVAL;
+
+	ret = drm_modeset_lock_single_interruptible(&crtc->mutex);
+	if (ret)
+		return ret;
+
+	armh = nv50_head_atom(crtc->state);
+	if (armh->crc.src) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	NV_DEBUG(drm,
+		 "Changing CRC flip threshold for next capture on head-%d to %d\n",
+		 head->base.index, value);
+	crc->flip_threshold = value;
+	ret = len;
+
+out:
+	drm_modeset_unlock(&crtc->mutex);
+	return ret;
+}
+
+static const struct file_operations nv50_crc_flip_threshold_fops = {
+	.owner = THIS_MODULE,
+	.open = nv50_crc_debugfs_flip_threshold_open,
+	.read = seq_read,
+	.write = nv50_crc_debugfs_flip_threshold_set,
+};
+
+int nv50_head_crc_late_register(struct nv50_head *head)
+{
+	struct drm_crtc *crtc = &head->base.base;
+	const struct nv50_crc_func *func =
+		nv50_disp(crtc->dev)->core->func->crc;
+	struct dentry *root, *dent;
+
+	if (!func || !crtc->debugfs_entry)
+		return 0;
+
+	root = debugfs_create_dir("nv_crc", crtc->debugfs_entry);
+	if (IS_ERR(root))
+		return PTR_ERR(root);
+
+	dent = debugfs_create_file("flip_threshold", 0644, root, head,
+				   &nv50_crc_flip_threshold_fops);
+	if (IS_ERR(dent))
+		return PTR_ERR(dent);
+
+	return 0;
+}
+
+static inline void
+nv50_crc_init_head(struct nv50_disp *disp, const struct nv50_crc_func *func,
+		   struct nv50_head *head)
+{
+	struct nv50_crc *crc = &head->crc;
+
+	crc->flip_threshold = func->flip_threshold;
+	spin_lock_init(&crc->lock);
+	drm_vblank_work_init(&crc->flip_work, &head->base.base,
+			     nv50_crc_ctx_flip_work);
+}
+
+void nv50_crc_init(struct drm_device *dev)
+{
+	struct nv50_disp *disp = nv50_disp(dev);
+	struct drm_crtc *crtc;
+	const struct nv50_crc_func *func = disp->core->func->crc;
+
+	if (!func)
+		return;
+
+	drm_for_each_crtc(crtc, dev)
+		nv50_crc_init_head(disp, func, nv50_head(crtc));
+}
diff --git a/drivers/gpu/drm/nouveau/dispnv50/crc.h b/drivers/gpu/drm/nouveau/dispnv50/crc.h
new file mode 100644
index 000000000000..e79e55a94578
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/dispnv50/crc.h
@@ -0,0 +1,125 @@
+/* SPDX-License-Identifier: MIT */
+#ifndef __NV50_CRC_H__
+#define __NV50_CRC_H__
+
+#include <linux/mutex.h>
+#include <drm/drm_crtc.h>
+#include <drm/drm_vblank.h>
+
+#include <nvif/mem.h>
+#include <nvkm/subdev/bios.h>
+#include "nouveau_encoder.h"
+
+struct nv50_disp;
+struct nv50_head;
+
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+enum nv50_crc_source {
+	NV50_CRC_SOURCE_NONE = 0,
+	NV50_CRC_SOURCE_AUTO,
+	NV50_CRC_SOURCE_RG,
+	NV50_CRC_SOURCE_OUTP_ACTIVE,
+	NV50_CRC_SOURCE_OUTP_COMPLETE,
+	NV50_CRC_SOURCE_OUTP_INACTIVE,
+};
+
+/* RG -> SF (DP only)
+ *    -> SOR
+ *    -> PIOR
+ *    -> DAC
+ */
+enum nv50_crc_source_type {
+	NV50_CRC_SOURCE_TYPE_NONE = 0,
+	NV50_CRC_SOURCE_TYPE_SOR,
+	NV50_CRC_SOURCE_TYPE_PIOR,
+	NV50_CRC_SOURCE_TYPE_DAC,
+	NV50_CRC_SOURCE_TYPE_RG,
+	NV50_CRC_SOURCE_TYPE_SF,
+};
+
+struct nv50_crc_notifier_ctx {
+	struct nvif_mem mem;
+	struct nvif_object ntfy;
+};
+
+struct nv50_crc_atom {
+	enum nv50_crc_source src;
+	/* Only used for gv100+ */
+	u8 wndw : 4;
+};
+
+struct nv50_crc_func {
+	void (*set_src)(struct nv50_head *, int or, enum nv50_crc_source_type,
+			struct nv50_crc_notifier_ctx *, u32 wndw);
+	void (*set_ctx)(struct nv50_head *, struct nv50_crc_notifier_ctx *);
+	u32 (*get_entry)(struct nv50_head *, struct nv50_crc_notifier_ctx *,
+			 enum nv50_crc_source, int idx);
+	bool (*ctx_finished)(struct nv50_head *,
+			     struct nv50_crc_notifier_ctx *);
+	short flip_threshold;
+	short num_entries;
+	size_t notifier_len;
+};
+
+struct nv50_crc {
+	spinlock_t lock;
+	struct nv50_crc_notifier_ctx ctx[2];
+	struct drm_vblank_work flip_work;
+	enum nv50_crc_source src;
+
+	u64 frame;
+	short entry_idx;
+	short flip_threshold;
+	u8 ctx_idx : 1;
+	bool ctx_changed : 1;
+};
+
+void nv50_crc_init(struct drm_device *dev);
+int nv50_head_crc_late_register(struct nv50_head *);
+void nv50_crc_handle_vblank(struct nv50_head *head);
+
+int nv50_crc_verify_source(struct drm_crtc *, const char *, size_t *);
+const char *const *nv50_crc_get_sources(struct drm_crtc *, size_t *);
+int nv50_crc_set_source(struct drm_crtc *, const char *);
+
+int nv50_crc_atomic_check(struct nv50_head *, struct nv50_head_atom *,
+			  struct nv50_head_atom *);
+void nv50_crc_atomic_stop_reporting(struct drm_atomic_state *);
+void nv50_crc_atomic_prepare_notifier_contexts(struct drm_atomic_state *);
+void nv50_crc_atomic_start_reporting(struct drm_atomic_state *);
+void nv50_crc_atomic_set(struct nv50_head *, struct nv50_head_atom *);
+void nv50_crc_atomic_clr(struct nv50_head *);
+
+extern const struct nv50_crc_func crc907d;
+extern const struct nv50_crc_func crcc37d;
+
+#else /* IS_ENABLED(CONFIG_DEBUG_FS) */
+struct nv50_crc {};
+struct nv50_crc_func {};
+struct nv50_crc_atom {};
+
+#define nv50_crc_verify_source NULL
+#define nv50_crc_get_sources NULL
+#define nv50_crc_set_source NULL
+
+static inline void nv50_crc_init(struct drm_device *dev) {}
+static inline int nv50_head_crc_late_register(struct nv50_head *) {}
+static inline void
+nv50_crc_handle_vblank(struct nv50_head *head) { return 0; }
+
+static inline int
+nv50_crc_atomic_check(struct nv50_head *, struct nv50_head_atom *,
+		      struct nv50_head_atom *) {}
+static inline void
+nv50_crc_atomic_stop_reporting(struct drm_atomic_state *) {}
+static inline void
+nv50_crc_atomic_prepare_notifier_contexts(struct drm_atomic_state *) {}
+static inline void
+nv50_crc_atomic_start_reporting(struct drm_atomic_state *) {}
+static inline void
+nv50_crc_atomic_set(struct nv50_head *, struct nv50_head_atom *) {}
+static inline void
+nv50_crc_atomic_clr(struct nv50_head *) {}
+
+#endif /* IS_ENABLED(CONFIG_DEBUG_FS) */
+#endif /* !__NV50_CRC_H__ */
diff --git a/drivers/gpu/drm/nouveau/dispnv50/crc907d.c b/drivers/gpu/drm/nouveau/dispnv50/crc907d.c
new file mode 100644
index 000000000000..d2e5f4b8c49f
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/dispnv50/crc907d.c
@@ -0,0 +1,139 @@
+/* SPDX-License-Identifier: MIT */
+#include <drm/drm_crtc.h>
+
+#include "crc.h"
+#include "core.h"
+#include "disp.h"
+#include "head.h"
+
+#define CRC907D_MAX_ENTRIES 255
+
+struct crc907d_notifier {
+	uint32_t status;
+	uint32_t :32; /* reserved */
+	struct crc907d_entry {
+		uint32_t status;
+		uint32_t compositor_crc;
+		u32 output_crc[2];
+	} entries[CRC907D_MAX_ENTRIES];
+} __packed;
+
+static void
+crc907d_set_src(struct nv50_head *head, int or,
+		enum nv50_crc_source_type source,
+		struct nv50_crc_notifier_ctx *ctx, u32 wndw)
+{
+	struct drm_crtc *crtc = &head->base.base;
+	struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan;
+	const u32 hoff = head->base.index * 0x300;
+	u32 *push;
+	u32 crc_args = 0xfff00000;
+
+	switch (source) {
+	case NV50_CRC_SOURCE_TYPE_SOR:
+		crc_args |= (0x00000f0f + or * 16) << 8;
+		break;
+	case NV50_CRC_SOURCE_TYPE_PIOR:
+		crc_args |= (0x000000ff + or * 256) << 8;
+		break;
+	case NV50_CRC_SOURCE_TYPE_DAC:
+		crc_args |= (0x00000ff0 + or) << 8;
+		break;
+	case NV50_CRC_SOURCE_TYPE_RG:
+		crc_args |= (0x00000ff8 + drm_crtc_index(crtc)) << 8;
+		break;
+	case NV50_CRC_SOURCE_TYPE_SF:
+		crc_args |= (0x00000f8f + drm_crtc_index(crtc) * 16) << 8;
+		break;
+	case NV50_CRC_SOURCE_NONE:
+		crc_args |= 0x000fff00;
+		break;
+	}
+
+	push = evo_wait(core, 4);
+	if (!push)
+		return;
+
+	if (source) {
+		evo_mthd(push, 0x0438 + hoff, 1);
+		evo_data(push, ctx->ntfy.handle);
+		evo_mthd(push, 0x0430 + hoff, 1);
+		evo_data(push, crc_args);
+	} else {
+		evo_mthd(push, 0x0430 + hoff, 1);
+		evo_data(push, crc_args);
+		evo_mthd(push, 0x0438 + hoff, 1);
+		evo_data(push, 0);
+	}
+	evo_kick(push, core);
+}
+
+static void crc907d_set_ctx(struct nv50_head *head,
+			    struct nv50_crc_notifier_ctx *ctx)
+{
+	struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan;
+	u32 *push = evo_wait(core, 2);
+
+	if (!push)
+		return;
+
+	evo_mthd(push, 0x0438 + (head->base.index * 0x300), 1);
+	evo_data(push, ctx ? ctx->ntfy.handle : 0);
+	evo_kick(push, core);
+}
+
+static u32 crc907d_get_entry(struct nv50_head *head,
+			     struct nv50_crc_notifier_ctx *ctx,
+			     enum nv50_crc_source source, int idx)
+{
+	struct crc907d_notifier __iomem *notifier = ctx->mem.object.map.ptr;
+
+	return ioread32_native(&notifier->entries[idx].output_crc[0]);
+}
+
+static bool crc907d_ctx_finished(struct nv50_head *head,
+				 struct nv50_crc_notifier_ctx *ctx)
+{
+	struct nouveau_drm *drm = nouveau_drm(head->base.base.dev);
+	struct crc907d_notifier __iomem *notifier = ctx->mem.object.map.ptr;
+	const u32 status = ioread32_native(&notifier->status);
+	const u32 overflow = status & 0x0000003e;
+
+	if (!(status & 0x00000001))
+		return false;
+
+	if (overflow) {
+		const char *engine = NULL;
+
+		switch (overflow) {
+			case 0x00000004: engine = "DSI"; break;
+			case 0x00000008: engine = "Compositor"; break;
+			case 0x00000010: engine = "CRC output 1"; break;
+			case 0x00000020: engine = "CRC output 2"; break;
+		}
+
+		if (engine)
+			NV_ERROR(drm,
+				 "CRC notifier context for head %d overflowed on %s: %x\n",
+				 head->base.index, engine, status);
+		else
+			NV_ERROR(drm,
+				 "CRC notifier context for head %d overflowed: %x\n",
+				 head->base.index, status);
+	}
+
+	NV_DEBUG(drm, "Head %d CRC context status: %x\n",
+		 head->base.index, status);
+
+	return true;
+}
+
+const struct nv50_crc_func crc907d = {
+	.set_src = crc907d_set_src,
+	.set_ctx = crc907d_set_ctx,
+	.get_entry = crc907d_get_entry,
+	.ctx_finished = crc907d_ctx_finished,
+	.flip_threshold = CRC907D_MAX_ENTRIES - 10,
+	.num_entries = CRC907D_MAX_ENTRIES,
+	.notifier_len = sizeof(struct crc907d_notifier),
+};
diff --git a/drivers/gpu/drm/nouveau/dispnv50/crcc37d.c b/drivers/gpu/drm/nouveau/dispnv50/crcc37d.c
new file mode 100644
index 000000000000..19b50948f486
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/dispnv50/crcc37d.c
@@ -0,0 +1,153 @@
+/* SPDX-License-Identifier: MIT */
+#include <drm/drm_crtc.h>
+
+#include "crc.h"
+#include "core.h"
+#include "disp.h"
+#include "head.h"
+
+#define CRCC37D_MAX_ENTRIES 2047
+
+struct crcc37d_notifier {
+	u32 status;
+
+	/* reserved */
+	u32 :32;
+	u32 :32;
+	u32 :32;
+	u32 :32;
+	u32 :32;
+	u32 :32;
+	u32 :32;
+
+	struct crcc37d_entry {
+		u32 status[2];
+		u32 :32; /* reserved */
+		u32 compositor_crc;
+		u32 rg_crc;
+		u32 output_crc[2];
+		u32 :32; /* reserved */
+	} entries[CRCC37D_MAX_ENTRIES];
+} __packed;
+
+static void
+crcc37d_set_src(struct nv50_head *head, int or,
+		enum nv50_crc_source_type source,
+		struct nv50_crc_notifier_ctx *ctx, u32 wndw)
+{
+	struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan;
+	const u32 hoff = head->base.index * 0x400;
+	u32 *push;
+	u32 crc_args;
+
+	switch (source) {
+	case NV50_CRC_SOURCE_TYPE_SOR:
+		crc_args = (0x00000050 + or) << 12;
+		break;
+	case NV50_CRC_SOURCE_TYPE_PIOR:
+		crc_args = (0x00000060 + or) << 12;
+		break;
+	case NV50_CRC_SOURCE_TYPE_SF:
+		crc_args = 0x00000030 << 12;
+		break;
+	default:
+		crc_args = 0;
+		break;
+	}
+
+	push = evo_wait(core, 4);
+	if (!push)
+		return;
+
+	if (source) {
+		evo_mthd(push, 0x2180 + hoff, 1);
+		evo_data(push, ctx->ntfy.handle);
+		evo_mthd(push, 0x2184 + hoff, 1);
+		evo_data(push, crc_args | wndw);
+	} else {
+		evo_mthd(push, 0x2184 + hoff, 1);
+		evo_data(push, 0);
+		evo_mthd(push, 0x2180 + hoff, 1);
+		evo_data(push, 0);
+	}
+
+	evo_kick(push, core);
+}
+
+static void crcc37d_set_ctx(struct nv50_head *head,
+			    struct nv50_crc_notifier_ctx *ctx)
+{
+	struct nv50_dmac *core = &nv50_disp(head->base.base.dev)->core->chan;
+	u32 *push = evo_wait(core, 2);
+
+	if (!push)
+		return;
+
+	evo_mthd(push, 0x2180 + (head->base.index * 0x400), 1);
+	evo_data(push, ctx ? ctx->ntfy.handle : 0);
+	evo_kick(push, core);
+}
+
+static u32 crcc37d_get_entry(struct nv50_head *head,
+			     struct nv50_crc_notifier_ctx *ctx,
+			     enum nv50_crc_source source, int idx)
+{
+	struct crcc37d_notifier __iomem *notifier = ctx->mem.object.map.ptr;
+	struct crcc37d_entry __iomem *entry = &notifier->entries[idx];
+	u32 __iomem *crc_addr;
+
+	if (source == NV50_CRC_SOURCE_RG)
+		crc_addr = &entry->rg_crc;
+	else
+		crc_addr = &entry->output_crc[0];
+
+	return ioread32_native(crc_addr);
+}
+
+static bool crcc37d_ctx_finished(struct nv50_head *head,
+				 struct nv50_crc_notifier_ctx *ctx)
+{
+	struct nouveau_drm *drm = nouveau_drm(head->base.base.dev);
+	struct crcc37d_notifier __iomem *notifier = ctx->mem.object.map.ptr;
+	const u32 status = ioread32_native(&notifier->status);
+	const u32 overflow = status & 0x0000007e;
+
+	if (!(status & 0x00000001))
+		return false;
+
+	if (overflow) {
+		const char *engine = NULL;
+
+		switch (overflow) {
+		case 0x00000004: engine = "Front End"; break;
+		case 0x00000008: engine = "Compositor"; break;
+		case 0x00000010: engine = "RG"; break;
+		case 0x00000020: engine = "CRC output 1"; break;
+		case 0x00000040: engine = "CRC output 2"; break;
+		}
+
+		if (engine)
+			NV_ERROR(drm,
+				 "CRC notifier context for head %d overflowed on %s: %x\n",
+				 head->base.index, engine, status);
+		else
+			NV_ERROR(drm,
+				 "CRC notifier context for head %d overflowed: %x\n",
+				 head->base.index, status);
+	}
+
+	NV_DEBUG(drm, "Head %d CRC context status: %x\n",
+		 head->base.index, status);
+
+	return true;
+}
+
+const struct nv50_crc_func crcc37d = {
+	.set_src = crcc37d_set_src,
+	.set_ctx = crcc37d_set_ctx,
+	.get_entry = crcc37d_get_entry,
+	.ctx_finished = crcc37d_ctx_finished,
+	.flip_threshold = CRCC37D_MAX_ENTRIES - 30,
+	.num_entries = CRCC37D_MAX_ENTRIES,
+	.notifier_len = sizeof(struct crcc37d_notifier),
+};
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index bfea85782d0e..47015cea8063 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -775,6 +775,19 @@ struct nv50_msto {
 	bool disabled;
 };
 
+struct nouveau_encoder *nv50_real_outp(struct drm_encoder *encoder)
+{
+	struct nv50_msto *msto;
+
+	if (encoder->encoder_type != DRM_MODE_ENCODER_DPMST)
+		return nouveau_encoder(encoder);
+
+	msto = nv50_msto(encoder);
+	if (!msto->mstc)
+		return NULL;
+	return msto->mstc->mstm->outp;
+}
+
 static struct drm_dp_payload *
 nv50_msto_payload(struct nv50_msto *msto)
 {
@@ -1898,6 +1911,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 	int i;
 
 	NV_ATOMIC(drm, "commit %d %d\n", atom->lock_core, atom->flush_disable);
+	nv50_crc_atomic_stop_reporting(state);
 	drm_atomic_helper_wait_for_fences(dev, state, false);
 	drm_atomic_helper_wait_for_dependencies(state);
 	drm_atomic_helper_update_legacy_modeset_state(dev, state);
@@ -1907,6 +1921,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 
 	/* Disable head(s). */
 	for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) {
+		struct nv50_head_atom *armh = nv50_head_atom(old_crtc_state);
 		struct nv50_head_atom *asyh = nv50_head_atom(new_crtc_state);
 		struct nv50_head *head = nv50_head(crtc);
 
@@ -1919,7 +1934,8 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 		}
 
 		if (asyh->clr.mask) {
-			nv50_head_flush_clr(head, asyh, atom->flush_disable);
+			nv50_head_flush_clr(head, armh, asyh,
+					    atom->flush_disable);
 			interlock[NV50_DISP_INTERLOCK_CORE] |= 1;
 		}
 	}
@@ -1968,6 +1984,8 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 		}
 	}
 
+	nv50_crc_atomic_prepare_notifier_contexts(state);
+
 	/* Update output path(s). */
 	list_for_each_entry_safe(outp, outt, &atom->outp, head) {
 		const struct drm_encoder_helper_funcs *help;
@@ -1990,6 +2008,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 
 	/* Update head(s). */
 	for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) {
+		struct nv50_head_atom *armh = nv50_head_atom(old_crtc_state);
 		struct nv50_head_atom *asyh = nv50_head_atom(new_crtc_state);
 		struct nv50_head *head = nv50_head(crtc);
 
@@ -1997,7 +2016,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 			  asyh->set.mask, asyh->clr.mask);
 
 		if (asyh->set.mask) {
-			nv50_head_flush_set(head, asyh);
+			nv50_head_flush_set(head, armh, asyh);
 			interlock[NV50_DISP_INTERLOCK_CORE] = 1;
 		}
 
@@ -2081,6 +2100,7 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 		}
 	}
 
+	nv50_crc_atomic_start_reporting(state);
 	drm_atomic_helper_commit_hw_done(state);
 	drm_atomic_helper_cleanup_planes(dev, state);
 	drm_atomic_helper_commit_cleanup_done(state);
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.h b/drivers/gpu/drm/nouveau/dispnv50/disp.h
index 8935ebce8ab0..da72223277fe 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.h
@@ -1,10 +1,12 @@
 #ifndef __NV50_KMS_H__
 #define __NV50_KMS_H__
+#include <linux/workqueue.h>
 #include <nvif/mem.h>
 
 #include "nouveau_display.h"
 
 struct nv50_msto;
+struct nouveau_encoder;
 
 struct nv50_disp {
 	struct nvif_disp *disp;
@@ -89,6 +91,14 @@ int nv50_dmac_create(struct nvif_device *device, struct nvif_object *disp,
 		     u64 syncbuf, struct nv50_dmac *dmac);
 void nv50_dmac_destroy(struct nv50_dmac *);
 
+/*
+ * For normal encoders this just returns the encoder. For active MST encoders,
+ * this returns the real outp that's driving displays on the topology.
+ * Inactive MST encoders return NULL, since they would have no real outp to
+ * return anyway.
+ */
+struct nouveau_encoder *nv50_real_outp(struct drm_encoder *encoder);
+
 u32 *evo_wait(struct nv50_dmac *, int nr);
 void evo_kick(u32 *, struct nv50_dmac *);
 
diff --git a/drivers/gpu/drm/nouveau/dispnv50/handles.h b/drivers/gpu/drm/nouveau/dispnv50/handles.h
index e3a62c7a0d08..a97a7bd29243 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/handles.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/handles.h
@@ -11,5 +11,6 @@
 #define NV50_DISP_HANDLE_VRAM                                           0xf0000001
 
 #define NV50_DISP_HANDLE_WNDW_CTX(kind)                        (0xfb000000 | kind)
+#define NV50_DISP_HANDLE_CRC_CTX(head, i) (0xfc000000 | head->base.index << 1 | i)
 
 #endif /* !__NV50_KMS_HANDLES_H__ */
diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.c b/drivers/gpu/drm/nouveau/dispnv50/head.c
index 72bc3bce396a..4668bcfb596c 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/head.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/head.c
@@ -24,27 +24,36 @@
 #include "core.h"
 #include "curs.h"
 #include "ovly.h"
+#include "crc.h"
 
 #include <nvif/class.h>
+#include <nvif/event.h>
+#include <nvif/cl0046.h>
 
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_vblank.h>
 #include "nouveau_connector.h"
+
 void
 nv50_head_flush_clr(struct nv50_head *head,
-		    struct nv50_head_atom *asyh, bool flush)
+		    struct nv50_head_atom *armh,
+		    struct nv50_head_atom *asyh,
+		    bool flush)
 {
 	union nv50_head_atom_mask clr = {
 		.mask = asyh->clr.mask & ~(flush ? 0 : asyh->set.mask),
 	};
+	if (clr.crc)  nv50_crc_atomic_clr(head);
 	if (clr.olut) head->func->olut_clr(head);
 	if (clr.core) head->func->core_clr(head);
 	if (clr.curs) head->func->curs_clr(head);
 }
 
 void
-nv50_head_flush_set(struct nv50_head *head, struct nv50_head_atom *asyh)
+nv50_head_flush_set(struct nv50_head *head,
+		    struct nv50_head_atom *armh,
+		    struct nv50_head_atom *asyh)
 {
 	if (asyh->set.view   ) head->func->view    (head, asyh);
 	if (asyh->set.mode   ) head->func->mode    (head, asyh);
@@ -61,6 +70,7 @@ nv50_head_flush_set(struct nv50_head *head, struct nv50_head_atom *asyh)
 	if (asyh->set.ovly   ) head->func->ovly    (head, asyh);
 	if (asyh->set.dither ) head->func->dither  (head, asyh);
 	if (asyh->set.procamp) head->func->procamp (head, asyh);
+	if (asyh->set.crc    ) nv50_crc_atomic_set (head, asyh);
 	if (asyh->set.or     ) head->func->or      (head, asyh);
 }
 
@@ -313,7 +323,7 @@ nv50_head_atomic_check(struct drm_crtc *crtc, struct drm_crtc_state *state)
 	struct nouveau_conn_atom *asyc = NULL;
 	struct drm_connector_state *conns;
 	struct drm_connector *conn;
-	int i;
+	int i, ret;
 
 	NV_ATOMIC(drm, "%s atomic_check %d\n", crtc->name, asyh->state.active);
 	if (asyh->state.active) {
@@ -408,6 +418,10 @@ nv50_head_atomic_check(struct drm_crtc *crtc, struct drm_crtc_state *state)
 		asyh->set.curs = asyh->curs.visible;
 	}
 
+	ret = nv50_crc_atomic_check(head, asyh, armh);
+	if (ret)
+		return ret;
+
 	if (asyh->clr.mask || asyh->set.mask)
 		nv50_atom(asyh->state.state)->lock_core = true;
 	return 0;
@@ -446,6 +460,7 @@ nv50_head_atomic_duplicate_state(struct drm_crtc *crtc)
 	asyh->ovly = armh->ovly;
 	asyh->dither = armh->dither;
 	asyh->procamp = armh->procamp;
+	asyh->crc = armh->crc;
 	asyh->or = armh->or;
 	asyh->dp = armh->dp;
 	asyh->clr.mask = 0;
@@ -467,10 +482,18 @@ nv50_head_reset(struct drm_crtc *crtc)
 	__drm_atomic_helper_crtc_reset(crtc, &asyh->state);
 }
 
+static int
+nv50_head_late_register(struct drm_crtc *crtc)
+{
+	return nv50_head_crc_late_register(nv50_head(crtc));
+}
+
 static void
 nv50_head_destroy(struct drm_crtc *crtc)
 {
 	struct nv50_head *head = nv50_head(crtc);
+
+	nvif_notify_fini(&head->base.vblank);
 	nv50_lut_fini(&head->olut);
 	drm_crtc_cleanup(crtc);
 	kfree(head);
@@ -488,8 +511,38 @@ nv50_head_func = {
 	.enable_vblank = nouveau_display_vblank_enable,
 	.disable_vblank = nouveau_display_vblank_disable,
 	.get_vblank_timestamp = drm_crtc_vblank_helper_get_vblank_timestamp,
+	.late_register = nv50_head_late_register,
+};
+
+static const struct drm_crtc_funcs
+nvd9_head_func = {
+	.reset = nv50_head_reset,
+	.gamma_set = drm_atomic_helper_legacy_gamma_set,
+	.destroy = nv50_head_destroy,
+	.set_config = drm_atomic_helper_set_config,
+	.page_flip = drm_atomic_helper_page_flip,
+	.atomic_duplicate_state = nv50_head_atomic_duplicate_state,
+	.atomic_destroy_state = nv50_head_atomic_destroy_state,
+	.enable_vblank = nouveau_display_vblank_enable,
+	.disable_vblank = nouveau_display_vblank_disable,
+	.get_vblank_timestamp = drm_crtc_vblank_helper_get_vblank_timestamp,
+	.verify_crc_source = nv50_crc_verify_source,
+	.get_crc_sources = nv50_crc_get_sources,
+	.set_crc_source = nv50_crc_set_source,
+	.late_register = nv50_head_late_register,
 };
 
+static int nv50_head_vblank_handler(struct nvif_notify *notify)
+{
+	struct nouveau_crtc *nv_crtc =
+		container_of(notify, struct nouveau_crtc, vblank);
+
+	if (drm_crtc_handle_vblank(&nv_crtc->base))
+		nv50_crc_handle_vblank(nv50_head(&nv_crtc->base));
+
+	return NVIF_NOTIFY_KEEP;
+}
+
 struct nv50_head *
 nv50_head_create(struct drm_device *dev, int index)
 {
@@ -497,7 +550,9 @@ nv50_head_create(struct drm_device *dev, int index)
 	struct nv50_disp *disp = nv50_disp(dev);
 	struct nv50_head *head;
 	struct nv50_wndw *base, *ovly, *curs;
+	struct nouveau_crtc *nv_crtc;
 	struct drm_crtc *crtc;
+	const struct drm_crtc_funcs *funcs;
 	int ret;
 
 	head = kzalloc(sizeof(*head), GFP_KERNEL);
@@ -507,6 +562,11 @@ nv50_head_create(struct drm_device *dev, int index)
 	head->func = disp->core->func->head;
 	head->base.index = index;
 
+	if (disp->disp->object.oclass < GF110_DISP)
+		funcs = &nv50_head_func;
+	else
+		funcs = &nvd9_head_func;
+
 	if (disp->disp->object.oclass < GV100_DISP) {
 		ret = nv50_base_new(drm, head->base.index, &base);
 		if (ret)
@@ -531,9 +591,10 @@ nv50_head_create(struct drm_device *dev, int index)
 	if (ret)
 		goto fail_free;
 
-	crtc = &head->base.base;
+	nv_crtc = &head->base;
+	crtc = &nv_crtc->base;
 	drm_crtc_init_with_planes(dev, crtc, &base->plane, &curs->plane,
-				  &nv50_head_func, "head-%d", head->base.index);
+				  funcs, "head-%d", head->base.index);
 	drm_crtc_helper_add(crtc, &nv50_head_help);
 	/* Keep the legacy gamma size at 256 to avoid compatibility issues */
 	drm_mode_crtc_set_gamma_size(crtc, 256);
@@ -547,8 +608,22 @@ nv50_head_create(struct drm_device *dev, int index)
 			goto fail_crtc_cleanup;
 	}
 
+	ret = nvif_notify_init(&disp->disp->object, nv50_head_vblank_handler,
+			       false, NV04_DISP_NTFY_VBLANK,
+			       &(struct nvif_notify_head_req_v0) {
+				    .head = nv_crtc->index,
+			       },
+			       sizeof(struct nvif_notify_head_req_v0),
+			       sizeof(struct nvif_notify_head_rep_v0),
+			       &nv_crtc->vblank);
+	if (ret)
+		goto fail_lut_fini;
+
 	return head;
 
+fail_lut_fini:
+	if (head->func->olut_set)
+		nv50_lut_fini(&head->olut);
 fail_crtc_cleanup:
 	drm_crtc_cleanup(crtc);
 fail_free:
diff --git a/drivers/gpu/drm/nouveau/dispnv50/head.h b/drivers/gpu/drm/nouveau/dispnv50/head.h
index c05bbba9e247..2e57910ba450 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/head.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/head.h
@@ -1,22 +1,29 @@
 #ifndef __NV50_KMS_HEAD_H__
 #define __NV50_KMS_HEAD_H__
 #define nv50_head(c) container_of((c), struct nv50_head, base.base)
+#include <linux/workqueue.h>
+
 #include "disp.h"
 #include "atom.h"
+#include "crc.h"
 #include "lut.h"
 
 #include "nouveau_crtc.h"
+#include "nouveau_encoder.h"
 
 struct nv50_head {
 	const struct nv50_head_func *func;
 	struct nouveau_crtc base;
+	struct nv50_crc crc;
 	struct nv50_lut olut;
 	struct nv50_msto *msto;
 };
 
 struct nv50_head *nv50_head_create(struct drm_device *, int index);
-void nv50_head_flush_set(struct nv50_head *, struct nv50_head_atom *);
-void nv50_head_flush_clr(struct nv50_head *, struct nv50_head_atom *, bool y);
+void nv50_head_flush_set(struct nv50_head *, struct nv50_head_atom *armh,
+			 struct nv50_head_atom *asyh);
+void nv50_head_flush_clr(struct nv50_head *, struct nv50_head_atom *armh,
+			 struct nv50_head_atom *asyh, bool y);
 
 struct nv50_head_func {
 	void (*view)(struct nv50_head *, struct nv50_head_atom *);
diff --git a/drivers/gpu/drm/nouveau/dispnv50/head907d.c b/drivers/gpu/drm/nouveau/dispnv50/head907d.c
index 3002ec23d7a6..63a0b45d96d6 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/head907d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/head907d.c
@@ -19,8 +19,15 @@
  * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
  * OTHER DEALINGS IN THE SOFTWARE.
  */
+#include <drm/drm_connector.h>
+#include <drm/drm_mode_config.h>
+#include <drm/drm_vblank.h>
+#include "nouveau_drv.h"
+#include "nouveau_bios.h"
+#include "nouveau_connector.h"
 #include "head.h"
 #include "core.h"
+#include "crc.h"
 
 void
 head907d_or(struct nv50_head *head, struct nv50_head_atom *asyh)
@@ -29,9 +36,10 @@ head907d_or(struct nv50_head *head, struct nv50_head_atom *asyh)
 	u32 *push;
 	if ((push = evo_wait(core, 3))) {
 		evo_mthd(push, 0x0404 + (head->base.index * 0x300), 2);
-		evo_data(push, 0x00000001 | asyh->or.depth  << 6 |
-					    asyh->or.nvsync << 4 |
-					    asyh->or.nhsync << 3);
+		evo_data(push, asyh->or.depth  << 6 |
+			       asyh->or.nvsync << 4 |
+			       asyh->or.nhsync << 3 |
+			       asyh->or.crc_raster);
 		evo_data(push, 0x31ec6000 | head->base.index << 25 |
 					    asyh->mode.interlace);
 		evo_kick(push, core);
diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc37d.c b/drivers/gpu/drm/nouveau/dispnv50/headc37d.c
index cf5a68f4021a..87dd11b2fe96 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/headc37d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/headc37d.c
@@ -46,10 +46,10 @@ headc37d_or(struct nv50_head *head, struct nv50_head_atom *asyh)
 		}
 
 		evo_mthd(push, 0x2004 + (head->base.index * 0x400), 1);
-		evo_data(push, 0x00000001 |
-			       asyh->or.depth << 4 |
+		evo_data(push, depth << 4 |
 			       asyh->or.nvsync << 3 |
-			       asyh->or.nhsync << 2);
+			       asyh->or.nhsync << 2 |
+			       asyh->or.crc_raster);
 		evo_kick(push, core);
 	}
 }
diff --git a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
index 65e3b60804c6..e6cef3593aec 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/headc57d.c
@@ -46,10 +46,11 @@ headc57d_or(struct nv50_head *head, struct nv50_head_atom *asyh)
 		}
 
 		evo_mthd(push, 0x2004 + (head->base.index * 0x400), 1);
-		evo_data(push, 0xfc000001 |
-			       asyh->or.depth << 4 |
+		evo_data(push, 0xfc000000 |
+			       depth << 4 |
 			       asyh->or.nvsync << 3 |
-			       asyh->or.nhsync << 2);
+			       asyh->or.nhsync << 2 |
+			       asyh->or.crc_raster);
 		evo_kick(push, core);
 	}
 }
diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c
index 700817dc4fa0..7a0bd9a720a0 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.c
+++ b/drivers/gpu/drm/nouveau/nouveau_display.c
@@ -43,15 +43,7 @@
 #include <nvif/class.h>
 #include <nvif/cl0046.h>
 #include <nvif/event.h>
-
-static int
-nouveau_display_vblank_handler(struct nvif_notify *notify)
-{
-	struct nouveau_crtc *nv_crtc =
-		container_of(notify, typeof(*nv_crtc), vblank);
-	drm_crtc_handle_vblank(&nv_crtc->base);
-	return NVIF_NOTIFY_KEEP;
-}
+#include <dispnv50/crc.h>
 
 int
 nouveau_display_vblank_enable(struct drm_crtc *crtc)
@@ -135,50 +127,6 @@ nouveau_display_scanoutpos(struct drm_crtc *crtc,
 					       stime, etime);
 }
 
-static void
-nouveau_display_vblank_fini(struct drm_device *dev)
-{
-	struct drm_crtc *crtc;
-
-	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
-		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
-		nvif_notify_fini(&nv_crtc->vblank);
-	}
-}
-
-static int
-nouveau_display_vblank_init(struct drm_device *dev)
-{
-	struct nouveau_display *disp = nouveau_display(dev);
-	struct drm_crtc *crtc;
-	int ret;
-
-	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
-		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
-		ret = nvif_notify_init(&disp->disp.object,
-				       nouveau_display_vblank_handler, false,
-				       NV04_DISP_NTFY_VBLANK,
-				       &(struct nvif_notify_head_req_v0) {
-					.head = nv_crtc->index,
-				       },
-				       sizeof(struct nvif_notify_head_req_v0),
-				       sizeof(struct nvif_notify_head_rep_v0),
-				       &nv_crtc->vblank);
-		if (ret) {
-			nouveau_display_vblank_fini(dev);
-			return ret;
-		}
-	}
-
-	ret = drm_vblank_init(dev, dev->mode_config.num_crtc);
-	if (ret) {
-		nouveau_display_vblank_fini(dev);
-		return ret;
-	}
-
-	return 0;
-}
-
 static void
 nouveau_user_framebuffer_destroy(struct drm_framebuffer *drm_fb)
 {
@@ -545,9 +493,12 @@ nouveau_display_create(struct drm_device *dev)
 	drm_mode_config_reset(dev);
 
 	if (dev->mode_config.num_crtc) {
-		ret = nouveau_display_vblank_init(dev);
+		ret = drm_vblank_init(dev, dev->mode_config.num_crtc);
 		if (ret)
 			goto vblank_err;
+
+		if (disp->disp.object.oclass >= NV50_DISP)
+			nv50_crc_init(dev);
 	}
 
 	INIT_WORK(&drm->hpd_work, nouveau_display_hpd_work);
@@ -574,7 +525,6 @@ nouveau_display_destroy(struct drm_device *dev)
 #ifdef CONFIG_ACPI
 	unregister_acpi_notifier(&nouveau_drm(dev)->acpi_nb);
 #endif
-	nouveau_display_vblank_fini(dev);
 
 	drm_kms_helper_poll_fini(dev);
 	drm_mode_config_cleanup(dev);
-- 
2.24.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 9/9] drm/nouveau/kms/nvd9-: Add CRC support
  2020-03-18  0:41 ` [PATCH 9/9] drm/nouveau/kms/nvd9-: Add CRC support Lyude Paul
  2020-03-18  1:09   ` [PATCH v2] " Lyude Paul
@ 2020-03-18  6:02   ` Greg Kroah-Hartman
  1 sibling, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2020-03-18  6:02 UTC (permalink / raw)
  To: Lyude Paul
  Cc: Kate Stewart, Jani Nikula, Thomas Zimmermann, David Airlie,
	nouveau, linux-kernel, dri-devel, Sean Paul, Ben Skeggs,
	Pankaj Bharadiya, Alex Deucher, Sam Ravnborg

On Tue, Mar 17, 2020 at 08:41:06PM -0400, Lyude Paul wrote:
> +	root = debugfs_create_dir("nv_crc", crtc->debugfs_entry);
> +	if (IS_ERR(root))
> +		return PTR_ERR(root);

No need to check this, just take the return value and move on.

> +
> +	dent = debugfs_create_file("flip_threshold", 0644, root, head,
> +				   &nv50_crc_flip_threshold_fops);
> +	if (IS_ERR(dent))
> +		return PTR_ERR(dent);

No need to check this either, in fact this test is incorrect :(

Just make the call, and move on.  See the loads of debugfs cleanups I
have been doing for examples.

thanks,

greg k-h
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/9] drm/vblank: Add vblank works
  2020-03-18  0:40 ` [PATCH 1/9] drm/vblank: Add vblank works Lyude Paul
@ 2020-03-18 13:46   ` Daniel Vetter
  2020-03-19 20:12     ` Lyude Paul
                       ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Daniel Vetter @ 2020-03-18 13:46 UTC (permalink / raw)
  To: Lyude Paul
  Cc: David Airlie, nouveau, linux-kernel, dri-devel, Thomas Zimmermann

On Tue, Mar 17, 2020 at 08:40:58PM -0400, Lyude Paul wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Add some kind of vblank workers. The interface is similar to regular
> delayed works, and also allows for re-scheduling.
> 
> Whatever hardware programming we do in the work must be fast
> (must at least complete during the vblank, sometimes during
> the first few scanlines of vblank), so we'll fire up a per-crtc
> high priority thread for this.
> 
> [based off patches from Ville Syrjälä <ville.syrjala@linux.intel.com>,
> change below to signoff later]
> 
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Signed-off-by: Lyude Paul <lyude@redhat.com>

Hm not really sold on the idea that we have should reinvent our own worker
infrastructure here. Imo a vblank_work should look like a delayed work,
i.e. using struct work_struct as the base class, and wrapping the vblank
thing around it (instead of the timer). That alos would allow drivers to
schedule works on their own work queues, allowing for easier flushing and
all that stuff.

Also if we do this I think we should try to follow the delayed work abi as
closely as possible (e.g. INIT_VBLANK_WORK, queue_vblank_work,
mod_vblank_work, ...). Delayed workers (whether timer or vblank) have a
bunch of edges cases where consistently would be really great to avoid
surprises and bugs.
-Daniel

> ---
>  drivers/gpu/drm/drm_vblank.c | 322 +++++++++++++++++++++++++++++++++++
>  include/drm/drm_vblank.h     |  34 ++++
>  2 files changed, 356 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> index da7b0b0c1090..06c796b6c381 100644
> --- a/drivers/gpu/drm/drm_vblank.c
> +++ b/drivers/gpu/drm/drm_vblank.c
> @@ -25,7 +25,9 @@
>   */
>  
>  #include <linux/export.h>
> +#include <linux/kthread.h>
>  #include <linux/moduleparam.h>
> +#include <uapi/linux/sched/types.h>
>  
>  #include <drm/drm_crtc.h>
>  #include <drm/drm_drv.h>
> @@ -91,6 +93,7 @@
>  static bool
>  drm_get_last_vbltimestamp(struct drm_device *dev, unsigned int pipe,
>  			  ktime_t *tvblank, bool in_vblank_irq);
> +static int drm_vblank_get(struct drm_device *dev, unsigned int pipe);
>  
>  static unsigned int drm_timestamp_precision = 20;  /* Default to 20 usecs. */
>  
> @@ -440,6 +443,9 @@ void drm_vblank_cleanup(struct drm_device *dev)
>  			drm_core_check_feature(dev, DRIVER_MODESET));
>  
>  		del_timer_sync(&vblank->disable_timer);
> +
> +		wake_up_all(&vblank->vblank_work.work_wait);
> +		kthread_stop(vblank->vblank_work.thread);
>  	}
>  
>  	kfree(dev->vblank);
> @@ -447,6 +453,108 @@ void drm_vblank_cleanup(struct drm_device *dev)
>  	dev->num_crtcs = 0;
>  }
>  
> +static int vblank_work_thread(void *data)
> +{
> +	struct drm_vblank_crtc *vblank = data;
> +
> +	while (!kthread_should_stop()) {
> +		struct drm_vblank_work *work, *next;
> +		LIST_HEAD(list);
> +		u64 count;
> +		int ret;
> +
> +		spin_lock_irq(&vblank->dev->event_lock);
> +
> +		ret = wait_event_interruptible_lock_irq(vblank->queue,
> +							kthread_should_stop() ||
> +							!list_empty(&vblank->vblank_work.work_list),
> +							vblank->dev->event_lock);
> +
> +		WARN_ON(ret && !kthread_should_stop() &&
> +			list_empty(&vblank->vblank_work.irq_list) &&
> +			list_empty(&vblank->vblank_work.work_list));
> +
> +		list_for_each_entry_safe(work, next,
> +					 &vblank->vblank_work.work_list,
> +					 list) {
> +			list_move_tail(&work->list, &list);
> +			work->state = DRM_VBL_WORK_RUNNING;
> +		}
> +
> +		spin_unlock_irq(&vblank->dev->event_lock);
> +
> +		if (list_empty(&list))
> +			continue;
> +
> +		count = atomic64_read(&vblank->count);
> +		list_for_each_entry(work, &list, list)
> +			work->func(work, count);
> +
> +		spin_lock_irq(&vblank->dev->event_lock);
> +
> +		list_for_each_entry_safe(work, next, &list, list) {
> +			if (work->reschedule) {
> +				list_move_tail(&work->list,
> +					       &vblank->vblank_work.irq_list);
> +				drm_vblank_get(vblank->dev, vblank->pipe);
> +				work->reschedule = false;
> +				work->state = DRM_VBL_WORK_WAITING;
> +			} else {
> +				list_del_init(&work->list);
> +				work->cancel = false;
> +				work->state = DRM_VBL_WORK_IDLE;
> +			}
> +		}
> +
> +		spin_unlock_irq(&vblank->dev->event_lock);
> +
> +		wake_up_all(&vblank->vblank_work.work_wait);
> +	}
> +
> +	return 0;
> +}
> +
> +static void vblank_work_init(struct drm_vblank_crtc *vblank)
> +{
> +	struct sched_param param = {
> +		.sched_priority = MAX_RT_PRIO - 1,
> +	};
> +	int ret;
> +
> +	INIT_LIST_HEAD(&vblank->vblank_work.irq_list);
> +	INIT_LIST_HEAD(&vblank->vblank_work.work_list);
> +	init_waitqueue_head(&vblank->vblank_work.work_wait);
> +
> +	vblank->vblank_work.thread =
> +		kthread_run(vblank_work_thread, vblank, "card %d crtc %d",
> +			    vblank->dev->primary->index, vblank->pipe);
> +
> +	ret = sched_setscheduler(vblank->vblank_work.thread,
> +				 SCHED_FIFO, &param);
> +	WARN_ON(ret);
> +}
> +
> +/**
> + * drm_vblank_work_init - initialize a vblank work item
> + * @work: vblank work item
> + * @crtc: CRTC whose vblank will trigger the work execution
> + * @func: work function to be executed
> + *
> + * Initialize a vblank work item for a specific crtc.
> + */
> +void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc *crtc,
> +			  void (*func)(struct drm_vblank_work *work, u64 count))
> +{
> +	struct drm_device *dev = crtc->dev;
> +	struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)];
> +
> +	work->vblank = vblank;
> +	work->state = DRM_VBL_WORK_IDLE;
> +	work->func = func;
> +	INIT_LIST_HEAD(&work->list);
> +}
> +EXPORT_SYMBOL(drm_vblank_work_init);
> +
>  /**
>   * drm_vblank_init - initialize vblank support
>   * @dev: DRM device
> @@ -481,6 +589,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs)
>  		init_waitqueue_head(&vblank->queue);
>  		timer_setup(&vblank->disable_timer, vblank_disable_fn, 0);
>  		seqlock_init(&vblank->seqlock);
> +
> +		vblank_work_init(vblank);
>  	}
>  
>  	DRM_INFO("Supports vblank timestamp caching Rev 2 (21.10.2013).\n");
> @@ -1825,6 +1935,22 @@ static void drm_handle_vblank_events(struct drm_device *dev, unsigned int pipe)
>  	trace_drm_vblank_event(pipe, seq, now, high_prec);
>  }
>  
> +static void drm_handle_vblank_works(struct drm_vblank_crtc *vblank)
> +{
> +	struct drm_vblank_work *work, *next;
> +	u64 count = atomic64_read(&vblank->count);
> +
> +	list_for_each_entry_safe(work, next, &vblank->vblank_work.irq_list,
> +				 list) {
> +		if (!vblank_passed(count, work->count))
> +			continue;
> +
> +		drm_vblank_put(vblank->dev, vblank->pipe);
> +		list_move_tail(&work->list, &vblank->vblank_work.work_list);
> +		work->state = DRM_VBL_WORK_SCHEDULED;
> +	}
> +}
> +
>  /**
>   * drm_handle_vblank - handle a vblank event
>   * @dev: DRM device
> @@ -1866,6 +1992,7 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe)
>  
>  	spin_unlock(&dev->vblank_time_lock);
>  
> +	drm_handle_vblank_works(vblank);
>  	wake_up(&vblank->queue);
>  
>  	/* With instant-off, we defer disabling the interrupt until after
> @@ -2076,3 +2203,198 @@ int drm_crtc_queue_sequence_ioctl(struct drm_device *dev, void *data,
>  	kfree(e);
>  	return ret;
>  }
> +
> +/**
> + * drm_vblank_work_schedule - schedule a vblank work
> + * @work: vblank work to schedule
> + * @count: target vblank count
> + * @nextonmiss: defer until the next vblank if target vblank was missed
> + *
> + * Schedule @work for execution once the crtc vblank count reaches @count.
> + *
> + * If the crtc vblank count has already reached @count and @nextonmiss is
> + * %false the work starts to execute immediately.
> + *
> + * If the crtc vblank count has already reached @count and @nextonmiss is
> + * %true the work is deferred until the next vblank (as if @count has been
> + * specified as crtc vblank count + 1).
> + *
> + * If @work is already scheduled, this function will reschedule said work
> + * using the new @count.
> + *
> + * Returns:
> + * 0 on success, error code on failure.
> + */
> +int drm_vblank_work_schedule(struct drm_vblank_work *work,
> +			     u64 count, bool nextonmiss)
> +{
> +	struct drm_vblank_crtc *vblank = work->vblank;
> +	unsigned long irqflags;
> +	u64 cur_vbl;
> +	int ret = 0;
> +	bool rescheduling = false;
> +	bool passed;
> +
> +	spin_lock_irqsave(&vblank->dev->event_lock, irqflags);
> +
> +	if (work->cancel)
> +		goto out;
> +
> +	if (work->state == DRM_VBL_WORK_RUNNING) {
> +		work->reschedule = true;
> +		work->count = count;
> +		goto out;
> +	} else if (work->state != DRM_VBL_WORK_IDLE) {
> +		if (work->count == count)
> +			goto out;
> +		rescheduling = true;
> +	}
> +
> +	if (work->state != DRM_VBL_WORK_WAITING) {
> +		ret = drm_vblank_get(vblank->dev, vblank->pipe);
> +		if (ret)
> +			goto out;
> +	}
> +
> +	work->count = count;
> +
> +	cur_vbl = atomic64_read(&vblank->count);
> +	passed = vblank_passed(cur_vbl, count);
> +	if (passed)
> +		DRM_ERROR("crtc %d vblank %llu already passed (current %llu)\n",
> +			  vblank->pipe, count, cur_vbl);
> +
> +	if (!nextonmiss && passed) {
> +		drm_vblank_put(vblank->dev, vblank->pipe);
> +		if (rescheduling)
> +			list_move_tail(&work->list,
> +				       &vblank->vblank_work.work_list);
> +		else
> +			list_add_tail(&work->list,
> +				      &vblank->vblank_work.work_list);
> +		work->state = DRM_VBL_WORK_SCHEDULED;
> +		wake_up_all(&vblank->queue);
> +	} else {
> +		if (rescheduling)
> +			list_move_tail(&work->list,
> +				       &vblank->vblank_work.irq_list);
> +		else
> +			list_add_tail(&work->list,
> +				      &vblank->vblank_work.irq_list);
> +		work->state = DRM_VBL_WORK_WAITING;
> +	}
> +
> + out:
> +	spin_unlock_irqrestore(&vblank->dev->event_lock, irqflags);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(drm_vblank_work_schedule);
> +
> +static bool vblank_work_cancel(struct drm_vblank_work *work)
> +{
> +	struct drm_vblank_crtc *vblank = work->vblank;
> +
> +	switch (work->state) {
> +	case DRM_VBL_WORK_RUNNING:
> +		work->cancel = true;
> +		work->reschedule = false;
> +		/* fall through */
> +	default:
> +	case DRM_VBL_WORK_IDLE:
> +		return false;
> +	case DRM_VBL_WORK_WAITING:
> +		drm_vblank_put(vblank->dev, vblank->pipe);
> +		/* fall through */
> +	case DRM_VBL_WORK_SCHEDULED:
> +		list_del_init(&work->list);
> +		work->state = DRM_VBL_WORK_IDLE;
> +		return true;
> +	}
> +}
> +
> +/**
> + * drm_vblank_work_cancel - cancel a vblank work
> + * @work: vblank work to cancel
> + *
> + * Cancel an already scheduled vblank work.
> + *
> + * On return @work may still be executing, unless the return
> + * value is %true.
> + *
> + * Returns:
> + * True if the work was cancelled before it started to excute, false otherwise.
> + */
> +bool drm_vblank_work_cancel(struct drm_vblank_work *work)
> +{
> +	struct drm_vblank_crtc *vblank = work->vblank;
> +	bool cancelled;
> +
> +	spin_lock_irq(&vblank->dev->event_lock);
> +
> +	cancelled = vblank_work_cancel(work);
> +
> +	spin_unlock_irq(&vblank->dev->event_lock);
> +
> +	return cancelled;
> +}
> +EXPORT_SYMBOL(drm_vblank_work_cancel);
> +
> +/**
> + * drm_vblank_work_cancel_sync - cancel a vblank work and wait for it to finish executing
> + * @work: vblank work to cancel
> + *
> + * Cancel an already scheduled vblank work and wait for its
> + * execution to finish.
> + *
> + * On return @work is no longer guaraneed to be executing.
> + *
> + * Returns:
> + * True if the work was cancelled before it started to excute, false otherwise.
> + */
> +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work)
> +{
> +	struct drm_vblank_crtc *vblank = work->vblank;
> +	bool cancelled;
> +	long ret;
> +
> +	spin_lock_irq(&vblank->dev->event_lock);
> +
> +	cancelled = vblank_work_cancel(work);
> +
> +	ret = wait_event_lock_irq_timeout(vblank->vblank_work.work_wait,
> +					  work->state == DRM_VBL_WORK_IDLE,
> +					  vblank->dev->event_lock,
> +					  10 * HZ);
> +
> +	spin_unlock_irq(&vblank->dev->event_lock);
> +
> +	WARN(!ret, "crtc %d vblank work timed out\n", vblank->pipe);
> +
> +	return cancelled;
> +}
> +EXPORT_SYMBOL(drm_vblank_work_cancel_sync);
> +
> +/**
> + * drm_vblank_work_flush - wait for a scheduled vblank work to finish excuting
> + * @work: vblank work to flush
> + *
> + * Wait until @work has finished executing.
> + */
> +void drm_vblank_work_flush(struct drm_vblank_work *work)
> +{
> +	struct drm_vblank_crtc *vblank = work->vblank;
> +	long ret;
> +
> +	spin_lock_irq(&vblank->dev->event_lock);
> +
> +	ret = wait_event_lock_irq_timeout(vblank->vblank_work.work_wait,
> +					  work->state == DRM_VBL_WORK_IDLE,
> +					  vblank->dev->event_lock,
> +					  10 * HZ);
> +
> +	spin_unlock_irq(&vblank->dev->event_lock);
> +
> +	WARN(!ret, "crtc %d vblank work timed out\n", vblank->pipe);
> +}
> +EXPORT_SYMBOL(drm_vblank_work_flush);
> diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h
> index dd9f5b9e56e4..ac9130f419af 100644
> --- a/include/drm/drm_vblank.h
> +++ b/include/drm/drm_vblank.h
> @@ -203,8 +203,42 @@ struct drm_vblank_crtc {
>  	 * disabling functions multiple times.
>  	 */
>  	bool enabled;
> +
> +	struct {
> +		struct task_struct *thread;
> +		struct list_head irq_list, work_list;
> +		wait_queue_head_t work_wait;
> +	} vblank_work;
> +};
> +
> +struct drm_vblank_work {
> +	u64 count;
> +	struct drm_vblank_crtc *vblank;
> +	void (*func)(struct drm_vblank_work *work, u64 count);
> +	struct list_head list;
> +	enum {
> +		DRM_VBL_WORK_IDLE,
> +		DRM_VBL_WORK_WAITING,
> +		DRM_VBL_WORK_SCHEDULED,
> +		DRM_VBL_WORK_RUNNING,
> +	} state;
> +	bool cancel : 1;
> +	bool reschedule : 1;
>  };
>  
> +int drm_vblank_work_schedule(struct drm_vblank_work *work,
> +			     u64 count, bool nextonmiss);
> +void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc *crtc,
> +			  void (*func)(struct drm_vblank_work *work, u64 count));
> +bool drm_vblank_work_cancel(struct drm_vblank_work *work);
> +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work);
> +void drm_vblank_work_flush(struct drm_vblank_work *work);
> +
> +static inline bool drm_vblank_work_pending(struct drm_vblank_work *work)
> +{
> +	return work->state != DRM_VBL_WORK_IDLE;
> +}
> +
>  int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs);
>  bool drm_dev_has_vblank(const struct drm_device *dev);
>  u64 drm_crtc_vblank_count(struct drm_crtc *crtc);
> -- 
> 2.24.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/9] drm/vblank: Add vblank works
  2020-03-18 13:46   ` Daniel Vetter
@ 2020-03-19 20:12     ` Lyude Paul
  2020-03-27 20:29     ` Lyude Paul
  2020-05-07 18:57     ` Lyude Paul
  2 siblings, 0 replies; 22+ messages in thread
From: Lyude Paul @ 2020-03-19 20:12 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David Airlie, nouveau, linux-kernel, dri-devel, Thomas Zimmermann

On Wed, 2020-03-18 at 14:46 +0100, Daniel Vetter wrote:
> On Tue, Mar 17, 2020 at 08:40:58PM -0400, Lyude Paul wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > Add some kind of vblank workers. The interface is similar to regular
> > delayed works, and also allows for re-scheduling.
> > 
> > Whatever hardware programming we do in the work must be fast
> > (must at least complete during the vblank, sometimes during
> > the first few scanlines of vblank), so we'll fire up a per-crtc
> > high priority thread for this.
> > 
> > [based off patches from Ville Syrjälä <ville.syrjala@linux.intel.com>,
> > change below to signoff later]
> > 
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Signed-off-by: Lyude Paul <lyude@redhat.com>
> 
> Hm not really sold on the idea that we have should reinvent our own worker
> infrastructure here. Imo a vblank_work should look like a delayed work,
> i.e. using struct work_struct as the base class, and wrapping the vblank
> thing around it (instead of the timer). That alos would allow drivers to
> schedule works on their own work queues, allowing for easier flushing and
> all that stuff.
> 
> Also if we do this I think we should try to follow the delayed work abi as
> closely as possible (e.g. INIT_VBLANK_WORK, queue_vblank_work,
> mod_vblank_work, ...). Delayed workers (whether timer or vblank) have a
> bunch of edges cases where consistently would be really great to avoid
> surprises and bugs.
> -Daniel

Good point! I'll fix this up for the next respin

> 
> > ---
> >  drivers/gpu/drm/drm_vblank.c | 322 +++++++++++++++++++++++++++++++++++
> >  include/drm/drm_vblank.h     |  34 ++++
> >  2 files changed, 356 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> > index da7b0b0c1090..06c796b6c381 100644
> > --- a/drivers/gpu/drm/drm_vblank.c
> > +++ b/drivers/gpu/drm/drm_vblank.c
> > @@ -25,7 +25,9 @@
> >   */
> >  
> >  #include <linux/export.h>
> > +#include <linux/kthread.h>
> >  #include <linux/moduleparam.h>
> > +#include <uapi/linux/sched/types.h>
> >  
> >  #include <drm/drm_crtc.h>
> >  #include <drm/drm_drv.h>
> > @@ -91,6 +93,7 @@
> >  static bool
> >  drm_get_last_vbltimestamp(struct drm_device *dev, unsigned int pipe,
> >  			  ktime_t *tvblank, bool in_vblank_irq);
> > +static int drm_vblank_get(struct drm_device *dev, unsigned int pipe);
> >  
> >  static unsigned int drm_timestamp_precision = 20;  /* Default to 20
> > usecs. */
> >  
> > @@ -440,6 +443,9 @@ void drm_vblank_cleanup(struct drm_device *dev)
> >  			drm_core_check_feature(dev, DRIVER_MODESET));
> >  
> >  		del_timer_sync(&vblank->disable_timer);
> > +
> > +		wake_up_all(&vblank->vblank_work.work_wait);
> > +		kthread_stop(vblank->vblank_work.thread);
> >  	}
> >  
> >  	kfree(dev->vblank);
> > @@ -447,6 +453,108 @@ void drm_vblank_cleanup(struct drm_device *dev)
> >  	dev->num_crtcs = 0;
> >  }
> >  
> > +static int vblank_work_thread(void *data)
> > +{
> > +	struct drm_vblank_crtc *vblank = data;
> > +
> > +	while (!kthread_should_stop()) {
> > +		struct drm_vblank_work *work, *next;
> > +		LIST_HEAD(list);
> > +		u64 count;
> > +		int ret;
> > +
> > +		spin_lock_irq(&vblank->dev->event_lock);
> > +
> > +		ret = wait_event_interruptible_lock_irq(vblank->queue,
> > +							kthread_should_stop()
> > ||
> > +							!list_empty(&vblank-
> > >vblank_work.work_list),
> > +							vblank->dev-
> > >event_lock);
> > +
> > +		WARN_ON(ret && !kthread_should_stop() &&
> > +			list_empty(&vblank->vblank_work.irq_list) &&
> > +			list_empty(&vblank->vblank_work.work_list));
> > +
> > +		list_for_each_entry_safe(work, next,
> > +					 &vblank->vblank_work.work_list,
> > +					 list) {
> > +			list_move_tail(&work->list, &list);
> > +			work->state = DRM_VBL_WORK_RUNNING;
> > +		}
> > +
> > +		spin_unlock_irq(&vblank->dev->event_lock);
> > +
> > +		if (list_empty(&list))
> > +			continue;
> > +
> > +		count = atomic64_read(&vblank->count);
> > +		list_for_each_entry(work, &list, list)
> > +			work->func(work, count);
> > +
> > +		spin_lock_irq(&vblank->dev->event_lock);
> > +
> > +		list_for_each_entry_safe(work, next, &list, list) {
> > +			if (work->reschedule) {
> > +				list_move_tail(&work->list,
> > +					       &vblank->vblank_work.irq_list);
> > +				drm_vblank_get(vblank->dev, vblank->pipe);
> > +				work->reschedule = false;
> > +				work->state = DRM_VBL_WORK_WAITING;
> > +			} else {
> > +				list_del_init(&work->list);
> > +				work->cancel = false;
> > +				work->state = DRM_VBL_WORK_IDLE;
> > +			}
> > +		}
> > +
> > +		spin_unlock_irq(&vblank->dev->event_lock);
> > +
> > +		wake_up_all(&vblank->vblank_work.work_wait);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static void vblank_work_init(struct drm_vblank_crtc *vblank)
> > +{
> > +	struct sched_param param = {
> > +		.sched_priority = MAX_RT_PRIO - 1,
> > +	};
> > +	int ret;
> > +
> > +	INIT_LIST_HEAD(&vblank->vblank_work.irq_list);
> > +	INIT_LIST_HEAD(&vblank->vblank_work.work_list);
> > +	init_waitqueue_head(&vblank->vblank_work.work_wait);
> > +
> > +	vblank->vblank_work.thread =
> > +		kthread_run(vblank_work_thread, vblank, "card %d crtc %d",
> > +			    vblank->dev->primary->index, vblank->pipe);
> > +
> > +	ret = sched_setscheduler(vblank->vblank_work.thread,
> > +				 SCHED_FIFO, &param);
> > +	WARN_ON(ret);
> > +}
> > +
> > +/**
> > + * drm_vblank_work_init - initialize a vblank work item
> > + * @work: vblank work item
> > + * @crtc: CRTC whose vblank will trigger the work execution
> > + * @func: work function to be executed
> > + *
> > + * Initialize a vblank work item for a specific crtc.
> > + */
> > +void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc
> > *crtc,
> > +			  void (*func)(struct drm_vblank_work *work, u64
> > count))
> > +{
> > +	struct drm_device *dev = crtc->dev;
> > +	struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)];
> > +
> > +	work->vblank = vblank;
> > +	work->state = DRM_VBL_WORK_IDLE;
> > +	work->func = func;
> > +	INIT_LIST_HEAD(&work->list);
> > +}
> > +EXPORT_SYMBOL(drm_vblank_work_init);
> > +
> >  /**
> >   * drm_vblank_init - initialize vblank support
> >   * @dev: DRM device
> > @@ -481,6 +589,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned
> > int num_crtcs)
> >  		init_waitqueue_head(&vblank->queue);
> >  		timer_setup(&vblank->disable_timer, vblank_disable_fn, 0);
> >  		seqlock_init(&vblank->seqlock);
> > +
> > +		vblank_work_init(vblank);
> >  	}
> >  
> >  	DRM_INFO("Supports vblank timestamp caching Rev 2 (21.10.2013).\n");
> > @@ -1825,6 +1935,22 @@ static void drm_handle_vblank_events(struct
> > drm_device *dev, unsigned int pipe)
> >  	trace_drm_vblank_event(pipe, seq, now, high_prec);
> >  }
> >  
> > +static void drm_handle_vblank_works(struct drm_vblank_crtc *vblank)
> > +{
> > +	struct drm_vblank_work *work, *next;
> > +	u64 count = atomic64_read(&vblank->count);
> > +
> > +	list_for_each_entry_safe(work, next, &vblank->vblank_work.irq_list,
> > +				 list) {
> > +		if (!vblank_passed(count, work->count))
> > +			continue;
> > +
> > +		drm_vblank_put(vblank->dev, vblank->pipe);
> > +		list_move_tail(&work->list, &vblank->vblank_work.work_list);
> > +		work->state = DRM_VBL_WORK_SCHEDULED;
> > +	}
> > +}
> > +
> >  /**
> >   * drm_handle_vblank - handle a vblank event
> >   * @dev: DRM device
> > @@ -1866,6 +1992,7 @@ bool drm_handle_vblank(struct drm_device *dev,
> > unsigned int pipe)
> >  
> >  	spin_unlock(&dev->vblank_time_lock);
> >  
> > +	drm_handle_vblank_works(vblank);
> >  	wake_up(&vblank->queue);
> >  
> >  	/* With instant-off, we defer disabling the interrupt until after
> > @@ -2076,3 +2203,198 @@ int drm_crtc_queue_sequence_ioctl(struct
> > drm_device *dev, void *data,
> >  	kfree(e);
> >  	return ret;
> >  }
> > +
> > +/**
> > + * drm_vblank_work_schedule - schedule a vblank work
> > + * @work: vblank work to schedule
> > + * @count: target vblank count
> > + * @nextonmiss: defer until the next vblank if target vblank was missed
> > + *
> > + * Schedule @work for execution once the crtc vblank count reaches
> > @count.
> > + *
> > + * If the crtc vblank count has already reached @count and @nextonmiss is
> > + * %false the work starts to execute immediately.
> > + *
> > + * If the crtc vblank count has already reached @count and @nextonmiss is
> > + * %true the work is deferred until the next vblank (as if @count has
> > been
> > + * specified as crtc vblank count + 1).
> > + *
> > + * If @work is already scheduled, this function will reschedule said work
> > + * using the new @count.
> > + *
> > + * Returns:
> > + * 0 on success, error code on failure.
> > + */
> > +int drm_vblank_work_schedule(struct drm_vblank_work *work,
> > +			     u64 count, bool nextonmiss)
> > +{
> > +	struct drm_vblank_crtc *vblank = work->vblank;
> > +	unsigned long irqflags;
> > +	u64 cur_vbl;
> > +	int ret = 0;
> > +	bool rescheduling = false;
> > +	bool passed;
> > +
> > +	spin_lock_irqsave(&vblank->dev->event_lock, irqflags);
> > +
> > +	if (work->cancel)
> > +		goto out;
> > +
> > +	if (work->state == DRM_VBL_WORK_RUNNING) {
> > +		work->reschedule = true;
> > +		work->count = count;
> > +		goto out;
> > +	} else if (work->state != DRM_VBL_WORK_IDLE) {
> > +		if (work->count == count)
> > +			goto out;
> > +		rescheduling = true;
> > +	}
> > +
> > +	if (work->state != DRM_VBL_WORK_WAITING) {
> > +		ret = drm_vblank_get(vblank->dev, vblank->pipe);
> > +		if (ret)
> > +			goto out;
> > +	}
> > +
> > +	work->count = count;
> > +
> > +	cur_vbl = atomic64_read(&vblank->count);
> > +	passed = vblank_passed(cur_vbl, count);
> > +	if (passed)
> > +		DRM_ERROR("crtc %d vblank %llu already passed (current
> > %llu)\n",
> > +			  vblank->pipe, count, cur_vbl);
> > +
> > +	if (!nextonmiss && passed) {
> > +		drm_vblank_put(vblank->dev, vblank->pipe);
> > +		if (rescheduling)
> > +			list_move_tail(&work->list,
> > +				       &vblank->vblank_work.work_list);
> > +		else
> > +			list_add_tail(&work->list,
> > +				      &vblank->vblank_work.work_list);
> > +		work->state = DRM_VBL_WORK_SCHEDULED;
> > +		wake_up_all(&vblank->queue);
> > +	} else {
> > +		if (rescheduling)
> > +			list_move_tail(&work->list,
> > +				       &vblank->vblank_work.irq_list);
> > +		else
> > +			list_add_tail(&work->list,
> > +				      &vblank->vblank_work.irq_list);
> > +		work->state = DRM_VBL_WORK_WAITING;
> > +	}
> > +
> > + out:
> > +	spin_unlock_irqrestore(&vblank->dev->event_lock, irqflags);
> > +
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(drm_vblank_work_schedule);
> > +
> > +static bool vblank_work_cancel(struct drm_vblank_work *work)
> > +{
> > +	struct drm_vblank_crtc *vblank = work->vblank;
> > +
> > +	switch (work->state) {
> > +	case DRM_VBL_WORK_RUNNING:
> > +		work->cancel = true;
> > +		work->reschedule = false;
> > +		/* fall through */
> > +	default:
> > +	case DRM_VBL_WORK_IDLE:
> > +		return false;
> > +	case DRM_VBL_WORK_WAITING:
> > +		drm_vblank_put(vblank->dev, vblank->pipe);
> > +		/* fall through */
> > +	case DRM_VBL_WORK_SCHEDULED:
> > +		list_del_init(&work->list);
> > +		work->state = DRM_VBL_WORK_IDLE;
> > +		return true;
> > +	}
> > +}
> > +
> > +/**
> > + * drm_vblank_work_cancel - cancel a vblank work
> > + * @work: vblank work to cancel
> > + *
> > + * Cancel an already scheduled vblank work.
> > + *
> > + * On return @work may still be executing, unless the return
> > + * value is %true.
> > + *
> > + * Returns:
> > + * True if the work was cancelled before it started to excute, false
> > otherwise.
> > + */
> > +bool drm_vblank_work_cancel(struct drm_vblank_work *work)
> > +{
> > +	struct drm_vblank_crtc *vblank = work->vblank;
> > +	bool cancelled;
> > +
> > +	spin_lock_irq(&vblank->dev->event_lock);
> > +
> > +	cancelled = vblank_work_cancel(work);
> > +
> > +	spin_unlock_irq(&vblank->dev->event_lock);
> > +
> > +	return cancelled;
> > +}
> > +EXPORT_SYMBOL(drm_vblank_work_cancel);
> > +
> > +/**
> > + * drm_vblank_work_cancel_sync - cancel a vblank work and wait for it to
> > finish executing
> > + * @work: vblank work to cancel
> > + *
> > + * Cancel an already scheduled vblank work and wait for its
> > + * execution to finish.
> > + *
> > + * On return @work is no longer guaraneed to be executing.
> > + *
> > + * Returns:
> > + * True if the work was cancelled before it started to excute, false
> > otherwise.
> > + */
> > +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work)
> > +{
> > +	struct drm_vblank_crtc *vblank = work->vblank;
> > +	bool cancelled;
> > +	long ret;
> > +
> > +	spin_lock_irq(&vblank->dev->event_lock);
> > +
> > +	cancelled = vblank_work_cancel(work);
> > +
> > +	ret = wait_event_lock_irq_timeout(vblank->vblank_work.work_wait,
> > +					  work->state == DRM_VBL_WORK_IDLE,
> > +					  vblank->dev->event_lock,
> > +					  10 * HZ);
> > +
> > +	spin_unlock_irq(&vblank->dev->event_lock);
> > +
> > +	WARN(!ret, "crtc %d vblank work timed out\n", vblank->pipe);
> > +
> > +	return cancelled;
> > +}
> > +EXPORT_SYMBOL(drm_vblank_work_cancel_sync);
> > +
> > +/**
> > + * drm_vblank_work_flush - wait for a scheduled vblank work to finish
> > excuting
> > + * @work: vblank work to flush
> > + *
> > + * Wait until @work has finished executing.
> > + */
> > +void drm_vblank_work_flush(struct drm_vblank_work *work)
> > +{
> > +	struct drm_vblank_crtc *vblank = work->vblank;
> > +	long ret;
> > +
> > +	spin_lock_irq(&vblank->dev->event_lock);
> > +
> > +	ret = wait_event_lock_irq_timeout(vblank->vblank_work.work_wait,
> > +					  work->state == DRM_VBL_WORK_IDLE,
> > +					  vblank->dev->event_lock,
> > +					  10 * HZ);
> > +
> > +	spin_unlock_irq(&vblank->dev->event_lock);
> > +
> > +	WARN(!ret, "crtc %d vblank work timed out\n", vblank->pipe);
> > +}
> > +EXPORT_SYMBOL(drm_vblank_work_flush);
> > diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h
> > index dd9f5b9e56e4..ac9130f419af 100644
> > --- a/include/drm/drm_vblank.h
> > +++ b/include/drm/drm_vblank.h
> > @@ -203,8 +203,42 @@ struct drm_vblank_crtc {
> >  	 * disabling functions multiple times.
> >  	 */
> >  	bool enabled;
> > +
> > +	struct {
> > +		struct task_struct *thread;
> > +		struct list_head irq_list, work_list;
> > +		wait_queue_head_t work_wait;
> > +	} vblank_work;
> > +};
> > +
> > +struct drm_vblank_work {
> > +	u64 count;
> > +	struct drm_vblank_crtc *vblank;
> > +	void (*func)(struct drm_vblank_work *work, u64 count);
> > +	struct list_head list;
> > +	enum {
> > +		DRM_VBL_WORK_IDLE,
> > +		DRM_VBL_WORK_WAITING,
> > +		DRM_VBL_WORK_SCHEDULED,
> > +		DRM_VBL_WORK_RUNNING,
> > +	} state;
> > +	bool cancel : 1;
> > +	bool reschedule : 1;
> >  };
> >  
> > +int drm_vblank_work_schedule(struct drm_vblank_work *work,
> > +			     u64 count, bool nextonmiss);
> > +void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc
> > *crtc,
> > +			  void (*func)(struct drm_vblank_work *work, u64
> > count));
> > +bool drm_vblank_work_cancel(struct drm_vblank_work *work);
> > +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work);
> > +void drm_vblank_work_flush(struct drm_vblank_work *work);
> > +
> > +static inline bool drm_vblank_work_pending(struct drm_vblank_work *work)
> > +{
> > +	return work->state != DRM_VBL_WORK_IDLE;
> > +}
> > +
> >  int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs);
> >  bool drm_dev_has_vblank(const struct drm_device *dev);
> >  u64 drm_crtc_vblank_count(struct drm_crtc *crtc);
> > -- 
> > 2.24.1
> > 
-- 
Cheers,
	Lyude Paul (she/her)
	Associate Software Engineer at Red Hat

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/9] drm/vblank: Add vblank works
  2020-03-18 13:46   ` Daniel Vetter
  2020-03-19 20:12     ` Lyude Paul
@ 2020-03-27 20:29     ` Lyude Paul
  2020-03-27 20:38       ` Lyude Paul
  2020-05-07 18:57     ` Lyude Paul
  2 siblings, 1 reply; 22+ messages in thread
From: Lyude Paul @ 2020-03-27 20:29 UTC (permalink / raw)
  To: Daniel Vetter, Tejun Heo
  Cc: David Airlie, nouveau, linux-kernel, dri-devel, Thomas Zimmermann

Adding Tejun to this thread per-Daniel's suggestion on IRC.

Hi Tejun! So, I don't know what your experience with modesetting related stuff
is so I'll quickly explain some important concepts here regarding scanlines,
vblanks, etc. If you already understand this, feel free to skip this next
paragraph.

From the perspective of a computer, every time a computer monitor displays a
new frame it's done by "scanning out" the display image from top to bottom,
one row of pixels at a time. which row of pixels we're on is referred to as
the scanline. Additionally, there's usually a couple of extra scanlines which
we scan out, but aren't actually displayed on the screen (these sometimes get
used by HDMI audio and friends, but that's another story). The period where
we're on these scanlines is referred to as the vblank, this is important.

On a lot of display hardware, programming needs to take effect during the
vertical blanking period so that settings like gamma, what frame we're
scanning out, etc. can be safely changed without showing visual tearing on the
screen. In some unforgiving hardware, some of this programming has to both
start and end in the same vblank. This is apparently the case on the majority
of Intel GPUs in the wild right now, most notably with gamma updates which
involve mashing over 2KiB of registers during the vblank. Other drivers have
very similar needs to this (nouveau in particular is why I'm here, we need it
for CRC related stuff) so we figured it'd be a good idea to add a set of
helpers for performing realtime hardware programming that's synchronized to
vblank intervals. In particular, this is aimed at hardware programming that
would be a bit too awkward to try to pull off entirely in interrupt context.
We call these vblank workers.

We first tried doing this using plain old kthreads, as you can see in the
patch below, since we could schedule them as realtime. Additionally, the plan
was to use this in i915 combined with pm_qos to get rid of cstate latency when
handling the original vblank interrupt. Note, our time constraints are a bit
more forgiving in nouveau so you won't see any pm_qos mentions in this series.

In an effort to try to avoid reinventing parts of the kernel's worker
infrastructure though, we tried to see if we could implement these with simple
work_structs and workqueues with HIGH_PRI | UNBOUNDED,[1]. But, it would seem
that this work_struct approach is quite unreliable and we still usually fail
to start the register programming in time for Intel's vblank worker usecase.

So, so far using plain old RT kthreads seems to make things about as reliable
as I think we're going to get. Note - even using kthreads we _still_ sometimes
miss the vblank period and end up tearing a bit on screen, but it happens
significantly less often then with work_structs and is basically going to be
as fast as we can get (in the future, Intel wants to try fixing this by doing
this hardware programming outside of the CPU, but for now we're stuck with
this).

With all of this being said - we'd still like to avoid having to reinvent
workers if possible, so we were wondering if there was any kind of realtime
worker that could be used for this instead? I haven't been able to find any
way of scheduling workers to be realtime, and I'm not sure if implementing
this in Linux's workqueues would be realistic either. It'd be nice to avoid
having to use plain old kthreads if possible :). What do you think Tejun?

On Wed, 2020-03-18 at 14:46 +0100, Daniel Vetter wrote:
> On Tue, Mar 17, 2020 at 08:40:58PM -0400, Lyude Paul wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > Add some kind of vblank workers. The interface is similar to regular
> > delayed works, and also allows for re-scheduling.
> > 
> > Whatever hardware programming we do in the work must be fast
> > (must at least complete during the vblank, sometimes during
> > the first few scanlines of vblank), so we'll fire up a per-crtc
> > high priority thread for this.
> > 
> > [based off patches from Ville Syrjälä <ville.syrjala@linux.intel.com>,
> > change below to signoff later]
> > 
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Signed-off-by: Lyude Paul <lyude@redhat.com>
> 
> Hm not really sold on the idea that we have should reinvent our own worker
> infrastructure here. Imo a vblank_work should look like a delayed work,
> i.e. using struct work_struct as the base class, and wrapping the vblank
> thing around it (instead of the timer). That alos would allow drivers to
> schedule works on their own work queues, allowing for easier flushing and
> all that stuff.
> 
> Also if we do this I think we should try to follow the delayed work abi as
> closely as possible (e.g. INIT_VBLANK_WORK, queue_vblank_work,
> mod_vblank_work, ...). Delayed workers (whether timer or vblank) have a
> bunch of edges cases where consistently would be really great to avoid
> surprises and bugs.
> -Daniel
> 
> > ---
> >  drivers/gpu/drm/drm_vblank.c | 322 +++++++++++++++++++++++++++++++++++
> >  include/drm/drm_vblank.h     |  34 ++++
> >  2 files changed, 356 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> > index da7b0b0c1090..06c796b6c381 100644
> > --- a/drivers/gpu/drm/drm_vblank.c
> > +++ b/drivers/gpu/drm/drm_vblank.c
> > @@ -25,7 +25,9 @@
> >   */
> >  
> >  #include <linux/export.h>
> > +#include <linux/kthread.h>
> >  #include <linux/moduleparam.h>
> > +#include <uapi/linux/sched/types.h>
> >  
> >  #include <drm/drm_crtc.h>
> >  #include <drm/drm_drv.h>
> > @@ -91,6 +93,7 @@
> >  static bool
> >  drm_get_last_vbltimestamp(struct drm_device *dev, unsigned int pipe,
> >  			  ktime_t *tvblank, bool in_vblank_irq);
> > +static int drm_vblank_get(struct drm_device *dev, unsigned int pipe);
> >  
> >  static unsigned int drm_timestamp_precision = 20;  /* Default to 20
> > usecs. */
> >  
> > @@ -440,6 +443,9 @@ void drm_vblank_cleanup(struct drm_device *dev)
> >  			drm_core_check_feature(dev, DRIVER_MODESET));
> >  
> >  		del_timer_sync(&vblank->disable_timer);
> > +
> > +		wake_up_all(&vblank->vblank_work.work_wait);
> > +		kthread_stop(vblank->vblank_work.thread);
> >  	}
> >  
> >  	kfree(dev->vblank);
> > @@ -447,6 +453,108 @@ void drm_vblank_cleanup(struct drm_device *dev)
> >  	dev->num_crtcs = 0;
> >  }
> >  
> > +static int vblank_work_thread(void *data)
> > +{
> > +	struct drm_vblank_crtc *vblank = data;
> > +
> > +	while (!kthread_should_stop()) {
> > +		struct drm_vblank_work *work, *next;
> > +		LIST_HEAD(list);
> > +		u64 count;
> > +		int ret;
> > +
> > +		spin_lock_irq(&vblank->dev->event_lock);
> > +
> > +		ret = wait_event_interruptible_lock_irq(vblank->queue,
> > +							kthread_should_stop()
> > ||
> > +							!list_empty(&vblank-
> > >vblank_work.work_list),
> > +							vblank->dev-
> > >event_lock);
> > +
> > +		WARN_ON(ret && !kthread_should_stop() &&
> > +			list_empty(&vblank->vblank_work.irq_list) &&
> > +			list_empty(&vblank->vblank_work.work_list));
> > +
> > +		list_for_each_entry_safe(work, next,
> > +					 &vblank->vblank_work.work_list,
> > +					 list) {
> > +			list_move_tail(&work->list, &list);
> > +			work->state = DRM_VBL_WORK_RUNNING;
> > +		}
> > +
> > +		spin_unlock_irq(&vblank->dev->event_lock);
> > +
> > +		if (list_empty(&list))
> > +			continue;
> > +
> > +		count = atomic64_read(&vblank->count);
> > +		list_for_each_entry(work, &list, list)
> > +			work->func(work, count);
> > +
> > +		spin_lock_irq(&vblank->dev->event_lock);
> > +
> > +		list_for_each_entry_safe(work, next, &list, list) {
> > +			if (work->reschedule) {
> > +				list_move_tail(&work->list,
> > +					       &vblank->vblank_work.irq_list);
> > +				drm_vblank_get(vblank->dev, vblank->pipe);
> > +				work->reschedule = false;
> > +				work->state = DRM_VBL_WORK_WAITING;
> > +			} else {
> > +				list_del_init(&work->list);
> > +				work->cancel = false;
> > +				work->state = DRM_VBL_WORK_IDLE;
> > +			}
> > +		}
> > +
> > +		spin_unlock_irq(&vblank->dev->event_lock);
> > +
> > +		wake_up_all(&vblank->vblank_work.work_wait);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static void vblank_work_init(struct drm_vblank_crtc *vblank)
> > +{
> > +	struct sched_param param = {
> > +		.sched_priority = MAX_RT_PRIO - 1,
> > +	};
> > +	int ret;
> > +
> > +	INIT_LIST_HEAD(&vblank->vblank_work.irq_list);
> > +	INIT_LIST_HEAD(&vblank->vblank_work.work_list);
> > +	init_waitqueue_head(&vblank->vblank_work.work_wait);
> > +
> > +	vblank->vblank_work.thread =
> > +		kthread_run(vblank_work_thread, vblank, "card %d crtc %d",
> > +			    vblank->dev->primary->index, vblank->pipe);
> > +
> > +	ret = sched_setscheduler(vblank->vblank_work.thread,
> > +				 SCHED_FIFO, &param);
> > +	WARN_ON(ret);
> > +}
> > +
> > +/**
> > + * drm_vblank_work_init - initialize a vblank work item
> > + * @work: vblank work item
> > + * @crtc: CRTC whose vblank will trigger the work execution
> > + * @func: work function to be executed
> > + *
> > + * Initialize a vblank work item for a specific crtc.
> > + */
> > +void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc
> > *crtc,
> > +			  void (*func)(struct drm_vblank_work *work, u64
> > count))
> > +{
> > +	struct drm_device *dev = crtc->dev;
> > +	struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)];
> > +
> > +	work->vblank = vblank;
> > +	work->state = DRM_VBL_WORK_IDLE;
> > +	work->func = func;
> > +	INIT_LIST_HEAD(&work->list);
> > +}
> > +EXPORT_SYMBOL(drm_vblank_work_init);
> > +
> >  /**
> >   * drm_vblank_init - initialize vblank support
> >   * @dev: DRM device
> > @@ -481,6 +589,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned
> > int num_crtcs)
> >  		init_waitqueue_head(&vblank->queue);
> >  		timer_setup(&vblank->disable_timer, vblank_disable_fn, 0);
> >  		seqlock_init(&vblank->seqlock);
> > +
> > +		vblank_work_init(vblank);
> >  	}
> >  
> >  	DRM_INFO("Supports vblank timestamp caching Rev 2 (21.10.2013).\n");
> > @@ -1825,6 +1935,22 @@ static void drm_handle_vblank_events(struct
> > drm_device *dev, unsigned int pipe)
> >  	trace_drm_vblank_event(pipe, seq, now, high_prec);
> >  }
> >  
> > +static void drm_handle_vblank_works(struct drm_vblank_crtc *vblank)
> > +{
> > +	struct drm_vblank_work *work, *next;
> > +	u64 count = atomic64_read(&vblank->count);
> > +
> > +	list_for_each_entry_safe(work, next, &vblank->vblank_work.irq_list,
> > +				 list) {
> > +		if (!vblank_passed(count, work->count))
> > +			continue;
> > +
> > +		drm_vblank_put(vblank->dev, vblank->pipe);
> > +		list_move_tail(&work->list, &vblank->vblank_work.work_list);
> > +		work->state = DRM_VBL_WORK_SCHEDULED;
> > +	}
> > +}
> > +
> >  /**
> >   * drm_handle_vblank - handle a vblank event
> >   * @dev: DRM device
> > @@ -1866,6 +1992,7 @@ bool drm_handle_vblank(struct drm_device *dev,
> > unsigned int pipe)
> >  
> >  	spin_unlock(&dev->vblank_time_lock);
> >  
> > +	drm_handle_vblank_works(vblank);
> >  	wake_up(&vblank->queue);
> >  
> >  	/* With instant-off, we defer disabling the interrupt until after
> > @@ -2076,3 +2203,198 @@ int drm_crtc_queue_sequence_ioctl(struct
> > drm_device *dev, void *data,
> >  	kfree(e);
> >  	return ret;
> >  }
> > +
> > +/**
> > + * drm_vblank_work_schedule - schedule a vblank work
> > + * @work: vblank work to schedule
> > + * @count: target vblank count
> > + * @nextonmiss: defer until the next vblank if target vblank was missed
> > + *
> > + * Schedule @work for execution once the crtc vblank count reaches
> > @count.
> > + *
> > + * If the crtc vblank count has already reached @count and @nextonmiss is
> > + * %false the work starts to execute immediately.
> > + *
> > + * If the crtc vblank count has already reached @count and @nextonmiss is
> > + * %true the work is deferred until the next vblank (as if @count has
> > been
> > + * specified as crtc vblank count + 1).
> > + *
> > + * If @work is already scheduled, this function will reschedule said work
> > + * using the new @count.
> > + *
> > + * Returns:
> > + * 0 on success, error code on failure.
> > + */
> > +int drm_vblank_work_schedule(struct drm_vblank_work *work,
> > +			     u64 count, bool nextonmiss)
> > +{
> > +	struct drm_vblank_crtc *vblank = work->vblank;
> > +	unsigned long irqflags;
> > +	u64 cur_vbl;
> > +	int ret = 0;
> > +	bool rescheduling = false;
> > +	bool passed;
> > +
> > +	spin_lock_irqsave(&vblank->dev->event_lock, irqflags);
> > +
> > +	if (work->cancel)
> > +		goto out;
> > +
> > +	if (work->state == DRM_VBL_WORK_RUNNING) {
> > +		work->reschedule = true;
> > +		work->count = count;
> > +		goto out;
> > +	} else if (work->state != DRM_VBL_WORK_IDLE) {
> > +		if (work->count == count)
> > +			goto out;
> > +		rescheduling = true;
> > +	}
> > +
> > +	if (work->state != DRM_VBL_WORK_WAITING) {
> > +		ret = drm_vblank_get(vblank->dev, vblank->pipe);
> > +		if (ret)
> > +			goto out;
> > +	}
> > +
> > +	work->count = count;
> > +
> > +	cur_vbl = atomic64_read(&vblank->count);
> > +	passed = vblank_passed(cur_vbl, count);
> > +	if (passed)
> > +		DRM_ERROR("crtc %d vblank %llu already passed (current
> > %llu)\n",
> > +			  vblank->pipe, count, cur_vbl);
> > +
> > +	if (!nextonmiss && passed) {
> > +		drm_vblank_put(vblank->dev, vblank->pipe);
> > +		if (rescheduling)
> > +			list_move_tail(&work->list,
> > +				       &vblank->vblank_work.work_list);
> > +		else
> > +			list_add_tail(&work->list,
> > +				      &vblank->vblank_work.work_list);
> > +		work->state = DRM_VBL_WORK_SCHEDULED;
> > +		wake_up_all(&vblank->queue);
> > +	} else {
> > +		if (rescheduling)
> > +			list_move_tail(&work->list,
> > +				       &vblank->vblank_work.irq_list);
> > +		else
> > +			list_add_tail(&work->list,
> > +				      &vblank->vblank_work.irq_list);
> > +		work->state = DRM_VBL_WORK_WAITING;
> > +	}
> > +
> > + out:
> > +	spin_unlock_irqrestore(&vblank->dev->event_lock, irqflags);
> > +
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(drm_vblank_work_schedule);
> > +
> > +static bool vblank_work_cancel(struct drm_vblank_work *work)
> > +{
> > +	struct drm_vblank_crtc *vblank = work->vblank;
> > +
> > +	switch (work->state) {
> > +	case DRM_VBL_WORK_RUNNING:
> > +		work->cancel = true;
> > +		work->reschedule = false;
> > +		/* fall through */
> > +	default:
> > +	case DRM_VBL_WORK_IDLE:
> > +		return false;
> > +	case DRM_VBL_WORK_WAITING:
> > +		drm_vblank_put(vblank->dev, vblank->pipe);
> > +		/* fall through */
> > +	case DRM_VBL_WORK_SCHEDULED:
> > +		list_del_init(&work->list);
> > +		work->state = DRM_VBL_WORK_IDLE;
> > +		return true;
> > +	}
> > +}
> > +
> > +/**
> > + * drm_vblank_work_cancel - cancel a vblank work
> > + * @work: vblank work to cancel
> > + *
> > + * Cancel an already scheduled vblank work.
> > + *
> > + * On return @work may still be executing, unless the return
> > + * value is %true.
> > + *
> > + * Returns:
> > + * True if the work was cancelled before it started to excute, false
> > otherwise.
> > + */
> > +bool drm_vblank_work_cancel(struct drm_vblank_work *work)
> > +{
> > +	struct drm_vblank_crtc *vblank = work->vblank;
> > +	bool cancelled;
> > +
> > +	spin_lock_irq(&vblank->dev->event_lock);
> > +
> > +	cancelled = vblank_work_cancel(work);
> > +
> > +	spin_unlock_irq(&vblank->dev->event_lock);
> > +
> > +	return cancelled;
> > +}
> > +EXPORT_SYMBOL(drm_vblank_work_cancel);
> > +
> > +/**
> > + * drm_vblank_work_cancel_sync - cancel a vblank work and wait for it to
> > finish executing
> > + * @work: vblank work to cancel
> > + *
> > + * Cancel an already scheduled vblank work and wait for its
> > + * execution to finish.
> > + *
> > + * On return @work is no longer guaraneed to be executing.
> > + *
> > + * Returns:
> > + * True if the work was cancelled before it started to excute, false
> > otherwise.
> > + */
> > +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work)
> > +{
> > +	struct drm_vblank_crtc *vblank = work->vblank;
> > +	bool cancelled;
> > +	long ret;
> > +
> > +	spin_lock_irq(&vblank->dev->event_lock);
> > +
> > +	cancelled = vblank_work_cancel(work);
> > +
> > +	ret = wait_event_lock_irq_timeout(vblank->vblank_work.work_wait,
> > +					  work->state == DRM_VBL_WORK_IDLE,
> > +					  vblank->dev->event_lock,
> > +					  10 * HZ);
> > +
> > +	spin_unlock_irq(&vblank->dev->event_lock);
> > +
> > +	WARN(!ret, "crtc %d vblank work timed out\n", vblank->pipe);
> > +
> > +	return cancelled;
> > +}
> > +EXPORT_SYMBOL(drm_vblank_work_cancel_sync);
> > +
> > +/**
> > + * drm_vblank_work_flush - wait for a scheduled vblank work to finish
> > excuting
> > + * @work: vblank work to flush
> > + *
> > + * Wait until @work has finished executing.
> > + */
> > +void drm_vblank_work_flush(struct drm_vblank_work *work)
> > +{
> > +	struct drm_vblank_crtc *vblank = work->vblank;
> > +	long ret;
> > +
> > +	spin_lock_irq(&vblank->dev->event_lock);
> > +
> > +	ret = wait_event_lock_irq_timeout(vblank->vblank_work.work_wait,
> > +					  work->state == DRM_VBL_WORK_IDLE,
> > +					  vblank->dev->event_lock,
> > +					  10 * HZ);
> > +
> > +	spin_unlock_irq(&vblank->dev->event_lock);
> > +
> > +	WARN(!ret, "crtc %d vblank work timed out\n", vblank->pipe);
> > +}
> > +EXPORT_SYMBOL(drm_vblank_work_flush);
> > diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h
> > index dd9f5b9e56e4..ac9130f419af 100644
> > --- a/include/drm/drm_vblank.h
> > +++ b/include/drm/drm_vblank.h
> > @@ -203,8 +203,42 @@ struct drm_vblank_crtc {
> >  	 * disabling functions multiple times.
> >  	 */
> >  	bool enabled;
> > +
> > +	struct {
> > +		struct task_struct *thread;
> > +		struct list_head irq_list, work_list;
> > +		wait_queue_head_t work_wait;
> > +	} vblank_work;
> > +};
> > +
> > +struct drm_vblank_work {
> > +	u64 count;
> > +	struct drm_vblank_crtc *vblank;
> > +	void (*func)(struct drm_vblank_work *work, u64 count);
> > +	struct list_head list;
> > +	enum {
> > +		DRM_VBL_WORK_IDLE,
> > +		DRM_VBL_WORK_WAITING,
> > +		DRM_VBL_WORK_SCHEDULED,
> > +		DRM_VBL_WORK_RUNNING,
> > +	} state;
> > +	bool cancel : 1;
> > +	bool reschedule : 1;
> >  };
> >  
> > +int drm_vblank_work_schedule(struct drm_vblank_work *work,
> > +			     u64 count, bool nextonmiss);
> > +void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc
> > *crtc,
> > +			  void (*func)(struct drm_vblank_work *work, u64
> > count));
> > +bool drm_vblank_work_cancel(struct drm_vblank_work *work);
> > +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work);
> > +void drm_vblank_work_flush(struct drm_vblank_work *work);
> > +
> > +static inline bool drm_vblank_work_pending(struct drm_vblank_work *work)
> > +{
> > +	return work->state != DRM_VBL_WORK_IDLE;
> > +}
> > +
> >  int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs);
> >  bool drm_dev_has_vblank(const struct drm_device *dev);
> >  u64 drm_crtc_vblank_count(struct drm_crtc *crtc);
> > -- 
> > 2.24.1
> > 
-- 
Cheers,
	Lyude Paul (she/her)
	Associate Software Engineer at Red Hat

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/9] drm/vblank: Add vblank works
  2020-03-27 20:29     ` Lyude Paul
@ 2020-03-27 20:38       ` Lyude Paul
  2020-04-13 20:18         ` Lyude Paul
  0 siblings, 1 reply; 22+ messages in thread
From: Lyude Paul @ 2020-03-27 20:38 UTC (permalink / raw)
  To: Daniel Vetter, Tejun Heo
  Cc: David Airlie, nouveau, linux-kernel, dri-devel, Thomas Zimmermann

Oops, completely forgot to link to the work_struct version of this patch:

[1] https://gitlab.freedesktop.org/lyudess/linux/-/commit/f57886aebbd9631f1ee6e61b516bf73afbfe74f4

On Fri, 2020-03-27 at 16:29 -0400, Lyude Paul wrote:
> Adding Tejun to this thread per-Daniel's suggestion on IRC.
> 
> Hi Tejun! So, I don't know what your experience with modesetting related
> stuff
> is so I'll quickly explain some important concepts here regarding scanlines,
> vblanks, etc. If you already understand this, feel free to skip this next
> paragraph.
> 
> From the perspective of a computer, every time a computer monitor displays a
> new frame it's done by "scanning out" the display image from top to bottom,
> one row of pixels at a time. which row of pixels we're on is referred to as
> the scanline. Additionally, there's usually a couple of extra scanlines
> which
> we scan out, but aren't actually displayed on the screen (these sometimes
> get
> used by HDMI audio and friends, but that's another story). The period where
> we're on these scanlines is referred to as the vblank, this is important.
> 
> On a lot of display hardware, programming needs to take effect during the
> vertical blanking period so that settings like gamma, what frame we're
> scanning out, etc. can be safely changed without showing visual tearing on
> the
> screen. In some unforgiving hardware, some of this programming has to both
> start and end in the same vblank. This is apparently the case on the
> majority
> of Intel GPUs in the wild right now, most notably with gamma updates which
> involve mashing over 2KiB of registers during the vblank. Other drivers have
> very similar needs to this (nouveau in particular is why I'm here, we need
> it
> for CRC related stuff) so we figured it'd be a good idea to add a set of
> helpers for performing realtime hardware programming that's synchronized to
> vblank intervals. In particular, this is aimed at hardware programming that
> would be a bit too awkward to try to pull off entirely in interrupt context.
> We call these vblank workers.
> 
> We first tried doing this using plain old kthreads, as you can see in the
> patch below, since we could schedule them as realtime. Additionally, the
> plan
> was to use this in i915 combined with pm_qos to get rid of cstate latency
> when
> handling the original vblank interrupt. Note, our time constraints are a bit
> more forgiving in nouveau so you won't see any pm_qos mentions in this
> series.
> 
> In an effort to try to avoid reinventing parts of the kernel's worker
> infrastructure though, we tried to see if we could implement these with
> simple
> work_structs and workqueues with HIGH_PRI | UNBOUNDED,[1]. But, it would
> seem
> that this work_struct approach is quite unreliable and we still usually fail
> to start the register programming in time for Intel's vblank worker usecase.
> 
> So, so far using plain old RT kthreads seems to make things about as
> reliable
> as I think we're going to get. Note - even using kthreads we _still_
> sometimes
> miss the vblank period and end up tearing a bit on screen, but it happens
> significantly less often then with work_structs and is basically going to be
> as fast as we can get (in the future, Intel wants to try fixing this by
> doing
> this hardware programming outside of the CPU, but for now we're stuck with
> this).
> 
> With all of this being said - we'd still like to avoid having to reinvent
> workers if possible, so we were wondering if there was any kind of realtime
> worker that could be used for this instead? I haven't been able to find any
> way of scheduling workers to be realtime, and I'm not sure if implementing
> this in Linux's workqueues would be realistic either. It'd be nice to avoid
> having to use plain old kthreads if possible :). What do you think Tejun?
> 
> On Wed, 2020-03-18 at 14:46 +0100, Daniel Vetter wrote:
> > On Tue, Mar 17, 2020 at 08:40:58PM -0400, Lyude Paul wrote:
> > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > 
> > > Add some kind of vblank workers. The interface is similar to regular
> > > delayed works, and also allows for re-scheduling.
> > > 
> > > Whatever hardware programming we do in the work must be fast
> > > (must at least complete during the vblank, sometimes during
> > > the first few scanlines of vblank), so we'll fire up a per-crtc
> > > high priority thread for this.
> > > 
> > > [based off patches from Ville Syrjälä <ville.syrjala@linux.intel.com>,
> > > change below to signoff later]
> > > 
> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > Signed-off-by: Lyude Paul <lyude@redhat.com>
> > 
> > Hm not really sold on the idea that we have should reinvent our own worker
> > infrastructure here. Imo a vblank_work should look like a delayed work,
> > i.e. using struct work_struct as the base class, and wrapping the vblank
> > thing around it (instead of the timer). That alos would allow drivers to
> > schedule works on their own work queues, allowing for easier flushing and
> > all that stuff.
> > 
> > Also if we do this I think we should try to follow the delayed work abi as
> > closely as possible (e.g. INIT_VBLANK_WORK, queue_vblank_work,
> > mod_vblank_work, ...). Delayed workers (whether timer or vblank) have a
> > bunch of edges cases where consistently would be really great to avoid
> > surprises and bugs.
> > -Daniel
> > 
> > > ---
> > >  drivers/gpu/drm/drm_vblank.c | 322 +++++++++++++++++++++++++++++++++++
> > >  include/drm/drm_vblank.h     |  34 ++++
> > >  2 files changed, 356 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> > > index da7b0b0c1090..06c796b6c381 100644
> > > --- a/drivers/gpu/drm/drm_vblank.c
> > > +++ b/drivers/gpu/drm/drm_vblank.c
> > > @@ -25,7 +25,9 @@
> > >   */
> > >  
> > >  #include <linux/export.h>
> > > +#include <linux/kthread.h>
> > >  #include <linux/moduleparam.h>
> > > +#include <uapi/linux/sched/types.h>
> > >  
> > >  #include <drm/drm_crtc.h>
> > >  #include <drm/drm_drv.h>
> > > @@ -91,6 +93,7 @@
> > >  static bool
> > >  drm_get_last_vbltimestamp(struct drm_device *dev, unsigned int pipe,
> > >  			  ktime_t *tvblank, bool in_vblank_irq);
> > > +static int drm_vblank_get(struct drm_device *dev, unsigned int pipe);
> > >  
> > >  static unsigned int drm_timestamp_precision = 20;  /* Default to 20
> > > usecs. */
> > >  
> > > @@ -440,6 +443,9 @@ void drm_vblank_cleanup(struct drm_device *dev)
> > >  			drm_core_check_feature(dev, DRIVER_MODESET));
> > >  
> > >  		del_timer_sync(&vblank->disable_timer);
> > > +
> > > +		wake_up_all(&vblank->vblank_work.work_wait);
> > > +		kthread_stop(vblank->vblank_work.thread);
> > >  	}
> > >  
> > >  	kfree(dev->vblank);
> > > @@ -447,6 +453,108 @@ void drm_vblank_cleanup(struct drm_device *dev)
> > >  	dev->num_crtcs = 0;
> > >  }
> > >  
> > > +static int vblank_work_thread(void *data)
> > > +{
> > > +	struct drm_vblank_crtc *vblank = data;
> > > +
> > > +	while (!kthread_should_stop()) {
> > > +		struct drm_vblank_work *work, *next;
> > > +		LIST_HEAD(list);
> > > +		u64 count;
> > > +		int ret;
> > > +
> > > +		spin_lock_irq(&vblank->dev->event_lock);
> > > +
> > > +		ret = wait_event_interruptible_lock_irq(vblank->queue,
> > > +							kthread_should_stop()
> > > +							!list_empty(&vblank-
> > > > vblank_work.work_list),
> > > +							vblank->dev-
> > > > event_lock);
> > > +
> > > +		WARN_ON(ret && !kthread_should_stop() &&
> > > +			list_empty(&vblank->vblank_work.irq_list) &&
> > > +			list_empty(&vblank->vblank_work.work_list));
> > > +
> > > +		list_for_each_entry_safe(work, next,
> > > +					 &vblank->vblank_work.work_list,
> > > +					 list) {
> > > +			list_move_tail(&work->list, &list);
> > > +			work->state = DRM_VBL_WORK_RUNNING;
> > > +		}
> > > +
> > > +		spin_unlock_irq(&vblank->dev->event_lock);
> > > +
> > > +		if (list_empty(&list))
> > > +			continue;
> > > +
> > > +		count = atomic64_read(&vblank->count);
> > > +		list_for_each_entry(work, &list, list)
> > > +			work->func(work, count);
> > > +
> > > +		spin_lock_irq(&vblank->dev->event_lock);
> > > +
> > > +		list_for_each_entry_safe(work, next, &list, list) {
> > > +			if (work->reschedule) {
> > > +				list_move_tail(&work->list,
> > > +					       &vblank->vblank_work.irq_list);
> > > +				drm_vblank_get(vblank->dev, vblank->pipe);
> > > +				work->reschedule = false;
> > > +				work->state = DRM_VBL_WORK_WAITING;
> > > +			} else {
> > > +				list_del_init(&work->list);
> > > +				work->cancel = false;
> > > +				work->state = DRM_VBL_WORK_IDLE;
> > > +			}
> > > +		}
> > > +
> > > +		spin_unlock_irq(&vblank->dev->event_lock);
> > > +
> > > +		wake_up_all(&vblank->vblank_work.work_wait);
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static void vblank_work_init(struct drm_vblank_crtc *vblank)
> > > +{
> > > +	struct sched_param param = {
> > > +		.sched_priority = MAX_RT_PRIO - 1,
> > > +	};
> > > +	int ret;
> > > +
> > > +	INIT_LIST_HEAD(&vblank->vblank_work.irq_list);
> > > +	INIT_LIST_HEAD(&vblank->vblank_work.work_list);
> > > +	init_waitqueue_head(&vblank->vblank_work.work_wait);
> > > +
> > > +	vblank->vblank_work.thread =
> > > +		kthread_run(vblank_work_thread, vblank, "card %d crtc %d",
> > > +			    vblank->dev->primary->index, vblank->pipe);
> > > +
> > > +	ret = sched_setscheduler(vblank->vblank_work.thread,
> > > +				 SCHED_FIFO, &param);
> > > +	WARN_ON(ret);
> > > +}
> > > +
> > > +/**
> > > + * drm_vblank_work_init - initialize a vblank work item
> > > + * @work: vblank work item
> > > + * @crtc: CRTC whose vblank will trigger the work execution
> > > + * @func: work function to be executed
> > > + *
> > > + * Initialize a vblank work item for a specific crtc.
> > > + */
> > > +void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc
> > > *crtc,
> > > +			  void (*func)(struct drm_vblank_work *work, u64
> > > count))
> > > +{
> > > +	struct drm_device *dev = crtc->dev;
> > > +	struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)];
> > > +
> > > +	work->vblank = vblank;
> > > +	work->state = DRM_VBL_WORK_IDLE;
> > > +	work->func = func;
> > > +	INIT_LIST_HEAD(&work->list);
> > > +}
> > > +EXPORT_SYMBOL(drm_vblank_work_init);
> > > +
> > >  /**
> > >   * drm_vblank_init - initialize vblank support
> > >   * @dev: DRM device
> > > @@ -481,6 +589,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned
> > > int num_crtcs)
> > >  		init_waitqueue_head(&vblank->queue);
> > >  		timer_setup(&vblank->disable_timer, vblank_disable_fn, 0);
> > >  		seqlock_init(&vblank->seqlock);
> > > +
> > > +		vblank_work_init(vblank);
> > >  	}
> > >  
> > >  	DRM_INFO("Supports vblank timestamp caching Rev 2 (21.10.2013).\n");
> > > @@ -1825,6 +1935,22 @@ static void drm_handle_vblank_events(struct
> > > drm_device *dev, unsigned int pipe)
> > >  	trace_drm_vblank_event(pipe, seq, now, high_prec);
> > >  }
> > >  
> > > +static void drm_handle_vblank_works(struct drm_vblank_crtc *vblank)
> > > +{
> > > +	struct drm_vblank_work *work, *next;
> > > +	u64 count = atomic64_read(&vblank->count);
> > > +
> > > +	list_for_each_entry_safe(work, next, &vblank->vblank_work.irq_list,
> > > +				 list) {
> > > +		if (!vblank_passed(count, work->count))
> > > +			continue;
> > > +
> > > +		drm_vblank_put(vblank->dev, vblank->pipe);
> > > +		list_move_tail(&work->list, &vblank->vblank_work.work_list);
> > > +		work->state = DRM_VBL_WORK_SCHEDULED;
> > > +	}
> > > +}
> > > +
> > >  /**
> > >   * drm_handle_vblank - handle a vblank event
> > >   * @dev: DRM device
> > > @@ -1866,6 +1992,7 @@ bool drm_handle_vblank(struct drm_device *dev,
> > > unsigned int pipe)
> > >  
> > >  	spin_unlock(&dev->vblank_time_lock);
> > >  
> > > +	drm_handle_vblank_works(vblank);
> > >  	wake_up(&vblank->queue);
> > >  
> > >  	/* With instant-off, we defer disabling the interrupt until after
> > > @@ -2076,3 +2203,198 @@ int drm_crtc_queue_sequence_ioctl(struct
> > > drm_device *dev, void *data,
> > >  	kfree(e);
> > >  	return ret;
> > >  }
> > > +
> > > +/**
> > > + * drm_vblank_work_schedule - schedule a vblank work
> > > + * @work: vblank work to schedule
> > > + * @count: target vblank count
> > > + * @nextonmiss: defer until the next vblank if target vblank was missed
> > > + *
> > > + * Schedule @work for execution once the crtc vblank count reaches
> > > @count.
> > > + *
> > > + * If the crtc vblank count has already reached @count and @nextonmiss
> > > is
> > > + * %false the work starts to execute immediately.
> > > + *
> > > + * If the crtc vblank count has already reached @count and @nextonmiss
> > > is
> > > + * %true the work is deferred until the next vblank (as if @count has
> > > been
> > > + * specified as crtc vblank count + 1).
> > > + *
> > > + * If @work is already scheduled, this function will reschedule said
> > > work
> > > + * using the new @count.
> > > + *
> > > + * Returns:
> > > + * 0 on success, error code on failure.
> > > + */
> > > +int drm_vblank_work_schedule(struct drm_vblank_work *work,
> > > +			     u64 count, bool nextonmiss)
> > > +{
> > > +	struct drm_vblank_crtc *vblank = work->vblank;
> > > +	unsigned long irqflags;
> > > +	u64 cur_vbl;
> > > +	int ret = 0;
> > > +	bool rescheduling = false;
> > > +	bool passed;
> > > +
> > > +	spin_lock_irqsave(&vblank->dev->event_lock, irqflags);
> > > +
> > > +	if (work->cancel)
> > > +		goto out;
> > > +
> > > +	if (work->state == DRM_VBL_WORK_RUNNING) {
> > > +		work->reschedule = true;
> > > +		work->count = count;
> > > +		goto out;
> > > +	} else if (work->state != DRM_VBL_WORK_IDLE) {
> > > +		if (work->count == count)
> > > +			goto out;
> > > +		rescheduling = true;
> > > +	}
> > > +
> > > +	if (work->state != DRM_VBL_WORK_WAITING) {
> > > +		ret = drm_vblank_get(vblank->dev, vblank->pipe);
> > > +		if (ret)
> > > +			goto out;
> > > +	}
> > > +
> > > +	work->count = count;
> > > +
> > > +	cur_vbl = atomic64_read(&vblank->count);
> > > +	passed = vblank_passed(cur_vbl, count);
> > > +	if (passed)
> > > +		DRM_ERROR("crtc %d vblank %llu already passed (current
> > > %llu)\n",
> > > +			  vblank->pipe, count, cur_vbl);
> > > +
> > > +	if (!nextonmiss && passed) {
> > > +		drm_vblank_put(vblank->dev, vblank->pipe);
> > > +		if (rescheduling)
> > > +			list_move_tail(&work->list,
> > > +				       &vblank->vblank_work.work_list);
> > > +		else
> > > +			list_add_tail(&work->list,
> > > +				      &vblank->vblank_work.work_list);
> > > +		work->state = DRM_VBL_WORK_SCHEDULED;
> > > +		wake_up_all(&vblank->queue);
> > > +	} else {
> > > +		if (rescheduling)
> > > +			list_move_tail(&work->list,
> > > +				       &vblank->vblank_work.irq_list);
> > > +		else
> > > +			list_add_tail(&work->list,
> > > +				      &vblank->vblank_work.irq_list);
> > > +		work->state = DRM_VBL_WORK_WAITING;
> > > +	}
> > > +
> > > + out:
> > > +	spin_unlock_irqrestore(&vblank->dev->event_lock, irqflags);
> > > +
> > > +	return ret;
> > > +}
> > > +EXPORT_SYMBOL(drm_vblank_work_schedule);
> > > +
> > > +static bool vblank_work_cancel(struct drm_vblank_work *work)
> > > +{
> > > +	struct drm_vblank_crtc *vblank = work->vblank;
> > > +
> > > +	switch (work->state) {
> > > +	case DRM_VBL_WORK_RUNNING:
> > > +		work->cancel = true;
> > > +		work->reschedule = false;
> > > +		/* fall through */
> > > +	default:
> > > +	case DRM_VBL_WORK_IDLE:
> > > +		return false;
> > > +	case DRM_VBL_WORK_WAITING:
> > > +		drm_vblank_put(vblank->dev, vblank->pipe);
> > > +		/* fall through */
> > > +	case DRM_VBL_WORK_SCHEDULED:
> > > +		list_del_init(&work->list);
> > > +		work->state = DRM_VBL_WORK_IDLE;
> > > +		return true;
> > > +	}
> > > +}
> > > +
> > > +/**
> > > + * drm_vblank_work_cancel - cancel a vblank work
> > > + * @work: vblank work to cancel
> > > + *
> > > + * Cancel an already scheduled vblank work.
> > > + *
> > > + * On return @work may still be executing, unless the return
> > > + * value is %true.
> > > + *
> > > + * Returns:
> > > + * True if the work was cancelled before it started to excute, false
> > > otherwise.
> > > + */
> > > +bool drm_vblank_work_cancel(struct drm_vblank_work *work)
> > > +{
> > > +	struct drm_vblank_crtc *vblank = work->vblank;
> > > +	bool cancelled;
> > > +
> > > +	spin_lock_irq(&vblank->dev->event_lock);
> > > +
> > > +	cancelled = vblank_work_cancel(work);
> > > +
> > > +	spin_unlock_irq(&vblank->dev->event_lock);
> > > +
> > > +	return cancelled;
> > > +}
> > > +EXPORT_SYMBOL(drm_vblank_work_cancel);
> > > +
> > > +/**
> > > + * drm_vblank_work_cancel_sync - cancel a vblank work and wait for it
> > > to
> > > finish executing
> > > + * @work: vblank work to cancel
> > > + *
> > > + * Cancel an already scheduled vblank work and wait for its
> > > + * execution to finish.
> > > + *
> > > + * On return @work is no longer guaraneed to be executing.
> > > + *
> > > + * Returns:
> > > + * True if the work was cancelled before it started to excute, false
> > > otherwise.
> > > + */
> > > +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work)
> > > +{
> > > +	struct drm_vblank_crtc *vblank = work->vblank;
> > > +	bool cancelled;
> > > +	long ret;
> > > +
> > > +	spin_lock_irq(&vblank->dev->event_lock);
> > > +
> > > +	cancelled = vblank_work_cancel(work);
> > > +
> > > +	ret = wait_event_lock_irq_timeout(vblank->vblank_work.work_wait,
> > > +					  work->state == DRM_VBL_WORK_IDLE,
> > > +					  vblank->dev->event_lock,
> > > +					  10 * HZ);
> > > +
> > > +	spin_unlock_irq(&vblank->dev->event_lock);
> > > +
> > > +	WARN(!ret, "crtc %d vblank work timed out\n", vblank->pipe);
> > > +
> > > +	return cancelled;
> > > +}
> > > +EXPORT_SYMBOL(drm_vblank_work_cancel_sync);
> > > +
> > > +/**
> > > + * drm_vblank_work_flush - wait for a scheduled vblank work to finish
> > > excuting
> > > + * @work: vblank work to flush
> > > + *
> > > + * Wait until @work has finished executing.
> > > + */
> > > +void drm_vblank_work_flush(struct drm_vblank_work *work)
> > > +{
> > > +	struct drm_vblank_crtc *vblank = work->vblank;
> > > +	long ret;
> > > +
> > > +	spin_lock_irq(&vblank->dev->event_lock);
> > > +
> > > +	ret = wait_event_lock_irq_timeout(vblank->vblank_work.work_wait,
> > > +					  work->state == DRM_VBL_WORK_IDLE,
> > > +					  vblank->dev->event_lock,
> > > +					  10 * HZ);
> > > +
> > > +	spin_unlock_irq(&vblank->dev->event_lock);
> > > +
> > > +	WARN(!ret, "crtc %d vblank work timed out\n", vblank->pipe);
> > > +}
> > > +EXPORT_SYMBOL(drm_vblank_work_flush);
> > > diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h
> > > index dd9f5b9e56e4..ac9130f419af 100644
> > > --- a/include/drm/drm_vblank.h
> > > +++ b/include/drm/drm_vblank.h
> > > @@ -203,8 +203,42 @@ struct drm_vblank_crtc {
> > >  	 * disabling functions multiple times.
> > >  	 */
> > >  	bool enabled;
> > > +
> > > +	struct {
> > > +		struct task_struct *thread;
> > > +		struct list_head irq_list, work_list;
> > > +		wait_queue_head_t work_wait;
> > > +	} vblank_work;
> > > +};
> > > +
> > > +struct drm_vblank_work {
> > > +	u64 count;
> > > +	struct drm_vblank_crtc *vblank;
> > > +	void (*func)(struct drm_vblank_work *work, u64 count);
> > > +	struct list_head list;
> > > +	enum {
> > > +		DRM_VBL_WORK_IDLE,
> > > +		DRM_VBL_WORK_WAITING,
> > > +		DRM_VBL_WORK_SCHEDULED,
> > > +		DRM_VBL_WORK_RUNNING,
> > > +	} state;
> > > +	bool cancel : 1;
> > > +	bool reschedule : 1;
> > >  };
> > >  
> > > +int drm_vblank_work_schedule(struct drm_vblank_work *work,
> > > +			     u64 count, bool nextonmiss);
> > > +void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc
> > > *crtc,
> > > +			  void (*func)(struct drm_vblank_work *work, u64
> > > count));
> > > +bool drm_vblank_work_cancel(struct drm_vblank_work *work);
> > > +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work);
> > > +void drm_vblank_work_flush(struct drm_vblank_work *work);
> > > +
> > > +static inline bool drm_vblank_work_pending(struct drm_vblank_work
> > > *work)
> > > +{
> > > +	return work->state != DRM_VBL_WORK_IDLE;
> > > +}
> > > +
> > >  int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs);
> > >  bool drm_dev_has_vblank(const struct drm_device *dev);
> > >  u64 drm_crtc_vblank_count(struct drm_crtc *crtc);
> > > -- 
> > > 2.24.1
> > > 
-- 
Cheers,
	Lyude Paul (she/her)
	Associate Software Engineer at Red Hat

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/9] drm/vblank: Add vblank works
  2020-03-27 20:38       ` Lyude Paul
@ 2020-04-13 20:18         ` Lyude Paul
  2020-04-13 20:42           ` Tejun Heo
  0 siblings, 1 reply; 22+ messages in thread
From: Lyude Paul @ 2020-04-13 20:18 UTC (permalink / raw)
  To: Daniel Vetter, Tejun Heo
  Cc: David Airlie, nouveau, linux-kernel, dri-devel, Thomas Zimmermann

Hi Tejun! Sorry to bother you, but have you had a chance to look at any of
this yet? Would like to continue moving this forward

On Fri, 2020-03-27 at 16:38 -0400, Lyude Paul wrote:
> Oops, completely forgot to link to the work_struct version of this patch:
> 
> [1] 
> https://gitlab.freedesktop.org/lyudess/linux/-/commit/f57886aebbd9631f1ee6e61b516bf73afbfe74f4
> 
> On Fri, 2020-03-27 at 16:29 -0400, Lyude Paul wrote:
> > Adding Tejun to this thread per-Daniel's suggestion on IRC.
> > 
> > Hi Tejun! So, I don't know what your experience with modesetting related
> > stuff
> > is so I'll quickly explain some important concepts here regarding
> > scanlines,
> > vblanks, etc. If you already understand this, feel free to skip this next
> > paragraph.
> > 
> > From the perspective of a computer, every time a computer monitor displays
> > a
> > new frame it's done by "scanning out" the display image from top to
> > bottom,
> > one row of pixels at a time. which row of pixels we're on is referred to
> > as
> > the scanline. Additionally, there's usually a couple of extra scanlines
> > which
> > we scan out, but aren't actually displayed on the screen (these sometimes
> > get
> > used by HDMI audio and friends, but that's another story). The period
> > where
> > we're on these scanlines is referred to as the vblank, this is important.
> > 
> > On a lot of display hardware, programming needs to take effect during the
> > vertical blanking period so that settings like gamma, what frame we're
> > scanning out, etc. can be safely changed without showing visual tearing on
> > the
> > screen. In some unforgiving hardware, some of this programming has to both
> > start and end in the same vblank. This is apparently the case on the
> > majority
> > of Intel GPUs in the wild right now, most notably with gamma updates which
> > involve mashing over 2KiB of registers during the vblank. Other drivers
> > have
> > very similar needs to this (nouveau in particular is why I'm here, we need
> > it
> > for CRC related stuff) so we figured it'd be a good idea to add a set of
> > helpers for performing realtime hardware programming that's synchronized
> > to
> > vblank intervals. In particular, this is aimed at hardware programming
> > that
> > would be a bit too awkward to try to pull off entirely in interrupt
> > context.
> > We call these vblank workers.
> > 
> > We first tried doing this using plain old kthreads, as you can see in the
> > patch below, since we could schedule them as realtime. Additionally, the
> > plan
> > was to use this in i915 combined with pm_qos to get rid of cstate latency
> > when
> > handling the original vblank interrupt. Note, our time constraints are a
> > bit
> > more forgiving in nouveau so you won't see any pm_qos mentions in this
> > series.
> > 
> > In an effort to try to avoid reinventing parts of the kernel's worker
> > infrastructure though, we tried to see if we could implement these with
> > simple
> > work_structs and workqueues with HIGH_PRI | UNBOUNDED,[1]. But, it would
> > seem
> > that this work_struct approach is quite unreliable and we still usually
> > fail
> > to start the register programming in time for Intel's vblank worker
> > usecase.
> > 
> > So, so far using plain old RT kthreads seems to make things about as
> > reliable
> > as I think we're going to get. Note - even using kthreads we _still_
> > sometimes
> > miss the vblank period and end up tearing a bit on screen, but it happens
> > significantly less often then with work_structs and is basically going to
> > be
> > as fast as we can get (in the future, Intel wants to try fixing this by
> > doing
> > this hardware programming outside of the CPU, but for now we're stuck with
> > this).
> > 
> > With all of this being said - we'd still like to avoid having to reinvent
> > workers if possible, so we were wondering if there was any kind of
> > realtime
> > worker that could be used for this instead? I haven't been able to find
> > any
> > way of scheduling workers to be realtime, and I'm not sure if implementing
> > this in Linux's workqueues would be realistic either. It'd be nice to
> > avoid
> > having to use plain old kthreads if possible :). What do you think Tejun?
> > 
> > On Wed, 2020-03-18 at 14:46 +0100, Daniel Vetter wrote:
> > > On Tue, Mar 17, 2020 at 08:40:58PM -0400, Lyude Paul wrote:
> > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > 
> > > > Add some kind of vblank workers. The interface is similar to regular
> > > > delayed works, and also allows for re-scheduling.
> > > > 
> > > > Whatever hardware programming we do in the work must be fast
> > > > (must at least complete during the vblank, sometimes during
> > > > the first few scanlines of vblank), so we'll fire up a per-crtc
> > > > high priority thread for this.
> > > > 
> > > > [based off patches from Ville Syrjälä <ville.syrjala@linux.intel.com>,
> > > > change below to signoff later]
> > > > 
> > > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > Signed-off-by: Lyude Paul <lyude@redhat.com>
> > > 
> > > Hm not really sold on the idea that we have should reinvent our own
> > > worker
> > > infrastructure here. Imo a vblank_work should look like a delayed work,
> > > i.e. using struct work_struct as the base class, and wrapping the vblank
> > > thing around it (instead of the timer). That alos would allow drivers to
> > > schedule works on their own work queues, allowing for easier flushing
> > > and
> > > all that stuff.
> > > 
> > > Also if we do this I think we should try to follow the delayed work abi
> > > as
> > > closely as possible (e.g. INIT_VBLANK_WORK, queue_vblank_work,
> > > mod_vblank_work, ...). Delayed workers (whether timer or vblank) have a
> > > bunch of edges cases where consistently would be really great to avoid
> > > surprises and bugs.
> > > -Daniel
> > > 
> > > > ---
> > > >  drivers/gpu/drm/drm_vblank.c | 322
> > > > +++++++++++++++++++++++++++++++++++
> > > >  include/drm/drm_vblank.h     |  34 ++++
> > > >  2 files changed, 356 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/drm_vblank.c
> > > > b/drivers/gpu/drm/drm_vblank.c
> > > > index da7b0b0c1090..06c796b6c381 100644
> > > > --- a/drivers/gpu/drm/drm_vblank.c
> > > > +++ b/drivers/gpu/drm/drm_vblank.c
> > > > @@ -25,7 +25,9 @@
> > > >   */
> > > >  
> > > >  #include <linux/export.h>
> > > > +#include <linux/kthread.h>
> > > >  #include <linux/moduleparam.h>
> > > > +#include <uapi/linux/sched/types.h>
> > > >  
> > > >  #include <drm/drm_crtc.h>
> > > >  #include <drm/drm_drv.h>
> > > > @@ -91,6 +93,7 @@
> > > >  static bool
> > > >  drm_get_last_vbltimestamp(struct drm_device *dev, unsigned int pipe,
> > > >  			  ktime_t *tvblank, bool in_vblank_irq);
> > > > +static int drm_vblank_get(struct drm_device *dev, unsigned int pipe);
> > > >  
> > > >  static unsigned int drm_timestamp_precision = 20;  /* Default to 20
> > > > usecs. */
> > > >  
> > > > @@ -440,6 +443,9 @@ void drm_vblank_cleanup(struct drm_device *dev)
> > > >  			drm_core_check_feature(dev, DRIVER_MODESET));
> > > >  
> > > >  		del_timer_sync(&vblank->disable_timer);
> > > > +
> > > > +		wake_up_all(&vblank->vblank_work.work_wait);
> > > > +		kthread_stop(vblank->vblank_work.thread);
> > > >  	}
> > > >  
> > > >  	kfree(dev->vblank);
> > > > @@ -447,6 +453,108 @@ void drm_vblank_cleanup(struct drm_device *dev)
> > > >  	dev->num_crtcs = 0;
> > > >  }
> > > >  
> > > > +static int vblank_work_thread(void *data)
> > > > +{
> > > > +	struct drm_vblank_crtc *vblank = data;
> > > > +
> > > > +	while (!kthread_should_stop()) {
> > > > +		struct drm_vblank_work *work, *next;
> > > > +		LIST_HEAD(list);
> > > > +		u64 count;
> > > > +		int ret;
> > > > +
> > > > +		spin_lock_irq(&vblank->dev->event_lock);
> > > > +
> > > > +		ret = wait_event_interruptible_lock_irq(vblank->queue,
> > > > +							kthread_should
> > > > _stop()
> > > > +							!list_empty(&v
> > > > blank-
> > > > > vblank_work.work_list),
> > > > +							vblank->dev-
> > > > > event_lock);
> > > > +
> > > > +		WARN_ON(ret && !kthread_should_stop() &&
> > > > +			list_empty(&vblank->vblank_work.irq_list) &&
> > > > +			list_empty(&vblank->vblank_work.work_list));
> > > > +
> > > > +		list_for_each_entry_safe(work, next,
> > > > +					 &vblank-
> > > > >vblank_work.work_list,
> > > > +					 list) {
> > > > +			list_move_tail(&work->list, &list);
> > > > +			work->state = DRM_VBL_WORK_RUNNING;
> > > > +		}
> > > > +
> > > > +		spin_unlock_irq(&vblank->dev->event_lock);
> > > > +
> > > > +		if (list_empty(&list))
> > > > +			continue;
> > > > +
> > > > +		count = atomic64_read(&vblank->count);
> > > > +		list_for_each_entry(work, &list, list)
> > > > +			work->func(work, count);
> > > > +
> > > > +		spin_lock_irq(&vblank->dev->event_lock);
> > > > +
> > > > +		list_for_each_entry_safe(work, next, &list, list) {
> > > > +			if (work->reschedule) {
> > > > +				list_move_tail(&work->list,
> > > > +					       &vblank-
> > > > >vblank_work.irq_list);
> > > > +				drm_vblank_get(vblank->dev, vblank-
> > > > >pipe);
> > > > +				work->reschedule = false;
> > > > +				work->state = DRM_VBL_WORK_WAITING;
> > > > +			} else {
> > > > +				list_del_init(&work->list);
> > > > +				work->cancel = false;
> > > > +				work->state = DRM_VBL_WORK_IDLE;
> > > > +			}
> > > > +		}
> > > > +
> > > > +		spin_unlock_irq(&vblank->dev->event_lock);
> > > > +
> > > > +		wake_up_all(&vblank->vblank_work.work_wait);
> > > > +	}
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +static void vblank_work_init(struct drm_vblank_crtc *vblank)
> > > > +{
> > > > +	struct sched_param param = {
> > > > +		.sched_priority = MAX_RT_PRIO - 1,
> > > > +	};
> > > > +	int ret;
> > > > +
> > > > +	INIT_LIST_HEAD(&vblank->vblank_work.irq_list);
> > > > +	INIT_LIST_HEAD(&vblank->vblank_work.work_list);
> > > > +	init_waitqueue_head(&vblank->vblank_work.work_wait);
> > > > +
> > > > +	vblank->vblank_work.thread =
> > > > +		kthread_run(vblank_work_thread, vblank, "card %d crtc
> > > > %d",
> > > > +			    vblank->dev->primary->index, vblank-
> > > > >pipe);
> > > > +
> > > > +	ret = sched_setscheduler(vblank->vblank_work.thread,
> > > > +				 SCHED_FIFO, &param);
> > > > +	WARN_ON(ret);
> > > > +}
> > > > +
> > > > +/**
> > > > + * drm_vblank_work_init - initialize a vblank work item
> > > > + * @work: vblank work item
> > > > + * @crtc: CRTC whose vblank will trigger the work execution
> > > > + * @func: work function to be executed
> > > > + *
> > > > + * Initialize a vblank work item for a specific crtc.
> > > > + */
> > > > +void drm_vblank_work_init(struct drm_vblank_work *work, struct
> > > > drm_crtc
> > > > *crtc,
> > > > +			  void (*func)(struct drm_vblank_work *work,
> > > > u64
> > > > count))
> > > > +{
> > > > +	struct drm_device *dev = crtc->dev;
> > > > +	struct drm_vblank_crtc *vblank = &dev-
> > > > >vblank[drm_crtc_index(crtc)];
> > > > +
> > > > +	work->vblank = vblank;
> > > > +	work->state = DRM_VBL_WORK_IDLE;
> > > > +	work->func = func;
> > > > +	INIT_LIST_HEAD(&work->list);
> > > > +}
> > > > +EXPORT_SYMBOL(drm_vblank_work_init);
> > > > +
> > > >  /**
> > > >   * drm_vblank_init - initialize vblank support
> > > >   * @dev: DRM device
> > > > @@ -481,6 +589,8 @@ int drm_vblank_init(struct drm_device *dev,
> > > > unsigned
> > > > int num_crtcs)
> > > >  		init_waitqueue_head(&vblank->queue);
> > > >  		timer_setup(&vblank->disable_timer, vblank_disable_fn,
> > > > 0);
> > > >  		seqlock_init(&vblank->seqlock);
> > > > +
> > > > +		vblank_work_init(vblank);
> > > >  	}
> > > >  
> > > >  	DRM_INFO("Supports vblank timestamp caching Rev 2
> > > > (21.10.2013).\n");
> > > > @@ -1825,6 +1935,22 @@ static void drm_handle_vblank_events(struct
> > > > drm_device *dev, unsigned int pipe)
> > > >  	trace_drm_vblank_event(pipe, seq, now, high_prec);
> > > >  }
> > > >  
> > > > +static void drm_handle_vblank_works(struct drm_vblank_crtc *vblank)
> > > > +{
> > > > +	struct drm_vblank_work *work, *next;
> > > > +	u64 count = atomic64_read(&vblank->count);
> > > > +
> > > > +	list_for_each_entry_safe(work, next, &vblank-
> > > > >vblank_work.irq_list,
> > > > +				 list) {
> > > > +		if (!vblank_passed(count, work->count))
> > > > +			continue;
> > > > +
> > > > +		drm_vblank_put(vblank->dev, vblank->pipe);
> > > > +		list_move_tail(&work->list, &vblank-
> > > > >vblank_work.work_list);
> > > > +		work->state = DRM_VBL_WORK_SCHEDULED;
> > > > +	}
> > > > +}
> > > > +
> > > >  /**
> > > >   * drm_handle_vblank - handle a vblank event
> > > >   * @dev: DRM device
> > > > @@ -1866,6 +1992,7 @@ bool drm_handle_vblank(struct drm_device *dev,
> > > > unsigned int pipe)
> > > >  
> > > >  	spin_unlock(&dev->vblank_time_lock);
> > > >  
> > > > +	drm_handle_vblank_works(vblank);
> > > >  	wake_up(&vblank->queue);
> > > >  
> > > >  	/* With instant-off, we defer disabling the interrupt until
> > > > after
> > > > @@ -2076,3 +2203,198 @@ int drm_crtc_queue_sequence_ioctl(struct
> > > > drm_device *dev, void *data,
> > > >  	kfree(e);
> > > >  	return ret;
> > > >  }
> > > > +
> > > > +/**
> > > > + * drm_vblank_work_schedule - schedule a vblank work
> > > > + * @work: vblank work to schedule
> > > > + * @count: target vblank count
> > > > + * @nextonmiss: defer until the next vblank if target vblank was
> > > > missed
> > > > + *
> > > > + * Schedule @work for execution once the crtc vblank count reaches
> > > > @count.
> > > > + *
> > > > + * If the crtc vblank count has already reached @count and
> > > > @nextonmiss
> > > > is
> > > > + * %false the work starts to execute immediately.
> > > > + *
> > > > + * If the crtc vblank count has already reached @count and
> > > > @nextonmiss
> > > > is
> > > > + * %true the work is deferred until the next vblank (as if @count has
> > > > been
> > > > + * specified as crtc vblank count + 1).
> > > > + *
> > > > + * If @work is already scheduled, this function will reschedule said
> > > > work
> > > > + * using the new @count.
> > > > + *
> > > > + * Returns:
> > > > + * 0 on success, error code on failure.
> > > > + */
> > > > +int drm_vblank_work_schedule(struct drm_vblank_work *work,
> > > > +			     u64 count, bool nextonmiss)
> > > > +{
> > > > +	struct drm_vblank_crtc *vblank = work->vblank;
> > > > +	unsigned long irqflags;
> > > > +	u64 cur_vbl;
> > > > +	int ret = 0;
> > > > +	bool rescheduling = false;
> > > > +	bool passed;
> > > > +
> > > > +	spin_lock_irqsave(&vblank->dev->event_lock, irqflags);
> > > > +
> > > > +	if (work->cancel)
> > > > +		goto out;
> > > > +
> > > > +	if (work->state == DRM_VBL_WORK_RUNNING) {
> > > > +		work->reschedule = true;
> > > > +		work->count = count;
> > > > +		goto out;
> > > > +	} else if (work->state != DRM_VBL_WORK_IDLE) {
> > > > +		if (work->count == count)
> > > > +			goto out;
> > > > +		rescheduling = true;
> > > > +	}
> > > > +
> > > > +	if (work->state != DRM_VBL_WORK_WAITING) {
> > > > +		ret = drm_vblank_get(vblank->dev, vblank->pipe);
> > > > +		if (ret)
> > > > +			goto out;
> > > > +	}
> > > > +
> > > > +	work->count = count;
> > > > +
> > > > +	cur_vbl = atomic64_read(&vblank->count);
> > > > +	passed = vblank_passed(cur_vbl, count);
> > > > +	if (passed)
> > > > +		DRM_ERROR("crtc %d vblank %llu already passed (current
> > > > %llu)\n",
> > > > +			  vblank->pipe, count, cur_vbl);
> > > > +
> > > > +	if (!nextonmiss && passed) {
> > > > +		drm_vblank_put(vblank->dev, vblank->pipe);
> > > > +		if (rescheduling)
> > > > +			list_move_tail(&work->list,
> > > > +				       &vblank-
> > > > >vblank_work.work_list);
> > > > +		else
> > > > +			list_add_tail(&work->list,
> > > > +				      &vblank->vblank_work.work_list);
> > > > +		work->state = DRM_VBL_WORK_SCHEDULED;
> > > > +		wake_up_all(&vblank->queue);
> > > > +	} else {
> > > > +		if (rescheduling)
> > > > +			list_move_tail(&work->list,
> > > > +				       &vblank->vblank_work.irq_list);
> > > > +		else
> > > > +			list_add_tail(&work->list,
> > > > +				      &vblank->vblank_work.irq_list);
> > > > +		work->state = DRM_VBL_WORK_WAITING;
> > > > +	}
> > > > +
> > > > + out:
> > > > +	spin_unlock_irqrestore(&vblank->dev->event_lock, irqflags);
> > > > +
> > > > +	return ret;
> > > > +}
> > > > +EXPORT_SYMBOL(drm_vblank_work_schedule);
> > > > +
> > > > +static bool vblank_work_cancel(struct drm_vblank_work *work)
> > > > +{
> > > > +	struct drm_vblank_crtc *vblank = work->vblank;
> > > > +
> > > > +	switch (work->state) {
> > > > +	case DRM_VBL_WORK_RUNNING:
> > > > +		work->cancel = true;
> > > > +		work->reschedule = false;
> > > > +		/* fall through */
> > > > +	default:
> > > > +	case DRM_VBL_WORK_IDLE:
> > > > +		return false;
> > > > +	case DRM_VBL_WORK_WAITING:
> > > > +		drm_vblank_put(vblank->dev, vblank->pipe);
> > > > +		/* fall through */
> > > > +	case DRM_VBL_WORK_SCHEDULED:
> > > > +		list_del_init(&work->list);
> > > > +		work->state = DRM_VBL_WORK_IDLE;
> > > > +		return true;
> > > > +	}
> > > > +}
> > > > +
> > > > +/**
> > > > + * drm_vblank_work_cancel - cancel a vblank work
> > > > + * @work: vblank work to cancel
> > > > + *
> > > > + * Cancel an already scheduled vblank work.
> > > > + *
> > > > + * On return @work may still be executing, unless the return
> > > > + * value is %true.
> > > > + *
> > > > + * Returns:
> > > > + * True if the work was cancelled before it started to excute, false
> > > > otherwise.
> > > > + */
> > > > +bool drm_vblank_work_cancel(struct drm_vblank_work *work)
> > > > +{
> > > > +	struct drm_vblank_crtc *vblank = work->vblank;
> > > > +	bool cancelled;
> > > > +
> > > > +	spin_lock_irq(&vblank->dev->event_lock);
> > > > +
> > > > +	cancelled = vblank_work_cancel(work);
> > > > +
> > > > +	spin_unlock_irq(&vblank->dev->event_lock);
> > > > +
> > > > +	return cancelled;
> > > > +}
> > > > +EXPORT_SYMBOL(drm_vblank_work_cancel);
> > > > +
> > > > +/**
> > > > + * drm_vblank_work_cancel_sync - cancel a vblank work and wait for it
> > > > to
> > > > finish executing
> > > > + * @work: vblank work to cancel
> > > > + *
> > > > + * Cancel an already scheduled vblank work and wait for its
> > > > + * execution to finish.
> > > > + *
> > > > + * On return @work is no longer guaraneed to be executing.
> > > > + *
> > > > + * Returns:
> > > > + * True if the work was cancelled before it started to excute, false
> > > > otherwise.
> > > > + */
> > > > +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work)
> > > > +{
> > > > +	struct drm_vblank_crtc *vblank = work->vblank;
> > > > +	bool cancelled;
> > > > +	long ret;
> > > > +
> > > > +	spin_lock_irq(&vblank->dev->event_lock);
> > > > +
> > > > +	cancelled = vblank_work_cancel(work);
> > > > +
> > > > +	ret = wait_event_lock_irq_timeout(vblank-
> > > > >vblank_work.work_wait,
> > > > +					  work->state ==
> > > > DRM_VBL_WORK_IDLE,
> > > > +					  vblank->dev->event_lock,
> > > > +					  10 * HZ);
> > > > +
> > > > +	spin_unlock_irq(&vblank->dev->event_lock);
> > > > +
> > > > +	WARN(!ret, "crtc %d vblank work timed out\n", vblank->pipe);
> > > > +
> > > > +	return cancelled;
> > > > +}
> > > > +EXPORT_SYMBOL(drm_vblank_work_cancel_sync);
> > > > +
> > > > +/**
> > > > + * drm_vblank_work_flush - wait for a scheduled vblank work to finish
> > > > excuting
> > > > + * @work: vblank work to flush
> > > > + *
> > > > + * Wait until @work has finished executing.
> > > > + */
> > > > +void drm_vblank_work_flush(struct drm_vblank_work *work)
> > > > +{
> > > > +	struct drm_vblank_crtc *vblank = work->vblank;
> > > > +	long ret;
> > > > +
> > > > +	spin_lock_irq(&vblank->dev->event_lock);
> > > > +
> > > > +	ret = wait_event_lock_irq_timeout(vblank-
> > > > >vblank_work.work_wait,
> > > > +					  work->state ==
> > > > DRM_VBL_WORK_IDLE,
> > > > +					  vblank->dev->event_lock,
> > > > +					  10 * HZ);
> > > > +
> > > > +	spin_unlock_irq(&vblank->dev->event_lock);
> > > > +
> > > > +	WARN(!ret, "crtc %d vblank work timed out\n", vblank->pipe);
> > > > +}
> > > > +EXPORT_SYMBOL(drm_vblank_work_flush);
> > > > diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h
> > > > index dd9f5b9e56e4..ac9130f419af 100644
> > > > --- a/include/drm/drm_vblank.h
> > > > +++ b/include/drm/drm_vblank.h
> > > > @@ -203,8 +203,42 @@ struct drm_vblank_crtc {
> > > >  	 * disabling functions multiple times.
> > > >  	 */
> > > >  	bool enabled;
> > > > +
> > > > +	struct {
> > > > +		struct task_struct *thread;
> > > > +		struct list_head irq_list, work_list;
> > > > +		wait_queue_head_t work_wait;
> > > > +	} vblank_work;
> > > > +};
> > > > +
> > > > +struct drm_vblank_work {
> > > > +	u64 count;
> > > > +	struct drm_vblank_crtc *vblank;
> > > > +	void (*func)(struct drm_vblank_work *work, u64 count);
> > > > +	struct list_head list;
> > > > +	enum {
> > > > +		DRM_VBL_WORK_IDLE,
> > > > +		DRM_VBL_WORK_WAITING,
> > > > +		DRM_VBL_WORK_SCHEDULED,
> > > > +		DRM_VBL_WORK_RUNNING,
> > > > +	} state;
> > > > +	bool cancel : 1;
> > > > +	bool reschedule : 1;
> > > >  };
> > > >  
> > > > +int drm_vblank_work_schedule(struct drm_vblank_work *work,
> > > > +			     u64 count, bool nextonmiss);
> > > > +void drm_vblank_work_init(struct drm_vblank_work *work, struct
> > > > drm_crtc
> > > > *crtc,
> > > > +			  void (*func)(struct drm_vblank_work *work,
> > > > u64
> > > > count));
> > > > +bool drm_vblank_work_cancel(struct drm_vblank_work *work);
> > > > +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work);
> > > > +void drm_vblank_work_flush(struct drm_vblank_work *work);
> > > > +
> > > > +static inline bool drm_vblank_work_pending(struct drm_vblank_work
> > > > *work)
> > > > +{
> > > > +	return work->state != DRM_VBL_WORK_IDLE;
> > > > +}
> > > > +
> > > >  int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs);
> > > >  bool drm_dev_has_vblank(const struct drm_device *dev);
> > > >  u64 drm_crtc_vblank_count(struct drm_crtc *crtc);
> > > > -- 
> > > > 2.24.1
> > > > 
-- 
Cheers,
	Lyude Paul (she/her)
	Associate Software Engineer at Red Hat

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/9] drm/vblank: Add vblank works
  2020-04-13 20:18         ` Lyude Paul
@ 2020-04-13 20:42           ` Tejun Heo
  2020-04-13 21:07             ` Sam Ravnborg
  2020-04-14 16:52             ` Lyude Paul
  0 siblings, 2 replies; 22+ messages in thread
From: Tejun Heo @ 2020-04-13 20:42 UTC (permalink / raw)
  To: Lyude Paul
  Cc: Thomas Zimmermann, David Airlie, nouveau, linux-kernel, dri-devel

Hello,

On Mon, Apr 13, 2020 at 04:18:57PM -0400, Lyude Paul wrote:
> Hi Tejun! Sorry to bother you, but have you had a chance to look at any of
> this yet? Would like to continue moving this forward

Sorry, wasn't following this thread. Have you looked at kthread_worker?

 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/kthread.h#n71

And, thanks a lot for the vblank explanation. I really enjoyed readin it. :)

-- 
tejun
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/9] drm/vblank: Add vblank works
  2020-04-13 20:42           ` Tejun Heo
@ 2020-04-13 21:07             ` Sam Ravnborg
  2020-04-14 16:52             ` Lyude Paul
  1 sibling, 0 replies; 22+ messages in thread
From: Sam Ravnborg @ 2020-04-13 21:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: David Airlie, nouveau, linux-kernel, dri-devel, Thomas Zimmermann

Hi Tejun

> 
> And, thanks a lot for the vblank explanation. I really enjoyed readin it. :)

You were not the only one who liked it :-)

We ended up with an updated explanation in drm_vblank.c:
https://cgit.freedesktop.org/drm/drm-misc/tree/drivers/gpu/drm/drm_vblank.c

Including some nice ascii art in the final version.

It will hit upstream in next merge window.

	Sam
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/9] drm/vblank: Add vblank works
  2020-04-13 20:42           ` Tejun Heo
  2020-04-13 21:07             ` Sam Ravnborg
@ 2020-04-14 16:52             ` Lyude Paul
  2020-04-14 18:17               ` Tejun Heo
  1 sibling, 1 reply; 22+ messages in thread
From: Lyude Paul @ 2020-04-14 16:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Thomas Zimmermann, David Airlie, nouveau, linux-kernel, dri-devel

On Mon, 2020-04-13 at 16:42 -0400, Tejun Heo wrote:
> Hello,
> 
> On Mon, Apr 13, 2020 at 04:18:57PM -0400, Lyude Paul wrote:
> > Hi Tejun! Sorry to bother you, but have you had a chance to look at any of
> > this yet? Would like to continue moving this forward
> 
> Sorry, wasn't following this thread. Have you looked at kthread_worker?
> 

Hi, thanks for the response! And yes-I think this would actually be perfect
for what we need, I guess one question I might as well ask since I've got you
here: would patches to expose an unlocked version of kthread_queue_worker() be
accepted? With something like that I should be able to just reuse the
delayed_work_list and spinlocks that come with kthread_worker which would make
the vblank works implementation a bit easier
>  
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/kthread.h#n71
> 
> And, thanks a lot for the vblank explanation. I really enjoyed readin it. :)
> 
-- 
Cheers,
	Lyude Paul (she/her)
	Associate Software Engineer at Red Hat

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/9] drm/vblank: Add vblank works
  2020-04-14 16:52             ` Lyude Paul
@ 2020-04-14 18:17               ` Tejun Heo
  0 siblings, 0 replies; 22+ messages in thread
From: Tejun Heo @ 2020-04-14 18:17 UTC (permalink / raw)
  To: Lyude Paul
  Cc: Thomas Zimmermann, David Airlie, nouveau, linux-kernel, dri-devel

Hello,

On Tue, Apr 14, 2020 at 12:52:51PM -0400, Lyude Paul wrote:
> Hi, thanks for the response! And yes-I think this would actually be perfect
> for what we need, I guess one question I might as well ask since I've got you
> here: would patches to expose an unlocked version of kthread_queue_worker() be
> accepted? With something like that I should be able to just reuse the
> delayed_work_list and spinlocks that come with kthread_worker which would make
> the vblank works implementation a bit easier

Difficult to tell w/o looking at the code but if technically reasonable and
justified, I don't see a reason why something like that couldn't be accepted.
That said, whatever gain coming from sharing an inner lock like that usually
is miniscule and API and logic simplicity often easily outweighs.

Thanks.

-- 
tejun
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/9] drm/vblank: Add vblank works
  2020-03-18 13:46   ` Daniel Vetter
  2020-03-19 20:12     ` Lyude Paul
  2020-03-27 20:29     ` Lyude Paul
@ 2020-05-07 18:57     ` Lyude Paul
  2 siblings, 0 replies; 22+ messages in thread
From: Lyude Paul @ 2020-05-07 18:57 UTC (permalink / raw)
  To: Daniel Vetter, Tejun Heo
  Cc: David Airlie, nouveau, linux-kernel, dri-devel, Thomas Zimmermann

Hey guys! Sorry this took me a little while to get to, but I was finally able
to sit down for a bit and do a thorough investigation on the latency issues
here to figure out if it's noise or not. I did this investigation with the
plain work_struct implementation, the original kthread implementation, and a
version of the kthread_worker based implementation (rewritten since the last
series I sent out, so it isn't as tightly integrated with kthread_worker as
before). Also note, this was a pretty big refresher on my stats intro back
from when I was in college, so hopefully I didn't make any mistakes :).

First, I should probably point out that I discovered the igt-gpu-tools tests
we were using for testing this were actually giving us false positives before.
Basically, we made the mistake of exiting the test early if we exceeded the
specified duration (by default, 2 seconds) even if we never actually tried any
LUT updates, which was causing this test to pass on my machine when it really
shouldn't have. I managed to fix this, and finally managed to actually
reproduce the test failures Ville was seeing by running the test in parallel
with `stress -c 1 -i 1 -b 1` (starts 1 CPU hog, 1 sync() hog, 1
write()/unlink() hog). I also made sure to stop any systemd timers and run igt
with ionice -c realtime nice -n -20, just to make sure igt's userspace process
was affected by the load as little as possible to reduce noise in the results.

But I quickly discovered the test still was far too noisy, as even after 500
test runs with each implementation I couldn't see any obvious difference in
performance between any of them. So I added a --benchmark option to the test,
which makes the test run for the entirity of the --duration value (regardless
of how many test failures we hit) and report the total number of frames
tested, along with how many of those frames had LUT tearing. I then used this
to benchmark each implementation and started putting all of the failure
percentages in a spreadsheet. I did this with a duration of 20 seconds, which
would amount to testing ~1200-1210 frames each time. Then, I kept increasing
my sample size by rerunning the benchmarks until it looked like the standard
deviation of my data set was starting to stabilize, e.g. it was no longer
fluctuating by more then ~.10% when I added more data.

Finally-this actually managed to give me some data that looked somewhat
sensible:

https://people.freedesktop.org/~lyudess/archive/05-07-2020/vbl-work-benchmarks.pdf

(remember-each value is the percentage of frames where LUT tearing was
detected)

The verdict appears to be that the new kthread_worker implementation I have
looks like it's ever so slightly faster on average then the original kthread
based implementation when the system is under heavy load. 

Also, you can find the branches I used here:

https://gitlab.freedesktop.org/lyudess/igt-gpu-tools/-/commits/wip/lut-tearing-2
https://gitlab.freedesktop.org/lyudess/linux/-/commits/wip/vbl-workers-vlv-v3 (kthread_worker v2)
https://gitlab.freedesktop.org/lyudess/linux/-/commits/wip/vbl-workers-vlv-kthread (kthread)
https://gitlab.freedesktop.org/lyudess/linux/-/commits/wip/vbl-workers-vlv-v2 (workqueues)

All tests were done on my single core valleyview machine, with one 1080p HDMI
display and one 1080p VGA display.

IMHO: Unless anyone has objections, this seems like enough evidence for me, so
I'll send out a respin of the nouveau-crc series with the new respin of the
kthread_worker based vblank works in a bit.

On Wed, 2020-03-18 at 14:46 +0100, Daniel Vetter wrote:
> On Tue, Mar 17, 2020 at 08:40:58PM -0400, Lyude Paul wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > Add some kind of vblank workers. The interface is similar to regular
> > delayed works, and also allows for re-scheduling.
> > 
> > Whatever hardware programming we do in the work must be fast
> > (must at least complete during the vblank, sometimes during
> > the first few scanlines of vblank), so we'll fire up a per-crtc
> > high priority thread for this.
> > 
> > [based off patches from Ville Syrjälä <ville.syrjala@linux.intel.com>,
> > change below to signoff later]
> > 
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Signed-off-by: Lyude Paul <lyude@redhat.com>
> 
> Hm not really sold on the idea that we have should reinvent our own worker
> infrastructure here. Imo a vblank_work should look like a delayed work,
> i.e. using struct work_struct as the base class, and wrapping the vblank
> thing around it (instead of the timer). That alos would allow drivers to
> schedule works on their own work queues, allowing for easier flushing and
> all that stuff.
> 
> Also if we do this I think we should try to follow the delayed work abi as
> closely as possible (e.g. INIT_VBLANK_WORK, queue_vblank_work,
> mod_vblank_work, ...). Delayed workers (whether timer or vblank) have a
> bunch of edges cases where consistently would be really great to avoid
> surprises and bugs.
> -Daniel
> 
> > ---
> >  drivers/gpu/drm/drm_vblank.c | 322 +++++++++++++++++++++++++++++++++++
> >  include/drm/drm_vblank.h     |  34 ++++
> >  2 files changed, 356 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> > index da7b0b0c1090..06c796b6c381 100644
> > --- a/drivers/gpu/drm/drm_vblank.c
> > +++ b/drivers/gpu/drm/drm_vblank.c
> > @@ -25,7 +25,9 @@
> >   */
> >  
> >  #include <linux/export.h>
> > +#include <linux/kthread.h>
> >  #include <linux/moduleparam.h>
> > +#include <uapi/linux/sched/types.h>
> >  
> >  #include <drm/drm_crtc.h>
> >  #include <drm/drm_drv.h>
> > @@ -91,6 +93,7 @@
> >  static bool
> >  drm_get_last_vbltimestamp(struct drm_device *dev, unsigned int pipe,
> >  			  ktime_t *tvblank, bool in_vblank_irq);
> > +static int drm_vblank_get(struct drm_device *dev, unsigned int pipe);
> >  
> >  static unsigned int drm_timestamp_precision = 20;  /* Default to 20
> > usecs. */
> >  
> > @@ -440,6 +443,9 @@ void drm_vblank_cleanup(struct drm_device *dev)
> >  			drm_core_check_feature(dev, DRIVER_MODESET));
> >  
> >  		del_timer_sync(&vblank->disable_timer);
> > +
> > +		wake_up_all(&vblank->vblank_work.work_wait);
> > +		kthread_stop(vblank->vblank_work.thread);
> >  	}
> >  
> >  	kfree(dev->vblank);
> > @@ -447,6 +453,108 @@ void drm_vblank_cleanup(struct drm_device *dev)
> >  	dev->num_crtcs = 0;
> >  }
> >  
> > +static int vblank_work_thread(void *data)
> > +{
> > +	struct drm_vblank_crtc *vblank = data;
> > +
> > +	while (!kthread_should_stop()) {
> > +		struct drm_vblank_work *work, *next;
> > +		LIST_HEAD(list);
> > +		u64 count;
> > +		int ret;
> > +
> > +		spin_lock_irq(&vblank->dev->event_lock);
> > +
> > +		ret = wait_event_interruptible_lock_irq(vblank->queue,
> > +							kthread_should_stop()
> > ||
> > +							!list_empty(&vblank-
> > >vblank_work.work_list),
> > +							vblank->dev-
> > >event_lock);
> > +
> > +		WARN_ON(ret && !kthread_should_stop() &&
> > +			list_empty(&vblank->vblank_work.irq_list) &&
> > +			list_empty(&vblank->vblank_work.work_list));
> > +
> > +		list_for_each_entry_safe(work, next,
> > +					 &vblank->vblank_work.work_list,
> > +					 list) {
> > +			list_move_tail(&work->list, &list);
> > +			work->state = DRM_VBL_WORK_RUNNING;
> > +		}
> > +
> > +		spin_unlock_irq(&vblank->dev->event_lock);
> > +
> > +		if (list_empty(&list))
> > +			continue;
> > +
> > +		count = atomic64_read(&vblank->count);
> > +		list_for_each_entry(work, &list, list)
> > +			work->func(work, count);
> > +
> > +		spin_lock_irq(&vblank->dev->event_lock);
> > +
> > +		list_for_each_entry_safe(work, next, &list, list) {
> > +			if (work->reschedule) {
> > +				list_move_tail(&work->list,
> > +					       &vblank->vblank_work.irq_list);
> > +				drm_vblank_get(vblank->dev, vblank->pipe);
> > +				work->reschedule = false;
> > +				work->state = DRM_VBL_WORK_WAITING;
> > +			} else {
> > +				list_del_init(&work->list);
> > +				work->cancel = false;
> > +				work->state = DRM_VBL_WORK_IDLE;
> > +			}
> > +		}
> > +
> > +		spin_unlock_irq(&vblank->dev->event_lock);
> > +
> > +		wake_up_all(&vblank->vblank_work.work_wait);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static void vblank_work_init(struct drm_vblank_crtc *vblank)
> > +{
> > +	struct sched_param param = {
> > +		.sched_priority = MAX_RT_PRIO - 1,
> > +	};
> > +	int ret;
> > +
> > +	INIT_LIST_HEAD(&vblank->vblank_work.irq_list);
> > +	INIT_LIST_HEAD(&vblank->vblank_work.work_list);
> > +	init_waitqueue_head(&vblank->vblank_work.work_wait);
> > +
> > +	vblank->vblank_work.thread =
> > +		kthread_run(vblank_work_thread, vblank, "card %d crtc %d",
> > +			    vblank->dev->primary->index, vblank->pipe);
> > +
> > +	ret = sched_setscheduler(vblank->vblank_work.thread,
> > +				 SCHED_FIFO, &param);
> > +	WARN_ON(ret);
> > +}
> > +
> > +/**
> > + * drm_vblank_work_init - initialize a vblank work item
> > + * @work: vblank work item
> > + * @crtc: CRTC whose vblank will trigger the work execution
> > + * @func: work function to be executed
> > + *
> > + * Initialize a vblank work item for a specific crtc.
> > + */
> > +void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc
> > *crtc,
> > +			  void (*func)(struct drm_vblank_work *work, u64
> > count))
> > +{
> > +	struct drm_device *dev = crtc->dev;
> > +	struct drm_vblank_crtc *vblank = &dev->vblank[drm_crtc_index(crtc)];
> > +
> > +	work->vblank = vblank;
> > +	work->state = DRM_VBL_WORK_IDLE;
> > +	work->func = func;
> > +	INIT_LIST_HEAD(&work->list);
> > +}
> > +EXPORT_SYMBOL(drm_vblank_work_init);
> > +
> >  /**
> >   * drm_vblank_init - initialize vblank support
> >   * @dev: DRM device
> > @@ -481,6 +589,8 @@ int drm_vblank_init(struct drm_device *dev, unsigned
> > int num_crtcs)
> >  		init_waitqueue_head(&vblank->queue);
> >  		timer_setup(&vblank->disable_timer, vblank_disable_fn, 0);
> >  		seqlock_init(&vblank->seqlock);
> > +
> > +		vblank_work_init(vblank);
> >  	}
> >  
> >  	DRM_INFO("Supports vblank timestamp caching Rev 2 (21.10.2013).\n");
> > @@ -1825,6 +1935,22 @@ static void drm_handle_vblank_events(struct
> > drm_device *dev, unsigned int pipe)
> >  	trace_drm_vblank_event(pipe, seq, now, high_prec);
> >  }
> >  
> > +static void drm_handle_vblank_works(struct drm_vblank_crtc *vblank)
> > +{
> > +	struct drm_vblank_work *work, *next;
> > +	u64 count = atomic64_read(&vblank->count);
> > +
> > +	list_for_each_entry_safe(work, next, &vblank->vblank_work.irq_list,
> > +				 list) {
> > +		if (!vblank_passed(count, work->count))
> > +			continue;
> > +
> > +		drm_vblank_put(vblank->dev, vblank->pipe);
> > +		list_move_tail(&work->list, &vblank->vblank_work.work_list);
> > +		work->state = DRM_VBL_WORK_SCHEDULED;
> > +	}
> > +}
> > +
> >  /**
> >   * drm_handle_vblank - handle a vblank event
> >   * @dev: DRM device
> > @@ -1866,6 +1992,7 @@ bool drm_handle_vblank(struct drm_device *dev,
> > unsigned int pipe)
> >  
> >  	spin_unlock(&dev->vblank_time_lock);
> >  
> > +	drm_handle_vblank_works(vblank);
> >  	wake_up(&vblank->queue);
> >  
> >  	/* With instant-off, we defer disabling the interrupt until after
> > @@ -2076,3 +2203,198 @@ int drm_crtc_queue_sequence_ioctl(struct
> > drm_device *dev, void *data,
> >  	kfree(e);
> >  	return ret;
> >  }
> > +
> > +/**
> > + * drm_vblank_work_schedule - schedule a vblank work
> > + * @work: vblank work to schedule
> > + * @count: target vblank count
> > + * @nextonmiss: defer until the next vblank if target vblank was missed
> > + *
> > + * Schedule @work for execution once the crtc vblank count reaches
> > @count.
> > + *
> > + * If the crtc vblank count has already reached @count and @nextonmiss is
> > + * %false the work starts to execute immediately.
> > + *
> > + * If the crtc vblank count has already reached @count and @nextonmiss is
> > + * %true the work is deferred until the next vblank (as if @count has
> > been
> > + * specified as crtc vblank count + 1).
> > + *
> > + * If @work is already scheduled, this function will reschedule said work
> > + * using the new @count.
> > + *
> > + * Returns:
> > + * 0 on success, error code on failure.
> > + */
> > +int drm_vblank_work_schedule(struct drm_vblank_work *work,
> > +			     u64 count, bool nextonmiss)
> > +{
> > +	struct drm_vblank_crtc *vblank = work->vblank;
> > +	unsigned long irqflags;
> > +	u64 cur_vbl;
> > +	int ret = 0;
> > +	bool rescheduling = false;
> > +	bool passed;
> > +
> > +	spin_lock_irqsave(&vblank->dev->event_lock, irqflags);
> > +
> > +	if (work->cancel)
> > +		goto out;
> > +
> > +	if (work->state == DRM_VBL_WORK_RUNNING) {
> > +		work->reschedule = true;
> > +		work->count = count;
> > +		goto out;
> > +	} else if (work->state != DRM_VBL_WORK_IDLE) {
> > +		if (work->count == count)
> > +			goto out;
> > +		rescheduling = true;
> > +	}
> > +
> > +	if (work->state != DRM_VBL_WORK_WAITING) {
> > +		ret = drm_vblank_get(vblank->dev, vblank->pipe);
> > +		if (ret)
> > +			goto out;
> > +	}
> > +
> > +	work->count = count;
> > +
> > +	cur_vbl = atomic64_read(&vblank->count);
> > +	passed = vblank_passed(cur_vbl, count);
> > +	if (passed)
> > +		DRM_ERROR("crtc %d vblank %llu already passed (current
> > %llu)\n",
> > +			  vblank->pipe, count, cur_vbl);
> > +
> > +	if (!nextonmiss && passed) {
> > +		drm_vblank_put(vblank->dev, vblank->pipe);
> > +		if (rescheduling)
> > +			list_move_tail(&work->list,
> > +				       &vblank->vblank_work.work_list);
> > +		else
> > +			list_add_tail(&work->list,
> > +				      &vblank->vblank_work.work_list);
> > +		work->state = DRM_VBL_WORK_SCHEDULED;
> > +		wake_up_all(&vblank->queue);
> > +	} else {
> > +		if (rescheduling)
> > +			list_move_tail(&work->list,
> > +				       &vblank->vblank_work.irq_list);
> > +		else
> > +			list_add_tail(&work->list,
> > +				      &vblank->vblank_work.irq_list);
> > +		work->state = DRM_VBL_WORK_WAITING;
> > +	}
> > +
> > + out:
> > +	spin_unlock_irqrestore(&vblank->dev->event_lock, irqflags);
> > +
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(drm_vblank_work_schedule);
> > +
> > +static bool vblank_work_cancel(struct drm_vblank_work *work)
> > +{
> > +	struct drm_vblank_crtc *vblank = work->vblank;
> > +
> > +	switch (work->state) {
> > +	case DRM_VBL_WORK_RUNNING:
> > +		work->cancel = true;
> > +		work->reschedule = false;
> > +		/* fall through */
> > +	default:
> > +	case DRM_VBL_WORK_IDLE:
> > +		return false;
> > +	case DRM_VBL_WORK_WAITING:
> > +		drm_vblank_put(vblank->dev, vblank->pipe);
> > +		/* fall through */
> > +	case DRM_VBL_WORK_SCHEDULED:
> > +		list_del_init(&work->list);
> > +		work->state = DRM_VBL_WORK_IDLE;
> > +		return true;
> > +	}
> > +}
> > +
> > +/**
> > + * drm_vblank_work_cancel - cancel a vblank work
> > + * @work: vblank work to cancel
> > + *
> > + * Cancel an already scheduled vblank work.
> > + *
> > + * On return @work may still be executing, unless the return
> > + * value is %true.
> > + *
> > + * Returns:
> > + * True if the work was cancelled before it started to excute, false
> > otherwise.
> > + */
> > +bool drm_vblank_work_cancel(struct drm_vblank_work *work)
> > +{
> > +	struct drm_vblank_crtc *vblank = work->vblank;
> > +	bool cancelled;
> > +
> > +	spin_lock_irq(&vblank->dev->event_lock);
> > +
> > +	cancelled = vblank_work_cancel(work);
> > +
> > +	spin_unlock_irq(&vblank->dev->event_lock);
> > +
> > +	return cancelled;
> > +}
> > +EXPORT_SYMBOL(drm_vblank_work_cancel);
> > +
> > +/**
> > + * drm_vblank_work_cancel_sync - cancel a vblank work and wait for it to
> > finish executing
> > + * @work: vblank work to cancel
> > + *
> > + * Cancel an already scheduled vblank work and wait for its
> > + * execution to finish.
> > + *
> > + * On return @work is no longer guaraneed to be executing.
> > + *
> > + * Returns:
> > + * True if the work was cancelled before it started to excute, false
> > otherwise.
> > + */
> > +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work)
> > +{
> > +	struct drm_vblank_crtc *vblank = work->vblank;
> > +	bool cancelled;
> > +	long ret;
> > +
> > +	spin_lock_irq(&vblank->dev->event_lock);
> > +
> > +	cancelled = vblank_work_cancel(work);
> > +
> > +	ret = wait_event_lock_irq_timeout(vblank->vblank_work.work_wait,
> > +					  work->state == DRM_VBL_WORK_IDLE,
> > +					  vblank->dev->event_lock,
> > +					  10 * HZ);
> > +
> > +	spin_unlock_irq(&vblank->dev->event_lock);
> > +
> > +	WARN(!ret, "crtc %d vblank work timed out\n", vblank->pipe);
> > +
> > +	return cancelled;
> > +}
> > +EXPORT_SYMBOL(drm_vblank_work_cancel_sync);
> > +
> > +/**
> > + * drm_vblank_work_flush - wait for a scheduled vblank work to finish
> > excuting
> > + * @work: vblank work to flush
> > + *
> > + * Wait until @work has finished executing.
> > + */
> > +void drm_vblank_work_flush(struct drm_vblank_work *work)
> > +{
> > +	struct drm_vblank_crtc *vblank = work->vblank;
> > +	long ret;
> > +
> > +	spin_lock_irq(&vblank->dev->event_lock);
> > +
> > +	ret = wait_event_lock_irq_timeout(vblank->vblank_work.work_wait,
> > +					  work->state == DRM_VBL_WORK_IDLE,
> > +					  vblank->dev->event_lock,
> > +					  10 * HZ);
> > +
> > +	spin_unlock_irq(&vblank->dev->event_lock);
> > +
> > +	WARN(!ret, "crtc %d vblank work timed out\n", vblank->pipe);
> > +}
> > +EXPORT_SYMBOL(drm_vblank_work_flush);
> > diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h
> > index dd9f5b9e56e4..ac9130f419af 100644
> > --- a/include/drm/drm_vblank.h
> > +++ b/include/drm/drm_vblank.h
> > @@ -203,8 +203,42 @@ struct drm_vblank_crtc {
> >  	 * disabling functions multiple times.
> >  	 */
> >  	bool enabled;
> > +
> > +	struct {
> > +		struct task_struct *thread;
> > +		struct list_head irq_list, work_list;
> > +		wait_queue_head_t work_wait;
> > +	} vblank_work;
> > +};
> > +
> > +struct drm_vblank_work {
> > +	u64 count;
> > +	struct drm_vblank_crtc *vblank;
> > +	void (*func)(struct drm_vblank_work *work, u64 count);
> > +	struct list_head list;
> > +	enum {
> > +		DRM_VBL_WORK_IDLE,
> > +		DRM_VBL_WORK_WAITING,
> > +		DRM_VBL_WORK_SCHEDULED,
> > +		DRM_VBL_WORK_RUNNING,
> > +	} state;
> > +	bool cancel : 1;
> > +	bool reschedule : 1;
> >  };
> >  
> > +int drm_vblank_work_schedule(struct drm_vblank_work *work,
> > +			     u64 count, bool nextonmiss);
> > +void drm_vblank_work_init(struct drm_vblank_work *work, struct drm_crtc
> > *crtc,
> > +			  void (*func)(struct drm_vblank_work *work, u64
> > count));
> > +bool drm_vblank_work_cancel(struct drm_vblank_work *work);
> > +bool drm_vblank_work_cancel_sync(struct drm_vblank_work *work);
> > +void drm_vblank_work_flush(struct drm_vblank_work *work);
> > +
> > +static inline bool drm_vblank_work_pending(struct drm_vblank_work *work)
> > +{
> > +	return work->state != DRM_VBL_WORK_IDLE;
> > +}
> > +
> >  int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs);
> >  bool drm_dev_has_vblank(const struct drm_device *dev);
> >  u64 drm_crtc_vblank_count(struct drm_crtc *crtc);
> > -- 
> > 2.24.1
> > 
-- 
Cheers,
	Lyude Paul (she/her)
	Associate Software Engineer at Red Hat

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2020-05-07 18:57 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-18  0:40 [PATCH 0/9] drm/nouveau: Introduce CRC support for gf119+ Lyude Paul
2020-03-18  0:40 ` [PATCH 1/9] drm/vblank: Add vblank works Lyude Paul
2020-03-18 13:46   ` Daniel Vetter
2020-03-19 20:12     ` Lyude Paul
2020-03-27 20:29     ` Lyude Paul
2020-03-27 20:38       ` Lyude Paul
2020-04-13 20:18         ` Lyude Paul
2020-04-13 20:42           ` Tejun Heo
2020-04-13 21:07             ` Sam Ravnborg
2020-04-14 16:52             ` Lyude Paul
2020-04-14 18:17               ` Tejun Heo
2020-05-07 18:57     ` Lyude Paul
2020-03-18  0:40 ` [PATCH 2/9] drm/nouveau/kms/nv50-: Unroll error cleanup in nv50_head_create() Lyude Paul
2020-03-18  0:41 ` [PATCH 3/9] drm/nouveau/kms/nv140-: Don't modify depth in state during atomic commit Lyude Paul
2020-03-18  0:41 ` [PATCH 4/9] drm/nouveau/kms/nv50-: Fix disabling dithering Lyude Paul
2020-03-18  0:41 ` [PATCH 5/9] drm/nouveau/kms/nv50-: s/harm/armh/g Lyude Paul
2020-03-18  0:41 ` [PATCH 6/9] drm/nouveau/kms/nv140-: Track wndw mappings in nv50_head_atom Lyude Paul
2020-03-18  0:41 ` [PATCH 7/9] drm/nouveau/kms/nv50-: Expose nv50_outp_atom in disp.h Lyude Paul
2020-03-18  0:41 ` [PATCH 8/9] drm/nouveau/kms/nv50-: Move hard-coded object handles into header Lyude Paul
2020-03-18  0:41 ` [PATCH 9/9] drm/nouveau/kms/nvd9-: Add CRC support Lyude Paul
2020-03-18  1:09   ` [PATCH v2] " Lyude Paul
2020-03-18  6:02   ` [PATCH 9/9] " Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).