All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] drm/vkms: rework crc worker
@ 2019-06-06 22:27 Daniel Vetter
  2019-06-06 22:27 ` [PATCH 01/10] drm/vkms: Fix crc worker races Daniel Vetter
                   ` (10 more replies)
  0 siblings, 11 replies; 44+ messages in thread
From: Daniel Vetter @ 2019-06-06 22:27 UTC (permalink / raw)
  To: DRI Development; +Cc: Daniel Vetter

Hi all,

This here is the first part of a rework for the vkms crc worker. I think
this should fix all the locking/races/use-after-free issues I spotted in
the code. There's more work we can do in the future as a follow-up:

- directly access vkms_plane_state->base in the crc worker, with this
  approach in this series here that should be safe now. Follow-up patches
  could switch and remove all the separate crc_data infrastructure.

- I think some kerneldoc for vkms structures would be nice. Documentation
  the various functions is probably overkill.

- Implementing a more generic blending engine, as prep for adding more
  pixel formats, more planes, and more features in general.

Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
things worse, but I didn't do a full run.

Cheers, Daniel

Daniel Vetter (10):
  drm/vkms: Fix crc worker races
  drm/vkms: Use spin_lock_irq in process context
  drm/vkms: Rename vkms_output.state_lock to crc_lock
  drm/vkms: Move format arrays to vkms_plane.c
  drm/vkms: Add our own commit_tail
  drm/vkms: flush crc workers earlier in commit flow
  drm/vkms: Dont flush crc worker when we change crc status
  drm/vkms: No _irqsave within spin_lock_irq needed
  drm/vkms: totally reworked crc data tracking
  drm/vkms: No need for ->pages_lock in crc work anymore

 drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
 drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
 drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
 drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
 drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
 5 files changed, 146 insertions(+), 75 deletions(-)

-- 
2.20.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 01/10] drm/vkms: Fix crc worker races
  2019-06-06 22:27 [PATCH 00/10] drm/vkms: rework crc worker Daniel Vetter
@ 2019-06-06 22:27 ` Daniel Vetter
  2019-06-12 13:33   ` Rodrigo Siqueira
  2019-06-06 22:27 ` [PATCH 02/10] drm/vkms: Use spin_lock_irq in process context Daniel Vetter
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 44+ messages in thread
From: Daniel Vetter @ 2019-06-06 22:27 UTC (permalink / raw)
  To: DRI Development
  Cc: Haneen Mohammed, Rodrigo Siqueira, Daniel Vetter, Shayenne Moura,
	Daniel Vetter

The issue we have is that the crc worker might fall behind. We've
tried to handle this by tracking both the earliest frame for which it
still needs to compute a crc, and the last one. Plus when the
crtc_state changes, we have a new work item, which are all run in
order due to the ordered workqueue we allocate for each vkms crtc.

Trouble is there's been a few small issues in the current code:
- we need to capture frame_end in the vblank hrtimer, not in the
  worker. The worker might run much later, and then we generate a lot
  of crc for which there's already a different worker queued up.
- frame number might be 0, so create a new crc_pending boolean to
  track this without confusion.
- we need to atomically grab frame_start/end and clear it, so do that
  all in one go. This is not going to create a new race, because if we
  race with the hrtimer then our work will be re-run.
- only race that can happen is the following:
  1. worker starts
  2. hrtimer runs and updates frame_end
  3. worker grabs frame_start/end, already reading the new frame_end,
  and clears crc_pending
  4. hrtimer calls queue_work()
  5. worker completes
  6. worker gets  re-run, crc_pending is false
  Explain this case a bit better by rewording the comment.

v2: Demote warning level output to debug when we fail to requeue, this
is expected under high load when the crc worker can't quite keep up.

Cc: Shayenne Moura <shayenneluzmoura@gmail.com>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/vkms/vkms_crc.c  | 27 +++++++++++++--------------
 drivers/gpu/drm/vkms/vkms_crtc.c |  9 +++++++--
 drivers/gpu/drm/vkms/vkms_drv.h  |  2 ++
 3 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
index d7b409a3c0f8..66603da634fe 100644
--- a/drivers/gpu/drm/vkms/vkms_crc.c
+++ b/drivers/gpu/drm/vkms/vkms_crc.c
@@ -166,16 +166,24 @@ void vkms_crc_work_handle(struct work_struct *work)
 	struct drm_plane *plane;
 	u32 crc32 = 0;
 	u64 frame_start, frame_end;
+	bool crc_pending;
 	unsigned long flags;
 
 	spin_lock_irqsave(&out->state_lock, flags);
 	frame_start = crtc_state->frame_start;
 	frame_end = crtc_state->frame_end;
+	crc_pending = crtc_state->crc_pending;
+	crtc_state->frame_start = 0;
+	crtc_state->frame_end = 0;
+	crtc_state->crc_pending = false;
 	spin_unlock_irqrestore(&out->state_lock, flags);
 
-	/* _vblank_handle() hasn't updated frame_start yet */
-	if (!frame_start || frame_start == frame_end)
-		goto out;
+	/*
+	 * We raced with the vblank hrtimer and previous work already computed
+	 * the crc, nothing to do.
+	 */
+	if (!crc_pending)
+		return;
 
 	drm_for_each_plane(plane, &vdev->drm) {
 		struct vkms_plane_state *vplane_state;
@@ -196,20 +204,11 @@ void vkms_crc_work_handle(struct work_struct *work)
 	if (primary_crc)
 		crc32 = _vkms_get_crc(primary_crc, cursor_crc);
 
-	frame_end = drm_crtc_accurate_vblank_count(crtc);
-
-	/* queue_work can fail to schedule crc_work; add crc for
-	 * missing frames
+	/*
+	 * The worker can fall behind the vblank hrtimer, make sure we catch up.
 	 */
 	while (frame_start <= frame_end)
 		drm_crtc_add_crc_entry(crtc, true, frame_start++, &crc32);
-
-out:
-	/* to avoid using the same value for frame number again */
-	spin_lock_irqsave(&out->state_lock, flags);
-	crtc_state->frame_end = frame_end;
-	crtc_state->frame_start = 0;
-	spin_unlock_irqrestore(&out->state_lock, flags);
 }
 
 static int vkms_crc_parse_source(const char *src_name, bool *enabled)
diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
index 1bbe099b7db8..c727d8486e97 100644
--- a/drivers/gpu/drm/vkms/vkms_crtc.c
+++ b/drivers/gpu/drm/vkms/vkms_crtc.c
@@ -30,13 +30,18 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
 		 * has read the data
 		 */
 		spin_lock(&output->state_lock);
-		if (!state->frame_start)
+		if (!state->crc_pending)
 			state->frame_start = frame;
+		else
+			DRM_DEBUG_DRIVER("crc worker falling behind, frame_start: %llu, frame_end: %llu\n",
+					 state->frame_start, frame);
+		state->frame_end = frame;
+		state->crc_pending = true;
 		spin_unlock(&output->state_lock);
 
 		ret = queue_work(output->crc_workq, &state->crc_work);
 		if (!ret)
-			DRM_WARN("failed to queue vkms_crc_work_handle");
+			DRM_DEBUG_DRIVER("vkms_crc_work_handle already queued\n");
 	}
 
 	spin_unlock(&output->lock);
diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index 81f1cfbeb936..3c7e06b19efd 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -56,6 +56,8 @@ struct vkms_plane_state {
 struct vkms_crtc_state {
 	struct drm_crtc_state base;
 	struct work_struct crc_work;
+
+	bool crc_pending;
 	u64 frame_start;
 	u64 frame_end;
 };
-- 
2.20.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 02/10] drm/vkms: Use spin_lock_irq in process context
  2019-06-06 22:27 [PATCH 00/10] drm/vkms: rework crc worker Daniel Vetter
  2019-06-06 22:27 ` [PATCH 01/10] drm/vkms: Fix crc worker races Daniel Vetter
@ 2019-06-06 22:27 ` Daniel Vetter
  2019-06-12 13:34   ` Rodrigo Siqueira
  2019-06-06 22:27 ` [PATCH 03/10] drm/vkms: Rename vkms_output.state_lock to crc_lock Daniel Vetter
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 44+ messages in thread
From: Daniel Vetter @ 2019-06-06 22:27 UTC (permalink / raw)
  To: DRI Development
  Cc: Haneen Mohammed, Rodrigo Siqueira, Daniel Vetter, Shayenne Moura,
	Daniel Vetter

The worker is always in process context, no need for the _irqsafe
version. Same for the set_source callback, that's only called from the
debugfs handler in a syscall.

Cc: Shayenne Moura <shayenneluzmoura@gmail.com>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/vkms/vkms_crc.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
index 66603da634fe..883e36fe7b6e 100644
--- a/drivers/gpu/drm/vkms/vkms_crc.c
+++ b/drivers/gpu/drm/vkms/vkms_crc.c
@@ -167,16 +167,15 @@ void vkms_crc_work_handle(struct work_struct *work)
 	u32 crc32 = 0;
 	u64 frame_start, frame_end;
 	bool crc_pending;
-	unsigned long flags;
 
-	spin_lock_irqsave(&out->state_lock, flags);
+	spin_lock_irq(&out->state_lock);
 	frame_start = crtc_state->frame_start;
 	frame_end = crtc_state->frame_end;
 	crc_pending = crtc_state->crc_pending;
 	crtc_state->frame_start = 0;
 	crtc_state->frame_end = 0;
 	crtc_state->crc_pending = false;
-	spin_unlock_irqrestore(&out->state_lock, flags);
+	spin_unlock_irq(&out->state_lock);
 
 	/*
 	 * We raced with the vblank hrtimer and previous work already computed
@@ -246,7 +245,6 @@ int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name)
 {
 	struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
 	bool enabled = false;
-	unsigned long flags;
 	int ret = 0;
 
 	ret = vkms_crc_parse_source(src_name, &enabled);
@@ -254,9 +252,9 @@ int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name)
 	/* make sure nothing is scheduled on crtc workq */
 	flush_workqueue(out->crc_workq);
 
-	spin_lock_irqsave(&out->lock, flags);
+	spin_lock_irq(&out->lock);
 	out->crc_enabled = enabled;
-	spin_unlock_irqrestore(&out->lock, flags);
+	spin_unlock_irq(&out->lock);
 
 	return ret;
 }
-- 
2.20.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 03/10] drm/vkms: Rename vkms_output.state_lock to crc_lock
  2019-06-06 22:27 [PATCH 00/10] drm/vkms: rework crc worker Daniel Vetter
  2019-06-06 22:27 ` [PATCH 01/10] drm/vkms: Fix crc worker races Daniel Vetter
  2019-06-06 22:27 ` [PATCH 02/10] drm/vkms: Use spin_lock_irq in process context Daniel Vetter
@ 2019-06-06 22:27 ` Daniel Vetter
  2019-06-12 13:38   ` Rodrigo Siqueira
  2019-06-06 22:27 ` [PATCH 04/10] drm/vkms: Move format arrays to vkms_plane.c Daniel Vetter
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 44+ messages in thread
From: Daniel Vetter @ 2019-06-06 22:27 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Haneen Mohammed, Rodrigo Siqueira, Daniel Vetter

Plus add a comment about what it actually protects. It's very little.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/vkms/vkms_crc.c  | 4 ++--
 drivers/gpu/drm/vkms/vkms_crtc.c | 6 +++---
 drivers/gpu/drm/vkms/vkms_drv.h  | 5 +++--
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
index 883e36fe7b6e..96806cd35ad4 100644
--- a/drivers/gpu/drm/vkms/vkms_crc.c
+++ b/drivers/gpu/drm/vkms/vkms_crc.c
@@ -168,14 +168,14 @@ void vkms_crc_work_handle(struct work_struct *work)
 	u64 frame_start, frame_end;
 	bool crc_pending;
 
-	spin_lock_irq(&out->state_lock);
+	spin_lock_irq(&out->crc_lock);
 	frame_start = crtc_state->frame_start;
 	frame_end = crtc_state->frame_end;
 	crc_pending = crtc_state->crc_pending;
 	crtc_state->frame_start = 0;
 	crtc_state->frame_end = 0;
 	crtc_state->crc_pending = false;
-	spin_unlock_irq(&out->state_lock);
+	spin_unlock_irq(&out->crc_lock);
 
 	/*
 	 * We raced with the vblank hrtimer and previous work already computed
diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
index c727d8486e97..55b16d545fe7 100644
--- a/drivers/gpu/drm/vkms/vkms_crtc.c
+++ b/drivers/gpu/drm/vkms/vkms_crtc.c
@@ -29,7 +29,7 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
 		/* update frame_start only if a queued vkms_crc_work_handle()
 		 * has read the data
 		 */
-		spin_lock(&output->state_lock);
+		spin_lock(&output->crc_lock);
 		if (!state->crc_pending)
 			state->frame_start = frame;
 		else
@@ -37,7 +37,7 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
 					 state->frame_start, frame);
 		state->frame_end = frame;
 		state->crc_pending = true;
-		spin_unlock(&output->state_lock);
+		spin_unlock(&output->crc_lock);
 
 		ret = queue_work(output->crc_workq, &state->crc_work);
 		if (!ret)
@@ -224,7 +224,7 @@ int vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
 	drm_crtc_helper_add(crtc, &vkms_crtc_helper_funcs);
 
 	spin_lock_init(&vkms_out->lock);
-	spin_lock_init(&vkms_out->state_lock);
+	spin_lock_init(&vkms_out->crc_lock);
 
 	vkms_out->crc_workq = alloc_ordered_workqueue("vkms_crc_workq", 0);
 	if (!vkms_out->crc_workq)
diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index 3c7e06b19efd..43d3f98289fe 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -57,6 +57,7 @@ struct vkms_crtc_state {
 	struct drm_crtc_state base;
 	struct work_struct crc_work;
 
+	/* below three are protected by vkms_output.crc_lock */
 	bool crc_pending;
 	u64 frame_start;
 	u64 frame_end;
@@ -74,8 +75,8 @@ struct vkms_output {
 	struct workqueue_struct *crc_workq;
 	/* protects concurrent access to crc_data */
 	spinlock_t lock;
-	/* protects concurrent access to crtc_state */
-	spinlock_t state_lock;
+
+	spinlock_t crc_lock;
 };
 
 struct vkms_device {
-- 
2.20.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 04/10] drm/vkms: Move format arrays to vkms_plane.c
  2019-06-06 22:27 [PATCH 00/10] drm/vkms: rework crc worker Daniel Vetter
                   ` (2 preceding siblings ...)
  2019-06-06 22:27 ` [PATCH 03/10] drm/vkms: Rename vkms_output.state_lock to crc_lock Daniel Vetter
@ 2019-06-06 22:27 ` Daniel Vetter
  2019-06-12 13:39   ` Rodrigo Siqueira
  2019-06-19  2:12   ` Rodrigo Siqueira
  2019-06-06 22:27 ` [PATCH 05/10] drm/vkms: Add our own commit_tail Daniel Vetter
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 44+ messages in thread
From: Daniel Vetter @ 2019-06-06 22:27 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Haneen Mohammed, Rodrigo Siqueira, Daniel Vetter

No need to have them multiple times.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/vkms/vkms_drv.h   | 8 --------
 drivers/gpu/drm/vkms/vkms_plane.c | 8 ++++++++
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index 43d3f98289fe..2a35299bfb89 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -20,14 +20,6 @@
 
 extern bool enable_cursor;
 
-static const u32 vkms_formats[] = {
-	DRM_FORMAT_XRGB8888,
-};
-
-static const u32 vkms_cursor_formats[] = {
-	DRM_FORMAT_ARGB8888,
-};
-
 struct vkms_crc_data {
 	struct drm_framebuffer fb;
 	struct drm_rect src, dst;
diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
index 0e67d2d42f0c..0fceb6258422 100644
--- a/drivers/gpu/drm/vkms/vkms_plane.c
+++ b/drivers/gpu/drm/vkms/vkms_plane.c
@@ -6,6 +6,14 @@
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 
+static const u32 vkms_formats[] = {
+	DRM_FORMAT_XRGB8888,
+};
+
+static const u32 vkms_cursor_formats[] = {
+	DRM_FORMAT_ARGB8888,
+};
+
 static struct drm_plane_state *
 vkms_plane_duplicate_state(struct drm_plane *plane)
 {
-- 
2.20.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 05/10] drm/vkms: Add our own commit_tail
  2019-06-06 22:27 [PATCH 00/10] drm/vkms: rework crc worker Daniel Vetter
                   ` (3 preceding siblings ...)
  2019-06-06 22:27 ` [PATCH 04/10] drm/vkms: Move format arrays to vkms_plane.c Daniel Vetter
@ 2019-06-06 22:27 ` Daniel Vetter
  2019-06-06 22:27 ` [PATCH 06/10] drm/vkms: flush crc workers earlier in commit flow Daniel Vetter
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2019-06-06 22:27 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Haneen Mohammed, Rodrigo Siqueira, Daniel Vetter

Just prep work, more will be done here in following patches.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/vkms/vkms_drv.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
index 738dd6206d85..f677ab1d0094 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.c
+++ b/drivers/gpu/drm/vkms/vkms_drv.c
@@ -11,6 +11,7 @@
 
 #include <linux/module.h>
 #include <drm/drm_gem.h>
+#include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_fb_helper.h>
@@ -58,6 +59,25 @@ static void vkms_release(struct drm_device *dev)
 	destroy_workqueue(vkms->output.crc_workq);
 }
 
+static void vkms_atomic_commit_tail(struct drm_atomic_state *old_state)
+{
+	struct drm_device *dev = old_state->dev;
+
+	drm_atomic_helper_commit_modeset_disables(dev, old_state);
+
+	drm_atomic_helper_commit_planes(dev, old_state, 0);
+
+	drm_atomic_helper_commit_modeset_enables(dev, old_state);
+
+	drm_atomic_helper_fake_vblank(old_state);
+
+	drm_atomic_helper_commit_hw_done(old_state);
+
+	drm_atomic_helper_wait_for_vblanks(dev, old_state);
+
+	drm_atomic_helper_cleanup_planes(dev, old_state);
+}
+
 static struct drm_driver vkms_driver = {
 	.driver_features	= DRIVER_MODESET | DRIVER_ATOMIC | DRIVER_GEM,
 	.release		= vkms_release,
@@ -80,6 +100,10 @@ static const struct drm_mode_config_funcs vkms_mode_funcs = {
 	.atomic_commit = drm_atomic_helper_commit,
 };
 
+static const struct drm_mode_config_helper_funcs vkms_mode_config_helpers = {
+	.atomic_commit_tail = vkms_atomic_commit_tail,
+};
+
 static int vkms_modeset_init(struct vkms_device *vkmsdev)
 {
 	struct drm_device *dev = &vkmsdev->drm;
@@ -91,6 +115,7 @@ static int vkms_modeset_init(struct vkms_device *vkmsdev)
 	dev->mode_config.max_width = XRES_MAX;
 	dev->mode_config.max_height = YRES_MAX;
 	dev->mode_config.preferred_depth = 24;
+	dev->mode_config.helper_private = &vkms_mode_config_helpers;
 
 	return vkms_output_init(vkmsdev);
 }
-- 
2.20.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 06/10] drm/vkms: flush crc workers earlier in commit flow
  2019-06-06 22:27 [PATCH 00/10] drm/vkms: rework crc worker Daniel Vetter
                   ` (4 preceding siblings ...)
  2019-06-06 22:27 ` [PATCH 05/10] drm/vkms: Add our own commit_tail Daniel Vetter
@ 2019-06-06 22:27 ` Daniel Vetter
  2019-06-12 13:42   ` Rodrigo Siqueira
  2019-06-06 22:27 ` [PATCH 07/10] drm/vkms: Dont flush crc worker when we change crc status Daniel Vetter
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 44+ messages in thread
From: Daniel Vetter @ 2019-06-06 22:27 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Haneen Mohammed, Rodrigo Siqueira, Daniel Vetter

Currently we flush pending crc workers very late in the commit flow,
when we destry all the old crtc states. Unfortunately at that point
the framebuffers are already unpinned (and our vaddr possible gone),
so this isn't good. Also, the plane_states we need might also already
be cleaned up, since cleanup order of state structures isn't well
defined.

Fix this by waiting for all crc workers of the old state to complete
before we start any of the cleanup work.

Note that this is not yet race-free, because the hrtimer and crc
worker look at the wrong state pointers, but that will be fixed in
subsequent patches.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/vkms/vkms_crtc.c |  2 +-
 drivers/gpu/drm/vkms/vkms_drv.c  | 10 ++++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
index 55b16d545fe7..b6987d90805f 100644
--- a/drivers/gpu/drm/vkms/vkms_crtc.c
+++ b/drivers/gpu/drm/vkms/vkms_crtc.c
@@ -125,7 +125,7 @@ static void vkms_atomic_crtc_destroy_state(struct drm_crtc *crtc,
 	__drm_atomic_helper_crtc_destroy_state(state);
 
 	if (vkms_state) {
-		flush_work(&vkms_state->crc_work);
+		WARN_ON(work_pending(&vkms_state->crc_work));
 		kfree(vkms_state);
 	}
 }
diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
index f677ab1d0094..cc53ef88a331 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.c
+++ b/drivers/gpu/drm/vkms/vkms_drv.c
@@ -62,6 +62,9 @@ static void vkms_release(struct drm_device *dev)
 static void vkms_atomic_commit_tail(struct drm_atomic_state *old_state)
 {
 	struct drm_device *dev = old_state->dev;
+	struct drm_crtc *crtc;
+	struct drm_crtc_state *old_crtc_state;
+	int i;
 
 	drm_atomic_helper_commit_modeset_disables(dev, old_state);
 
@@ -75,6 +78,13 @@ static void vkms_atomic_commit_tail(struct drm_atomic_state *old_state)
 
 	drm_atomic_helper_wait_for_vblanks(dev, old_state);
 
+	for_each_old_crtc_in_state(old_state, crtc, old_crtc_state, i) {
+		struct vkms_crtc_state *vkms_state =
+			to_vkms_crtc_state(old_crtc_state);
+
+		flush_work(&vkms_state->crc_work);
+	}
+
 	drm_atomic_helper_cleanup_planes(dev, old_state);
 }
 
-- 
2.20.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 07/10] drm/vkms: Dont flush crc worker when we change crc status
  2019-06-06 22:27 [PATCH 00/10] drm/vkms: rework crc worker Daniel Vetter
                   ` (5 preceding siblings ...)
  2019-06-06 22:27 ` [PATCH 06/10] drm/vkms: flush crc workers earlier in commit flow Daniel Vetter
@ 2019-06-06 22:27 ` Daniel Vetter
  2019-06-19  2:17   ` Rodrigo Siqueira
  2019-06-06 22:27 ` [PATCH 08/10] drm/vkms: No _irqsave within spin_lock_irq needed Daniel Vetter
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 44+ messages in thread
From: Daniel Vetter @ 2019-06-06 22:27 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Haneen Mohammed, Rodrigo Siqueira, Daniel Vetter

The crc core code can cope with some late crc, the race is kinda
unavoidable. So no need to flush pending workers, they'll complete in
time.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/vkms/vkms_crc.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
index 96806cd35ad4..9d15e5e85830 100644
--- a/drivers/gpu/drm/vkms/vkms_crc.c
+++ b/drivers/gpu/drm/vkms/vkms_crc.c
@@ -249,9 +249,6 @@ int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name)
 
 	ret = vkms_crc_parse_source(src_name, &enabled);
 
-	/* make sure nothing is scheduled on crtc workq */
-	flush_workqueue(out->crc_workq);
-
 	spin_lock_irq(&out->lock);
 	out->crc_enabled = enabled;
 	spin_unlock_irq(&out->lock);
-- 
2.20.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 08/10] drm/vkms: No _irqsave within spin_lock_irq needed
  2019-06-06 22:27 [PATCH 00/10] drm/vkms: rework crc worker Daniel Vetter
                   ` (6 preceding siblings ...)
  2019-06-06 22:27 ` [PATCH 07/10] drm/vkms: Dont flush crc worker when we change crc status Daniel Vetter
@ 2019-06-06 22:27 ` Daniel Vetter
  2019-06-12 13:43   ` Rodrigo Siqueira
  2019-06-06 22:27 ` [PATCH 09/10] drm/vkms: totally reworked crc data tracking Daniel Vetter
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 44+ messages in thread
From: Daniel Vetter @ 2019-06-06 22:27 UTC (permalink / raw)
  To: DRI Development; +Cc: Daniel Vetter, Daniel Vetter

irqs are already off.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/vkms/vkms_crtc.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
index b6987d90805f..48a793ba4030 100644
--- a/drivers/gpu/drm/vkms/vkms_crtc.c
+++ b/drivers/gpu/drm/vkms/vkms_crtc.c
@@ -183,17 +183,16 @@ static void vkms_crtc_atomic_flush(struct drm_crtc *crtc,
 				   struct drm_crtc_state *old_crtc_state)
 {
 	struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
-	unsigned long flags;
 
 	if (crtc->state->event) {
-		spin_lock_irqsave(&crtc->dev->event_lock, flags);
+		spin_lock(&crtc->dev->event_lock);
 
 		if (drm_crtc_vblank_get(crtc) != 0)
 			drm_crtc_send_vblank_event(crtc, crtc->state->event);
 		else
 			drm_crtc_arm_vblank_event(crtc, crtc->state->event);
 
-		spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
+		spin_unlock(&crtc->dev->event_lock);
 
 		crtc->state->event = NULL;
 	}
-- 
2.20.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 09/10] drm/vkms: totally reworked crc data tracking
  2019-06-06 22:27 [PATCH 00/10] drm/vkms: rework crc worker Daniel Vetter
                   ` (7 preceding siblings ...)
  2019-06-06 22:27 ` [PATCH 08/10] drm/vkms: No _irqsave within spin_lock_irq needed Daniel Vetter
@ 2019-06-06 22:27 ` Daniel Vetter
  2019-06-12 13:46   ` Rodrigo Siqueira
  2019-06-06 22:27 ` [PATCH 10/10] drm/vkms: No need for ->pages_lock in crc work anymore Daniel Vetter
  2019-06-12 13:28 ` [PATCH 00/10] drm/vkms: rework crc worker Rodrigo Siqueira
  10 siblings, 1 reply; 44+ messages in thread
From: Daniel Vetter @ 2019-06-06 22:27 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Haneen Mohammed, Rodrigo Siqueira, Daniel Vetter

The crc computation worker needs to be able to get at some data
structures and framebuffer mappings, while potentially more atomic
updates are going on. The solution thus far is to copy relevant bits
around, but that's very tedious.

Here's a new approach, which tries to be more clever, but relies on a
few not-so-obvious things:
- crtc_state is always updated when a plane_state changes. Therefore
  we can just stuff plane_state pointers into a crtc_state. That
  solves the problem of easily getting at the needed plane_states.
- with the flushing changes from previous patches the above also holds
  without races due to the next atomic update being a bit eager with
  cleaning up pending work - we always wait for all crc work items to
  complete before unmapping framebuffers.
- we also need to make sure that the hrtimer fires off the right
  worker. Keep a new distinct crc_state pointer, under the
  vkms_output->lock protection for this. Note that crtc->state is
  updated very early in the atomic commit, way before we arm the
  vblank event - the vblank event should always match the buffers we
  use to compute the crc. This also solves an issue in the hrtimer,
  where we've accessed drm_crtc->state without holding the right locks
  (we held none - oops).
- in the worker itself we can then just access the plane states we
  need, again solving a bunch of ordering and locking issues.
  Accessing plane->state requires locks, accessing the private
  vkms_crtc_state->active_planes pointer only requires that the memory
  doesn't get freed too early.

The idea behind vkms_crtc_state->active_planes is that this would
contain all visible planes, in z-order, as a first step towards a more
generic blending implementation.

Note that this patch also fixes races between prepare_fb/cleanup_fb
and the crc worker accessing ->vaddr.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/vkms/vkms_crc.c  | 21 +++--------
 drivers/gpu/drm/vkms/vkms_crtc.c | 60 +++++++++++++++++++++++++++++---
 drivers/gpu/drm/vkms/vkms_drv.h  |  9 ++++-
 3 files changed, 67 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
index 9d15e5e85830..0d31cfc32042 100644
--- a/drivers/gpu/drm/vkms/vkms_crc.c
+++ b/drivers/gpu/drm/vkms/vkms_crc.c
@@ -159,11 +159,8 @@ void vkms_crc_work_handle(struct work_struct *work)
 						crc_work);
 	struct drm_crtc *crtc = crtc_state->base.crtc;
 	struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
-	struct vkms_device *vdev = container_of(out, struct vkms_device,
-						output);
 	struct vkms_crc_data *primary_crc = NULL;
 	struct vkms_crc_data *cursor_crc = NULL;
-	struct drm_plane *plane;
 	u32 crc32 = 0;
 	u64 frame_start, frame_end;
 	bool crc_pending;
@@ -184,21 +181,11 @@ void vkms_crc_work_handle(struct work_struct *work)
 	if (!crc_pending)
 		return;
 
-	drm_for_each_plane(plane, &vdev->drm) {
-		struct vkms_plane_state *vplane_state;
-		struct vkms_crc_data *crc_data;
+	if (crtc_state->num_active_planes >= 1)
+		primary_crc = crtc_state->active_planes[0]->crc_data;
 
-		vplane_state = to_vkms_plane_state(plane->state);
-		crc_data = vplane_state->crc_data;
-
-		if (drm_framebuffer_read_refcount(&crc_data->fb) == 0)
-			continue;
-
-		if (plane->type == DRM_PLANE_TYPE_PRIMARY)
-			primary_crc = crc_data;
-		else
-			cursor_crc = crc_data;
-	}
+	if (crtc_state->num_active_planes == 2)
+		cursor_crc = crtc_state->active_planes[1]->crc_data;
 
 	if (primary_crc)
 		crc32 = _vkms_get_crc(primary_crc, cursor_crc);
diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
index 48a793ba4030..14156ed70415 100644
--- a/drivers/gpu/drm/vkms/vkms_crtc.c
+++ b/drivers/gpu/drm/vkms/vkms_crtc.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0+
 
 #include "vkms_drv.h"
+#include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_probe_helper.h>
 
@@ -9,7 +10,7 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
 	struct vkms_output *output = container_of(timer, struct vkms_output,
 						  vblank_hrtimer);
 	struct drm_crtc *crtc = &output->crtc;
-	struct vkms_crtc_state *state = to_vkms_crtc_state(crtc->state);
+	struct vkms_crtc_state *state;
 	u64 ret_overrun;
 	bool ret;
 
@@ -23,6 +24,7 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
 	if (!ret)
 		DRM_ERROR("vkms failure on handling vblank");
 
+	state = output->crc_state;
 	if (state && output->crc_enabled) {
 		u64 frame = drm_crtc_accurate_vblank_count(crtc);
 
@@ -124,10 +126,9 @@ static void vkms_atomic_crtc_destroy_state(struct drm_crtc *crtc,
 
 	__drm_atomic_helper_crtc_destroy_state(state);
 
-	if (vkms_state) {
-		WARN_ON(work_pending(&vkms_state->crc_work));
-		kfree(vkms_state);
-	}
+	WARN_ON(work_pending(&vkms_state->crc_work));
+	kfree(vkms_state->active_planes);
+	kfree(vkms_state);
 }
 
 static void vkms_atomic_crtc_reset(struct drm_crtc *crtc)
@@ -156,6 +157,52 @@ static const struct drm_crtc_funcs vkms_crtc_funcs = {
 	.verify_crc_source	= vkms_verify_crc_source,
 };
 
+static int vkms_crtc_atomic_check(struct drm_crtc *crtc,
+				   struct drm_crtc_state *state)
+{
+	struct vkms_crtc_state *vkms_state = to_vkms_crtc_state(state);
+	struct drm_plane *plane;
+	struct drm_plane_state *plane_state;
+	int i = 0, ret;
+
+	if (vkms_state->active_planes)
+		return 0;
+
+	ret = drm_atomic_add_affected_planes(state->state, crtc);
+	if (ret < 0)
+		return ret;
+
+	drm_for_each_plane_mask(plane, crtc->dev, state->plane_mask) {
+		plane_state = drm_atomic_get_existing_plane_state(state->state,
+								  plane);
+		WARN_ON(!plane_state);
+
+		if (!plane_state->visible)
+			continue;
+
+		i++;
+	}
+
+	vkms_state->active_planes = kcalloc(i, sizeof(plane), GFP_KERNEL);
+	if (!vkms_state->active_planes)
+		return -ENOMEM;
+	vkms_state->num_active_planes = i;
+
+	i = 0;
+	drm_for_each_plane_mask(plane, crtc->dev, state->plane_mask) {
+		plane_state = drm_atomic_get_existing_plane_state(state->state,
+								  plane);
+
+		if (!plane_state->visible)
+			continue;
+
+		vkms_state->active_planes[i++] =
+			to_vkms_plane_state(plane_state);
+	}
+
+	return 0;
+}
+
 static void vkms_crtc_atomic_enable(struct drm_crtc *crtc,
 				    struct drm_crtc_state *old_state)
 {
@@ -197,10 +244,13 @@ static void vkms_crtc_atomic_flush(struct drm_crtc *crtc,
 		crtc->state->event = NULL;
 	}
 
+	vkms_output->crc_state = to_vkms_crtc_state(crtc->state);
+
 	spin_unlock_irq(&vkms_output->lock);
 }
 
 static const struct drm_crtc_helper_funcs vkms_crtc_helper_funcs = {
+	.atomic_check	= vkms_crtc_atomic_check,
 	.atomic_begin	= vkms_crtc_atomic_begin,
 	.atomic_flush	= vkms_crtc_atomic_flush,
 	.atomic_enable	= vkms_crtc_atomic_enable,
diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
index 2a35299bfb89..4e7450111d45 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.h
+++ b/drivers/gpu/drm/vkms/vkms_drv.h
@@ -49,6 +49,10 @@ struct vkms_crtc_state {
 	struct drm_crtc_state base;
 	struct work_struct crc_work;
 
+	int num_active_planes;
+	/* stack of active planes for crc computation, should be in z order */
+	struct vkms_plane_state **active_planes;
+
 	/* below three are protected by vkms_output.crc_lock */
 	bool crc_pending;
 	u64 frame_start;
@@ -62,12 +66,15 @@ struct vkms_output {
 	struct hrtimer vblank_hrtimer;
 	ktime_t period_ns;
 	struct drm_pending_vblank_event *event;
-	bool crc_enabled;
 	/* ordered wq for crc_work */
 	struct workqueue_struct *crc_workq;
 	/* protects concurrent access to crc_data */
 	spinlock_t lock;
 
+	/* protected by @lock */
+	bool crc_enabled;
+	struct vkms_crtc_state *crc_state;
+
 	spinlock_t crc_lock;
 };
 
-- 
2.20.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 10/10] drm/vkms: No need for ->pages_lock in crc work anymore
  2019-06-06 22:27 [PATCH 00/10] drm/vkms: rework crc worker Daniel Vetter
                   ` (8 preceding siblings ...)
  2019-06-06 22:27 ` [PATCH 09/10] drm/vkms: totally reworked crc data tracking Daniel Vetter
@ 2019-06-06 22:27 ` Daniel Vetter
  2019-06-12 13:47   ` Rodrigo Siqueira
  2019-06-12 13:28 ` [PATCH 00/10] drm/vkms: rework crc worker Rodrigo Siqueira
  10 siblings, 1 reply; 44+ messages in thread
From: Daniel Vetter @ 2019-06-06 22:27 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Haneen Mohammed, Rodrigo Siqueira, Daniel Vetter

We're now guaranteed to no longer race against prepare_fb/cleanup_fb,
which means we can access ->vaddr without having to hold a lock.

Before the previous patches it was fairly easy to observe the cursor
->vaddr being invalid, but that's now gone, so we can upgrade to a
full WARN_ON.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/vkms/vkms_crc.c | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
index 0d31cfc32042..4b3146d83265 100644
--- a/drivers/gpu/drm/vkms/vkms_crc.c
+++ b/drivers/gpu/drm/vkms/vkms_crc.c
@@ -97,16 +97,10 @@ static void compose_cursor(struct vkms_crc_data *cursor_crc,
 	cursor_obj = drm_gem_fb_get_obj(&cursor_crc->fb, 0);
 	cursor_vkms_obj = drm_gem_to_vkms_gem(cursor_obj);
 
-	mutex_lock(&cursor_vkms_obj->pages_lock);
-	if (!cursor_vkms_obj->vaddr) {
-		DRM_WARN("cursor plane vaddr is NULL");
-		goto out;
-	}
+	if (WARN_ON(!cursor_vkms_obj->vaddr))
+		return;
 
 	blend(vaddr_out, cursor_vkms_obj->vaddr, primary_crc, cursor_crc);
-
-out:
-	mutex_unlock(&cursor_vkms_obj->pages_lock);
 }
 
 static uint32_t _vkms_get_crc(struct vkms_crc_data *primary_crc,
@@ -123,15 +117,12 @@ static uint32_t _vkms_get_crc(struct vkms_crc_data *primary_crc,
 		return 0;
 	}
 
-	mutex_lock(&vkms_obj->pages_lock);
 	if (WARN_ON(!vkms_obj->vaddr)) {
-		mutex_unlock(&vkms_obj->pages_lock);
 		kfree(vaddr_out);
 		return crc;
 	}
 
 	memcpy(vaddr_out, vkms_obj->vaddr, vkms_obj->gem.size);
-	mutex_unlock(&vkms_obj->pages_lock);
 
 	if (cursor_crc)
 		compose_cursor(cursor_crc, primary_crc, vaddr_out);
-- 
2.20.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH 00/10] drm/vkms: rework crc worker
  2019-06-06 22:27 [PATCH 00/10] drm/vkms: rework crc worker Daniel Vetter
                   ` (9 preceding siblings ...)
  2019-06-06 22:27 ` [PATCH 10/10] drm/vkms: No need for ->pages_lock in crc work anymore Daniel Vetter
@ 2019-06-12 13:28 ` Rodrigo Siqueira
  2019-06-12 14:42   ` Daniel Vetter
  10 siblings, 1 reply; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-12 13:28 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: DRI Development

Hi Daniel,

First of all, thank you very much for your patchset.

I tried to make a detailed review of your series, and you can see my
comments in each patch. You’ll notice that I asked many things related
to the DRM subsystem with the hope that I could learn a little bit
more about DRM from your comments.

Before you go through the review, I would like to start a discussion
about the vkms race conditions. First, I have to admit that I did not
understand the race conditions that you described before because I
couldn’t reproduce them. Now, I'm suspecting that I could not
experience the problem because I'm using QEMU with KVM; with this idea
in mind, I suppose that we have two scenarios for using vkms in a
virtual machine:

* Scenario 1: The user has hardware virtualization support; in this
case, it is a little bit harder to experience race conditions with
vkms.

* Scenario 2: Without hardware virtualization support, it is much
easier to experience race conditions.

With these two scenarios in mind, I conducted a bunch of experiments
for trying to shed light on this issue. I did:

1. Enabled lockdep

2. Defined two different environments for testing both using QEMU with
and without kvm. See below the QEMU hardware options:

* env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
* env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4

3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
turn off the vm.

4. From the lockdep_stat, I just highlighted the row related to vkms
and the columns holdtime-total and holdtime-avg

I would like to highlight that the following data does not have any
statistical approach, and its intention is solely to assist our
discussion. See below the summary of the collected data:

Summary of the experiment results:

+----------------+----------------+----------------+
|                |     env_kvm    |   env_no_kvm   |
+                +----------------+----------------+
| Test           | Before | After | Before | After |
+----------------+--------+-------+--------+-------+
| kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
+----------------+--------+-------+--------+-------+

* Before: before apply this patchset
* After: after apply this patchset

-----------------------------------------+------------------+-----------
S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
-----------------------------------------+----------------+-------------
&(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
(work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
(wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
&(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
(work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
-----------------------------------------+----------------+-------------
S2: With this patchset and with kvm      |                |
-----------------------------------------+----------------+-------------
&(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
(work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
&(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
(wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
(work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
-----------------------------------------+----------------+-------------
M1: Without this patchset and without kvm|                |
-----------------------------------------+----------------+-------------
&(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
&(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
(work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
(wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
(work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
-----------------------------------------+----------------+-------------
M2: With this patchset and without kvm   |                |
-----------------------------------------+----------------+-------------
(wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
&(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
(work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
&(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
(work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31

First, I analyzed the scenarios with KVM (S1 and S2); more
specifically, I focused on these two classes:

1. (work_completion)(&vkms_state->crc_wo
2. (work_completion)(&vkms_state->crc_#2

After taking a look at the data, it looks like that this patchset
greatly reduces the hold time average for crc_wo. On the other hand,
it increases the hold time average for crc_#2. I didn’t understand
well the reason for the difference. Could you help me here?

When I looked for the second set of scenarios (M1 and M2, both without
KVM), the results look much more distant; basically, this patchset
increased the hold time average. Again, could you help me understand a
little bit better this issue?

Finally, after these tests, I have some questions:

1. VKMS is a software-only driver; because of this, how about defining
a minimal system resource for using it?

2. During my experiments, I noticed that running tests with a VM that
uses KVM had consistent results. For example, kms_flip never fails
with QEMU+KVM; however, without KVM, two or three tests failed (one of
them looks random). If we use vkms for test DRM stuff, should we
recommend the use of KVM?

Best Regards

On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Hi all,
>
> This here is the first part of a rework for the vkms crc worker. I think
> this should fix all the locking/races/use-after-free issues I spotted in
> the code. There's more work we can do in the future as a follow-up:
>
> - directly access vkms_plane_state->base in the crc worker, with this
>   approach in this series here that should be safe now. Follow-up patches
>   could switch and remove all the separate crc_data infrastructure.
>
> - I think some kerneldoc for vkms structures would be nice. Documentation
>   the various functions is probably overkill.
>
> - Implementing a more generic blending engine, as prep for adding more
>   pixel formats, more planes, and more features in general.
>
> Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> things worse, but I didn't do a full run.
>
> Cheers, Daniel
>
> Daniel Vetter (10):
>   drm/vkms: Fix crc worker races
>   drm/vkms: Use spin_lock_irq in process context
>   drm/vkms: Rename vkms_output.state_lock to crc_lock
>   drm/vkms: Move format arrays to vkms_plane.c
>   drm/vkms: Add our own commit_tail
>   drm/vkms: flush crc workers earlier in commit flow
>   drm/vkms: Dont flush crc worker when we change crc status
>   drm/vkms: No _irqsave within spin_lock_irq needed
>   drm/vkms: totally reworked crc data tracking
>   drm/vkms: No need for ->pages_lock in crc work anymore
>
>  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
>  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
>  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
>  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
>  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
>  5 files changed, 146 insertions(+), 75 deletions(-)
>
> --
> 2.20.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 

Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 01/10] drm/vkms: Fix crc worker races
  2019-06-06 22:27 ` [PATCH 01/10] drm/vkms: Fix crc worker races Daniel Vetter
@ 2019-06-12 13:33   ` Rodrigo Siqueira
  2019-06-12 14:48     ` Daniel Vetter
  0 siblings, 1 reply; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-12 13:33 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Shayenne Moura, Haneen Mohammed, DRI Development, Daniel Vetter

On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> The issue we have is that the crc worker might fall behind. We've
> tried to handle this by tracking both the earliest frame for which it
> still needs to compute a crc, and the last one. Plus when the
> crtc_state changes, we have a new work item, which are all run in
> order due to the ordered workqueue we allocate for each vkms crtc.
>
> Trouble is there's been a few small issues in the current code:
> - we need to capture frame_end in the vblank hrtimer, not in the
>   worker. The worker might run much later, and then we generate a lot
>   of crc for which there's already a different worker queued up.
> - frame number might be 0, so create a new crc_pending boolean to
>   track this without confusion.
> - we need to atomically grab frame_start/end and clear it, so do that
>   all in one go. This is not going to create a new race, because if we
>   race with the hrtimer then our work will be re-run.
> - only race that can happen is the following:
>   1. worker starts
>   2. hrtimer runs and updates frame_end
>   3. worker grabs frame_start/end, already reading the new frame_end,
>   and clears crc_pending
>   4. hrtimer calls queue_work()
>   5. worker completes
>   6. worker gets  re-run, crc_pending is false
>   Explain this case a bit better by rewording the comment.
>
> v2: Demote warning level output to debug when we fail to requeue, this
> is expected under high load when the crc worker can't quite keep up.
>
> Cc: Shayenne Moura <shayenneluzmoura@gmail.com>
> Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/vkms/vkms_crc.c  | 27 +++++++++++++--------------
>  drivers/gpu/drm/vkms/vkms_crtc.c |  9 +++++++--
>  drivers/gpu/drm/vkms/vkms_drv.h  |  2 ++
>  3 files changed, 22 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> index d7b409a3c0f8..66603da634fe 100644
> --- a/drivers/gpu/drm/vkms/vkms_crc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> @@ -166,16 +166,24 @@ void vkms_crc_work_handle(struct work_struct *work)
>         struct drm_plane *plane;
>         u32 crc32 = 0;
>         u64 frame_start, frame_end;
> +       bool crc_pending;
>         unsigned long flags;
>
>         spin_lock_irqsave(&out->state_lock, flags);
>         frame_start = crtc_state->frame_start;
>         frame_end = crtc_state->frame_end;
> +       crc_pending = crtc_state->crc_pending;
> +       crtc_state->frame_start = 0;
> +       crtc_state->frame_end = 0;
> +       crtc_state->crc_pending = false;
>         spin_unlock_irqrestore(&out->state_lock, flags);
>
> -       /* _vblank_handle() hasn't updated frame_start yet */
> -       if (!frame_start || frame_start == frame_end)
> -               goto out;
> +       /*
> +        * We raced with the vblank hrtimer and previous work already computed
> +        * the crc, nothing to do.
> +        */
> +       if (!crc_pending)
> +               return;

I think this condition is not reachable because crc_pending will be
filled with true in `vkms_vblank_simulate()` which in turn schedule
the function `vkms_crc_work_handle()`. Just for checking, I tried to
reach this condition by running kms_flip, kms_pipe_crc_basic, and
kms_cursor_crc with two different VM setups[1], but I couldn't reach
it. What do you think?

[1] Qemu parameters
VM1: -m 1G -smp cores=2,cpus=2
VM2: -enable-kvm -m 2G -smp cores=4,cpus=4

>         drm_for_each_plane(plane, &vdev->drm) {
>                 struct vkms_plane_state *vplane_state;
> @@ -196,20 +204,11 @@ void vkms_crc_work_handle(struct work_struct *work)
>         if (primary_crc)
>                 crc32 = _vkms_get_crc(primary_crc, cursor_crc);
>
> -       frame_end = drm_crtc_accurate_vblank_count(crtc);
> -
> -       /* queue_work can fail to schedule crc_work; add crc for
> -        * missing frames
> +       /*
> +        * The worker can fall behind the vblank hrtimer, make sure we catch up.
>          */
>         while (frame_start <= frame_end)
>                 drm_crtc_add_crc_entry(crtc, true, frame_start++, &crc32);

I want to take this opportunity to ask about this while; It's not
really specific to this patch.

I have to admit that I never fully got the idea behind this 'while';
it looks like that we just fill out the missed frames with a repeated
value. FWIU, `drm_crtc_add_crc_entry()` will add an entry with the CRC
information for a frame, but in this case, we are adding the same CRC
for a different set of frames. I agree that near frame has a similar
CRC value, but could we rely on this all the time? What could happen
if we have a great difference from the frame_start and frame_end?

> -
> -out:
> -       /* to avoid using the same value for frame number again */
> -       spin_lock_irqsave(&out->state_lock, flags);
> -       crtc_state->frame_end = frame_end;
> -       crtc_state->frame_start = 0;
> -       spin_unlock_irqrestore(&out->state_lock, flags);
>  }
>
>  static int vkms_crc_parse_source(const char *src_name, bool *enabled)
> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> index 1bbe099b7db8..c727d8486e97 100644
> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> @@ -30,13 +30,18 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
>                  * has read the data
>                  */
>                 spin_lock(&output->state_lock);
> -               if (!state->frame_start)
> +               if (!state->crc_pending)
>                         state->frame_start = frame;
> +               else
> +                       DRM_DEBUG_DRIVER("crc worker falling behind, frame_start: %llu, frame_end: %llu\n",
> +                                        state->frame_start, frame);
> +               state->frame_end = frame;
> +               state->crc_pending = true;
>                 spin_unlock(&output->state_lock);
>
>                 ret = queue_work(output->crc_workq, &state->crc_work);
>                 if (!ret)
> -                       DRM_WARN("failed to queue vkms_crc_work_handle");
> +                       DRM_DEBUG_DRIVER("vkms_crc_work_handle already queued\n");
>         }
>
>         spin_unlock(&output->lock);
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 81f1cfbeb936..3c7e06b19efd 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -56,6 +56,8 @@ struct vkms_plane_state {
>  struct vkms_crtc_state {
>         struct drm_crtc_state base;
>         struct work_struct crc_work;
> +
> +       bool crc_pending;
>         u64 frame_start;
>         u64 frame_end;
>  };
> --
> 2.20.1
>


-- 

Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 02/10] drm/vkms: Use spin_lock_irq in process context
  2019-06-06 22:27 ` [PATCH 02/10] drm/vkms: Use spin_lock_irq in process context Daniel Vetter
@ 2019-06-12 13:34   ` Rodrigo Siqueira
  2019-06-12 14:54     ` Daniel Vetter
  0 siblings, 1 reply; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-12 13:34 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Shayenne Moura, Haneen Mohammed, DRI Development, Daniel Vetter

On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> The worker is always in process context, no need for the _irqsafe
> version. Same for the set_source callback, that's only called from the
> debugfs handler in a syscall.
>
> Cc: Shayenne Moura <shayenneluzmoura@gmail.com>
> Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/vkms/vkms_crc.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> index 66603da634fe..883e36fe7b6e 100644
> --- a/drivers/gpu/drm/vkms/vkms_crc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> @@ -167,16 +167,15 @@ void vkms_crc_work_handle(struct work_struct *work)
>         u32 crc32 = 0;
>         u64 frame_start, frame_end;
>         bool crc_pending;
> -       unsigned long flags;
>
> -       spin_lock_irqsave(&out->state_lock, flags);
> +       spin_lock_irq(&out->state_lock);
>         frame_start = crtc_state->frame_start;
>         frame_end = crtc_state->frame_end;
>         crc_pending = crtc_state->crc_pending;
>         crtc_state->frame_start = 0;
>         crtc_state->frame_end = 0;
>         crtc_state->crc_pending = false;
> -       spin_unlock_irqrestore(&out->state_lock, flags);
> +       spin_unlock_irq(&out->state_lock);
>
>         /*
>          * We raced with the vblank hrtimer and previous work already computed
> @@ -246,7 +245,6 @@ int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name)
>  {
>         struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
>         bool enabled = false;
> -       unsigned long flags;
>         int ret = 0;
>
>         ret = vkms_crc_parse_source(src_name, &enabled);
> @@ -254,9 +252,9 @@ int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name)
>         /* make sure nothing is scheduled on crtc workq */
>         flush_workqueue(out->crc_workq);
>
> -       spin_lock_irqsave(&out->lock, flags);
> +       spin_lock_irq(&out->lock);
>         out->crc_enabled = enabled;
> -       spin_unlock_irqrestore(&out->lock, flags);

I was wondering if we could use atomic_t for crc_enabled and avoid
this sort of lock. I did not try to change the data type; this is just
an idea.

Reviewed-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Tested-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>

> +       spin_unlock_irq(&out->lock);
>
>         return ret;
>  }
> --
> 2.20.1
>


-- 

Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 03/10] drm/vkms: Rename vkms_output.state_lock to crc_lock
  2019-06-06 22:27 ` [PATCH 03/10] drm/vkms: Rename vkms_output.state_lock to crc_lock Daniel Vetter
@ 2019-06-12 13:38   ` Rodrigo Siqueira
  2019-06-13  7:48     ` Daniel Vetter
  0 siblings, 1 reply; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-12 13:38 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, DRI Development, Haneen Mohammed

On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Plus add a comment about what it actually protects. It's very little.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/vkms/vkms_crc.c  | 4 ++--
>  drivers/gpu/drm/vkms/vkms_crtc.c | 6 +++---
>  drivers/gpu/drm/vkms/vkms_drv.h  | 5 +++--
>  3 files changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> index 883e36fe7b6e..96806cd35ad4 100644
> --- a/drivers/gpu/drm/vkms/vkms_crc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> @@ -168,14 +168,14 @@ void vkms_crc_work_handle(struct work_struct *work)
>         u64 frame_start, frame_end;
>         bool crc_pending;
>
> -       spin_lock_irq(&out->state_lock);
> +       spin_lock_irq(&out->crc_lock);
>         frame_start = crtc_state->frame_start;
>         frame_end = crtc_state->frame_end;
>         crc_pending = crtc_state->crc_pending;
>         crtc_state->frame_start = 0;
>         crtc_state->frame_end = 0;
>         crtc_state->crc_pending = false;
> -       spin_unlock_irq(&out->state_lock);
> +       spin_unlock_irq(&out->crc_lock);
>
>         /*
>          * We raced with the vblank hrtimer and previous work already computed
> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> index c727d8486e97..55b16d545fe7 100644
> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> @@ -29,7 +29,7 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
>                 /* update frame_start only if a queued vkms_crc_work_handle()
>                  * has read the data
>                  */
> -               spin_lock(&output->state_lock);
> +               spin_lock(&output->crc_lock);
>                 if (!state->crc_pending)
>                         state->frame_start = frame;
>                 else
> @@ -37,7 +37,7 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
>                                          state->frame_start, frame);
>                 state->frame_end = frame;
>                 state->crc_pending = true;
> -               spin_unlock(&output->state_lock);
> +               spin_unlock(&output->crc_lock);
>
>                 ret = queue_work(output->crc_workq, &state->crc_work);
>                 if (!ret)
> @@ -224,7 +224,7 @@ int vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
>         drm_crtc_helper_add(crtc, &vkms_crtc_helper_funcs);
>
>         spin_lock_init(&vkms_out->lock);
> -       spin_lock_init(&vkms_out->state_lock);
> +       spin_lock_init(&vkms_out->crc_lock);
>
>         vkms_out->crc_workq = alloc_ordered_workqueue("vkms_crc_workq", 0);
>         if (!vkms_out->crc_workq)
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 3c7e06b19efd..43d3f98289fe 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -57,6 +57,7 @@ struct vkms_crtc_state {
>         struct drm_crtc_state base;
>         struct work_struct crc_work;
>
> +       /* below three are protected by vkms_output.crc_lock */
>         bool crc_pending;
>         u64 frame_start;
>         u64 frame_end;
> @@ -74,8 +75,8 @@ struct vkms_output {
>         struct workqueue_struct *crc_workq;
>         /* protects concurrent access to crc_data */
>         spinlock_t lock;

It's not really specific to this patch, but after reviewing it, I was
thinking about the use of the 'lock' field in the struct vkms_output.
Do we really need it? It looks like that crc_lock just replaced it.

Additionally, going a little bit deeper in the lock field at struct
vkms_output, I was also asking myself: what critical area do we want
to protect with this lock? After inspecting the use of this lock, I
noticed two different places that use it:

1. In the function vkms_vblank_simulate()

Note that we cover a vast area with ‘output->lock’. IMHO we just need
to protect the state variable (line “state = output->crc_state;”) and
we can also use crc_lock for this case. Make sense?

2. In the function vkms_crtc_atomic_begin()

We also hold output->lock in the vkms_crtc_atomic_begin() and just
release it at vkms_crtc_atomic_flush(). Do we still need it?

> -       /* protects concurrent access to crtc_state */
> -       spinlock_t state_lock;
> +
> +       spinlock_t crc_lock;
>  };

Maybe add a kernel doc on top of crc_lock?

>
>  struct vkms_device {
> --
> 2.20.1
>


-- 

Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 04/10] drm/vkms: Move format arrays to vkms_plane.c
  2019-06-06 22:27 ` [PATCH 04/10] drm/vkms: Move format arrays to vkms_plane.c Daniel Vetter
@ 2019-06-12 13:39   ` Rodrigo Siqueira
  2019-06-19  2:12   ` Rodrigo Siqueira
  1 sibling, 0 replies; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-12 13:39 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, DRI Development, Haneen Mohammed

On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> No need to have them multiple times.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/vkms/vkms_drv.h   | 8 --------
>  drivers/gpu/drm/vkms/vkms_plane.c | 8 ++++++++
>  2 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 43d3f98289fe..2a35299bfb89 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -20,14 +20,6 @@
>
>  extern bool enable_cursor;
>
> -static const u32 vkms_formats[] = {
> -       DRM_FORMAT_XRGB8888,
> -};
> -
> -static const u32 vkms_cursor_formats[] = {
> -       DRM_FORMAT_ARGB8888,
> -};
> -
>  struct vkms_crc_data {
>         struct drm_framebuffer fb;
>         struct drm_rect src, dst;
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index 0e67d2d42f0c..0fceb6258422 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -6,6 +6,14 @@
>  #include <drm/drm_atomic_helper.h>
>  #include <drm/drm_gem_framebuffer_helper.h>
>
> +static const u32 vkms_formats[] = {
> +       DRM_FORMAT_XRGB8888,
> +};
> +
> +static const u32 vkms_cursor_formats[] = {
> +       DRM_FORMAT_ARGB8888,
> +};
> +
>  static struct drm_plane_state *
>  vkms_plane_duplicate_state(struct drm_plane *plane)
>  {
> --
> 2.20.1
>
Reviewed-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Tested-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>

-- 

Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 06/10] drm/vkms: flush crc workers earlier in commit flow
  2019-06-06 22:27 ` [PATCH 06/10] drm/vkms: flush crc workers earlier in commit flow Daniel Vetter
@ 2019-06-12 13:42   ` Rodrigo Siqueira
  2019-06-13  7:53     ` Daniel Vetter
  0 siblings, 1 reply; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-12 13:42 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, DRI Development, Haneen Mohammed

On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Currently we flush pending crc workers very late in the commit flow,
> when we destry all the old crtc states. Unfortunately at that point

destry -> destroy

> the framebuffers are already unpinned (and our vaddr possible gone),
> so this isn't good. Also, the plane_states we need might also already
> be cleaned up, since cleanup order of state structures isn't well
> defined.
>
> Fix this by waiting for all crc workers of the old state to complete
> before we start any of the cleanup work.
>
> Note that this is not yet race-free, because the hrtimer and crc
> worker look at the wrong state pointers, but that will be fixed in
> subsequent patches.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/vkms/vkms_crtc.c |  2 +-
>  drivers/gpu/drm/vkms/vkms_drv.c  | 10 ++++++++++
>  2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> index 55b16d545fe7..b6987d90805f 100644
> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> @@ -125,7 +125,7 @@ static void vkms_atomic_crtc_destroy_state(struct drm_crtc *crtc,
>         __drm_atomic_helper_crtc_destroy_state(state);
>
>         if (vkms_state) {
> -               flush_work(&vkms_state->crc_work);
> +               WARN_ON(work_pending(&vkms_state->crc_work));
>                 kfree(vkms_state);
>         }
>  }
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
> index f677ab1d0094..cc53ef88a331 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.c
> +++ b/drivers/gpu/drm/vkms/vkms_drv.c
> @@ -62,6 +62,9 @@ static void vkms_release(struct drm_device *dev)
>  static void vkms_atomic_commit_tail(struct drm_atomic_state *old_state)
>  {
>         struct drm_device *dev = old_state->dev;
> +       struct drm_crtc *crtc;
> +       struct drm_crtc_state *old_crtc_state;
> +       int i;
>
>         drm_atomic_helper_commit_modeset_disables(dev, old_state);
>
> @@ -75,6 +78,13 @@ static void vkms_atomic_commit_tail(struct drm_atomic_state *old_state)
>
>         drm_atomic_helper_wait_for_vblanks(dev, old_state);
>
> +       for_each_old_crtc_in_state(old_state, crtc, old_crtc_state, i) {
> +               struct vkms_crtc_state *vkms_state =
> +                       to_vkms_crtc_state(old_crtc_state);
> +
> +               flush_work(&vkms_state->crc_work);
> +       }
> +
>         drm_atomic_helper_cleanup_planes(dev, old_state);
>  }

why not use drm_atomic_helper_commit_tail() here? I mean:

for_each_old_crtc_in_state(old_state, crtc, old_crtc_state, i) {
…
}

drm_atomic_helper_commit_tail(old_state);

After looking at drm_atomic_helper_cleanup_planes() it sounds safe for
me to use the above code; I just test it with two tests from
crc_cursor. Maybe I missed something, could you help me here?

Finally, IMHO, I think that Patch 05, 06 and 07 could be squashed in a
single patch to make it easier to understand the change.

> --
> 2.20.1
>


-- 

Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 08/10] drm/vkms: No _irqsave within spin_lock_irq needed
  2019-06-06 22:27 ` [PATCH 08/10] drm/vkms: No _irqsave within spin_lock_irq needed Daniel Vetter
@ 2019-06-12 13:43   ` Rodrigo Siqueira
  0 siblings, 0 replies; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-12 13:43 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, DRI Development

On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> irqs are already off.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
>  drivers/gpu/drm/vkms/vkms_crtc.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> index b6987d90805f..48a793ba4030 100644
> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> @@ -183,17 +183,16 @@ static void vkms_crtc_atomic_flush(struct drm_crtc *crtc,
>                                    struct drm_crtc_state *old_crtc_state)
>  {
>         struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
> -       unsigned long flags;
>
>         if (crtc->state->event) {
> -               spin_lock_irqsave(&crtc->dev->event_lock, flags);
> +               spin_lock(&crtc->dev->event_lock);
>
>                 if (drm_crtc_vblank_get(crtc) != 0)
>                         drm_crtc_send_vblank_event(crtc, crtc->state->event);
>                 else
>                         drm_crtc_arm_vblank_event(crtc, crtc->state->event);
>
> -               spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
> +               spin_unlock(&crtc->dev->event_lock);
>
>                 crtc->state->event = NULL;
>         }
> --
> 2.20.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

Reviewed-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Tested-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>

-- 

Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 09/10] drm/vkms: totally reworked crc data tracking
  2019-06-06 22:27 ` [PATCH 09/10] drm/vkms: totally reworked crc data tracking Daniel Vetter
@ 2019-06-12 13:46   ` Rodrigo Siqueira
  2019-06-13  7:59     ` Daniel Vetter
  0 siblings, 1 reply; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-12 13:46 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, DRI Development, Haneen Mohammed

On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> The crc computation worker needs to be able to get at some data
> structures and framebuffer mappings, while potentially more atomic
> updates are going on. The solution thus far is to copy relevant bits
> around, but that's very tedious.
>
> Here's a new approach, which tries to be more clever, but relies on a
> few not-so-obvious things:
> - crtc_state is always updated when a plane_state changes. Therefore
>   we can just stuff plane_state pointers into a crtc_state. That
>   solves the problem of easily getting at the needed plane_states.

Just for curiosity, where exactly the crct_state is updated? If
possible, could you elaborate a little bit about this?

> - with the flushing changes from previous patches the above also holds
>   without races due to the next atomic update being a bit eager with
>   cleaning up pending work - we always wait for all crc work items to
>   complete before unmapping framebuffers.
> - we also need to make sure that the hrtimer fires off the right
>   worker. Keep a new distinct crc_state pointer, under the
>   vkms_output->lock protection for this. Note that crtc->state is
>   updated very early in the atomic commit, way before we arm the
>   vblank event - the vblank event should always match the buffers we
>   use to compute the crc. This also solves an issue in the hrtimer,
>   where we've accessed drm_crtc->state without holding the right locks
>   (we held none - oops).
> - in the worker itself we can then just access the plane states we
>   need, again solving a bunch of ordering and locking issues.
>   Accessing plane->state requires locks, accessing the private
>   vkms_crtc_state->active_planes pointer only requires that the memory
>   doesn't get freed too early.
>
> The idea behind vkms_crtc_state->active_planes is that this would
> contain all visible planes, in z-order, as a first step towards a more
> generic blending implementation.
>
> Note that this patch also fixes races between prepare_fb/cleanup_fb
> and the crc worker accessing ->vaddr.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/vkms/vkms_crc.c  | 21 +++--------
>  drivers/gpu/drm/vkms/vkms_crtc.c | 60 +++++++++++++++++++++++++++++---
>  drivers/gpu/drm/vkms/vkms_drv.h  |  9 ++++-
>  3 files changed, 67 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> index 9d15e5e85830..0d31cfc32042 100644
> --- a/drivers/gpu/drm/vkms/vkms_crc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> @@ -159,11 +159,8 @@ void vkms_crc_work_handle(struct work_struct *work)
>                                                 crc_work);
>         struct drm_crtc *crtc = crtc_state->base.crtc;
>         struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
> -       struct vkms_device *vdev = container_of(out, struct vkms_device,
> -                                               output);
>         struct vkms_crc_data *primary_crc = NULL;
>         struct vkms_crc_data *cursor_crc = NULL;
> -       struct drm_plane *plane;
>         u32 crc32 = 0;
>         u64 frame_start, frame_end;
>         bool crc_pending;
> @@ -184,21 +181,11 @@ void vkms_crc_work_handle(struct work_struct *work)
>         if (!crc_pending)
>                 return;
>
> -       drm_for_each_plane(plane, &vdev->drm) {
> -               struct vkms_plane_state *vplane_state;
> -               struct vkms_crc_data *crc_data;
> +       if (crtc_state->num_active_planes >= 1)
> +               primary_crc = crtc_state->active_planes[0]->crc_data;
>
> -               vplane_state = to_vkms_plane_state(plane->state);
> -               crc_data = vplane_state->crc_data;
> -
> -               if (drm_framebuffer_read_refcount(&crc_data->fb) == 0)
> -                       continue;
> -
> -               if (plane->type == DRM_PLANE_TYPE_PRIMARY)
> -                       primary_crc = crc_data;
> -               else
> -                       cursor_crc = crc_data;
> -       }
> +       if (crtc_state->num_active_planes == 2)
> +               cursor_crc = crtc_state->active_planes[1]->crc_data;
>
>         if (primary_crc)
>                 crc32 = _vkms_get_crc(primary_crc, cursor_crc);
> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> index 48a793ba4030..14156ed70415 100644
> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0+
>
>  #include "vkms_drv.h"
> +#include <drm/drm_atomic.h>
>  #include <drm/drm_atomic_helper.h>
>  #include <drm/drm_probe_helper.h>
>
> @@ -9,7 +10,7 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
>         struct vkms_output *output = container_of(timer, struct vkms_output,
>                                                   vblank_hrtimer);
>         struct drm_crtc *crtc = &output->crtc;
> -       struct vkms_crtc_state *state = to_vkms_crtc_state(crtc->state);
> +       struct vkms_crtc_state *state;
>         u64 ret_overrun;
>         bool ret;
>
> @@ -23,6 +24,7 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
>         if (!ret)
>                 DRM_ERROR("vkms failure on handling vblank");
>
> +       state = output->crc_state;
>         if (state && output->crc_enabled) {
>                 u64 frame = drm_crtc_accurate_vblank_count(crtc);
>
> @@ -124,10 +126,9 @@ static void vkms_atomic_crtc_destroy_state(struct drm_crtc *crtc,
>
>         __drm_atomic_helper_crtc_destroy_state(state);
>
> -       if (vkms_state) {
> -               WARN_ON(work_pending(&vkms_state->crc_work));
> -               kfree(vkms_state);
> -       }
> +       WARN_ON(work_pending(&vkms_state->crc_work));
> +       kfree(vkms_state->active_planes);
> +       kfree(vkms_state);
>  }
>
>  static void vkms_atomic_crtc_reset(struct drm_crtc *crtc)
> @@ -156,6 +157,52 @@ static const struct drm_crtc_funcs vkms_crtc_funcs = {
>         .verify_crc_source      = vkms_verify_crc_source,
>  };
>
> +static int vkms_crtc_atomic_check(struct drm_crtc *crtc,
> +                                  struct drm_crtc_state *state)
> +{
> +       struct vkms_crtc_state *vkms_state = to_vkms_crtc_state(state);
> +       struct drm_plane *plane;
> +       struct drm_plane_state *plane_state;
> +       int i = 0, ret;
> +
> +       if (vkms_state->active_planes)
> +               return 0;
> +
> +       ret = drm_atomic_add_affected_planes(state->state, crtc);
> +       if (ret < 0)
> +               return ret;
> +
> +       drm_for_each_plane_mask(plane, crtc->dev, state->plane_mask) {

In the drm_for_each_plane_mask documentation says: “ Iterate over all
planes specified by bitmask”. I did not understand what it means by
“specified by bitmask” and the use of this macro in this context. I
tried to replace it for drm_for_each_plane, but the test just break.
Could you explain a little bit the magic behind
drm_for_each_plane_mask?

> +               plane_state = drm_atomic_get_existing_plane_state(state->state,
> +                                                                 plane);
> +               WARN_ON(!plane_state);
> +
> +               if (!plane_state->visible)
> +                       continue;
> +
> +               i++;
> +       }
> +
> +       vkms_state->active_planes = kcalloc(i, sizeof(plane), GFP_KERNEL);
> +       if (!vkms_state->active_planes)
> +               return -ENOMEM;
> +       vkms_state->num_active_planes = i;
> +
> +       i = 0;
> +       drm_for_each_plane_mask(plane, crtc->dev, state->plane_mask) {
> +               plane_state = drm_atomic_get_existing_plane_state(state->state,
> +                                                                 plane);
> +
> +               if (!plane_state->visible)
> +                       continue;
> +
> +               vkms_state->active_planes[i++] =
> +                       to_vkms_plane_state(plane_state);
> +       }
> +
> +       return 0;
> +}
> +
>  static void vkms_crtc_atomic_enable(struct drm_crtc *crtc,
>                                     struct drm_crtc_state *old_state)
>  {
> @@ -197,10 +244,13 @@ static void vkms_crtc_atomic_flush(struct drm_crtc *crtc,
>                 crtc->state->event = NULL;
>         }
>
> +       vkms_output->crc_state = to_vkms_crtc_state(crtc->state);
> +
>         spin_unlock_irq(&vkms_output->lock);
>  }
>
>  static const struct drm_crtc_helper_funcs vkms_crtc_helper_funcs = {
> +       .atomic_check   = vkms_crtc_atomic_check,
>         .atomic_begin   = vkms_crtc_atomic_begin,
>         .atomic_flush   = vkms_crtc_atomic_flush,
>         .atomic_enable  = vkms_crtc_atomic_enable,
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 2a35299bfb89..4e7450111d45 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -49,6 +49,10 @@ struct vkms_crtc_state {
>         struct drm_crtc_state base;
>         struct work_struct crc_work;
>
> +       int num_active_planes;
> +       /* stack of active planes for crc computation, should be in z order */
> +       struct vkms_plane_state **active_planes;
> +
>         /* below three are protected by vkms_output.crc_lock */
>         bool crc_pending;
>         u64 frame_start;
> @@ -62,12 +66,15 @@ struct vkms_output {
>         struct hrtimer vblank_hrtimer;
>         ktime_t period_ns;
>         struct drm_pending_vblank_event *event;
> -       bool crc_enabled;
>         /* ordered wq for crc_work */
>         struct workqueue_struct *crc_workq;
>         /* protects concurrent access to crc_data */
>         spinlock_t lock;
>
> +       /* protected by @lock */
> +       bool crc_enabled;
> +       struct vkms_crtc_state *crc_state;
> +
>         spinlock_t crc_lock;
>  };
>
> --
> 2.20.1
>


-- 

Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 10/10] drm/vkms: No need for ->pages_lock in crc work anymore
  2019-06-06 22:27 ` [PATCH 10/10] drm/vkms: No need for ->pages_lock in crc work anymore Daniel Vetter
@ 2019-06-12 13:47   ` Rodrigo Siqueira
  0 siblings, 0 replies; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-12 13:47 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, DRI Development, Haneen Mohammed

On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> We're now guaranteed to no longer race against prepare_fb/cleanup_fb,
> which means we can access ->vaddr without having to hold a lock.
>
> Before the previous patches it was fairly easy to observe the cursor
> ->vaddr being invalid, but that's now gone, so we can upgrade to a
> full WARN_ON.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/vkms/vkms_crc.c | 13 ++-----------
>  1 file changed, 2 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> index 0d31cfc32042..4b3146d83265 100644
> --- a/drivers/gpu/drm/vkms/vkms_crc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> @@ -97,16 +97,10 @@ static void compose_cursor(struct vkms_crc_data *cursor_crc,
>         cursor_obj = drm_gem_fb_get_obj(&cursor_crc->fb, 0);
>         cursor_vkms_obj = drm_gem_to_vkms_gem(cursor_obj);
>
> -       mutex_lock(&cursor_vkms_obj->pages_lock);
> -       if (!cursor_vkms_obj->vaddr) {
> -               DRM_WARN("cursor plane vaddr is NULL");
> -               goto out;
> -       }
> +       if (WARN_ON(!cursor_vkms_obj->vaddr))
> +               return;
>
>         blend(vaddr_out, cursor_vkms_obj->vaddr, primary_crc, cursor_crc);
> -
> -out:
> -       mutex_unlock(&cursor_vkms_obj->pages_lock);
>  }
>
>  static uint32_t _vkms_get_crc(struct vkms_crc_data *primary_crc,
> @@ -123,15 +117,12 @@ static uint32_t _vkms_get_crc(struct vkms_crc_data *primary_crc,
>                 return 0;
>         }
>
> -       mutex_lock(&vkms_obj->pages_lock);
>         if (WARN_ON(!vkms_obj->vaddr)) {
> -               mutex_unlock(&vkms_obj->pages_lock);
>                 kfree(vaddr_out);
>                 return crc;
>         }
>
>         memcpy(vaddr_out, vkms_obj->vaddr, vkms_obj->gem.size);
> -       mutex_unlock(&vkms_obj->pages_lock);
>
>         if (cursor_crc)
>                 compose_cursor(cursor_crc, primary_crc, vaddr_out);
> --
> 2.20.1
>
Reviewed-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Tested-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>

-- 

Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 00/10] drm/vkms: rework crc worker
  2019-06-12 13:28 ` [PATCH 00/10] drm/vkms: rework crc worker Rodrigo Siqueira
@ 2019-06-12 14:42   ` Daniel Vetter
  2019-06-18  2:49     ` Rodrigo Siqueira
  0 siblings, 1 reply; 44+ messages in thread
From: Daniel Vetter @ 2019-06-12 14:42 UTC (permalink / raw)
  To: Rodrigo Siqueira; +Cc: Daniel Vetter, DRI Development

On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> Hi Daniel,
> 
> First of all, thank you very much for your patchset.
> 
> I tried to make a detailed review of your series, and you can see my
> comments in each patch. You’ll notice that I asked many things related
> to the DRM subsystem with the hope that I could learn a little bit
> more about DRM from your comments.
> 
> Before you go through the review, I would like to start a discussion
> about the vkms race conditions. First, I have to admit that I did not
> understand the race conditions that you described before because I
> couldn’t reproduce them. Now, I'm suspecting that I could not
> experience the problem because I'm using QEMU with KVM; with this idea
> in mind, I suppose that we have two scenarios for using vkms in a
> virtual machine:
> 
> * Scenario 1: The user has hardware virtualization support; in this
> case, it is a little bit harder to experience race conditions with
> vkms.
> 
> * Scenario 2: Without hardware virtualization support, it is much
> easier to experience race conditions.
> 
> With these two scenarios in mind, I conducted a bunch of experiments
> for trying to shed light on this issue. I did:
> 
> 1. Enabled lockdep
> 
> 2. Defined two different environments for testing both using QEMU with
> and without kvm. See below the QEMU hardware options:
> 
> * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> 
> 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> turn off the vm.
> 
> 4. From the lockdep_stat, I just highlighted the row related to vkms
> and the columns holdtime-total and holdtime-avg
> 
> I would like to highlight that the following data does not have any
> statistical approach, and its intention is solely to assist our
> discussion. See below the summary of the collected data:
> 
> Summary of the experiment results:
> 
> +----------------+----------------+----------------+
> |                |     env_kvm    |   env_no_kvm   |
> +                +----------------+----------------+
> | Test           | Before | After | Before | After |
> +----------------+--------+-------+--------+-------+
> | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> +----------------+--------+-------+--------+-------+
> 
> * Before: before apply this patchset
> * After: after apply this patchset
> 
> -----------------------------------------+------------------+-----------
> S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> -----------------------------------------+----------------+-------------
> &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> -----------------------------------------+----------------+-------------
> S2: With this patchset and with kvm      |                |
> -----------------------------------------+----------------+-------------
> &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> -----------------------------------------+----------------+-------------
> M1: Without this patchset and without kvm|                |
> -----------------------------------------+----------------+-------------
> &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> -----------------------------------------+----------------+-------------
> M2: With this patchset and without kvm   |                |
> -----------------------------------------+----------------+-------------
> (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> 
> First, I analyzed the scenarios with KVM (S1 and S2); more
> specifically, I focused on these two classes:
> 
> 1. (work_completion)(&vkms_state->crc_wo
> 2. (work_completion)(&vkms_state->crc_#2
> 
> After taking a look at the data, it looks like that this patchset
> greatly reduces the hold time average for crc_wo. On the other hand,
> it increases the hold time average for crc_#2. I didn’t understand
> well the reason for the difference. Could you help me here?

So there's two real locks here from our code, the ones you can see as
spinlocks:

&(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
&(&vkms_out->lock)->rlock:               |  247190.04     |  39.39

All the others are fake locks that the workqueue adds, which only exist in
lockdep. They are used to catch special kinds of deadlocks like the below:

thread A:
1. mutex_lock(mutex_A)
2. flush_work(work_A)

thread B
1. running work_A:
2. mutex_lock(mutex_A)

thread B can't continue becuase mutex_A is already held by thread A.
thread A can't continue because thread B is blocked and the work never
finishes.
-> deadlock

I haven't checked which is which, but essentially what you measure with
the hold times of these fake locks is how long a work execution takes on
average.

Since my patches are supposed to fix races where the worker can't keep up
with the vblank hrtimer, then the average worker will (probably) do more,
so that going up is expected. I think.

I'm honestly not sure what's going on here, never looking into this in
detail.

> When I looked for the second set of scenarios (M1 and M2, both without
> KVM), the results look much more distant; basically, this patchset
> increased the hold time average. Again, could you help me understand a
> little bit better this issue?
> 
> Finally, after these tests, I have some questions:
> 
> 1. VKMS is a software-only driver; because of this, how about defining
> a minimal system resource for using it?

No idea, in reality it's probably "if it fails too often, buy faster cpu".
I do think we should make the code robust against a slow cpu, since atm
that's needed even on pretty fast machines (because our blending code is
really, really slow and inefficient).

> 2. During my experiments, I noticed that running tests with a VM that
> uses KVM had consistent results. For example, kms_flip never fails
> with QEMU+KVM; however, without KVM, two or three tests failed (one of
> them looks random). If we use vkms for test DRM stuff, should we
> recommend the use of KVM?

What do you mean without kvm? In general running without kvm shouldn't be
slower, so I'm a bit confused ... I'm running vgem directly on the
machine, by booting into new kernels (and controlling the machine over the
network).
-Daniel

> Best Regards
> 
> On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > Hi all,
> >
> > This here is the first part of a rework for the vkms crc worker. I think
> > this should fix all the locking/races/use-after-free issues I spotted in
> > the code. There's more work we can do in the future as a follow-up:
> >
> > - directly access vkms_plane_state->base in the crc worker, with this
> >   approach in this series here that should be safe now. Follow-up patches
> >   could switch and remove all the separate crc_data infrastructure.
> >
> > - I think some kerneldoc for vkms structures would be nice. Documentation
> >   the various functions is probably overkill.
> >
> > - Implementing a more generic blending engine, as prep for adding more
> >   pixel formats, more planes, and more features in general.
> >
> > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > things worse, but I didn't do a full run.
> >
> > Cheers, Daniel
> >
> > Daniel Vetter (10):
> >   drm/vkms: Fix crc worker races
> >   drm/vkms: Use spin_lock_irq in process context
> >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> >   drm/vkms: Move format arrays to vkms_plane.c
> >   drm/vkms: Add our own commit_tail
> >   drm/vkms: flush crc workers earlier in commit flow
> >   drm/vkms: Dont flush crc worker when we change crc status
> >   drm/vkms: No _irqsave within spin_lock_irq needed
> >   drm/vkms: totally reworked crc data tracking
> >   drm/vkms: No need for ->pages_lock in crc work anymore
> >
> >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> >  5 files changed, 146 insertions(+), 75 deletions(-)
> >
> > --
> > 2.20.1
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> 
> 
> -- 
> 
> Rodrigo Siqueira
> https://siqueira.tech

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 01/10] drm/vkms: Fix crc worker races
  2019-06-12 13:33   ` Rodrigo Siqueira
@ 2019-06-12 14:48     ` Daniel Vetter
  2019-06-18  2:39       ` Rodrigo Siqueira
  0 siblings, 1 reply; 44+ messages in thread
From: Daniel Vetter @ 2019-06-12 14:48 UTC (permalink / raw)
  To: Rodrigo Siqueira
  Cc: Haneen Mohammed, Daniel Vetter, DRI Development, Shayenne Moura,
	Daniel Vetter

On Wed, Jun 12, 2019 at 10:33:11AM -0300, Rodrigo Siqueira wrote:
> On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > The issue we have is that the crc worker might fall behind. We've
> > tried to handle this by tracking both the earliest frame for which it
> > still needs to compute a crc, and the last one. Plus when the
> > crtc_state changes, we have a new work item, which are all run in
> > order due to the ordered workqueue we allocate for each vkms crtc.
> >
> > Trouble is there's been a few small issues in the current code:
> > - we need to capture frame_end in the vblank hrtimer, not in the
> >   worker. The worker might run much later, and then we generate a lot
> >   of crc for which there's already a different worker queued up.
> > - frame number might be 0, so create a new crc_pending boolean to
> >   track this without confusion.
> > - we need to atomically grab frame_start/end and clear it, so do that
> >   all in one go. This is not going to create a new race, because if we
> >   race with the hrtimer then our work will be re-run.
> > - only race that can happen is the following:
> >   1. worker starts
> >   2. hrtimer runs and updates frame_end
> >   3. worker grabs frame_start/end, already reading the new frame_end,
> >   and clears crc_pending
> >   4. hrtimer calls queue_work()
> >   5. worker completes
> >   6. worker gets  re-run, crc_pending is false
> >   Explain this case a bit better by rewording the comment.
> >
> > v2: Demote warning level output to debug when we fail to requeue, this
> > is expected under high load when the crc worker can't quite keep up.
> >
> > Cc: Shayenne Moura <shayenneluzmoura@gmail.com>
> > Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > ---
> >  drivers/gpu/drm/vkms/vkms_crc.c  | 27 +++++++++++++--------------
> >  drivers/gpu/drm/vkms/vkms_crtc.c |  9 +++++++--
> >  drivers/gpu/drm/vkms/vkms_drv.h  |  2 ++
> >  3 files changed, 22 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> > index d7b409a3c0f8..66603da634fe 100644
> > --- a/drivers/gpu/drm/vkms/vkms_crc.c
> > +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> > @@ -166,16 +166,24 @@ void vkms_crc_work_handle(struct work_struct *work)
> >         struct drm_plane *plane;
> >         u32 crc32 = 0;
> >         u64 frame_start, frame_end;
> > +       bool crc_pending;
> >         unsigned long flags;
> >
> >         spin_lock_irqsave(&out->state_lock, flags);
> >         frame_start = crtc_state->frame_start;
> >         frame_end = crtc_state->frame_end;
> > +       crc_pending = crtc_state->crc_pending;
> > +       crtc_state->frame_start = 0;
> > +       crtc_state->frame_end = 0;
> > +       crtc_state->crc_pending = false;
> >         spin_unlock_irqrestore(&out->state_lock, flags);
> >
> > -       /* _vblank_handle() hasn't updated frame_start yet */
> > -       if (!frame_start || frame_start == frame_end)
> > -               goto out;
> > +       /*
> > +        * We raced with the vblank hrtimer and previous work already computed
> > +        * the crc, nothing to do.
> > +        */
> > +       if (!crc_pending)
> > +               return;
> 
> I think this condition is not reachable because crc_pending will be
> filled with true in `vkms_vblank_simulate()` which in turn schedule
> the function `vkms_crc_work_handle()`. Just for checking, I tried to
> reach this condition by running kms_flip, kms_pipe_crc_basic, and
> kms_cursor_crc with two different VM setups[1], but I couldn't reach
> it. What do you think?

thread A			thread B
1. run vblank hrtimer

				2. starts running crc work (from previous
				vblank)

3. spin_lock()			-> gets stalled on the spin_lock() because
				   thread A has it already

4. update frame_end (only in
later patches, atm this is
impossible). crc_pending is set
already.

5. schedule_work: since the work
is running already, this means it
is scheduled to run once more.

6. spin_unlock

				7. compute crc, clear crc_pending
				8. work finishes
				9. work gets run again
				8. crc_pending=false

Since the spin_lock critical section is _very_ short (less than 1 usec I
bet), this race is very hard to hit.

Exercise: Figure out why schedule_work _must_ schedule the work item to
re-run if it's running already. If it doesn't do that there's another
race.

> 
> [1] Qemu parameters
> VM1: -m 1G -smp cores=2,cpus=2
> VM2: -enable-kvm -m 2G -smp cores=4,cpus=4
> 
> >         drm_for_each_plane(plane, &vdev->drm) {
> >                 struct vkms_plane_state *vplane_state;
> > @@ -196,20 +204,11 @@ void vkms_crc_work_handle(struct work_struct *work)
> >         if (primary_crc)
> >                 crc32 = _vkms_get_crc(primary_crc, cursor_crc);
> >
> > -       frame_end = drm_crtc_accurate_vblank_count(crtc);
> > -
> > -       /* queue_work can fail to schedule crc_work; add crc for
> > -        * missing frames
> > +       /*
> > +        * The worker can fall behind the vblank hrtimer, make sure we catch up.
> >          */
> >         while (frame_start <= frame_end)
> >                 drm_crtc_add_crc_entry(crtc, true, frame_start++, &crc32);
> 
> I want to take this opportunity to ask about this while; It's not
> really specific to this patch.
> 
> I have to admit that I never fully got the idea behind this 'while';
> it looks like that we just fill out the missed frames with a repeated
> value. FWIU, `drm_crtc_add_crc_entry()` will add an entry with the CRC
> information for a frame, but in this case, we are adding the same CRC
> for a different set of frames. I agree that near frame has a similar
> CRC value, but could we rely on this all the time? What could happen
> if we have a great difference from the frame_start and frame_end?

It's a cheap trick for slow cpu: If the crc work gets behind the vblank
hrtimer, we need to somehow catch up. With real hw this is not possible,
but with vkms we simulate the hw. The only quick way to catch up is to
fill out the same crc for everything. It's a lie, it will make some
kms_atomic tests fail, but it's the only thing we can really do. Aside
from trying to make the crc computation code as fast as possible.
-Daniel

> 
> > -
> > -out:
> > -       /* to avoid using the same value for frame number again */
> > -       spin_lock_irqsave(&out->state_lock, flags);
> > -       crtc_state->frame_end = frame_end;
> > -       crtc_state->frame_start = 0;
> > -       spin_unlock_irqrestore(&out->state_lock, flags);
> >  }
> >
> >  static int vkms_crc_parse_source(const char *src_name, bool *enabled)
> > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > index 1bbe099b7db8..c727d8486e97 100644
> > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > @@ -30,13 +30,18 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> >                  * has read the data
> >                  */
> >                 spin_lock(&output->state_lock);
> > -               if (!state->frame_start)
> > +               if (!state->crc_pending)
> >                         state->frame_start = frame;
> > +               else
> > +                       DRM_DEBUG_DRIVER("crc worker falling behind, frame_start: %llu, frame_end: %llu\n",
> > +                                        state->frame_start, frame);
> > +               state->frame_end = frame;
> > +               state->crc_pending = true;
> >                 spin_unlock(&output->state_lock);
> >
> >                 ret = queue_work(output->crc_workq, &state->crc_work);
> >                 if (!ret)
> > -                       DRM_WARN("failed to queue vkms_crc_work_handle");
> > +                       DRM_DEBUG_DRIVER("vkms_crc_work_handle already queued\n");
> >         }
> >
> >         spin_unlock(&output->lock);
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 81f1cfbeb936..3c7e06b19efd 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -56,6 +56,8 @@ struct vkms_plane_state {
> >  struct vkms_crtc_state {
> >         struct drm_crtc_state base;
> >         struct work_struct crc_work;
> > +
> > +       bool crc_pending;
> >         u64 frame_start;
> >         u64 frame_end;
> >  };
> > --
> > 2.20.1
> >
> 
> 
> -- 
> 
> Rodrigo Siqueira
> https://siqueira.tech

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 02/10] drm/vkms: Use spin_lock_irq in process context
  2019-06-12 13:34   ` Rodrigo Siqueira
@ 2019-06-12 14:54     ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2019-06-12 14:54 UTC (permalink / raw)
  To: Rodrigo Siqueira
  Cc: Haneen Mohammed, Daniel Vetter, DRI Development, Shayenne Moura,
	Daniel Vetter

On Wed, Jun 12, 2019 at 10:34:55AM -0300, Rodrigo Siqueira wrote:
> On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > The worker is always in process context, no need for the _irqsafe
> > version. Same for the set_source callback, that's only called from the
> > debugfs handler in a syscall.
> >
> > Cc: Shayenne Moura <shayenneluzmoura@gmail.com>
> > Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > ---
> >  drivers/gpu/drm/vkms/vkms_crc.c | 10 ++++------
> >  1 file changed, 4 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> > index 66603da634fe..883e36fe7b6e 100644
> > --- a/drivers/gpu/drm/vkms/vkms_crc.c
> > +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> > @@ -167,16 +167,15 @@ void vkms_crc_work_handle(struct work_struct *work)
> >         u32 crc32 = 0;
> >         u64 frame_start, frame_end;
> >         bool crc_pending;
> > -       unsigned long flags;
> >
> > -       spin_lock_irqsave(&out->state_lock, flags);
> > +       spin_lock_irq(&out->state_lock);
> >         frame_start = crtc_state->frame_start;
> >         frame_end = crtc_state->frame_end;
> >         crc_pending = crtc_state->crc_pending;
> >         crtc_state->frame_start = 0;
> >         crtc_state->frame_end = 0;
> >         crtc_state->crc_pending = false;
> > -       spin_unlock_irqrestore(&out->state_lock, flags);
> > +       spin_unlock_irq(&out->state_lock);
> >
> >         /*
> >          * We raced with the vblank hrtimer and previous work already computed
> > @@ -246,7 +245,6 @@ int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name)
> >  {
> >         struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
> >         bool enabled = false;
> > -       unsigned long flags;
> >         int ret = 0;
> >
> >         ret = vkms_crc_parse_source(src_name, &enabled);
> > @@ -254,9 +252,9 @@ int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name)
> >         /* make sure nothing is scheduled on crtc workq */
> >         flush_workqueue(out->crc_workq);
> >
> > -       spin_lock_irqsave(&out->lock, flags);
> > +       spin_lock_irq(&out->lock);
> >         out->crc_enabled = enabled;
> > -       spin_unlock_irqrestore(&out->lock, flags);
> 
> I was wondering if we could use atomic_t for crc_enabled and avoid
> this sort of lock. I did not try to change the data type; this is just
> an idea.

tldr; atomic_t does not do what you think it does. rule of thumb is you
can use it for reference counting and statistics book-keeping, but nothing
else.

The long explanation is that atomic_t in the linux kernel does not have
barriers (unlike atomic values in almost all other language), they are
weakly ordered. If you want to use them for logic (like here with this
bool) you need to think very carefully about barriers, document those
barriers, proof you got it all right, all for maybe a tiny speed-up
(spinlocks are extremely well optimized). In almost all cases that's not
worth it, and fairly often you end up with more atomic operations and so
overall slower code.

btw for this reasons atomic_t is wrapped as refcount_t for those cases
where it's safe to use (plus the code is even more optimized for the
refcount use-case). Except for statistics (like how many crc did we
compute) you shouldn't ever use atomic_t, at least as a good rule of
thumb.
-Daniel


> 
> Reviewed-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> Tested-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> 
> > +       spin_unlock_irq(&out->lock);
> >
> >         return ret;
> >  }
> > --
> > 2.20.1
> >
> 
> 
> -- 
> 
> Rodrigo Siqueira
> https://siqueira.tech

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 03/10] drm/vkms: Rename vkms_output.state_lock to crc_lock
  2019-06-12 13:38   ` Rodrigo Siqueira
@ 2019-06-13  7:48     ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2019-06-13  7:48 UTC (permalink / raw)
  To: Rodrigo Siqueira
  Cc: Daniel Vetter, Haneen Mohammed, DRI Development, Daniel Vetter

On Wed, Jun 12, 2019 at 10:38:23AM -0300, Rodrigo Siqueira wrote:
> On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > Plus add a comment about what it actually protects. It's very little.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> > Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > ---
> >  drivers/gpu/drm/vkms/vkms_crc.c  | 4 ++--
> >  drivers/gpu/drm/vkms/vkms_crtc.c | 6 +++---
> >  drivers/gpu/drm/vkms/vkms_drv.h  | 5 +++--
> >  3 files changed, 8 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> > index 883e36fe7b6e..96806cd35ad4 100644
> > --- a/drivers/gpu/drm/vkms/vkms_crc.c
> > +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> > @@ -168,14 +168,14 @@ void vkms_crc_work_handle(struct work_struct *work)
> >         u64 frame_start, frame_end;
> >         bool crc_pending;
> >
> > -       spin_lock_irq(&out->state_lock);
> > +       spin_lock_irq(&out->crc_lock);
> >         frame_start = crtc_state->frame_start;
> >         frame_end = crtc_state->frame_end;
> >         crc_pending = crtc_state->crc_pending;
> >         crtc_state->frame_start = 0;
> >         crtc_state->frame_end = 0;
> >         crtc_state->crc_pending = false;
> > -       spin_unlock_irq(&out->state_lock);
> > +       spin_unlock_irq(&out->crc_lock);
> >
> >         /*
> >          * We raced with the vblank hrtimer and previous work already computed
> > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > index c727d8486e97..55b16d545fe7 100644
> > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > @@ -29,7 +29,7 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> >                 /* update frame_start only if a queued vkms_crc_work_handle()
> >                  * has read the data
> >                  */
> > -               spin_lock(&output->state_lock);
> > +               spin_lock(&output->crc_lock);
> >                 if (!state->crc_pending)
> >                         state->frame_start = frame;
> >                 else
> > @@ -37,7 +37,7 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> >                                          state->frame_start, frame);
> >                 state->frame_end = frame;
> >                 state->crc_pending = true;
> > -               spin_unlock(&output->state_lock);
> > +               spin_unlock(&output->crc_lock);
> >
> >                 ret = queue_work(output->crc_workq, &state->crc_work);
> >                 if (!ret)
> > @@ -224,7 +224,7 @@ int vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
> >         drm_crtc_helper_add(crtc, &vkms_crtc_helper_funcs);
> >
> >         spin_lock_init(&vkms_out->lock);
> > -       spin_lock_init(&vkms_out->state_lock);
> > +       spin_lock_init(&vkms_out->crc_lock);
> >
> >         vkms_out->crc_workq = alloc_ordered_workqueue("vkms_crc_workq", 0);
> >         if (!vkms_out->crc_workq)
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 3c7e06b19efd..43d3f98289fe 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -57,6 +57,7 @@ struct vkms_crtc_state {
> >         struct drm_crtc_state base;
> >         struct work_struct crc_work;
> >
> > +       /* below three are protected by vkms_output.crc_lock */
> >         bool crc_pending;
> >         u64 frame_start;
> >         u64 frame_end;
> > @@ -74,8 +75,8 @@ struct vkms_output {
> >         struct workqueue_struct *crc_workq;
> >         /* protects concurrent access to crc_data */
> >         spinlock_t lock;
> 
> It's not really specific to this patch, but after reviewing it, I was
> thinking about the use of the 'lock' field in the struct vkms_output.
> Do we really need it? It looks like that crc_lock just replaced it.

Yeah they're a bit redundant, but the other way round. crc_lock is per
vkms_output_state, and we constantly recreate that one. So if we'd want to
remove one of those locks, that's the one that needs to go. Only
vkms_output->lock is global.

I figured not something we absolutely have to do right away :-)

> Additionally, going a little bit deeper in the lock field at struct
> vkms_output, I was also asking myself: what critical area do we want
> to protect with this lock? After inspecting the use of this lock, I
> noticed two different places that use it:
> 
> 1. In the function vkms_vblank_simulate()
> 
> Note that we cover a vast area with ‘output->lock’. IMHO we just need
> to protect the state variable (line “state = output->crc_state;”) and
> we can also use crc_lock for this case. Make sense?

Hm very tricky, but good point. Going through that entire function, as it
is after my patch series:

- hrtimer_forward_now: I think with all the patches to fix the ordering it
  should be safe to take that out of the vkms_output->lock. in the
  get_vblank_timestamp function we also don't take the lock, so really
  doesn't protect anything.

- drm_crtc_handle_vblank: This must be protected by output->lock to avoid
  races against drm_crtc_arm_vblank.

- state = output->crc_state: Obviously needs the lock.

- lifetime of state memory, i.e. what guarantees no one calls kfree on it
  before we're done. We already rely on this not disappearing until the
  worker has finished (using the flush_work before we tear down old state
  structures), but moving the queue_work out from under the state->lock
  protection might open up new race windows. I think from a logic pov it's
  all fine: We wait for vblank to signal, which means after that point any
  queue_work will be for the next state, not the one we're about to free,
  in vkms_atomic_commit_tail. Well the freeing is done in
  drm_atomic_helper.c which calls commit_tail.

  But we might be missing some barriers in drm_vblank.c.

tldr; I think we can move the critical section down a bit, but until
drm_vblank.c is fully audited we need to keep the bottom part as-is I
think.

> 2. In the function vkms_crtc_atomic_begin()
> 
> We also hold output->lock in the vkms_crtc_atomic_begin() and just
> release it at vkms_crtc_atomic_flush(). Do we still need it?
> 
> > -       /* protects concurrent access to crtc_state */
> > -       spinlock_t state_lock;
> > +
> > +       spinlock_t crc_lock;
> >  };
> 
> Maybe add a kernel doc on top of crc_lock?

Atm we have 0 kerneldoc for vkms structures. I planned to fix that once
this series has landed (well maybe need a bit more work still before it
makes sense to start typing docs).
-Daniel

> 
> >
> >  struct vkms_device {
> > --
> > 2.20.1
> >
> 
> 
> -- 
> 
> Rodrigo Siqueira
> https://siqueira.tech

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 06/10] drm/vkms: flush crc workers earlier in commit flow
  2019-06-12 13:42   ` Rodrigo Siqueira
@ 2019-06-13  7:53     ` Daniel Vetter
  2019-06-13  7:55       ` Daniel Vetter
  2019-06-18  2:31       ` Rodrigo Siqueira
  0 siblings, 2 replies; 44+ messages in thread
From: Daniel Vetter @ 2019-06-13  7:53 UTC (permalink / raw)
  To: Rodrigo Siqueira
  Cc: Daniel Vetter, Haneen Mohammed, DRI Development, Daniel Vetter

On Wed, Jun 12, 2019 at 10:42:42AM -0300, Rodrigo Siqueira wrote:
> On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > Currently we flush pending crc workers very late in the commit flow,
> > when we destry all the old crtc states. Unfortunately at that point
> 
> destry -> destroy
> 
> > the framebuffers are already unpinned (and our vaddr possible gone),
> > so this isn't good. Also, the plane_states we need might also already
> > be cleaned up, since cleanup order of state structures isn't well
> > defined.
> >
> > Fix this by waiting for all crc workers of the old state to complete
> > before we start any of the cleanup work.
> >
> > Note that this is not yet race-free, because the hrtimer and crc
> > worker look at the wrong state pointers, but that will be fixed in
> > subsequent patches.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> > Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > ---
> >  drivers/gpu/drm/vkms/vkms_crtc.c |  2 +-
> >  drivers/gpu/drm/vkms/vkms_drv.c  | 10 ++++++++++
> >  2 files changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > index 55b16d545fe7..b6987d90805f 100644
> > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > @@ -125,7 +125,7 @@ static void vkms_atomic_crtc_destroy_state(struct drm_crtc *crtc,
> >         __drm_atomic_helper_crtc_destroy_state(state);
> >
> >         if (vkms_state) {
> > -               flush_work(&vkms_state->crc_work);
> > +               WARN_ON(work_pending(&vkms_state->crc_work));
> >                 kfree(vkms_state);
> >         }
> >  }
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
> > index f677ab1d0094..cc53ef88a331 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.c
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.c
> > @@ -62,6 +62,9 @@ static void vkms_release(struct drm_device *dev)
> >  static void vkms_atomic_commit_tail(struct drm_atomic_state *old_state)
> >  {
> >         struct drm_device *dev = old_state->dev;
> > +       struct drm_crtc *crtc;
> > +       struct drm_crtc_state *old_crtc_state;
> > +       int i;
> >
> >         drm_atomic_helper_commit_modeset_disables(dev, old_state);
> >
> > @@ -75,6 +78,13 @@ static void vkms_atomic_commit_tail(struct drm_atomic_state *old_state)
> >
> >         drm_atomic_helper_wait_for_vblanks(dev, old_state);
> >
> > +       for_each_old_crtc_in_state(old_state, crtc, old_crtc_state, i) {
> > +               struct vkms_crtc_state *vkms_state =
> > +                       to_vkms_crtc_state(old_crtc_state);
> > +
> > +               flush_work(&vkms_state->crc_work);
> > +       }
> > +
> >         drm_atomic_helper_cleanup_planes(dev, old_state);
> >  }
> 
> why not use drm_atomic_helper_commit_tail() here? I mean:
> 
> for_each_old_crtc_in_state(old_state, crtc, old_crtc_state, i) {
> …
> }
> 
> drm_atomic_helper_commit_tail(old_state);
> 
> After looking at drm_atomic_helper_cleanup_planes() it sounds safe for
> me to use the above code; I just test it with two tests from
> crc_cursor. Maybe I missed something, could you help me here?
> 
> Finally, IMHO, I think that Patch 05, 06 and 07 could be squashed in a
> single patch to make it easier to understand the change.

I wanted to highlight all the bits a bit more, because this is a lot more
tricky than it looks. For correct ordering and avoiding races we can't do
what you suggested. Only after

	drm_atomic_helper_wait_for_vblanks()

do we know that all subsequent queue_work will be for the _new_ state.
Only once that's done is flush_work() actually useful, before that we
might flush the work, and then right after the hrtimer that simulates
vblank queues it again. Every time you have a flush_work before cleaning
up the work structure the folling sequence must be obeyed, or it can go
wrong:

1. Make sure no one else can requeue the work anymore (in our case that's
done by a combination of first updating output->crc_state and then waiting
for the vblank to pass to make sure the hrtimer has noticed that change).

2. flush_work()

3. Actually clean up stuff (which isn't done here).

Doing the flush_work before we even completed the output->state update,
much less waited for the vblank to make sure that's happened, missed the
point.
-Daniel

> 
> > --
> > 2.20.1
> >
> 
> 
> -- 
> 
> Rodrigo Siqueira
> https://siqueira.tech

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 06/10] drm/vkms: flush crc workers earlier in commit flow
  2019-06-13  7:53     ` Daniel Vetter
@ 2019-06-13  7:55       ` Daniel Vetter
  2019-06-18  2:31       ` Rodrigo Siqueira
  1 sibling, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2019-06-13  7:55 UTC (permalink / raw)
  To: Rodrigo Siqueira
  Cc: Daniel Vetter, Haneen Mohammed, DRI Development, Daniel Vetter

On Thu, Jun 13, 2019 at 09:53:55AM +0200, Daniel Vetter wrote:
> On Wed, Jun 12, 2019 at 10:42:42AM -0300, Rodrigo Siqueira wrote:
> > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > >
> > > Currently we flush pending crc workers very late in the commit flow,
> > > when we destry all the old crtc states. Unfortunately at that point
> > 
> > destry -> destroy
> > 
> > > the framebuffers are already unpinned (and our vaddr possible gone),
> > > so this isn't good. Also, the plane_states we need might also already
> > > be cleaned up, since cleanup order of state structures isn't well
> > > defined.
> > >
> > > Fix this by waiting for all crc workers of the old state to complete
> > > before we start any of the cleanup work.
> > >
> > > Note that this is not yet race-free, because the hrtimer and crc
> > > worker look at the wrong state pointers, but that will be fixed in
> > > subsequent patches.
> > >
> > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> > > Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> > > Cc: Daniel Vetter <daniel@ffwll.ch>
> > > ---
> > >  drivers/gpu/drm/vkms/vkms_crtc.c |  2 +-
> > >  drivers/gpu/drm/vkms/vkms_drv.c  | 10 ++++++++++
> > >  2 files changed, 11 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > index 55b16d545fe7..b6987d90805f 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > @@ -125,7 +125,7 @@ static void vkms_atomic_crtc_destroy_state(struct drm_crtc *crtc,
> > >         __drm_atomic_helper_crtc_destroy_state(state);
> > >
> > >         if (vkms_state) {
> > > -               flush_work(&vkms_state->crc_work);
> > > +               WARN_ON(work_pending(&vkms_state->crc_work));
> > >                 kfree(vkms_state);
> > >         }
> > >  }
> > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
> > > index f677ab1d0094..cc53ef88a331 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_drv.c
> > > +++ b/drivers/gpu/drm/vkms/vkms_drv.c
> > > @@ -62,6 +62,9 @@ static void vkms_release(struct drm_device *dev)
> > >  static void vkms_atomic_commit_tail(struct drm_atomic_state *old_state)
> > >  {
> > >         struct drm_device *dev = old_state->dev;
> > > +       struct drm_crtc *crtc;
> > > +       struct drm_crtc_state *old_crtc_state;
> > > +       int i;
> > >
> > >         drm_atomic_helper_commit_modeset_disables(dev, old_state);
> > >
> > > @@ -75,6 +78,13 @@ static void vkms_atomic_commit_tail(struct drm_atomic_state *old_state)
> > >
> > >         drm_atomic_helper_wait_for_vblanks(dev, old_state);
> > >
> > > +       for_each_old_crtc_in_state(old_state, crtc, old_crtc_state, i) {
> > > +               struct vkms_crtc_state *vkms_state =
> > > +                       to_vkms_crtc_state(old_crtc_state);
> > > +
> > > +               flush_work(&vkms_state->crc_work);
> > > +       }
> > > +
> > >         drm_atomic_helper_cleanup_planes(dev, old_state);
> > >  }
> > 
> > why not use drm_atomic_helper_commit_tail() here? I mean:
> > 
> > for_each_old_crtc_in_state(old_state, crtc, old_crtc_state, i) {
> > …
> > }
> > 
> > drm_atomic_helper_commit_tail(old_state);
> > 
> > After looking at drm_atomic_helper_cleanup_planes() it sounds safe for
> > me to use the above code; I just test it with two tests from
> > crc_cursor. Maybe I missed something, could you help me here?
> > 
> > Finally, IMHO, I think that Patch 05, 06 and 07 could be squashed in a
> > single patch to make it easier to understand the change.

Ah just realized that patch 07 is entirely unrelated to this work here.
Squashing that in would be a bad idea, we could merge patch 7
independently of this stuff here. So it should be a separate patch.
-Daniel

> 
> I wanted to highlight all the bits a bit more, because this is a lot more
> tricky than it looks. For correct ordering and avoiding races we can't do
> what you suggested. Only after
> 
> 	drm_atomic_helper_wait_for_vblanks()
> 
> do we know that all subsequent queue_work will be for the _new_ state.
> Only once that's done is flush_work() actually useful, before that we
> might flush the work, and then right after the hrtimer that simulates
> vblank queues it again. Every time you have a flush_work before cleaning
> up the work structure the folling sequence must be obeyed, or it can go
> wrong:
> 
> 1. Make sure no one else can requeue the work anymore (in our case that's
> done by a combination of first updating output->crc_state and then waiting
> for the vblank to pass to make sure the hrtimer has noticed that change).
> 
> 2. flush_work()
> 
> 3. Actually clean up stuff (which isn't done here).
> 
> Doing the flush_work before we even completed the output->state update,
> much less waited for the vblank to make sure that's happened, missed the
> point.
> -Daniel
> 
> > 
> > > --
> > > 2.20.1
> > >
> > 
> > 
> > -- 
> > 
> > Rodrigo Siqueira
> > https://siqueira.tech
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 09/10] drm/vkms: totally reworked crc data tracking
  2019-06-12 13:46   ` Rodrigo Siqueira
@ 2019-06-13  7:59     ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2019-06-13  7:59 UTC (permalink / raw)
  To: Rodrigo Siqueira
  Cc: Daniel Vetter, Haneen Mohammed, DRI Development, Daniel Vetter

On Wed, Jun 12, 2019 at 10:46:28AM -0300, Rodrigo Siqueira wrote:
> On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > The crc computation worker needs to be able to get at some data
> > structures and framebuffer mappings, while potentially more atomic
> > updates are going on. The solution thus far is to copy relevant bits
> > around, but that's very tedious.
> >
> > Here's a new approach, which tries to be more clever, but relies on a
> > few not-so-obvious things:
> > - crtc_state is always updated when a plane_state changes. Therefore
> >   we can just stuff plane_state pointers into a crtc_state. That
> >   solves the problem of easily getting at the needed plane_states.
> 
> Just for curiosity, where exactly the crct_state is updated? If
> possible, could you elaborate a little bit about this?

drm_atomic_get_plane_state always gets the crtc_state too, which means if
we duplicate a plane_state, we will _always_ duplicate the crtc state too.
This is a guarantee by the atomic kms design.

> 
> > - with the flushing changes from previous patches the above also holds
> >   without races due to the next atomic update being a bit eager with
> >   cleaning up pending work - we always wait for all crc work items to
> >   complete before unmapping framebuffers.
> > - we also need to make sure that the hrtimer fires off the right
> >   worker. Keep a new distinct crc_state pointer, under the
> >   vkms_output->lock protection for this. Note that crtc->state is
> >   updated very early in the atomic commit, way before we arm the
> >   vblank event - the vblank event should always match the buffers we
> >   use to compute the crc. This also solves an issue in the hrtimer,
> >   where we've accessed drm_crtc->state without holding the right locks
> >   (we held none - oops).
> > - in the worker itself we can then just access the plane states we
> >   need, again solving a bunch of ordering and locking issues.
> >   Accessing plane->state requires locks, accessing the private
> >   vkms_crtc_state->active_planes pointer only requires that the memory
> >   doesn't get freed too early.
> >
> > The idea behind vkms_crtc_state->active_planes is that this would
> > contain all visible planes, in z-order, as a first step towards a more
> > generic blending implementation.
> >
> > Note that this patch also fixes races between prepare_fb/cleanup_fb
> > and the crc worker accessing ->vaddr.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> > Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > ---
> >  drivers/gpu/drm/vkms/vkms_crc.c  | 21 +++--------
> >  drivers/gpu/drm/vkms/vkms_crtc.c | 60 +++++++++++++++++++++++++++++---
> >  drivers/gpu/drm/vkms/vkms_drv.h  |  9 ++++-
> >  3 files changed, 67 insertions(+), 23 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> > index 9d15e5e85830..0d31cfc32042 100644
> > --- a/drivers/gpu/drm/vkms/vkms_crc.c
> > +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> > @@ -159,11 +159,8 @@ void vkms_crc_work_handle(struct work_struct *work)
> >                                                 crc_work);
> >         struct drm_crtc *crtc = crtc_state->base.crtc;
> >         struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
> > -       struct vkms_device *vdev = container_of(out, struct vkms_device,
> > -                                               output);
> >         struct vkms_crc_data *primary_crc = NULL;
> >         struct vkms_crc_data *cursor_crc = NULL;
> > -       struct drm_plane *plane;
> >         u32 crc32 = 0;
> >         u64 frame_start, frame_end;
> >         bool crc_pending;
> > @@ -184,21 +181,11 @@ void vkms_crc_work_handle(struct work_struct *work)
> >         if (!crc_pending)
> >                 return;
> >
> > -       drm_for_each_plane(plane, &vdev->drm) {
> > -               struct vkms_plane_state *vplane_state;
> > -               struct vkms_crc_data *crc_data;
> > +       if (crtc_state->num_active_planes >= 1)
> > +               primary_crc = crtc_state->active_planes[0]->crc_data;
> >
> > -               vplane_state = to_vkms_plane_state(plane->state);
> > -               crc_data = vplane_state->crc_data;
> > -
> > -               if (drm_framebuffer_read_refcount(&crc_data->fb) == 0)
> > -                       continue;
> > -
> > -               if (plane->type == DRM_PLANE_TYPE_PRIMARY)
> > -                       primary_crc = crc_data;
> > -               else
> > -                       cursor_crc = crc_data;
> > -       }
> > +       if (crtc_state->num_active_planes == 2)
> > +               cursor_crc = crtc_state->active_planes[1]->crc_data;
> >
> >         if (primary_crc)
> >                 crc32 = _vkms_get_crc(primary_crc, cursor_crc);
> > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > index 48a793ba4030..14156ed70415 100644
> > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > @@ -1,6 +1,7 @@
> >  // SPDX-License-Identifier: GPL-2.0+
> >
> >  #include "vkms_drv.h"
> > +#include <drm/drm_atomic.h>
> >  #include <drm/drm_atomic_helper.h>
> >  #include <drm/drm_probe_helper.h>
> >
> > @@ -9,7 +10,7 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> >         struct vkms_output *output = container_of(timer, struct vkms_output,
> >                                                   vblank_hrtimer);
> >         struct drm_crtc *crtc = &output->crtc;
> > -       struct vkms_crtc_state *state = to_vkms_crtc_state(crtc->state);
> > +       struct vkms_crtc_state *state;
> >         u64 ret_overrun;
> >         bool ret;
> >
> > @@ -23,6 +24,7 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> >         if (!ret)
> >                 DRM_ERROR("vkms failure on handling vblank");
> >
> > +       state = output->crc_state;
> >         if (state && output->crc_enabled) {
> >                 u64 frame = drm_crtc_accurate_vblank_count(crtc);
> >
> > @@ -124,10 +126,9 @@ static void vkms_atomic_crtc_destroy_state(struct drm_crtc *crtc,
> >
> >         __drm_atomic_helper_crtc_destroy_state(state);
> >
> > -       if (vkms_state) {
> > -               WARN_ON(work_pending(&vkms_state->crc_work));
> > -               kfree(vkms_state);
> > -       }
> > +       WARN_ON(work_pending(&vkms_state->crc_work));
> > +       kfree(vkms_state->active_planes);
> > +       kfree(vkms_state);
> >  }
> >
> >  static void vkms_atomic_crtc_reset(struct drm_crtc *crtc)
> > @@ -156,6 +157,52 @@ static const struct drm_crtc_funcs vkms_crtc_funcs = {
> >         .verify_crc_source      = vkms_verify_crc_source,
> >  };
> >
> > +static int vkms_crtc_atomic_check(struct drm_crtc *crtc,
> > +                                  struct drm_crtc_state *state)
> > +{
> > +       struct vkms_crtc_state *vkms_state = to_vkms_crtc_state(state);
> > +       struct drm_plane *plane;
> > +       struct drm_plane_state *plane_state;
> > +       int i = 0, ret;
> > +
> > +       if (vkms_state->active_planes)
> > +               return 0;
> > +
> > +       ret = drm_atomic_add_affected_planes(state->state, crtc);
> > +       if (ret < 0)
> > +               return ret;
> > +
> > +       drm_for_each_plane_mask(plane, crtc->dev, state->plane_mask) {
> 
> In the drm_for_each_plane_mask documentation says: “ Iterate over all
> planes specified by bitmask”. I did not understand what it means by
> “specified by bitmask” and the use of this macro in this context. I

state->plane_mask is the bitmask. Each bit in there represents a plane, so
can be used as a very effective representation for a subset of planes (as
long as we don't have more than 32 planes in total).

> tried to replace it for drm_for_each_plane, but the test just break.

Hm right now this should work, because we only have one crtc, so no
problem with walking over planes we shouldn't even look at.

> Could you explain a little bit the magic behind
> drm_for_each_plane_mask?

Hope the above quick summary of what we use the bitmask for helps to get
you started.
-Daniel

> 
> > +               plane_state = drm_atomic_get_existing_plane_state(state->state,
> > +                                                                 plane);
> > +               WARN_ON(!plane_state);
> > +
> > +               if (!plane_state->visible)
> > +                       continue;
> > +
> > +               i++;
> > +       }
> > +
> > +       vkms_state->active_planes = kcalloc(i, sizeof(plane), GFP_KERNEL);
> > +       if (!vkms_state->active_planes)
> > +               return -ENOMEM;
> > +       vkms_state->num_active_planes = i;
> > +
> > +       i = 0;
> > +       drm_for_each_plane_mask(plane, crtc->dev, state->plane_mask) {
> > +               plane_state = drm_atomic_get_existing_plane_state(state->state,
> > +                                                                 plane);
> > +
> > +               if (!plane_state->visible)
> > +                       continue;
> > +
> > +               vkms_state->active_planes[i++] =
> > +                       to_vkms_plane_state(plane_state);
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> >  static void vkms_crtc_atomic_enable(struct drm_crtc *crtc,
> >                                     struct drm_crtc_state *old_state)
> >  {
> > @@ -197,10 +244,13 @@ static void vkms_crtc_atomic_flush(struct drm_crtc *crtc,
> >                 crtc->state->event = NULL;
> >         }
> >
> > +       vkms_output->crc_state = to_vkms_crtc_state(crtc->state);
> > +
> >         spin_unlock_irq(&vkms_output->lock);
> >  }
> >
> >  static const struct drm_crtc_helper_funcs vkms_crtc_helper_funcs = {
> > +       .atomic_check   = vkms_crtc_atomic_check,
> >         .atomic_begin   = vkms_crtc_atomic_begin,
> >         .atomic_flush   = vkms_crtc_atomic_flush,
> >         .atomic_enable  = vkms_crtc_atomic_enable,
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 2a35299bfb89..4e7450111d45 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -49,6 +49,10 @@ struct vkms_crtc_state {
> >         struct drm_crtc_state base;
> >         struct work_struct crc_work;
> >
> > +       int num_active_planes;
> > +       /* stack of active planes for crc computation, should be in z order */
> > +       struct vkms_plane_state **active_planes;
> > +
> >         /* below three are protected by vkms_output.crc_lock */
> >         bool crc_pending;
> >         u64 frame_start;
> > @@ -62,12 +66,15 @@ struct vkms_output {
> >         struct hrtimer vblank_hrtimer;
> >         ktime_t period_ns;
> >         struct drm_pending_vblank_event *event;
> > -       bool crc_enabled;
> >         /* ordered wq for crc_work */
> >         struct workqueue_struct *crc_workq;
> >         /* protects concurrent access to crc_data */
> >         spinlock_t lock;
> >
> > +       /* protected by @lock */
> > +       bool crc_enabled;
> > +       struct vkms_crtc_state *crc_state;
> > +
> >         spinlock_t crc_lock;
> >  };
> >
> > --
> > 2.20.1
> >
> 
> 
> -- 
> 
> Rodrigo Siqueira
> https://siqueira.tech

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 06/10] drm/vkms: flush crc workers earlier in commit flow
  2019-06-13  7:53     ` Daniel Vetter
  2019-06-13  7:55       ` Daniel Vetter
@ 2019-06-18  2:31       ` Rodrigo Siqueira
  1 sibling, 0 replies; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-18  2:31 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Daniel Vetter, Haneen Mohammed, DRI Development, Daniel Vetter

On 06/13, Daniel Vetter wrote:
> On Wed, Jun 12, 2019 at 10:42:42AM -0300, Rodrigo Siqueira wrote:
> > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > >
> > > Currently we flush pending crc workers very late in the commit flow,
> > > when we destry all the old crtc states. Unfortunately at that point
> > 
> > destry -> destroy
> > 
> > > the framebuffers are already unpinned (and our vaddr possible gone),
> > > so this isn't good. Also, the plane_states we need might also already
> > > be cleaned up, since cleanup order of state structures isn't well
> > > defined.
> > >
> > > Fix this by waiting for all crc workers of the old state to complete
> > > before we start any of the cleanup work.
> > >
> > > Note that this is not yet race-free, because the hrtimer and crc
> > > worker look at the wrong state pointers, but that will be fixed in
> > > subsequent patches.
> > >
> > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> > > Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> > > Cc: Daniel Vetter <daniel@ffwll.ch>
> > > ---
> > >  drivers/gpu/drm/vkms/vkms_crtc.c |  2 +-
> > >  drivers/gpu/drm/vkms/vkms_drv.c  | 10 ++++++++++
> > >  2 files changed, 11 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > index 55b16d545fe7..b6987d90805f 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > @@ -125,7 +125,7 @@ static void vkms_atomic_crtc_destroy_state(struct drm_crtc *crtc,
> > >         __drm_atomic_helper_crtc_destroy_state(state);
> > >
> > >         if (vkms_state) {
> > > -               flush_work(&vkms_state->crc_work);
> > > +               WARN_ON(work_pending(&vkms_state->crc_work));
> > >                 kfree(vkms_state);
> > >         }
> > >  }
> > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
> > > index f677ab1d0094..cc53ef88a331 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_drv.c
> > > +++ b/drivers/gpu/drm/vkms/vkms_drv.c
> > > @@ -62,6 +62,9 @@ static void vkms_release(struct drm_device *dev)
> > >  static void vkms_atomic_commit_tail(struct drm_atomic_state *old_state)
> > >  {
> > >         struct drm_device *dev = old_state->dev;
> > > +       struct drm_crtc *crtc;

I forgot to point that crtc is set but not used, which make gcc
complain.

And thanks for the explanation below.

> > > +       struct drm_crtc_state *old_crtc_state;
> > > +       int i;
> > >
> > >         drm_atomic_helper_commit_modeset_disables(dev, old_state);
> > >
> > > @@ -75,6 +78,13 @@ static void vkms_atomic_commit_tail(struct drm_atomic_state *old_state)
> > >
> > >         drm_atomic_helper_wait_for_vblanks(dev, old_state);
> > >
> > > +       for_each_old_crtc_in_state(old_state, crtc, old_crtc_state, i) {
> > > +               struct vkms_crtc_state *vkms_state =
> > > +                       to_vkms_crtc_state(old_crtc_state);
> > > +
> > > +               flush_work(&vkms_state->crc_work);
> > > +       }
> > > +
> > >         drm_atomic_helper_cleanup_planes(dev, old_state);
> > >  }
> > 
> > why not use drm_atomic_helper_commit_tail() here? I mean:
> > 
> > for_each_old_crtc_in_state(old_state, crtc, old_crtc_state, i) {
> > …
> > }
> > 
> > drm_atomic_helper_commit_tail(old_state);
> > 
> > After looking at drm_atomic_helper_cleanup_planes() it sounds safe for
> > me to use the above code; I just test it with two tests from
> > crc_cursor. Maybe I missed something, could you help me here?
> > 
> > Finally, IMHO, I think that Patch 05, 06 and 07 could be squashed in a
> > single patch to make it easier to understand the change.
> 
> I wanted to highlight all the bits a bit more, because this is a lot more
> tricky than it looks. For correct ordering and avoiding races we can't do
> what you suggested. Only after
> 
> 	drm_atomic_helper_wait_for_vblanks()
> 
> do we know that all subsequent queue_work will be for the _new_ state.
> Only once that's done is flush_work() actually useful, before that we
> might flush the work, and then right after the hrtimer that simulates
> vblank queues it again. Every time you have a flush_work before cleaning
> up the work structure the folling sequence must be obeyed, or it can go
> wrong:
> 
> 1. Make sure no one else can requeue the work anymore (in our case that's
> done by a combination of first updating output->crc_state and then waiting
> for the vblank to pass to make sure the hrtimer has noticed that change).
> 
> 2. flush_work()
> 
> 3. Actually clean up stuff (which isn't done here).
> 
> Doing the flush_work before we even completed the output->state update,
> much less waited for the vblank to make sure that's happened, missed the
> point.
> -Daniel
> 
> > 
> > > --
> > > 2.20.1
> > >
> > 
> > 
> > -- 
> > 
> > Rodrigo Siqueira
> > https://siqueira.tech
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 01/10] drm/vkms: Fix crc worker races
  2019-06-12 14:48     ` Daniel Vetter
@ 2019-06-18  2:39       ` Rodrigo Siqueira
  2019-06-18  8:49         ` Daniel Vetter
  0 siblings, 1 reply; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-18  2:39 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Daniel Vetter, Daniel Vetter, Haneen Mohammed, DRI Development,
	Shayenne Moura

On 06/12, Daniel Vetter wrote:
> On Wed, Jun 12, 2019 at 10:33:11AM -0300, Rodrigo Siqueira wrote:
> > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > >
> > > The issue we have is that the crc worker might fall behind. We've
> > > tried to handle this by tracking both the earliest frame for which it
> > > still needs to compute a crc, and the last one. Plus when the
> > > crtc_state changes, we have a new work item, which are all run in
> > > order due to the ordered workqueue we allocate for each vkms crtc.
> > >
> > > Trouble is there's been a few small issues in the current code:
> > > - we need to capture frame_end in the vblank hrtimer, not in the
> > >   worker. The worker might run much later, and then we generate a lot
> > >   of crc for which there's already a different worker queued up.
> > > - frame number might be 0, so create a new crc_pending boolean to
> > >   track this without confusion.
> > > - we need to atomically grab frame_start/end and clear it, so do that
> > >   all in one go. This is not going to create a new race, because if we
> > >   race with the hrtimer then our work will be re-run.
> > > - only race that can happen is the following:
> > >   1. worker starts
> > >   2. hrtimer runs and updates frame_end
> > >   3. worker grabs frame_start/end, already reading the new frame_end,
> > >   and clears crc_pending
> > >   4. hrtimer calls queue_work()
> > >   5. worker completes
> > >   6. worker gets  re-run, crc_pending is false
> > >   Explain this case a bit better by rewording the comment.
> > >
> > > v2: Demote warning level output to debug when we fail to requeue, this
> > > is expected under high load when the crc worker can't quite keep up.
> > >
> > > Cc: Shayenne Moura <shayenneluzmoura@gmail.com>
> > > Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> > > Cc: Daniel Vetter <daniel@ffwll.ch>
> > > ---
> > >  drivers/gpu/drm/vkms/vkms_crc.c  | 27 +++++++++++++--------------
> > >  drivers/gpu/drm/vkms/vkms_crtc.c |  9 +++++++--
> > >  drivers/gpu/drm/vkms/vkms_drv.h  |  2 ++
> > >  3 files changed, 22 insertions(+), 16 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> > > index d7b409a3c0f8..66603da634fe 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_crc.c
> > > +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> > > @@ -166,16 +166,24 @@ void vkms_crc_work_handle(struct work_struct *work)
> > >         struct drm_plane *plane;
> > >         u32 crc32 = 0;
> > >         u64 frame_start, frame_end;
> > > +       bool crc_pending;
> > >         unsigned long flags;
> > >
> > >         spin_lock_irqsave(&out->state_lock, flags);
> > >         frame_start = crtc_state->frame_start;
> > >         frame_end = crtc_state->frame_end;
> > > +       crc_pending = crtc_state->crc_pending;
> > > +       crtc_state->frame_start = 0;
> > > +       crtc_state->frame_end = 0;
> > > +       crtc_state->crc_pending = false;
> > >         spin_unlock_irqrestore(&out->state_lock, flags);
> > >
> > > -       /* _vblank_handle() hasn't updated frame_start yet */
> > > -       if (!frame_start || frame_start == frame_end)
> > > -               goto out;
> > > +       /*
> > > +        * We raced with the vblank hrtimer and previous work already computed
> > > +        * the crc, nothing to do.
> > > +        */
> > > +       if (!crc_pending)
> > > +               return;
> > 
> > I think this condition is not reachable because crc_pending will be
> > filled with true in `vkms_vblank_simulate()` which in turn schedule
> > the function `vkms_crc_work_handle()`. Just for checking, I tried to
> > reach this condition by running kms_flip, kms_pipe_crc_basic, and
> > kms_cursor_crc with two different VM setups[1], but I couldn't reach
> > it. What do you think?
> 
> thread A			thread B
> 1. run vblank hrtimer
> 
> 				2. starts running crc work (from previous
> 				vblank)
> 
> 3. spin_lock()			-> gets stalled on the spin_lock() because
> 				   thread A has it already
> 
> 4. update frame_end (only in
> later patches, atm this is
> impossible). crc_pending is set
> already.
> 
> 5. schedule_work: since the work
> is running already, this means it
> is scheduled to run once more.
> 
> 6. spin_unlock
> 
> 				7. compute crc, clear crc_pending
> 				8. work finishes
> 				9. work gets run again
> 				8. crc_pending=false
> 
> Since the spin_lock critical section is _very_ short (less than 1 usec I
> bet), this race is very hard to hit.

First of all, thank you very much for all of your detailed explanation
and sorry for my delay to reply, I was 'processing' all of your
comments. I believe that I understood the issues related with this
patchset, I just want to check with you if the diagram and the cases
below make sense:

timer   |------|------|------|------|------|...

Case 1:        +----x +---x  +-----x

Case 2:     A  +----------x
            B         +----x

At the top of this diagram, I illustrated the vblank period along the
time. In the bottom lines, I highlighted two cases; the '+' represents
when the worker is queued (queue_work()), and the 'x' denotes when the
CRC work finishes its data processing. Before describing each case from
the diagram, I want to highlight that I'm focused on these two snippets
of code:

static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer) {
    [..]
		spin_lock(&output->crc_lock);
      [..] data [..]
		spin_unlock(&output->crc_lock);
    [..]
}

void vkms_crc_work_handle(struct work_struct *work){
  [..]
  spin_lock_irq(&out->crc_lock);
  crtc_state->crc_pending = false;
    [..] data [..]
	spin_unlock_irq(&output->crc_lock);
  [..]
}

Cases:

1) This is the best scenario; each CRC worker finishes before the next
vblank.

2) In this scenario, one of the CRC workers extends along multiple
vblanks.  If worker A already collected the sensitive data inside
vkms_crc_work_handle(), worker A and B will finish without problems
(thanks to your changes). However, if for any reason, the worker A did
not start before the worker B, the new work will take care of its own
CRC and the CRC from worker A. Finally, since worker B will set
crc_pending equal false when the worker A starts, it'll just return
because of the following code:

if (!crc_pending)
  return;

Make sense?
 
> Exercise: Figure out why schedule_work _must_ schedule the work item to
> re-run if it's running already. If it doesn't do that there's another
> race.
> 
> > 
> > [1] Qemu parameters
> > VM1: -m 1G -smp cores=2,cpus=2
> > VM2: -enable-kvm -m 2G -smp cores=4,cpus=4
> > 
> > >         drm_for_each_plane(plane, &vdev->drm) {
> > >                 struct vkms_plane_state *vplane_state;
> > > @@ -196,20 +204,11 @@ void vkms_crc_work_handle(struct work_struct *work)
> > >         if (primary_crc)
> > >                 crc32 = _vkms_get_crc(primary_crc, cursor_crc);
> > >
> > > -       frame_end = drm_crtc_accurate_vblank_count(crtc);
> > > -
> > > -       /* queue_work can fail to schedule crc_work; add crc for
> > > -        * missing frames
> > > +       /*
> > > +        * The worker can fall behind the vblank hrtimer, make sure we catch up.
> > >          */
> > >         while (frame_start <= frame_end)
> > >                 drm_crtc_add_crc_entry(crtc, true, frame_start++, &crc32);
> > 
> > I want to take this opportunity to ask about this while; It's not
> > really specific to this patch.
> > 
> > I have to admit that I never fully got the idea behind this 'while';
> > it looks like that we just fill out the missed frames with a repeated
> > value. FWIU, `drm_crtc_add_crc_entry()` will add an entry with the CRC
> > information for a frame, but in this case, we are adding the same CRC
> > for a different set of frames. I agree that near frame has a similar
> > CRC value, but could we rely on this all the time? What could happen
> > if we have a great difference from the frame_start and frame_end?
> 
> It's a cheap trick for slow cpu: If the crc work gets behind the vblank
> hrtimer, we need to somehow catch up. With real hw this is not possible,
> but with vkms we simulate the hw. The only quick way to catch up is to
> fill out the same crc for everything. It's a lie, it will make some
> kms_atomic tests fail, but it's the only thing we can really do. Aside
> from trying to make the crc computation code as fast as possible.
> -Daniel
> 
> > 
> > > -
> > > -out:
> > > -       /* to avoid using the same value for frame number again */
> > > -       spin_lock_irqsave(&out->state_lock, flags);
> > > -       crtc_state->frame_end = frame_end;
> > > -       crtc_state->frame_start = 0;
> > > -       spin_unlock_irqrestore(&out->state_lock, flags);
> > >  }
> > >
> > >  static int vkms_crc_parse_source(const char *src_name, bool *enabled)
> > > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > index 1bbe099b7db8..c727d8486e97 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > @@ -30,13 +30,18 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> > >                  * has read the data
> > >                  */
> > >                 spin_lock(&output->state_lock);
> > > -               if (!state->frame_start)
> > > +               if (!state->crc_pending)
> > >                         state->frame_start = frame;
> > > +               else
> > > +                       DRM_DEBUG_DRIVER("crc worker falling behind, frame_start: %llu, frame_end: %llu\n",
> > > +                                        state->frame_start, frame);
> > > +               state->frame_end = frame;
> > > +               state->crc_pending = true;
> > >                 spin_unlock(&output->state_lock);
> > >
> > >                 ret = queue_work(output->crc_workq, &state->crc_work);
> > >                 if (!ret)
> > > -                       DRM_WARN("failed to queue vkms_crc_work_handle");
> > > +                       DRM_DEBUG_DRIVER("vkms_crc_work_handle already queued\n");
> > >         }
> > >
> > >         spin_unlock(&output->lock);
> > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > > index 81f1cfbeb936..3c7e06b19efd 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > > @@ -56,6 +56,8 @@ struct vkms_plane_state {
> > >  struct vkms_crtc_state {
> > >         struct drm_crtc_state base;
> > >         struct work_struct crc_work;
> > > +
> > > +       bool crc_pending;
> > >         u64 frame_start;
> > >         u64 frame_end;
> > >  };
> > > --
> > > 2.20.1
> > >
> > 
> > 
> > -- 
> > 
> > Rodrigo Siqueira
> > https://siqueira.tech
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 00/10] drm/vkms: rework crc worker
  2019-06-12 14:42   ` Daniel Vetter
@ 2019-06-18  2:49     ` Rodrigo Siqueira
  2019-06-18  8:56       ` Daniel Vetter
  0 siblings, 1 reply; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-18  2:49 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, DRI Development

On 06/12, Daniel Vetter wrote:
> On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > Hi Daniel,
> > 
> > First of all, thank you very much for your patchset.
> > 
> > I tried to make a detailed review of your series, and you can see my
> > comments in each patch. You’ll notice that I asked many things related
> > to the DRM subsystem with the hope that I could learn a little bit
> > more about DRM from your comments.
> > 
> > Before you go through the review, I would like to start a discussion
> > about the vkms race conditions. First, I have to admit that I did not
> > understand the race conditions that you described before because I
> > couldn’t reproduce them. Now, I'm suspecting that I could not
> > experience the problem because I'm using QEMU with KVM; with this idea
> > in mind, I suppose that we have two scenarios for using vkms in a
> > virtual machine:
> > 
> > * Scenario 1: The user has hardware virtualization support; in this
> > case, it is a little bit harder to experience race conditions with
> > vkms.
> > 
> > * Scenario 2: Without hardware virtualization support, it is much
> > easier to experience race conditions.
> > 
> > With these two scenarios in mind, I conducted a bunch of experiments
> > for trying to shed light on this issue. I did:
> > 
> > 1. Enabled lockdep
> > 
> > 2. Defined two different environments for testing both using QEMU with
> > and without kvm. See below the QEMU hardware options:
> > 
> > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > 
> > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > turn off the vm.
> > 
> > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > and the columns holdtime-total and holdtime-avg
> > 
> > I would like to highlight that the following data does not have any
> > statistical approach, and its intention is solely to assist our
> > discussion. See below the summary of the collected data:
> > 
> > Summary of the experiment results:
> > 
> > +----------------+----------------+----------------+
> > |                |     env_kvm    |   env_no_kvm   |
> > +                +----------------+----------------+
> > | Test           | Before | After | Before | After |
> > +----------------+--------+-------+--------+-------+
> > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > +----------------+--------+-------+--------+-------+
> > 
> > * Before: before apply this patchset
> > * After: after apply this patchset
> > 
> > -----------------------------------------+------------------+-----------
> > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > -----------------------------------------+----------------+-------------
> > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > -----------------------------------------+----------------+-------------
> > S2: With this patchset and with kvm      |                |
> > -----------------------------------------+----------------+-------------
> > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > -----------------------------------------+----------------+-------------
> > M1: Without this patchset and without kvm|                |
> > -----------------------------------------+----------------+-------------
> > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > -----------------------------------------+----------------+-------------
> > M2: With this patchset and without kvm   |                |
> > -----------------------------------------+----------------+-------------
> > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > 
> > First, I analyzed the scenarios with KVM (S1 and S2); more
> > specifically, I focused on these two classes:
> > 
> > 1. (work_completion)(&vkms_state->crc_wo
> > 2. (work_completion)(&vkms_state->crc_#2
> > 
> > After taking a look at the data, it looks like that this patchset
> > greatly reduces the hold time average for crc_wo. On the other hand,
> > it increases the hold time average for crc_#2. I didn’t understand
> > well the reason for the difference. Could you help me here?
> 
> So there's two real locks here from our code, the ones you can see as
> spinlocks:
> 
> &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> 
> All the others are fake locks that the workqueue adds, which only exist in
> lockdep. They are used to catch special kinds of deadlocks like the below:
> 
> thread A:
> 1. mutex_lock(mutex_A)
> 2. flush_work(work_A)
> 
> thread B
> 1. running work_A:
> 2. mutex_lock(mutex_A)
> 
> thread B can't continue becuase mutex_A is already held by thread A.
> thread A can't continue because thread B is blocked and the work never
> finishes.
> -> deadlock
> 
> I haven't checked which is which, but essentially what you measure with
> the hold times of these fake locks is how long a work execution takes on
> average.
> 
> Since my patches are supposed to fix races where the worker can't keep up
> with the vblank hrtimer, then the average worker will (probably) do more,
> so that going up is expected. I think.
> 
> I'm honestly not sure what's going on here, never looking into this in
> detail.
> 
> > When I looked for the second set of scenarios (M1 and M2, both without
> > KVM), the results look much more distant; basically, this patchset
> > increased the hold time average. Again, could you help me understand a
> > little bit better this issue?
> > 
> > Finally, after these tests, I have some questions:
> > 
> > 1. VKMS is a software-only driver; because of this, how about defining
> > a minimal system resource for using it?
> 
> No idea, in reality it's probably "if it fails too often, buy faster cpu".
> I do think we should make the code robust against a slow cpu, since atm
> that's needed even on pretty fast machines (because our blending code is
> really, really slow and inefficient).
> 
> > 2. During my experiments, I noticed that running tests with a VM that
> > uses KVM had consistent results. For example, kms_flip never fails
> > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > them looks random). If we use vkms for test DRM stuff, should we
> > recommend the use of KVM?
> 
> What do you mean without kvm? In general running without kvm shouldn't be
> slower, so I'm a bit confused ... I'm running vgem directly on the
> machine, by booting into new kernels (and controlling the machine over the
> network).

Sorry, I should have detailed my point.

Basically, I do all my testing with vkms in QEMU VM. In that sense, I
did some experiments which I enabled and disabled the KVM (i.e., flag
'-enable-kvm') to check the vkms behaviour in these two scenarios. I
noticed that the tests are consistent when I use the '-enable-kvm' flag,
in that context it seemed a good idea to recommend the use of KVM for
those users who want to test vkms with igt. Anyway, don't worry about
that I'll try to add more documentation for vkms in the future and in
that time we discuss about this again.

Anyway, from my side, I think we should merge this series. Do you want
to prepare a V2 with the fixes and maybe update the commit messages by
using some of your explanations? Or, if you want, I can fix the tiny
details and merge it.

> -Daniel
> 
> > Best Regards
> > 
> > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > >
> > > Hi all,
> > >
> > > This here is the first part of a rework for the vkms crc worker. I think
> > > this should fix all the locking/races/use-after-free issues I spotted in
> > > the code. There's more work we can do in the future as a follow-up:
> > >
> > > - directly access vkms_plane_state->base in the crc worker, with this
> > >   approach in this series here that should be safe now. Follow-up patches
> > >   could switch and remove all the separate crc_data infrastructure.
> > >
> > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > >   the various functions is probably overkill.
> > >
> > > - Implementing a more generic blending engine, as prep for adding more
> > >   pixel formats, more planes, and more features in general.
> > >
> > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > things worse, but I didn't do a full run.
> > >
> > > Cheers, Daniel
> > >
> > > Daniel Vetter (10):
> > >   drm/vkms: Fix crc worker races
> > >   drm/vkms: Use spin_lock_irq in process context
> > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > >   drm/vkms: Move format arrays to vkms_plane.c
> > >   drm/vkms: Add our own commit_tail
> > >   drm/vkms: flush crc workers earlier in commit flow
> > >   drm/vkms: Dont flush crc worker when we change crc status
> > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > >   drm/vkms: totally reworked crc data tracking
> > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > >
> > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > >
> > > --
> > > 2.20.1
> > >
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > 
> > 
> > 
> > -- 
> > 
> > Rodrigo Siqueira
> > https://siqueira.tech
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 01/10] drm/vkms: Fix crc worker races
  2019-06-18  2:39       ` Rodrigo Siqueira
@ 2019-06-18  8:49         ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2019-06-18  8:49 UTC (permalink / raw)
  To: Rodrigo Siqueira
  Cc: Haneen Mohammed, Daniel Vetter, DRI Development, Shayenne Moura,
	Daniel Vetter

On Mon, Jun 17, 2019 at 11:39:31PM -0300, Rodrigo Siqueira wrote:
> On 06/12, Daniel Vetter wrote:
> > On Wed, Jun 12, 2019 at 10:33:11AM -0300, Rodrigo Siqueira wrote:
> > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > >
> > > > The issue we have is that the crc worker might fall behind. We've
> > > > tried to handle this by tracking both the earliest frame for which it
> > > > still needs to compute a crc, and the last one. Plus when the
> > > > crtc_state changes, we have a new work item, which are all run in
> > > > order due to the ordered workqueue we allocate for each vkms crtc.
> > > >
> > > > Trouble is there's been a few small issues in the current code:
> > > > - we need to capture frame_end in the vblank hrtimer, not in the
> > > >   worker. The worker might run much later, and then we generate a lot
> > > >   of crc for which there's already a different worker queued up.
> > > > - frame number might be 0, so create a new crc_pending boolean to
> > > >   track this without confusion.
> > > > - we need to atomically grab frame_start/end and clear it, so do that
> > > >   all in one go. This is not going to create a new race, because if we
> > > >   race with the hrtimer then our work will be re-run.
> > > > - only race that can happen is the following:
> > > >   1. worker starts
> > > >   2. hrtimer runs and updates frame_end
> > > >   3. worker grabs frame_start/end, already reading the new frame_end,
> > > >   and clears crc_pending
> > > >   4. hrtimer calls queue_work()
> > > >   5. worker completes
> > > >   6. worker gets  re-run, crc_pending is false
> > > >   Explain this case a bit better by rewording the comment.
> > > >
> > > > v2: Demote warning level output to debug when we fail to requeue, this
> > > > is expected under high load when the crc worker can't quite keep up.
> > > >
> > > > Cc: Shayenne Moura <shayenneluzmoura@gmail.com>
> > > > Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> > > > Cc: Daniel Vetter <daniel@ffwll.ch>
> > > > ---
> > > >  drivers/gpu/drm/vkms/vkms_crc.c  | 27 +++++++++++++--------------
> > > >  drivers/gpu/drm/vkms/vkms_crtc.c |  9 +++++++--
> > > >  drivers/gpu/drm/vkms/vkms_drv.h  |  2 ++
> > > >  3 files changed, 22 insertions(+), 16 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> > > > index d7b409a3c0f8..66603da634fe 100644
> > > > --- a/drivers/gpu/drm/vkms/vkms_crc.c
> > > > +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> > > > @@ -166,16 +166,24 @@ void vkms_crc_work_handle(struct work_struct *work)
> > > >         struct drm_plane *plane;
> > > >         u32 crc32 = 0;
> > > >         u64 frame_start, frame_end;
> > > > +       bool crc_pending;
> > > >         unsigned long flags;
> > > >
> > > >         spin_lock_irqsave(&out->state_lock, flags);
> > > >         frame_start = crtc_state->frame_start;
> > > >         frame_end = crtc_state->frame_end;
> > > > +       crc_pending = crtc_state->crc_pending;
> > > > +       crtc_state->frame_start = 0;
> > > > +       crtc_state->frame_end = 0;
> > > > +       crtc_state->crc_pending = false;
> > > >         spin_unlock_irqrestore(&out->state_lock, flags);
> > > >
> > > > -       /* _vblank_handle() hasn't updated frame_start yet */
> > > > -       if (!frame_start || frame_start == frame_end)
> > > > -               goto out;
> > > > +       /*
> > > > +        * We raced with the vblank hrtimer and previous work already computed
> > > > +        * the crc, nothing to do.
> > > > +        */
> > > > +       if (!crc_pending)
> > > > +               return;
> > > 
> > > I think this condition is not reachable because crc_pending will be
> > > filled with true in `vkms_vblank_simulate()` which in turn schedule
> > > the function `vkms_crc_work_handle()`. Just for checking, I tried to
> > > reach this condition by running kms_flip, kms_pipe_crc_basic, and
> > > kms_cursor_crc with two different VM setups[1], but I couldn't reach
> > > it. What do you think?
> > 
> > thread A			thread B
> > 1. run vblank hrtimer
> > 
> > 				2. starts running crc work (from previous
> > 				vblank)
> > 
> > 3. spin_lock()			-> gets stalled on the spin_lock() because
> > 				   thread A has it already
> > 
> > 4. update frame_end (only in
> > later patches, atm this is
> > impossible). crc_pending is set
> > already.
> > 
> > 5. schedule_work: since the work
> > is running already, this means it
> > is scheduled to run once more.
> > 
> > 6. spin_unlock
> > 
> > 				7. compute crc, clear crc_pending
> > 				8. work finishes
> > 				9. work gets run again
> > 				8. crc_pending=false
> > 
> > Since the spin_lock critical section is _very_ short (less than 1 usec I
> > bet), this race is very hard to hit.
> 
> First of all, thank you very much for all of your detailed explanation
> and sorry for my delay to reply, I was 'processing' all of your
> comments. I believe that I understood the issues related with this
> patchset, I just want to check with you if the diagram and the cases
> below make sense:
> 
> timer   |------|------|------|------|------|...
> 
> Case 1:        +----x +---x  +-----x
> 
> Case 2:     A  +----------x
>             B         +----x
> 
> At the top of this diagram, I illustrated the vblank period along the
> time. In the bottom lines, I highlighted two cases; the '+' represents
> when the worker is queued (queue_work()), and the 'x' denotes when the
> CRC work finishes its data processing. Before describing each case from
> the diagram, I want to highlight that I'm focused on these two snippets
> of code:
> 
> static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer) {
>     [..]
> 		spin_lock(&output->crc_lock);
>       [..] data [..]
> 		spin_unlock(&output->crc_lock);
>     [..]
> }
> 
> void vkms_crc_work_handle(struct work_struct *work){
>   [..]
>   spin_lock_irq(&out->crc_lock);
>   crtc_state->crc_pending = false;
>     [..] data [..]
> 	spin_unlock_irq(&output->crc_lock);
>   [..]
> }
> 
> Cases:
> 
> 1) This is the best scenario; each CRC worker finishes before the next
> vblank.
> 
> 2) In this scenario, one of the CRC workers extends along multiple
> vblanks.  If worker A already collected the sensitive data inside
> vkms_crc_work_handle(), worker A and B will finish without problems
> (thanks to your changes). However, if for any reason, the worker A did
> not start before the worker B, the new work will take care of its own
> CRC and the CRC from worker A. Finally, since worker B will set
> crc_pending equal false when the worker A starts, it'll just return
> because of the following code:
> 
> if (!crc_pending)
>   return;
> 
> Make sense?

Almost correct, but I think you mixed up the workers here. Since we have a
single-threaded workqueue B will always run only once A has started. So if
A has consumed the 2nd vblank already (because it got delayed for some
reason), then it's worker B which will observe crc_pending == false
(because A has already taken care of that).

So in a way there's an additional event per worker that you need to
incorporate into your scenarios (and this means there's going to be a few
more variants). You alsready have two:

- spin_lock critical section in vkms_vblank_simulate, which also
  (re)queues the work. In your diagram that's represented with a '+'

- the worker finishes and has uploaded the crc, in your diagramm that's a
  '*'

What's missing is when exactly the worker starts, and more precisely, the
spinlock protected critical section where the work reads frame_start/end.
That's the thing which can race in interesting ways with the vblank
handler.

But looking at your diagrams and explanations I think you're getting the
hang of thinking through race conditions.

btw another interesting exercise for thinking these through:
- write up the critical steps of the functions you want to analyze
- cut them up into pieces (number them so there's no chaos)
- put them on the table as columns (each column represents a cpu or thread
  doing stuff)
- then try to make gaps and interleave execution in interesting ways, and
  try to figure out whether everything is still computed correctly.

Like in your diagram you might need a few copies of the functions to
really make things fun. For me being able to quickly move around how code
is executed and even play out extreme scenarios (like "what happens if the
entire worker is executed between these two instructions over there" -
this is possible with preemption and the linux scheduler running something
else) helps a lot for understanding this.

It's still really hard to convince yourself of correctness, but at least
easier to get an intuitive understanding.

Cheers, Daniel

> > Exercise: Figure out why schedule_work _must_ schedule the work item to
> > re-run if it's running already. If it doesn't do that there's another
> > race.
> > 
> > > 
> > > [1] Qemu parameters
> > > VM1: -m 1G -smp cores=2,cpus=2
> > > VM2: -enable-kvm -m 2G -smp cores=4,cpus=4
> > > 
> > > >         drm_for_each_plane(plane, &vdev->drm) {
> > > >                 struct vkms_plane_state *vplane_state;
> > > > @@ -196,20 +204,11 @@ void vkms_crc_work_handle(struct work_struct *work)
> > > >         if (primary_crc)
> > > >                 crc32 = _vkms_get_crc(primary_crc, cursor_crc);
> > > >
> > > > -       frame_end = drm_crtc_accurate_vblank_count(crtc);
> > > > -
> > > > -       /* queue_work can fail to schedule crc_work; add crc for
> > > > -        * missing frames
> > > > +       /*
> > > > +        * The worker can fall behind the vblank hrtimer, make sure we catch up.
> > > >          */
> > > >         while (frame_start <= frame_end)
> > > >                 drm_crtc_add_crc_entry(crtc, true, frame_start++, &crc32);
> > > 
> > > I want to take this opportunity to ask about this while; It's not
> > > really specific to this patch.
> > > 
> > > I have to admit that I never fully got the idea behind this 'while';
> > > it looks like that we just fill out the missed frames with a repeated
> > > value. FWIU, `drm_crtc_add_crc_entry()` will add an entry with the CRC
> > > information for a frame, but in this case, we are adding the same CRC
> > > for a different set of frames. I agree that near frame has a similar
> > > CRC value, but could we rely on this all the time? What could happen
> > > if we have a great difference from the frame_start and frame_end?
> > 
> > It's a cheap trick for slow cpu: If the crc work gets behind the vblank
> > hrtimer, we need to somehow catch up. With real hw this is not possible,
> > but with vkms we simulate the hw. The only quick way to catch up is to
> > fill out the same crc for everything. It's a lie, it will make some
> > kms_atomic tests fail, but it's the only thing we can really do. Aside
> > from trying to make the crc computation code as fast as possible.
> > -Daniel
> > 
> > > 
> > > > -
> > > > -out:
> > > > -       /* to avoid using the same value for frame number again */
> > > > -       spin_lock_irqsave(&out->state_lock, flags);
> > > > -       crtc_state->frame_end = frame_end;
> > > > -       crtc_state->frame_start = 0;
> > > > -       spin_unlock_irqrestore(&out->state_lock, flags);
> > > >  }
> > > >
> > > >  static int vkms_crc_parse_source(const char *src_name, bool *enabled)
> > > > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > > index 1bbe099b7db8..c727d8486e97 100644
> > > > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > > > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > > @@ -30,13 +30,18 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> > > >                  * has read the data
> > > >                  */
> > > >                 spin_lock(&output->state_lock);
> > > > -               if (!state->frame_start)
> > > > +               if (!state->crc_pending)
> > > >                         state->frame_start = frame;
> > > > +               else
> > > > +                       DRM_DEBUG_DRIVER("crc worker falling behind, frame_start: %llu, frame_end: %llu\n",
> > > > +                                        state->frame_start, frame);
> > > > +               state->frame_end = frame;
> > > > +               state->crc_pending = true;
> > > >                 spin_unlock(&output->state_lock);
> > > >
> > > >                 ret = queue_work(output->crc_workq, &state->crc_work);
> > > >                 if (!ret)
> > > > -                       DRM_WARN("failed to queue vkms_crc_work_handle");
> > > > +                       DRM_DEBUG_DRIVER("vkms_crc_work_handle already queued\n");
> > > >         }
> > > >
> > > >         spin_unlock(&output->lock);
> > > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > > > index 81f1cfbeb936..3c7e06b19efd 100644
> > > > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > > > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > > > @@ -56,6 +56,8 @@ struct vkms_plane_state {
> > > >  struct vkms_crtc_state {
> > > >         struct drm_crtc_state base;
> > > >         struct work_struct crc_work;
> > > > +
> > > > +       bool crc_pending;
> > > >         u64 frame_start;
> > > >         u64 frame_end;
> > > >  };
> > > > --
> > > > 2.20.1
> > > >
> > > 
> > > 
> > > -- 
> > > 
> > > Rodrigo Siqueira
> > > https://siqueira.tech
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
> 
> -- 
> Rodrigo Siqueira
> https://siqueira.tech

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 00/10] drm/vkms: rework crc worker
  2019-06-18  2:49     ` Rodrigo Siqueira
@ 2019-06-18  8:56       ` Daniel Vetter
  2019-06-18 21:54         ` Rodrigo Siqueira
  0 siblings, 1 reply; 44+ messages in thread
From: Daniel Vetter @ 2019-06-18  8:56 UTC (permalink / raw)
  To: Rodrigo Siqueira; +Cc: Daniel Vetter, DRI Development

On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> On 06/12, Daniel Vetter wrote:
> > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > Hi Daniel,
> > > 
> > > First of all, thank you very much for your patchset.
> > > 
> > > I tried to make a detailed review of your series, and you can see my
> > > comments in each patch. You’ll notice that I asked many things related
> > > to the DRM subsystem with the hope that I could learn a little bit
> > > more about DRM from your comments.
> > > 
> > > Before you go through the review, I would like to start a discussion
> > > about the vkms race conditions. First, I have to admit that I did not
> > > understand the race conditions that you described before because I
> > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > experience the problem because I'm using QEMU with KVM; with this idea
> > > in mind, I suppose that we have two scenarios for using vkms in a
> > > virtual machine:
> > > 
> > > * Scenario 1: The user has hardware virtualization support; in this
> > > case, it is a little bit harder to experience race conditions with
> > > vkms.
> > > 
> > > * Scenario 2: Without hardware virtualization support, it is much
> > > easier to experience race conditions.
> > > 
> > > With these two scenarios in mind, I conducted a bunch of experiments
> > > for trying to shed light on this issue. I did:
> > > 
> > > 1. Enabled lockdep
> > > 
> > > 2. Defined two different environments for testing both using QEMU with
> > > and without kvm. See below the QEMU hardware options:
> > > 
> > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > 
> > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > turn off the vm.
> > > 
> > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > and the columns holdtime-total and holdtime-avg
> > > 
> > > I would like to highlight that the following data does not have any
> > > statistical approach, and its intention is solely to assist our
> > > discussion. See below the summary of the collected data:
> > > 
> > > Summary of the experiment results:
> > > 
> > > +----------------+----------------+----------------+
> > > |                |     env_kvm    |   env_no_kvm   |
> > > +                +----------------+----------------+
> > > | Test           | Before | After | Before | After |
> > > +----------------+--------+-------+--------+-------+
> > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > +----------------+--------+-------+--------+-------+
> > > 
> > > * Before: before apply this patchset
> > > * After: after apply this patchset
> > > 
> > > -----------------------------------------+------------------+-----------
> > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > -----------------------------------------+----------------+-------------
> > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > -----------------------------------------+----------------+-------------
> > > S2: With this patchset and with kvm      |                |
> > > -----------------------------------------+----------------+-------------
> > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > -----------------------------------------+----------------+-------------
> > > M1: Without this patchset and without kvm|                |
> > > -----------------------------------------+----------------+-------------
> > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > -----------------------------------------+----------------+-------------
> > > M2: With this patchset and without kvm   |                |
> > > -----------------------------------------+----------------+-------------
> > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > 
> > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > specifically, I focused on these two classes:
> > > 
> > > 1. (work_completion)(&vkms_state->crc_wo
> > > 2. (work_completion)(&vkms_state->crc_#2
> > > 
> > > After taking a look at the data, it looks like that this patchset
> > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > it increases the hold time average for crc_#2. I didn’t understand
> > > well the reason for the difference. Could you help me here?
> > 
> > So there's two real locks here from our code, the ones you can see as
> > spinlocks:
> > 
> > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > 
> > All the others are fake locks that the workqueue adds, which only exist in
> > lockdep. They are used to catch special kinds of deadlocks like the below:
> > 
> > thread A:
> > 1. mutex_lock(mutex_A)
> > 2. flush_work(work_A)
> > 
> > thread B
> > 1. running work_A:
> > 2. mutex_lock(mutex_A)
> > 
> > thread B can't continue becuase mutex_A is already held by thread A.
> > thread A can't continue because thread B is blocked and the work never
> > finishes.
> > -> deadlock
> > 
> > I haven't checked which is which, but essentially what you measure with
> > the hold times of these fake locks is how long a work execution takes on
> > average.
> > 
> > Since my patches are supposed to fix races where the worker can't keep up
> > with the vblank hrtimer, then the average worker will (probably) do more,
> > so that going up is expected. I think.
> > 
> > I'm honestly not sure what's going on here, never looking into this in
> > detail.
> > 
> > > When I looked for the second set of scenarios (M1 and M2, both without
> > > KVM), the results look much more distant; basically, this patchset
> > > increased the hold time average. Again, could you help me understand a
> > > little bit better this issue?
> > > 
> > > Finally, after these tests, I have some questions:
> > > 
> > > 1. VKMS is a software-only driver; because of this, how about defining
> > > a minimal system resource for using it?
> > 
> > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > I do think we should make the code robust against a slow cpu, since atm
> > that's needed even on pretty fast machines (because our blending code is
> > really, really slow and inefficient).
> > 
> > > 2. During my experiments, I noticed that running tests with a VM that
> > > uses KVM had consistent results. For example, kms_flip never fails
> > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > them looks random). If we use vkms for test DRM stuff, should we
> > > recommend the use of KVM?
> > 
> > What do you mean without kvm? In general running without kvm shouldn't be
> > slower, so I'm a bit confused ... I'm running vgem directly on the
> > machine, by booting into new kernels (and controlling the machine over the
> > network).
> 
> Sorry, I should have detailed my point.
> 
> Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> did some experiments which I enabled and disabled the KVM (i.e., flag
> '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> noticed that the tests are consistent when I use the '-enable-kvm' flag,
> in that context it seemed a good idea to recommend the use of KVM for
> those users who want to test vkms with igt. Anyway, don't worry about
> that I'll try to add more documentation for vkms in the future and in
> that time we discuss about this again.

Ah, qemu without kvm is going to use software emulation for a lot of the
kernel stuff. That's going to be terribly slow indeed.

> Anyway, from my side, I think we should merge this series. Do you want
> to prepare a V2 with the fixes and maybe update the commit messages by
> using some of your explanations? Or, if you want, I can fix the tiny
> details and merge it.

I'm deeply burried in my prime cleanup/doc series right now, so will take
a few days until I get around to this again. If you want, please go ahead
with merging.

btw if you edit a patch when merging, please add a comment about that to
the commit message. And ime it's best to only augment the commit message
and maybe delete an unused variable or so that got forgotten. For
everything more better to do the edits and resubmit.

Thanks, Daniel

> 
> > -Daniel
> > 
> > > Best Regards
> > > 
> > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > the code. There's more work we can do in the future as a follow-up:
> > > >
> > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > >   approach in this series here that should be safe now. Follow-up patches
> > > >   could switch and remove all the separate crc_data infrastructure.
> > > >
> > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > >   the various functions is probably overkill.
> > > >
> > > > - Implementing a more generic blending engine, as prep for adding more
> > > >   pixel formats, more planes, and more features in general.
> > > >
> > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > things worse, but I didn't do a full run.
> > > >
> > > > Cheers, Daniel
> > > >
> > > > Daniel Vetter (10):
> > > >   drm/vkms: Fix crc worker races
> > > >   drm/vkms: Use spin_lock_irq in process context
> > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > >   drm/vkms: Add our own commit_tail
> > > >   drm/vkms: flush crc workers earlier in commit flow
> > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > >   drm/vkms: totally reworked crc data tracking
> > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > >
> > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > >
> > > > --
> > > > 2.20.1
> > > >
> > > > _______________________________________________
> > > > dri-devel mailing list
> > > > dri-devel@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > 
> > > 
> > > 
> > > -- 
> > > 
> > > Rodrigo Siqueira
> > > https://siqueira.tech
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
> 
> -- 
> Rodrigo Siqueira
> https://siqueira.tech

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 00/10] drm/vkms: rework crc worker
  2019-06-18  8:56       ` Daniel Vetter
@ 2019-06-18 21:54         ` Rodrigo Siqueira
  2019-06-18 22:06           ` Daniel Vetter
  0 siblings, 1 reply; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-18 21:54 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, DRI Development

On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > On 06/12, Daniel Vetter wrote:
> > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > Hi Daniel,
> > > >
> > > > First of all, thank you very much for your patchset.
> > > >
> > > > I tried to make a detailed review of your series, and you can see my
> > > > comments in each patch. You’ll notice that I asked many things related
> > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > more about DRM from your comments.
> > > >
> > > > Before you go through the review, I would like to start a discussion
> > > > about the vkms race conditions. First, I have to admit that I did not
> > > > understand the race conditions that you described before because I
> > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > virtual machine:
> > > >
> > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > case, it is a little bit harder to experience race conditions with
> > > > vkms.
> > > >
> > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > easier to experience race conditions.
> > > >
> > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > for trying to shed light on this issue. I did:
> > > >
> > > > 1. Enabled lockdep
> > > >
> > > > 2. Defined two different environments for testing both using QEMU with
> > > > and without kvm. See below the QEMU hardware options:
> > > >
> > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > >
> > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > turn off the vm.
> > > >
> > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > and the columns holdtime-total and holdtime-avg
> > > >
> > > > I would like to highlight that the following data does not have any
> > > > statistical approach, and its intention is solely to assist our
> > > > discussion. See below the summary of the collected data:
> > > >
> > > > Summary of the experiment results:
> > > >
> > > > +----------------+----------------+----------------+
> > > > |                |     env_kvm    |   env_no_kvm   |
> > > > +                +----------------+----------------+
> > > > | Test           | Before | After | Before | After |
> > > > +----------------+--------+-------+--------+-------+
> > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > +----------------+--------+-------+--------+-------+
> > > >
> > > > * Before: before apply this patchset
> > > > * After: after apply this patchset
> > > >
> > > > -----------------------------------------+------------------+-----------
> > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > -----------------------------------------+----------------+-------------
> > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > -----------------------------------------+----------------+-------------
> > > > S2: With this patchset and with kvm      |                |
> > > > -----------------------------------------+----------------+-------------
> > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > -----------------------------------------+----------------+-------------
> > > > M1: Without this patchset and without kvm|                |
> > > > -----------------------------------------+----------------+-------------
> > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > -----------------------------------------+----------------+-------------
> > > > M2: With this patchset and without kvm   |                |
> > > > -----------------------------------------+----------------+-------------
> > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > >
> > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > specifically, I focused on these two classes:
> > > >
> > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > 2. (work_completion)(&vkms_state->crc_#2
> > > >
> > > > After taking a look at the data, it looks like that this patchset
> > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > well the reason for the difference. Could you help me here?
> > >
> > > So there's two real locks here from our code, the ones you can see as
> > > spinlocks:
> > >
> > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > >
> > > All the others are fake locks that the workqueue adds, which only exist in
> > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > >
> > > thread A:
> > > 1. mutex_lock(mutex_A)
> > > 2. flush_work(work_A)
> > >
> > > thread B
> > > 1. running work_A:
> > > 2. mutex_lock(mutex_A)
> > >
> > > thread B can't continue becuase mutex_A is already held by thread A.
> > > thread A can't continue because thread B is blocked and the work never
> > > finishes.
> > > -> deadlock
> > >
> > > I haven't checked which is which, but essentially what you measure with
> > > the hold times of these fake locks is how long a work execution takes on
> > > average.
> > >
> > > Since my patches are supposed to fix races where the worker can't keep up
> > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > so that going up is expected. I think.
> > >
> > > I'm honestly not sure what's going on here, never looking into this in
> > > detail.
> > >
> > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > KVM), the results look much more distant; basically, this patchset
> > > > increased the hold time average. Again, could you help me understand a
> > > > little bit better this issue?
> > > >
> > > > Finally, after these tests, I have some questions:
> > > >
> > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > a minimal system resource for using it?
> > >
> > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > I do think we should make the code robust against a slow cpu, since atm
> > > that's needed even on pretty fast machines (because our blending code is
> > > really, really slow and inefficient).
> > >
> > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > recommend the use of KVM?
> > >
> > > What do you mean without kvm? In general running without kvm shouldn't be
> > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > machine, by booting into new kernels (and controlling the machine over the
> > > network).
> >
> > Sorry, I should have detailed my point.
> >
> > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > did some experiments which I enabled and disabled the KVM (i.e., flag
> > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > in that context it seemed a good idea to recommend the use of KVM for
> > those users who want to test vkms with igt. Anyway, don't worry about
> > that I'll try to add more documentation for vkms in the future and in
> > that time we discuss about this again.
>
> Ah, qemu without kvm is going to use software emulation for a lot of the
> kernel stuff. That's going to be terribly slow indeed.
>
> > Anyway, from my side, I think we should merge this series. Do you want
> > to prepare a V2 with the fixes and maybe update the commit messages by
> > using some of your explanations? Or, if you want, I can fix the tiny
> > details and merge it.
>
> I'm deeply burried in my prime cleanup/doc series right now, so will take
> a few days until I get around to this again. If you want, please go ahead
> with merging.
>
> btw if you edit a patch when merging, please add a comment about that to
> the commit message. And ime it's best to only augment the commit message
> and maybe delete an unused variable or so that got forgotten. For
> everything more better to do the edits and resubmit.

First of all, thank you very much for all your reviews and
explanation. I’ll try the exercise that you proposed on Patch 1.

I’ll merge patch [4] and [7] since they’re not related to these
series. For the other patches, I’ll merge them after I finish the new
version of writeback series. I’ll ping you later.

4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1

Finally, not related with this patchset, can I apply the patch
“drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).

1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4

Thanks

> Thanks, Daniel
>
> >
> > > -Daniel
> > >
> > > > Best Regards
> > > >
> > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > the code. There's more work we can do in the future as a follow-up:
> > > > >
> > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > >
> > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > >   the various functions is probably overkill.
> > > > >
> > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > >   pixel formats, more planes, and more features in general.
> > > > >
> > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > things worse, but I didn't do a full run.
> > > > >
> > > > > Cheers, Daniel
> > > > >
> > > > > Daniel Vetter (10):
> > > > >   drm/vkms: Fix crc worker races
> > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > >   drm/vkms: Add our own commit_tail
> > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > >   drm/vkms: totally reworked crc data tracking
> > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > >
> > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > >
> > > > > --
> > > > > 2.20.1
> > > > >
> > > > > _______________________________________________
> > > > > dri-devel mailing list
> > > > > dri-devel@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Rodrigo Siqueira
> > > > https://siqueira.tech
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
> >
> > --
> > Rodrigo Siqueira
> > https://siqueira.tech
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



-- 

Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 00/10] drm/vkms: rework crc worker
  2019-06-18 21:54         ` Rodrigo Siqueira
@ 2019-06-18 22:06           ` Daniel Vetter
  2019-06-18 22:07             ` Daniel Vetter
  2019-06-26  1:44             ` Rodrigo Siqueira
  0 siblings, 2 replies; 44+ messages in thread
From: Daniel Vetter @ 2019-06-18 22:06 UTC (permalink / raw)
  To: Rodrigo Siqueira; +Cc: DRI Development

On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
<rodrigosiqueiramelo@gmail.com> wrote:
>
> On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > > On 06/12, Daniel Vetter wrote:
> > > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > > Hi Daniel,
> > > > >
> > > > > First of all, thank you very much for your patchset.
> > > > >
> > > > > I tried to make a detailed review of your series, and you can see my
> > > > > comments in each patch. You’ll notice that I asked many things related
> > > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > > more about DRM from your comments.
> > > > >
> > > > > Before you go through the review, I would like to start a discussion
> > > > > about the vkms race conditions. First, I have to admit that I did not
> > > > > understand the race conditions that you described before because I
> > > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > > virtual machine:
> > > > >
> > > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > > case, it is a little bit harder to experience race conditions with
> > > > > vkms.
> > > > >
> > > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > > easier to experience race conditions.
> > > > >
> > > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > > for trying to shed light on this issue. I did:
> > > > >
> > > > > 1. Enabled lockdep
> > > > >
> > > > > 2. Defined two different environments for testing both using QEMU with
> > > > > and without kvm. See below the QEMU hardware options:
> > > > >
> > > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > > >
> > > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > > turn off the vm.
> > > > >
> > > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > > and the columns holdtime-total and holdtime-avg
> > > > >
> > > > > I would like to highlight that the following data does not have any
> > > > > statistical approach, and its intention is solely to assist our
> > > > > discussion. See below the summary of the collected data:
> > > > >
> > > > > Summary of the experiment results:
> > > > >
> > > > > +----------------+----------------+----------------+
> > > > > |                |     env_kvm    |   env_no_kvm   |
> > > > > +                +----------------+----------------+
> > > > > | Test           | Before | After | Before | After |
> > > > > +----------------+--------+-------+--------+-------+
> > > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > > +----------------+--------+-------+--------+-------+
> > > > >
> > > > > * Before: before apply this patchset
> > > > > * After: after apply this patchset
> > > > >
> > > > > -----------------------------------------+------------------+-----------
> > > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > > -----------------------------------------+----------------+-------------
> > > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > > -----------------------------------------+----------------+-------------
> > > > > S2: With this patchset and with kvm      |                |
> > > > > -----------------------------------------+----------------+-------------
> > > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > > -----------------------------------------+----------------+-------------
> > > > > M1: Without this patchset and without kvm|                |
> > > > > -----------------------------------------+----------------+-------------
> > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > > -----------------------------------------+----------------+-------------
> > > > > M2: With this patchset and without kvm   |                |
> > > > > -----------------------------------------+----------------+-------------
> > > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > > >
> > > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > > specifically, I focused on these two classes:
> > > > >
> > > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > > 2. (work_completion)(&vkms_state->crc_#2
> > > > >
> > > > > After taking a look at the data, it looks like that this patchset
> > > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > > well the reason for the difference. Could you help me here?
> > > >
> > > > So there's two real locks here from our code, the ones you can see as
> > > > spinlocks:
> > > >
> > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > >
> > > > All the others are fake locks that the workqueue adds, which only exist in
> > > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > > >
> > > > thread A:
> > > > 1. mutex_lock(mutex_A)
> > > > 2. flush_work(work_A)
> > > >
> > > > thread B
> > > > 1. running work_A:
> > > > 2. mutex_lock(mutex_A)
> > > >
> > > > thread B can't continue becuase mutex_A is already held by thread A.
> > > > thread A can't continue because thread B is blocked and the work never
> > > > finishes.
> > > > -> deadlock
> > > >
> > > > I haven't checked which is which, but essentially what you measure with
> > > > the hold times of these fake locks is how long a work execution takes on
> > > > average.
> > > >
> > > > Since my patches are supposed to fix races where the worker can't keep up
> > > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > > so that going up is expected. I think.
> > > >
> > > > I'm honestly not sure what's going on here, never looking into this in
> > > > detail.
> > > >
> > > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > > KVM), the results look much more distant; basically, this patchset
> > > > > increased the hold time average. Again, could you help me understand a
> > > > > little bit better this issue?
> > > > >
> > > > > Finally, after these tests, I have some questions:
> > > > >
> > > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > > a minimal system resource for using it?
> > > >
> > > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > > I do think we should make the code robust against a slow cpu, since atm
> > > > that's needed even on pretty fast machines (because our blending code is
> > > > really, really slow and inefficient).
> > > >
> > > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > > recommend the use of KVM?
> > > >
> > > > What do you mean without kvm? In general running without kvm shouldn't be
> > > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > > machine, by booting into new kernels (and controlling the machine over the
> > > > network).
> > >
> > > Sorry, I should have detailed my point.
> > >
> > > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > > did some experiments which I enabled and disabled the KVM (i.e., flag
> > > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > > in that context it seemed a good idea to recommend the use of KVM for
> > > those users who want to test vkms with igt. Anyway, don't worry about
> > > that I'll try to add more documentation for vkms in the future and in
> > > that time we discuss about this again.
> >
> > Ah, qemu without kvm is going to use software emulation for a lot of the
> > kernel stuff. That's going to be terribly slow indeed.
> >
> > > Anyway, from my side, I think we should merge this series. Do you want
> > > to prepare a V2 with the fixes and maybe update the commit messages by
> > > using some of your explanations? Or, if you want, I can fix the tiny
> > > details and merge it.
> >
> > I'm deeply burried in my prime cleanup/doc series right now, so will take
> > a few days until I get around to this again. If you want, please go ahead
> > with merging.
> >
> > btw if you edit a patch when merging, please add a comment about that to
> > the commit message. And ime it's best to only augment the commit message
> > and maybe delete an unused variable or so that got forgotten. For
> > everything more better to do the edits and resubmit.
>
> First of all, thank you very much for all your reviews and
> explanation. I’ll try the exercise that you proposed on Patch 1.
>
> I’ll merge patch [4] and [7] since they’re not related to these
> series. For the other patches, I’ll merge them after I finish the new
> version of writeback series. I’ll ping you later.
>
> 4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
> 7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1$

Can you merge them quicker? I have another 3 vkms patches here
touching that area with some follow-up stuff ...

> Finally, not related with this patchset, can I apply the patch
> “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
>
> 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4

If you want get some acks from igt maintainers (those patches landed
now, right), but this is good enough.
-Daniel


> Thanks
>
> > Thanks, Daniel
> >
> > >
> > > > -Daniel
> > > >
> > > > > Best Regards
> > > > >
> > > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > > the code. There's more work we can do in the future as a follow-up:
> > > > > >
> > > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > > >
> > > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > > >   the various functions is probably overkill.
> > > > > >
> > > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > > >   pixel formats, more planes, and more features in general.
> > > > > >
> > > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > > things worse, but I didn't do a full run.
> > > > > >
> > > > > > Cheers, Daniel
> > > > > >
> > > > > > Daniel Vetter (10):
> > > > > >   drm/vkms: Fix crc worker races
> > > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > > >   drm/vkms: Add our own commit_tail
> > > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > > >   drm/vkms: totally reworked crc data tracking
> > > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > > >
> > > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > > >
> > > > > > --
> > > > > > 2.20.1
> > > > > >
> > > > > > _______________________________________________
> > > > > > dri-devel mailing list
> > > > > > dri-devel@lists.freedesktop.org
> > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Rodrigo Siqueira
> > > > > https://siqueira.tech
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
> > >
> > > --
> > > Rodrigo Siqueira
> > > https://siqueira.tech
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
>
>
>
> --
>
> Rodrigo Siqueira
> https://siqueira.tech



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 00/10] drm/vkms: rework crc worker
  2019-06-18 22:06           ` Daniel Vetter
@ 2019-06-18 22:07             ` Daniel Vetter
  2019-06-18 22:25               ` Rodrigo Siqueira
  2019-06-26  1:44             ` Rodrigo Siqueira
  1 sibling, 1 reply; 44+ messages in thread
From: Daniel Vetter @ 2019-06-18 22:07 UTC (permalink / raw)
  To: Rodrigo Siqueira; +Cc: DRI Development

On Wed, Jun 19, 2019 at 12:06 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
> <rodrigosiqueiramelo@gmail.com> wrote:
> >
> > On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > > > On 06/12, Daniel Vetter wrote:
> > > > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > > > Hi Daniel,
> > > > > >
> > > > > > First of all, thank you very much for your patchset.
> > > > > >
> > > > > > I tried to make a detailed review of your series, and you can see my
> > > > > > comments in each patch. You’ll notice that I asked many things related
> > > > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > > > more about DRM from your comments.
> > > > > >
> > > > > > Before you go through the review, I would like to start a discussion
> > > > > > about the vkms race conditions. First, I have to admit that I did not
> > > > > > understand the race conditions that you described before because I
> > > > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > > > virtual machine:
> > > > > >
> > > > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > > > case, it is a little bit harder to experience race conditions with
> > > > > > vkms.
> > > > > >
> > > > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > > > easier to experience race conditions.
> > > > > >
> > > > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > > > for trying to shed light on this issue. I did:
> > > > > >
> > > > > > 1. Enabled lockdep
> > > > > >
> > > > > > 2. Defined two different environments for testing both using QEMU with
> > > > > > and without kvm. See below the QEMU hardware options:
> > > > > >
> > > > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > > > >
> > > > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > > > turn off the vm.
> > > > > >
> > > > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > > > and the columns holdtime-total and holdtime-avg
> > > > > >
> > > > > > I would like to highlight that the following data does not have any
> > > > > > statistical approach, and its intention is solely to assist our
> > > > > > discussion. See below the summary of the collected data:
> > > > > >
> > > > > > Summary of the experiment results:
> > > > > >
> > > > > > +----------------+----------------+----------------+
> > > > > > |                |     env_kvm    |   env_no_kvm   |
> > > > > > +                +----------------+----------------+
> > > > > > | Test           | Before | After | Before | After |
> > > > > > +----------------+--------+-------+--------+-------+
> > > > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > > > +----------------+--------+-------+--------+-------+
> > > > > >
> > > > > > * Before: before apply this patchset
> > > > > > * After: after apply this patchset
> > > > > >
> > > > > > -----------------------------------------+------------------+-----------
> > > > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > S2: With this patchset and with kvm      |                |
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > M1: Without this patchset and without kvm|                |
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > M2: With this patchset and without kvm   |                |
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > > > >
> > > > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > > > specifically, I focused on these two classes:
> > > > > >
> > > > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > > > 2. (work_completion)(&vkms_state->crc_#2
> > > > > >
> > > > > > After taking a look at the data, it looks like that this patchset
> > > > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > > > well the reason for the difference. Could you help me here?
> > > > >
> > > > > So there's two real locks here from our code, the ones you can see as
> > > > > spinlocks:
> > > > >
> > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > >
> > > > > All the others are fake locks that the workqueue adds, which only exist in
> > > > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > > > >
> > > > > thread A:
> > > > > 1. mutex_lock(mutex_A)
> > > > > 2. flush_work(work_A)
> > > > >
> > > > > thread B
> > > > > 1. running work_A:
> > > > > 2. mutex_lock(mutex_A)
> > > > >
> > > > > thread B can't continue becuase mutex_A is already held by thread A.
> > > > > thread A can't continue because thread B is blocked and the work never
> > > > > finishes.
> > > > > -> deadlock
> > > > >
> > > > > I haven't checked which is which, but essentially what you measure with
> > > > > the hold times of these fake locks is how long a work execution takes on
> > > > > average.
> > > > >
> > > > > Since my patches are supposed to fix races where the worker can't keep up
> > > > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > > > so that going up is expected. I think.
> > > > >
> > > > > I'm honestly not sure what's going on here, never looking into this in
> > > > > detail.
> > > > >
> > > > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > > > KVM), the results look much more distant; basically, this patchset
> > > > > > increased the hold time average. Again, could you help me understand a
> > > > > > little bit better this issue?
> > > > > >
> > > > > > Finally, after these tests, I have some questions:
> > > > > >
> > > > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > > > a minimal system resource for using it?
> > > > >
> > > > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > > > I do think we should make the code robust against a slow cpu, since atm
> > > > > that's needed even on pretty fast machines (because our blending code is
> > > > > really, really slow and inefficient).
> > > > >
> > > > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > > > recommend the use of KVM?
> > > > >
> > > > > What do you mean without kvm? In general running without kvm shouldn't be
> > > > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > > > machine, by booting into new kernels (and controlling the machine over the
> > > > > network).
> > > >
> > > > Sorry, I should have detailed my point.
> > > >
> > > > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > > > did some experiments which I enabled and disabled the KVM (i.e., flag
> > > > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > > > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > > > in that context it seemed a good idea to recommend the use of KVM for
> > > > those users who want to test vkms with igt. Anyway, don't worry about
> > > > that I'll try to add more documentation for vkms in the future and in
> > > > that time we discuss about this again.
> > >
> > > Ah, qemu without kvm is going to use software emulation for a lot of the
> > > kernel stuff. That's going to be terribly slow indeed.
> > >
> > > > Anyway, from my side, I think we should merge this series. Do you want
> > > > to prepare a V2 with the fixes and maybe update the commit messages by
> > > > using some of your explanations? Or, if you want, I can fix the tiny
> > > > details and merge it.
> > >
> > > I'm deeply burried in my prime cleanup/doc series right now, so will take
> > > a few days until I get around to this again. If you want, please go ahead
> > > with merging.
> > >
> > > btw if you edit a patch when merging, please add a comment about that to
> > > the commit message. And ime it's best to only augment the commit message
> > > and maybe delete an unused variable or so that got forgotten. For
> > > everything more better to do the edits and resubmit.
> >
> > First of all, thank you very much for all your reviews and
> > explanation. I’ll try the exercise that you proposed on Patch 1.
> >
> > I’ll merge patch [4] and [7] since they’re not related to these
> > series. For the other patches, I’ll merge them after I finish the new
> > version of writeback series. I’ll ping you later.
> >
> > 4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
> > 7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1$
>
> Can you merge them quicker? I have another 3 vkms patches here
> touching that area with some follow-up stuff ...
>
> > Finally, not related with this patchset, can I apply the patch
> > “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> > more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
> >
> > 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4
>
> If you want get some acks from igt maintainers (those patches landed
> now, right), but this is good enough.

Oh wait correction: My review is conditional on you changing that one
thing. So needs another version. Since this is a functional change imo
too much to fix up while applying.
-Daniel

> -Daniel
>
>
> > Thanks
> >
> > > Thanks, Daniel
> > >
> > > >
> > > > > -Daniel
> > > > >
> > > > > > Best Regards
> > > > > >
> > > > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > > > the code. There's more work we can do in the future as a follow-up:
> > > > > > >
> > > > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > > > >
> > > > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > > > >   the various functions is probably overkill.
> > > > > > >
> > > > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > > > >   pixel formats, more planes, and more features in general.
> > > > > > >
> > > > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > > > things worse, but I didn't do a full run.
> > > > > > >
> > > > > > > Cheers, Daniel
> > > > > > >
> > > > > > > Daniel Vetter (10):
> > > > > > >   drm/vkms: Fix crc worker races
> > > > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > > > >   drm/vkms: Add our own commit_tail
> > > > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > > > >   drm/vkms: totally reworked crc data tracking
> > > > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > > > >
> > > > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > > > >
> > > > > > > --
> > > > > > > 2.20.1
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > dri-devel mailing list
> > > > > > > dri-devel@lists.freedesktop.org
> > > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Rodrigo Siqueira
> > > > > > https://siqueira.tech
> > > > >
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > > >
> > > > --
> > > > Rodrigo Siqueira
> > > > https://siqueira.tech
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
> >
> >
> >
> > --
> >
> > Rodrigo Siqueira
> > https://siqueira.tech
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 00/10] drm/vkms: rework crc worker
  2019-06-18 22:07             ` Daniel Vetter
@ 2019-06-18 22:25               ` Rodrigo Siqueira
  2019-06-18 22:39                 ` Daniel Vetter
  0 siblings, 1 reply; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-18 22:25 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: DRI Development

On Tue, Jun 18, 2019 at 7:08 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Wed, Jun 19, 2019 at 12:06 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
> > <rodrigosiqueiramelo@gmail.com> wrote:
> > >
> > > On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > > > > On 06/12, Daniel Vetter wrote:
> > > > > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > > > > Hi Daniel,
> > > > > > >
> > > > > > > First of all, thank you very much for your patchset.
> > > > > > >
> > > > > > > I tried to make a detailed review of your series, and you can see my
> > > > > > > comments in each patch. You’ll notice that I asked many things related
> > > > > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > > > > more about DRM from your comments.
> > > > > > >
> > > > > > > Before you go through the review, I would like to start a discussion
> > > > > > > about the vkms race conditions. First, I have to admit that I did not
> > > > > > > understand the race conditions that you described before because I
> > > > > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > > > > virtual machine:
> > > > > > >
> > > > > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > > > > case, it is a little bit harder to experience race conditions with
> > > > > > > vkms.
> > > > > > >
> > > > > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > > > > easier to experience race conditions.
> > > > > > >
> > > > > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > > > > for trying to shed light on this issue. I did:
> > > > > > >
> > > > > > > 1. Enabled lockdep
> > > > > > >
> > > > > > > 2. Defined two different environments for testing both using QEMU with
> > > > > > > and without kvm. See below the QEMU hardware options:
> > > > > > >
> > > > > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > > > > >
> > > > > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > > > > turn off the vm.
> > > > > > >
> > > > > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > > > > and the columns holdtime-total and holdtime-avg
> > > > > > >
> > > > > > > I would like to highlight that the following data does not have any
> > > > > > > statistical approach, and its intention is solely to assist our
> > > > > > > discussion. See below the summary of the collected data:
> > > > > > >
> > > > > > > Summary of the experiment results:
> > > > > > >
> > > > > > > +----------------+----------------+----------------+
> > > > > > > |                |     env_kvm    |   env_no_kvm   |
> > > > > > > +                +----------------+----------------+
> > > > > > > | Test           | Before | After | Before | After |
> > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > >
> > > > > > > * Before: before apply this patchset
> > > > > > > * After: after apply this patchset
> > > > > > >
> > > > > > > -----------------------------------------+------------------+-----------
> > > > > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > S2: With this patchset and with kvm      |                |
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > M1: Without this patchset and without kvm|                |
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > M2: With this patchset and without kvm   |                |
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > > > > >
> > > > > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > > > > specifically, I focused on these two classes:
> > > > > > >
> > > > > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > > > > 2. (work_completion)(&vkms_state->crc_#2
> > > > > > >
> > > > > > > After taking a look at the data, it looks like that this patchset
> > > > > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > > > > well the reason for the difference. Could you help me here?
> > > > > >
> > > > > > So there's two real locks here from our code, the ones you can see as
> > > > > > spinlocks:
> > > > > >
> > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > >
> > > > > > All the others are fake locks that the workqueue adds, which only exist in
> > > > > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > > > > >
> > > > > > thread A:
> > > > > > 1. mutex_lock(mutex_A)
> > > > > > 2. flush_work(work_A)
> > > > > >
> > > > > > thread B
> > > > > > 1. running work_A:
> > > > > > 2. mutex_lock(mutex_A)
> > > > > >
> > > > > > thread B can't continue becuase mutex_A is already held by thread A.
> > > > > > thread A can't continue because thread B is blocked and the work never
> > > > > > finishes.
> > > > > > -> deadlock
> > > > > >
> > > > > > I haven't checked which is which, but essentially what you measure with
> > > > > > the hold times of these fake locks is how long a work execution takes on
> > > > > > average.
> > > > > >
> > > > > > Since my patches are supposed to fix races where the worker can't keep up
> > > > > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > > > > so that going up is expected. I think.
> > > > > >
> > > > > > I'm honestly not sure what's going on here, never looking into this in
> > > > > > detail.
> > > > > >
> > > > > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > > > > KVM), the results look much more distant; basically, this patchset
> > > > > > > increased the hold time average. Again, could you help me understand a
> > > > > > > little bit better this issue?
> > > > > > >
> > > > > > > Finally, after these tests, I have some questions:
> > > > > > >
> > > > > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > > > > a minimal system resource for using it?
> > > > > >
> > > > > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > > > > I do think we should make the code robust against a slow cpu, since atm
> > > > > > that's needed even on pretty fast machines (because our blending code is
> > > > > > really, really slow and inefficient).
> > > > > >
> > > > > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > > > > recommend the use of KVM?
> > > > > >
> > > > > > What do you mean without kvm? In general running without kvm shouldn't be
> > > > > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > > > > machine, by booting into new kernels (and controlling the machine over the
> > > > > > network).
> > > > >
> > > > > Sorry, I should have detailed my point.
> > > > >
> > > > > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > > > > did some experiments which I enabled and disabled the KVM (i.e., flag
> > > > > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > > > > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > > > > in that context it seemed a good idea to recommend the use of KVM for
> > > > > those users who want to test vkms with igt. Anyway, don't worry about
> > > > > that I'll try to add more documentation for vkms in the future and in
> > > > > that time we discuss about this again.
> > > >
> > > > Ah, qemu without kvm is going to use software emulation for a lot of the
> > > > kernel stuff. That's going to be terribly slow indeed.
> > > >
> > > > > Anyway, from my side, I think we should merge this series. Do you want
> > > > > to prepare a V2 with the fixes and maybe update the commit messages by
> > > > > using some of your explanations? Or, if you want, I can fix the tiny
> > > > > details and merge it.
> > > >
> > > > I'm deeply burried in my prime cleanup/doc series right now, so will take
> > > > a few days until I get around to this again. If you want, please go ahead
> > > > with merging.
> > > >
> > > > btw if you edit a patch when merging, please add a comment about that to
> > > > the commit message. And ime it's best to only augment the commit message
> > > > and maybe delete an unused variable or so that got forgotten. For
> > > > everything more better to do the edits and resubmit.
> > >
> > > First of all, thank you very much for all your reviews and
> > > explanation. I’ll try the exercise that you proposed on Patch 1.
> > >
> > > I’ll merge patch [4] and [7] since they’re not related to these
> > > series. For the other patches, I’ll merge them after I finish the new
> > > version of writeback series. I’ll ping you later.
> > >
> > > 4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
> > > 7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1$
> >
> > Can you merge them quicker? I have another 3 vkms patches here
> > touching that area with some follow-up stuff ...

Do you mean patch 4 and 7 right? I cannot merge it right now, but I
can merge it tonight; however, I'm fine if you want to merge it.

> > > Finally, not related with this patchset, can I apply the patch
> > > “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> > > more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
> > >
> > > 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4
> >
> > If you want get some acks from igt maintainers (those patches landed
> > now, right), but this is good enough.
>
> Oh wait correction: My review is conditional on you changing that one
> thing. So needs another version. Since this is a functional change imo
> too much to fix up while applying.

In your comment you said:

  >   if (vblwait->request.type & _DRM_VBLANK_SIGNAL)
  > - return -EINVAL;
  > + return -EOPNOTSUPP;

  Not sure we want EINVAL here, that's kinda a "parameters are wrong"
  version too. With that changed:

I think I did not got your point here, sorry for that... so, do you
want that I change EOPNOTSUPP by EINVAL in the above code?

> -Daniel
>
> > -Daniel
> >
> >
> > > Thanks
> > >
> > > > Thanks, Daniel
> > > >
> > > > >
> > > > > > -Daniel
> > > > > >
> > > > > > > Best Regards
> > > > > > >
> > > > > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > > > > the code. There's more work we can do in the future as a follow-up:
> > > > > > > >
> > > > > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > > > > >
> > > > > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > > > > >   the various functions is probably overkill.
> > > > > > > >
> > > > > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > > > > >   pixel formats, more planes, and more features in general.
> > > > > > > >
> > > > > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > > > > things worse, but I didn't do a full run.
> > > > > > > >
> > > > > > > > Cheers, Daniel
> > > > > > > >
> > > > > > > > Daniel Vetter (10):
> > > > > > > >   drm/vkms: Fix crc worker races
> > > > > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > > > > >   drm/vkms: Add our own commit_tail
> > > > > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > > > > >   drm/vkms: totally reworked crc data tracking
> > > > > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > > > > >
> > > > > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > > > > >
> > > > > > > > --
> > > > > > > > 2.20.1
> > > > > > > >
> > > > > > > > _______________________________________________
> > > > > > > > dri-devel mailing list
> > > > > > > > dri-devel@lists.freedesktop.org
> > > > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Rodrigo Siqueira
> > > > > > > https://siqueira.tech
> > > > > >
> > > > > > --
> > > > > > Daniel Vetter
> > > > > > Software Engineer, Intel Corporation
> > > > > > http://blog.ffwll.ch
> > > > >
> > > > > --
> > > > > Rodrigo Siqueira
> > > > > https://siqueira.tech
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
> > >
> > >
> > >
> > > --
> > >
> > > Rodrigo Siqueira
> > > https://siqueira.tech
> >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch



-- 

Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 00/10] drm/vkms: rework crc worker
  2019-06-18 22:25               ` Rodrigo Siqueira
@ 2019-06-18 22:39                 ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2019-06-18 22:39 UTC (permalink / raw)
  To: Rodrigo Siqueira; +Cc: DRI Development

On Wed, Jun 19, 2019 at 12:25 AM Rodrigo Siqueira
<rodrigosiqueiramelo@gmail.com> wrote:
>
> On Tue, Jun 18, 2019 at 7:08 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Wed, Jun 19, 2019 at 12:06 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
> > > <rodrigosiqueiramelo@gmail.com> wrote:
> > > > Finally, not related with this patchset, can I apply the patch
> > > > “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> > > > more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
> > > >
> > > > 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4
> > >
> > > If you want get some acks from igt maintainers (those patches landed
> > > now, right), but this is good enough.
> >
> > Oh wait correction: My review is conditional on you changing that one
> > thing. So needs another version. Since this is a functional change imo
> > too much to fix up while applying.
>
> In your comment you said:
>
>   >   if (vblwait->request.type & _DRM_VBLANK_SIGNAL)
>   > - return -EINVAL;
>   > + return -EOPNOTSUPP;
>
>   Not sure we want EINVAL here, that's kinda a "parameters are wrong"
>   version too. With that changed:
>
> I think I did not got your point here, sorry for that... so, do you
> want that I change EOPNOTSUPP by EINVAL in the above code?

Oops, that was wrong. I meant to say that I don't see why we should
use EOPNOTSUPP here, the EINVAL indicating a wrong argument seems more
fitting to me. It's been pretty much forever (if we every supported
this) that vblank signal worked on linux. Ok, did a quick check, it
died in 2009. That's before the kms stuff landed, there's definitely
no userspace around anymore that ever expected this to work :-) Hence
why I think EINVAL is more fitting ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 04/10] drm/vkms: Move format arrays to vkms_plane.c
  2019-06-06 22:27 ` [PATCH 04/10] drm/vkms: Move format arrays to vkms_plane.c Daniel Vetter
  2019-06-12 13:39   ` Rodrigo Siqueira
@ 2019-06-19  2:12   ` Rodrigo Siqueira
  1 sibling, 0 replies; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-19  2:12 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, DRI Development, Haneen Mohammed


[-- Attachment #1.1: Type: text/plain, Size: 1675 bytes --]

On 06/07, Daniel Vetter wrote:
> No need to have them multiple times.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/vkms/vkms_drv.h   | 8 --------
>  drivers/gpu/drm/vkms/vkms_plane.c | 8 ++++++++
>  2 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 43d3f98289fe..2a35299bfb89 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -20,14 +20,6 @@
>  
>  extern bool enable_cursor;
>  
> -static const u32 vkms_formats[] = {
> -	DRM_FORMAT_XRGB8888,
> -};
> -
> -static const u32 vkms_cursor_formats[] = {
> -	DRM_FORMAT_ARGB8888,
> -};
> -
>  struct vkms_crc_data {
>  	struct drm_framebuffer fb;
>  	struct drm_rect src, dst;
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index 0e67d2d42f0c..0fceb6258422 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -6,6 +6,14 @@
>  #include <drm/drm_atomic_helper.h>
>  #include <drm/drm_gem_framebuffer_helper.h>
>  
> +static const u32 vkms_formats[] = {
> +	DRM_FORMAT_XRGB8888,
> +};
> +
> +static const u32 vkms_cursor_formats[] = {
> +	DRM_FORMAT_ARGB8888,
> +};
> +
>  static struct drm_plane_state *
>  vkms_plane_duplicate_state(struct drm_plane *plane)
>  {
> -- 
> 2.20.1
> 

Applied to the drm-misc-next branch.

Thanks.

-- 
Rodrigo Siqueira
https://siqueira.tech

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 07/10] drm/vkms: Dont flush crc worker when we change crc status
  2019-06-06 22:27 ` [PATCH 07/10] drm/vkms: Dont flush crc worker when we change crc status Daniel Vetter
@ 2019-06-19  2:17   ` Rodrigo Siqueira
  2019-06-19  7:47     ` Daniel Vetter
  0 siblings, 1 reply; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-19  2:17 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, DRI Development, Haneen Mohammed


[-- Attachment #1.1: Type: text/plain, Size: 1435 bytes --]

On 06/07, Daniel Vetter wrote:
> The crc core code can cope with some late crc, the race is kinda
> unavoidable. So no need to flush pending workers, they'll complete in
> time.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/vkms/vkms_crc.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> index 96806cd35ad4..9d15e5e85830 100644
> --- a/drivers/gpu/drm/vkms/vkms_crc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> @@ -249,9 +249,6 @@ int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name)
>  
>  	ret = vkms_crc_parse_source(src_name, &enabled);
>  
> -	/* make sure nothing is scheduled on crtc workq */
> -	flush_workqueue(out->crc_workq);
> -
>  	spin_lock_irq(&out->lock);
>  	out->crc_enabled = enabled;
>  	spin_unlock_irq(&out->lock);
> -- 
> 2.20.1
> 
Hi,

I tried to apply this patch, but git complained about it. I fixed the
problem manually (it was very simple), but I noticed that dim did not
add the tag "Link". Because of this, I decided to check with you before
I apply this patch. Is it ok to fix conflict without dim? Is it ok apply
a patch without the tag Link?

-- 
Rodrigo Siqueira
https://siqueira.tech

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 07/10] drm/vkms: Dont flush crc worker when we change crc status
  2019-06-19  2:17   ` Rodrigo Siqueira
@ 2019-06-19  7:47     ` Daniel Vetter
  0 siblings, 0 replies; 44+ messages in thread
From: Daniel Vetter @ 2019-06-19  7:47 UTC (permalink / raw)
  To: Rodrigo Siqueira
  Cc: Daniel Vetter, Haneen Mohammed, DRI Development, Daniel Vetter

On Tue, Jun 18, 2019 at 11:17:34PM -0300, Rodrigo Siqueira wrote:
> On 06/07, Daniel Vetter wrote:
> > The crc core code can cope with some late crc, the race is kinda
> > unavoidable. So no need to flush pending workers, they'll complete in
> > time.
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> > Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > ---
> >  drivers/gpu/drm/vkms/vkms_crc.c | 3 ---
> >  1 file changed, 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> > index 96806cd35ad4..9d15e5e85830 100644
> > --- a/drivers/gpu/drm/vkms/vkms_crc.c
> > +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> > @@ -249,9 +249,6 @@ int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name)
> >  
> >  	ret = vkms_crc_parse_source(src_name, &enabled);
> >  
> > -	/* make sure nothing is scheduled on crtc workq */
> > -	flush_workqueue(out->crc_workq);
> > -
> >  	spin_lock_irq(&out->lock);
> >  	out->crc_enabled = enabled;
> >  	spin_unlock_irq(&out->lock);
> > -- 
> > 2.20.1
> > 
> Hi,
> 
> I tried to apply this patch, but git complained about it. I fixed the
> problem manually (it was very simple), but I noticed that dim did not
> add the tag "Link". Because of this, I decided to check with you before
> I apply this patch. Is it ok to fix conflict without dim? Is it ok apply
> a patch without the tag Link?

If you've manually resolved a conflict, use dim apply-link to just extract
the Link: tag from the same patch file, and apply it to the topmost
commit. If you don't have a Link: tag then dim push will refuse to work.

In general resolving conflicts is ok, but again except for extremely
trivial things I prefer not to. For this I'd just wait until you're ready
to pull in the entire series in sequence. Otherwise you'll need to resolve
even more conflicts since the other patches also wont apply cleanly
anymore.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 00/10] drm/vkms: rework crc worker
  2019-06-18 22:06           ` Daniel Vetter
  2019-06-18 22:07             ` Daniel Vetter
@ 2019-06-26  1:44             ` Rodrigo Siqueira
  2019-06-26  7:54               ` Daniel Vetter
  1 sibling, 1 reply; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-26  1:44 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: DRI Development


[-- Attachment #1.1: Type: text/plain, Size: 15915 bytes --]

On 06/19, Daniel Vetter wrote:
> On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
> <rodrigosiqueiramelo@gmail.com> wrote:
> >
> > On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > > > On 06/12, Daniel Vetter wrote:
> > > > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > > > Hi Daniel,
> > > > > >
> > > > > > First of all, thank you very much for your patchset.
> > > > > >
> > > > > > I tried to make a detailed review of your series, and you can see my
> > > > > > comments in each patch. You’ll notice that I asked many things related
> > > > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > > > more about DRM from your comments.
> > > > > >
> > > > > > Before you go through the review, I would like to start a discussion
> > > > > > about the vkms race conditions. First, I have to admit that I did not
> > > > > > understand the race conditions that you described before because I
> > > > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > > > virtual machine:
> > > > > >
> > > > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > > > case, it is a little bit harder to experience race conditions with
> > > > > > vkms.
> > > > > >
> > > > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > > > easier to experience race conditions.
> > > > > >
> > > > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > > > for trying to shed light on this issue. I did:
> > > > > >
> > > > > > 1. Enabled lockdep
> > > > > >
> > > > > > 2. Defined two different environments for testing both using QEMU with
> > > > > > and without kvm. See below the QEMU hardware options:
> > > > > >
> > > > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > > > >
> > > > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > > > turn off the vm.
> > > > > >
> > > > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > > > and the columns holdtime-total and holdtime-avg
> > > > > >
> > > > > > I would like to highlight that the following data does not have any
> > > > > > statistical approach, and its intention is solely to assist our
> > > > > > discussion. See below the summary of the collected data:
> > > > > >
> > > > > > Summary of the experiment results:
> > > > > >
> > > > > > +----------------+----------------+----------------+
> > > > > > |                |     env_kvm    |   env_no_kvm   |
> > > > > > +                +----------------+----------------+
> > > > > > | Test           | Before | After | Before | After |
> > > > > > +----------------+--------+-------+--------+-------+
> > > > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > > > +----------------+--------+-------+--------+-------+
> > > > > >
> > > > > > * Before: before apply this patchset
> > > > > > * After: after apply this patchset
> > > > > >
> > > > > > -----------------------------------------+------------------+-----------
> > > > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > S2: With this patchset and with kvm      |                |
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > M1: Without this patchset and without kvm|                |
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > M2: With this patchset and without kvm   |                |
> > > > > > -----------------------------------------+----------------+-------------
> > > > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > > > >
> > > > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > > > specifically, I focused on these two classes:
> > > > > >
> > > > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > > > 2. (work_completion)(&vkms_state->crc_#2
> > > > > >
> > > > > > After taking a look at the data, it looks like that this patchset
> > > > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > > > well the reason for the difference. Could you help me here?
> > > > >
> > > > > So there's two real locks here from our code, the ones you can see as
> > > > > spinlocks:
> > > > >
> > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > >
> > > > > All the others are fake locks that the workqueue adds, which only exist in
> > > > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > > > >
> > > > > thread A:
> > > > > 1. mutex_lock(mutex_A)
> > > > > 2. flush_work(work_A)
> > > > >
> > > > > thread B
> > > > > 1. running work_A:
> > > > > 2. mutex_lock(mutex_A)
> > > > >
> > > > > thread B can't continue becuase mutex_A is already held by thread A.
> > > > > thread A can't continue because thread B is blocked and the work never
> > > > > finishes.
> > > > > -> deadlock
> > > > >
> > > > > I haven't checked which is which, but essentially what you measure with
> > > > > the hold times of these fake locks is how long a work execution takes on
> > > > > average.
> > > > >
> > > > > Since my patches are supposed to fix races where the worker can't keep up
> > > > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > > > so that going up is expected. I think.
> > > > >
> > > > > I'm honestly not sure what's going on here, never looking into this in
> > > > > detail.
> > > > >
> > > > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > > > KVM), the results look much more distant; basically, this patchset
> > > > > > increased the hold time average. Again, could you help me understand a
> > > > > > little bit better this issue?
> > > > > >
> > > > > > Finally, after these tests, I have some questions:
> > > > > >
> > > > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > > > a minimal system resource for using it?
> > > > >
> > > > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > > > I do think we should make the code robust against a slow cpu, since atm
> > > > > that's needed even on pretty fast machines (because our blending code is
> > > > > really, really slow and inefficient).
> > > > >
> > > > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > > > recommend the use of KVM?
> > > > >
> > > > > What do you mean without kvm? In general running without kvm shouldn't be
> > > > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > > > machine, by booting into new kernels (and controlling the machine over the
> > > > > network).
> > > >
> > > > Sorry, I should have detailed my point.
> > > >
> > > > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > > > did some experiments which I enabled and disabled the KVM (i.e., flag
> > > > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > > > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > > > in that context it seemed a good idea to recommend the use of KVM for
> > > > those users who want to test vkms with igt. Anyway, don't worry about
> > > > that I'll try to add more documentation for vkms in the future and in
> > > > that time we discuss about this again.
> > >
> > > Ah, qemu without kvm is going to use software emulation for a lot of the
> > > kernel stuff. That's going to be terribly slow indeed.
> > >
> > > > Anyway, from my side, I think we should merge this series. Do you want
> > > > to prepare a V2 with the fixes and maybe update the commit messages by
> > > > using some of your explanations? Or, if you want, I can fix the tiny
> > > > details and merge it.
> > >
> > > I'm deeply burried in my prime cleanup/doc series right now, so will take
> > > a few days until I get around to this again. If you want, please go ahead
> > > with merging.
> > >
> > > btw if you edit a patch when merging, please add a comment about that to
> > > the commit message. And ime it's best to only augment the commit message
> > > and maybe delete an unused variable or so that got forgotten. For
> > > everything more better to do the edits and resubmit.
> >
> > First of all, thank you very much for all your reviews and
> > explanation. I’ll try the exercise that you proposed on Patch 1.
> >
> > I’ll merge patch [4] and [7] since they’re not related to these
> > series. For the other patches, I’ll merge them after I finish the new
> > version of writeback series. I’ll ping you later.
> >

Hi,

I already sent the new version of the writeback patchset. So, how do you
want to proceed with this series? Do you prefer to send a V2 or should I
apply the patchset and make the tiny fixes?

Thanks

> > 4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
> > 7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1$
> 
> Can you merge them quicker? I have another 3 vkms patches here
> touching that area with some follow-up stuff ...
> 
> > Finally, not related with this patchset, can I apply the patch
> > “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> > more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
> >
> > 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4
> 
> If you want get some acks from igt maintainers (those patches landed
> now, right), but this is good enough.
> -Daniel
> 
> 
> > Thanks
> >
> > > Thanks, Daniel
> > >
> > > >
> > > > > -Daniel
> > > > >
> > > > > > Best Regards
> > > > > >
> > > > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > > > the code. There's more work we can do in the future as a follow-up:
> > > > > > >
> > > > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > > > >
> > > > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > > > >   the various functions is probably overkill.
> > > > > > >
> > > > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > > > >   pixel formats, more planes, and more features in general.
> > > > > > >
> > > > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > > > things worse, but I didn't do a full run.
> > > > > > >
> > > > > > > Cheers, Daniel
> > > > > > >
> > > > > > > Daniel Vetter (10):
> > > > > > >   drm/vkms: Fix crc worker races
> > > > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > > > >   drm/vkms: Add our own commit_tail
> > > > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > > > >   drm/vkms: totally reworked crc data tracking
> > > > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > > > >
> > > > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > > > >
> > > > > > > --
> > > > > > > 2.20.1
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > dri-devel mailing list
> > > > > > > dri-devel@lists.freedesktop.org
> > > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Rodrigo Siqueira
> > > > > > https://siqueira.tech
> > > > >
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > > >
> > > > --
> > > > Rodrigo Siqueira
> > > > https://siqueira.tech
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
> >
> >
> >
> > --
> >
> > Rodrigo Siqueira
> > https://siqueira.tech
> 
> 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Rodrigo Siqueira
https://siqueira.tech

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 00/10] drm/vkms: rework crc worker
  2019-06-26  1:44             ` Rodrigo Siqueira
@ 2019-06-26  7:54               ` Daniel Vetter
  2019-06-26 13:46                 ` Rodrigo Siqueira
  2019-07-01  3:30                 ` Rodrigo Siqueira
  0 siblings, 2 replies; 44+ messages in thread
From: Daniel Vetter @ 2019-06-26  7:54 UTC (permalink / raw)
  To: Rodrigo Siqueira; +Cc: DRI Development

On Wed, Jun 26, 2019 at 3:44 AM Rodrigo Siqueira
<rodrigosiqueiramelo@gmail.com> wrote:
>
> On 06/19, Daniel Vetter wrote:
> > On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
> > <rodrigosiqueiramelo@gmail.com> wrote:
> > >
> > > On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > > > > On 06/12, Daniel Vetter wrote:
> > > > > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > > > > Hi Daniel,
> > > > > > >
> > > > > > > First of all, thank you very much for your patchset.
> > > > > > >
> > > > > > > I tried to make a detailed review of your series, and you can see my
> > > > > > > comments in each patch. You’ll notice that I asked many things related
> > > > > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > > > > more about DRM from your comments.
> > > > > > >
> > > > > > > Before you go through the review, I would like to start a discussion
> > > > > > > about the vkms race conditions. First, I have to admit that I did not
> > > > > > > understand the race conditions that you described before because I
> > > > > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > > > > virtual machine:
> > > > > > >
> > > > > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > > > > case, it is a little bit harder to experience race conditions with
> > > > > > > vkms.
> > > > > > >
> > > > > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > > > > easier to experience race conditions.
> > > > > > >
> > > > > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > > > > for trying to shed light on this issue. I did:
> > > > > > >
> > > > > > > 1. Enabled lockdep
> > > > > > >
> > > > > > > 2. Defined two different environments for testing both using QEMU with
> > > > > > > and without kvm. See below the QEMU hardware options:
> > > > > > >
> > > > > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > > > > >
> > > > > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > > > > turn off the vm.
> > > > > > >
> > > > > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > > > > and the columns holdtime-total and holdtime-avg
> > > > > > >
> > > > > > > I would like to highlight that the following data does not have any
> > > > > > > statistical approach, and its intention is solely to assist our
> > > > > > > discussion. See below the summary of the collected data:
> > > > > > >
> > > > > > > Summary of the experiment results:
> > > > > > >
> > > > > > > +----------------+----------------+----------------+
> > > > > > > |                |     env_kvm    |   env_no_kvm   |
> > > > > > > +                +----------------+----------------+
> > > > > > > | Test           | Before | After | Before | After |
> > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > >
> > > > > > > * Before: before apply this patchset
> > > > > > > * After: after apply this patchset
> > > > > > >
> > > > > > > -----------------------------------------+------------------+-----------
> > > > > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > S2: With this patchset and with kvm      |                |
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > M1: Without this patchset and without kvm|                |
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > M2: With this patchset and without kvm   |                |
> > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > > > > >
> > > > > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > > > > specifically, I focused on these two classes:
> > > > > > >
> > > > > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > > > > 2. (work_completion)(&vkms_state->crc_#2
> > > > > > >
> > > > > > > After taking a look at the data, it looks like that this patchset
> > > > > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > > > > well the reason for the difference. Could you help me here?
> > > > > >
> > > > > > So there's two real locks here from our code, the ones you can see as
> > > > > > spinlocks:
> > > > > >
> > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > >
> > > > > > All the others are fake locks that the workqueue adds, which only exist in
> > > > > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > > > > >
> > > > > > thread A:
> > > > > > 1. mutex_lock(mutex_A)
> > > > > > 2. flush_work(work_A)
> > > > > >
> > > > > > thread B
> > > > > > 1. running work_A:
> > > > > > 2. mutex_lock(mutex_A)
> > > > > >
> > > > > > thread B can't continue becuase mutex_A is already held by thread A.
> > > > > > thread A can't continue because thread B is blocked and the work never
> > > > > > finishes.
> > > > > > -> deadlock
> > > > > >
> > > > > > I haven't checked which is which, but essentially what you measure with
> > > > > > the hold times of these fake locks is how long a work execution takes on
> > > > > > average.
> > > > > >
> > > > > > Since my patches are supposed to fix races where the worker can't keep up
> > > > > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > > > > so that going up is expected. I think.
> > > > > >
> > > > > > I'm honestly not sure what's going on here, never looking into this in
> > > > > > detail.
> > > > > >
> > > > > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > > > > KVM), the results look much more distant; basically, this patchset
> > > > > > > increased the hold time average. Again, could you help me understand a
> > > > > > > little bit better this issue?
> > > > > > >
> > > > > > > Finally, after these tests, I have some questions:
> > > > > > >
> > > > > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > > > > a minimal system resource for using it?
> > > > > >
> > > > > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > > > > I do think we should make the code robust against a slow cpu, since atm
> > > > > > that's needed even on pretty fast machines (because our blending code is
> > > > > > really, really slow and inefficient).
> > > > > >
> > > > > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > > > > recommend the use of KVM?
> > > > > >
> > > > > > What do you mean without kvm? In general running without kvm shouldn't be
> > > > > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > > > > machine, by booting into new kernels (and controlling the machine over the
> > > > > > network).
> > > > >
> > > > > Sorry, I should have detailed my point.
> > > > >
> > > > > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > > > > did some experiments which I enabled and disabled the KVM (i.e., flag
> > > > > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > > > > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > > > > in that context it seemed a good idea to recommend the use of KVM for
> > > > > those users who want to test vkms with igt. Anyway, don't worry about
> > > > > that I'll try to add more documentation for vkms in the future and in
> > > > > that time we discuss about this again.
> > > >
> > > > Ah, qemu without kvm is going to use software emulation for a lot of the
> > > > kernel stuff. That's going to be terribly slow indeed.
> > > >
> > > > > Anyway, from my side, I think we should merge this series. Do you want
> > > > > to prepare a V2 with the fixes and maybe update the commit messages by
> > > > > using some of your explanations? Or, if you want, I can fix the tiny
> > > > > details and merge it.
> > > >
> > > > I'm deeply burried in my prime cleanup/doc series right now, so will take
> > > > a few days until I get around to this again. If you want, please go ahead
> > > > with merging.
> > > >
> > > > btw if you edit a patch when merging, please add a comment about that to
> > > > the commit message. And ime it's best to only augment the commit message
> > > > and maybe delete an unused variable or so that got forgotten. For
> > > > everything more better to do the edits and resubmit.
> > >
> > > First of all, thank you very much for all your reviews and
> > > explanation. I’ll try the exercise that you proposed on Patch 1.
> > >
> > > I’ll merge patch [4] and [7] since they’re not related to these
> > > series. For the other patches, I’ll merge them after I finish the new
> > > version of writeback series. I’ll ping you later.
> > >
>
> Hi,
>
> I already sent the new version of the writeback patchset. So, how do you
> want to proceed with this series? Do you prefer to send a V2 or should I
> apply the patchset and make the tiny fixes?

I don't have r-b tags from you (usually if it's just nits then you can
do "r-b with those tiny details addressed"), so I can't merge. I
thought you wanted to merge them, which I've been waiting for you to
do ... so that's why they're stuck right now. Just decide and we move
them, up to you really.
-Daniel

>
> Thanks
>
> > > 4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
> > > 7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1$
> >
> > Can you merge them quicker? I have another 3 vkms patches here
> > touching that area with some follow-up stuff ...
> >
> > > Finally, not related with this patchset, can I apply the patch
> > > “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> > > more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
> > >
> > > 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4
> >
> > If you want get some acks from igt maintainers (those patches landed
> > now, right), but this is good enough.
> > -Daniel
> >
> >
> > > Thanks
> > >
> > > > Thanks, Daniel
> > > >
> > > > >
> > > > > > -Daniel
> > > > > >
> > > > > > > Best Regards
> > > > > > >
> > > > > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > > > > the code. There's more work we can do in the future as a follow-up:
> > > > > > > >
> > > > > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > > > > >
> > > > > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > > > > >   the various functions is probably overkill.
> > > > > > > >
> > > > > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > > > > >   pixel formats, more planes, and more features in general.
> > > > > > > >
> > > > > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > > > > things worse, but I didn't do a full run.
> > > > > > > >
> > > > > > > > Cheers, Daniel
> > > > > > > >
> > > > > > > > Daniel Vetter (10):
> > > > > > > >   drm/vkms: Fix crc worker races
> > > > > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > > > > >   drm/vkms: Add our own commit_tail
> > > > > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > > > > >   drm/vkms: totally reworked crc data tracking
> > > > > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > > > > >
> > > > > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > > > > >
> > > > > > > > --
> > > > > > > > 2.20.1
> > > > > > > >
> > > > > > > > _______________________________________________
> > > > > > > > dri-devel mailing list
> > > > > > > > dri-devel@lists.freedesktop.org
> > > > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Rodrigo Siqueira
> > > > > > > https://siqueira.tech
> > > > > >
> > > > > > --
> > > > > > Daniel Vetter
> > > > > > Software Engineer, Intel Corporation
> > > > > > http://blog.ffwll.ch
> > > > >
> > > > > --
> > > > > Rodrigo Siqueira
> > > > > https://siqueira.tech
> > > >
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
> > >
> > >
> > >
> > > --
> > >
> > > Rodrigo Siqueira
> > > https://siqueira.tech
> >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>
> --
> Rodrigo Siqueira
> https://siqueira.tech



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 00/10] drm/vkms: rework crc worker
  2019-06-26  7:54               ` Daniel Vetter
@ 2019-06-26 13:46                 ` Rodrigo Siqueira
  2019-07-01  3:30                 ` Rodrigo Siqueira
  1 sibling, 0 replies; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-06-26 13:46 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: DRI Development

On Wed, Jun 26, 2019 at 4:55 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Wed, Jun 26, 2019 at 3:44 AM Rodrigo Siqueira
> <rodrigosiqueiramelo@gmail.com> wrote:
> >
> > On 06/19, Daniel Vetter wrote:
> > > On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
> > > <rodrigosiqueiramelo@gmail.com> wrote:
> > > >
> > > > On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >
> > > > > On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > > > > > On 06/12, Daniel Vetter wrote:
> > > > > > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > > > > > Hi Daniel,
> > > > > > > >
> > > > > > > > First of all, thank you very much for your patchset.
> > > > > > > >
> > > > > > > > I tried to make a detailed review of your series, and you can see my
> > > > > > > > comments in each patch. You’ll notice that I asked many things related
> > > > > > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > > > > > more about DRM from your comments.
> > > > > > > >
> > > > > > > > Before you go through the review, I would like to start a discussion
> > > > > > > > about the vkms race conditions. First, I have to admit that I did not
> > > > > > > > understand the race conditions that you described before because I
> > > > > > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > > > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > > > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > > > > > virtual machine:
> > > > > > > >
> > > > > > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > > > > > case, it is a little bit harder to experience race conditions with
> > > > > > > > vkms.
> > > > > > > >
> > > > > > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > > > > > easier to experience race conditions.
> > > > > > > >
> > > > > > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > > > > > for trying to shed light on this issue. I did:
> > > > > > > >
> > > > > > > > 1. Enabled lockdep
> > > > > > > >
> > > > > > > > 2. Defined two different environments for testing both using QEMU with
> > > > > > > > and without kvm. See below the QEMU hardware options:
> > > > > > > >
> > > > > > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > > > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > > > > > >
> > > > > > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > > > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > > > > > turn off the vm.
> > > > > > > >
> > > > > > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > > > > > and the columns holdtime-total and holdtime-avg
> > > > > > > >
> > > > > > > > I would like to highlight that the following data does not have any
> > > > > > > > statistical approach, and its intention is solely to assist our
> > > > > > > > discussion. See below the summary of the collected data:
> > > > > > > >
> > > > > > > > Summary of the experiment results:
> > > > > > > >
> > > > > > > > +----------------+----------------+----------------+
> > > > > > > > |                |     env_kvm    |   env_no_kvm   |
> > > > > > > > +                +----------------+----------------+
> > > > > > > > | Test           | Before | After | Before | After |
> > > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > > >
> > > > > > > > * Before: before apply this patchset
> > > > > > > > * After: after apply this patchset
> > > > > > > >
> > > > > > > > -----------------------------------------+------------------+-----------
> > > > > > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > > > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > > > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > S2: With this patchset and with kvm      |                |
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > > > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > > > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > M1: Without this patchset and without kvm|                |
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > > > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > M2: With this patchset and without kvm   |                |
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > > > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > > > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > > > > > >
> > > > > > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > > > > > specifically, I focused on these two classes:
> > > > > > > >
> > > > > > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > > > > > 2. (work_completion)(&vkms_state->crc_#2
> > > > > > > >
> > > > > > > > After taking a look at the data, it looks like that this patchset
> > > > > > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > > > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > > > > > well the reason for the difference. Could you help me here?
> > > > > > >
> > > > > > > So there's two real locks here from our code, the ones you can see as
> > > > > > > spinlocks:
> > > > > > >
> > > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > >
> > > > > > > All the others are fake locks that the workqueue adds, which only exist in
> > > > > > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > > > > > >
> > > > > > > thread A:
> > > > > > > 1. mutex_lock(mutex_A)
> > > > > > > 2. flush_work(work_A)
> > > > > > >
> > > > > > > thread B
> > > > > > > 1. running work_A:
> > > > > > > 2. mutex_lock(mutex_A)
> > > > > > >
> > > > > > > thread B can't continue becuase mutex_A is already held by thread A.
> > > > > > > thread A can't continue because thread B is blocked and the work never
> > > > > > > finishes.
> > > > > > > -> deadlock
> > > > > > >
> > > > > > > I haven't checked which is which, but essentially what you measure with
> > > > > > > the hold times of these fake locks is how long a work execution takes on
> > > > > > > average.
> > > > > > >
> > > > > > > Since my patches are supposed to fix races where the worker can't keep up
> > > > > > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > > > > > so that going up is expected. I think.
> > > > > > >
> > > > > > > I'm honestly not sure what's going on here, never looking into this in
> > > > > > > detail.
> > > > > > >
> > > > > > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > > > > > KVM), the results look much more distant; basically, this patchset
> > > > > > > > increased the hold time average. Again, could you help me understand a
> > > > > > > > little bit better this issue?
> > > > > > > >
> > > > > > > > Finally, after these tests, I have some questions:
> > > > > > > >
> > > > > > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > > > > > a minimal system resource for using it?
> > > > > > >
> > > > > > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > > > > > I do think we should make the code robust against a slow cpu, since atm
> > > > > > > that's needed even on pretty fast machines (because our blending code is
> > > > > > > really, really slow and inefficient).
> > > > > > >
> > > > > > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > > > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > > > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > > > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > > > > > recommend the use of KVM?
> > > > > > >
> > > > > > > What do you mean without kvm? In general running without kvm shouldn't be
> > > > > > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > > > > > machine, by booting into new kernels (and controlling the machine over the
> > > > > > > network).
> > > > > >
> > > > > > Sorry, I should have detailed my point.
> > > > > >
> > > > > > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > > > > > did some experiments which I enabled and disabled the KVM (i.e., flag
> > > > > > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > > > > > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > > > > > in that context it seemed a good idea to recommend the use of KVM for
> > > > > > those users who want to test vkms with igt. Anyway, don't worry about
> > > > > > that I'll try to add more documentation for vkms in the future and in
> > > > > > that time we discuss about this again.
> > > > >
> > > > > Ah, qemu without kvm is going to use software emulation for a lot of the
> > > > > kernel stuff. That's going to be terribly slow indeed.
> > > > >
> > > > > > Anyway, from my side, I think we should merge this series. Do you want
> > > > > > to prepare a V2 with the fixes and maybe update the commit messages by
> > > > > > using some of your explanations? Or, if you want, I can fix the tiny
> > > > > > details and merge it.
> > > > >
> > > > > I'm deeply burried in my prime cleanup/doc series right now, so will take
> > > > > a few days until I get around to this again. If you want, please go ahead
> > > > > with merging.
> > > > >
> > > > > btw if you edit a patch when merging, please add a comment about that to
> > > > > the commit message. And ime it's best to only augment the commit message
> > > > > and maybe delete an unused variable or so that got forgotten. For
> > > > > everything more better to do the edits and resubmit.
> > > >
> > > > First of all, thank you very much for all your reviews and
> > > > explanation. I’ll try the exercise that you proposed on Patch 1.
> > > >
> > > > I’ll merge patch [4] and [7] since they’re not related to these
> > > > series. For the other patches, I’ll merge them after I finish the new
> > > > version of writeback series. I’ll ping you later.
> > > >
> >
> > Hi,
> >
> > I already sent the new version of the writeback patchset. So, how do you
> > want to proceed with this series? Do you prefer to send a V2 or should I
> > apply the patchset and make the tiny fixes?
>
> I don't have r-b tags from you (usually if it's just nits then you can
> do "r-b with those tiny details addressed"), so I can't merge. I
> thought you wanted to merge them, which I've been waiting for you to
> do ... so that's why they're stuck right now. Just decide and we move
> them, up to you really.
> -Daniel

Sorry for forgotten to add my the r-b and t-b, I missed it due to my
questions. Also, sorry for not merge it yet... I was focused on fixing
the writeback implementation in vkms.

Anyway, I'll merge the patchset tonight and make the tiny fixes in the
code. I'll try to be online on irc for checking some small details
with you.

> >
> > Thanks
> >
> > > > 4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
> > > > 7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1$
> > >
> > > Can you merge them quicker? I have another 3 vkms patches here
> > > touching that area with some follow-up stuff ...
> > >
> > > > Finally, not related with this patchset, can I apply the patch
> > > > “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> > > > more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
> > > >
> > > > 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4
> > >
> > > If you want get some acks from igt maintainers (those patches landed
> > > now, right), but this is good enough.
> > > -Daniel
> > >
> > >
> > > > Thanks
> > > >
> > > > > Thanks, Daniel
> > > > >
> > > > > >
> > > > > > > -Daniel
> > > > > > >
> > > > > > > > Best Regards
> > > > > > > >
> > > > > > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > > > > > the code. There's more work we can do in the future as a follow-up:
> > > > > > > > >
> > > > > > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > > > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > > > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > > > > > >
> > > > > > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > > > > > >   the various functions is probably overkill.
> > > > > > > > >
> > > > > > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > > > > > >   pixel formats, more planes, and more features in general.
> > > > > > > > >
> > > > > > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > > > > > things worse, but I didn't do a full run.
> > > > > > > > >
> > > > > > > > > Cheers, Daniel
> > > > > > > > >
> > > > > > > > > Daniel Vetter (10):
> > > > > > > > >   drm/vkms: Fix crc worker races
> > > > > > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > > > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > > > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > > > > > >   drm/vkms: Add our own commit_tail
> > > > > > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > > > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > > > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > > > > > >   drm/vkms: totally reworked crc data tracking
> > > > > > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > > > > > >
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > > > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > 2.20.1
> > > > > > > > >
> > > > > > > > > _______________________________________________
> > > > > > > > > dri-devel mailing list
> > > > > > > > > dri-devel@lists.freedesktop.org
> > > > > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Rodrigo Siqueira
> > > > > > > > https://siqueira.tech
> > > > > > >
> > > > > > > --
> > > > > > > Daniel Vetter
> > > > > > > Software Engineer, Intel Corporation
> > > > > > > http://blog.ffwll.ch
> > > > > >
> > > > > > --
> > > > > > Rodrigo Siqueira
> > > > > > https://siqueira.tech
> > > > >
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Rodrigo Siqueira
> > > > https://siqueira.tech
> > >
> > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> >
> > --
> > Rodrigo Siqueira
> > https://siqueira.tech
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch



-- 

Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 00/10] drm/vkms: rework crc worker
  2019-06-26  7:54               ` Daniel Vetter
  2019-06-26 13:46                 ` Rodrigo Siqueira
@ 2019-07-01  3:30                 ` Rodrigo Siqueira
  1 sibling, 0 replies; 44+ messages in thread
From: Rodrigo Siqueira @ 2019-07-01  3:30 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: DRI Development


[-- Attachment #1.1: Type: text/plain, Size: 18177 bytes --]

On 06/26, Daniel Vetter wrote:
> On Wed, Jun 26, 2019 at 3:44 AM Rodrigo Siqueira
> <rodrigosiqueiramelo@gmail.com> wrote:
> >
> > On 06/19, Daniel Vetter wrote:
> > > On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira
> > > <rodrigosiqueiramelo@gmail.com> wrote:
> > > >
> > > > On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >
> > > > > On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote:
> > > > > > On 06/12, Daniel Vetter wrote:
> > > > > > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> > > > > > > > Hi Daniel,
> > > > > > > >
> > > > > > > > First of all, thank you very much for your patchset.
> > > > > > > >
> > > > > > > > I tried to make a detailed review of your series, and you can see my
> > > > > > > > comments in each patch. You’ll notice that I asked many things related
> > > > > > > > to the DRM subsystem with the hope that I could learn a little bit
> > > > > > > > more about DRM from your comments.
> > > > > > > >
> > > > > > > > Before you go through the review, I would like to start a discussion
> > > > > > > > about the vkms race conditions. First, I have to admit that I did not
> > > > > > > > understand the race conditions that you described before because I
> > > > > > > > couldn’t reproduce them. Now, I'm suspecting that I could not
> > > > > > > > experience the problem because I'm using QEMU with KVM; with this idea
> > > > > > > > in mind, I suppose that we have two scenarios for using vkms in a
> > > > > > > > virtual machine:
> > > > > > > >
> > > > > > > > * Scenario 1: The user has hardware virtualization support; in this
> > > > > > > > case, it is a little bit harder to experience race conditions with
> > > > > > > > vkms.
> > > > > > > >
> > > > > > > > * Scenario 2: Without hardware virtualization support, it is much
> > > > > > > > easier to experience race conditions.
> > > > > > > >
> > > > > > > > With these two scenarios in mind, I conducted a bunch of experiments
> > > > > > > > for trying to shed light on this issue. I did:
> > > > > > > >
> > > > > > > > 1. Enabled lockdep
> > > > > > > >
> > > > > > > > 2. Defined two different environments for testing both using QEMU with
> > > > > > > > and without kvm. See below the QEMU hardware options:
> > > > > > > >
> > > > > > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> > > > > > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> > > > > > > >
> > > > > > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> > > > > > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> > > > > > > > turn off the vm.
> > > > > > > >
> > > > > > > > 4. From the lockdep_stat, I just highlighted the row related to vkms
> > > > > > > > and the columns holdtime-total and holdtime-avg
> > > > > > > >
> > > > > > > > I would like to highlight that the following data does not have any
> > > > > > > > statistical approach, and its intention is solely to assist our
> > > > > > > > discussion. See below the summary of the collected data:
> > > > > > > >
> > > > > > > > Summary of the experiment results:
> > > > > > > >
> > > > > > > > +----------------+----------------+----------------+
> > > > > > > > |                |     env_kvm    |   env_no_kvm   |
> > > > > > > > +                +----------------+----------------+
> > > > > > > > | Test           | Before | After | Before | After |
> > > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > > > | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> > > > > > > > +----------------+--------+-------+--------+-------+
> > > > > > > >
> > > > > > > > * Before: before apply this patchset
> > > > > > > > * After: after apply this patchset
> > > > > > > >
> > > > > > > > -----------------------------------------+------------------+-----------
> > > > > > > > S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> > > > > > > > (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> > > > > > > > &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > S2: With this patchset and with kvm      |                |
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> > > > > > > > &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> > > > > > > > (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > M1: Without this patchset and without kvm|                |
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> > > > > > > > (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > M2: With this patchset and without kvm   |                |
> > > > > > > > -----------------------------------------+----------------+-------------
> > > > > > > > (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> > > > > > > > &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> > > > > > > > (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> > > > > > > > &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> > > > > > > > (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> > > > > > > >
> > > > > > > > First, I analyzed the scenarios with KVM (S1 and S2); more
> > > > > > > > specifically, I focused on these two classes:
> > > > > > > >
> > > > > > > > 1. (work_completion)(&vkms_state->crc_wo
> > > > > > > > 2. (work_completion)(&vkms_state->crc_#2
> > > > > > > >
> > > > > > > > After taking a look at the data, it looks like that this patchset
> > > > > > > > greatly reduces the hold time average for crc_wo. On the other hand,
> > > > > > > > it increases the hold time average for crc_#2. I didn’t understand
> > > > > > > > well the reason for the difference. Could you help me here?
> > > > > > >
> > > > > > > So there's two real locks here from our code, the ones you can see as
> > > > > > > spinlocks:
> > > > > > >
> > > > > > > &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> > > > > > > &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> > > > > > >
> > > > > > > All the others are fake locks that the workqueue adds, which only exist in
> > > > > > > lockdep. They are used to catch special kinds of deadlocks like the below:
> > > > > > >
> > > > > > > thread A:
> > > > > > > 1. mutex_lock(mutex_A)
> > > > > > > 2. flush_work(work_A)
> > > > > > >
> > > > > > > thread B
> > > > > > > 1. running work_A:
> > > > > > > 2. mutex_lock(mutex_A)
> > > > > > >
> > > > > > > thread B can't continue becuase mutex_A is already held by thread A.
> > > > > > > thread A can't continue because thread B is blocked and the work never
> > > > > > > finishes.
> > > > > > > -> deadlock
> > > > > > >
> > > > > > > I haven't checked which is which, but essentially what you measure with
> > > > > > > the hold times of these fake locks is how long a work execution takes on
> > > > > > > average.
> > > > > > >
> > > > > > > Since my patches are supposed to fix races where the worker can't keep up
> > > > > > > with the vblank hrtimer, then the average worker will (probably) do more,
> > > > > > > so that going up is expected. I think.
> > > > > > >
> > > > > > > I'm honestly not sure what's going on here, never looking into this in
> > > > > > > detail.
> > > > > > >
> > > > > > > > When I looked for the second set of scenarios (M1 and M2, both without
> > > > > > > > KVM), the results look much more distant; basically, this patchset
> > > > > > > > increased the hold time average. Again, could you help me understand a
> > > > > > > > little bit better this issue?
> > > > > > > >
> > > > > > > > Finally, after these tests, I have some questions:
> > > > > > > >
> > > > > > > > 1. VKMS is a software-only driver; because of this, how about defining
> > > > > > > > a minimal system resource for using it?
> > > > > > >
> > > > > > > No idea, in reality it's probably "if it fails too often, buy faster cpu".
> > > > > > > I do think we should make the code robust against a slow cpu, since atm
> > > > > > > that's needed even on pretty fast machines (because our blending code is
> > > > > > > really, really slow and inefficient).
> > > > > > >
> > > > > > > > 2. During my experiments, I noticed that running tests with a VM that
> > > > > > > > uses KVM had consistent results. For example, kms_flip never fails
> > > > > > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of
> > > > > > > > them looks random). If we use vkms for test DRM stuff, should we
> > > > > > > > recommend the use of KVM?
> > > > > > >
> > > > > > > What do you mean without kvm? In general running without kvm shouldn't be
> > > > > > > slower, so I'm a bit confused ... I'm running vgem directly on the
> > > > > > > machine, by booting into new kernels (and controlling the machine over the
> > > > > > > network).
> > > > > >
> > > > > > Sorry, I should have detailed my point.
> > > > > >
> > > > > > Basically, I do all my testing with vkms in QEMU VM. In that sense, I
> > > > > > did some experiments which I enabled and disabled the KVM (i.e., flag
> > > > > > '-enable-kvm') to check the vkms behaviour in these two scenarios. I
> > > > > > noticed that the tests are consistent when I use the '-enable-kvm' flag,
> > > > > > in that context it seemed a good idea to recommend the use of KVM for
> > > > > > those users who want to test vkms with igt. Anyway, don't worry about
> > > > > > that I'll try to add more documentation for vkms in the future and in
> > > > > > that time we discuss about this again.
> > > > >
> > > > > Ah, qemu without kvm is going to use software emulation for a lot of the
> > > > > kernel stuff. That's going to be terribly slow indeed.
> > > > >
> > > > > > Anyway, from my side, I think we should merge this series. Do you want
> > > > > > to prepare a V2 with the fixes and maybe update the commit messages by
> > > > > > using some of your explanations? Or, if you want, I can fix the tiny
> > > > > > details and merge it.
> > > > >
> > > > > I'm deeply burried in my prime cleanup/doc series right now, so will take
> > > > > a few days until I get around to this again. If you want, please go ahead
> > > > > with merging.
> > > > >
> > > > > btw if you edit a patch when merging, please add a comment about that to
> > > > > the commit message. And ime it's best to only augment the commit message
> > > > > and maybe delete an unused variable or so that got forgotten. For
> > > > > everything more better to do the edits and resubmit.
> > > >
> > > > First of all, thank you very much for all your reviews and
> > > > explanation. I’ll try the exercise that you proposed on Patch 1.
> > > >
> > > > I’ll merge patch [4] and [7] since they’re not related to these
> > > > series. For the other patches, I’ll merge them after I finish the new
> > > > version of writeback series. I’ll ping you later.
> > > >
> >
> > Hi,
> >
> > I already sent the new version of the writeback patchset. So, how do you
> > want to proceed with this series? Do you prefer to send a V2 or should I
> > apply the patchset and make the tiny fixes?
> 
> I don't have r-b tags from you (usually if it's just nits then you can
> do "r-b with those tiny details addressed"), so I can't merge. I
> thought you wanted to merge them, which I've been waiting for you to
> do ... so that's why they're stuck right now. Just decide and we move
> them, up to you really.
> -Daniel

Hi Daniel,

I think that I forgot to tell you that I already applied this series.

If you have time, could you take a look in the following two series
built on top of this patchset?

1. Writeback: https://patchwork.freedesktop.org/series/61738/
2. Configfs: https://patchwork.freedesktop.org/series/63010/

Thanks
 
> >
> > Thanks
> >
> > > > 4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1
> > > > 7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1$
> > >
> > > Can you merge them quicker? I have another 3 vkms patches here
> > > touching that area with some follow-up stuff ...
> > >
> > > > Finally, not related with this patchset, can I apply the patch
> > > > “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need
> > > > more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list).
> > > >
> > > > 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4
> > >
> > > If you want get some acks from igt maintainers (those patches landed
> > > now, right), but this is good enough.
> > > -Daniel
> > >
> > >
> > > > Thanks
> > > >
> > > > > Thanks, Daniel
> > > > >
> > > > > >
> > > > > > > -Daniel
> > > > > > >
> > > > > > > > Best Regards
> > > > > > > >
> > > > > > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > This here is the first part of a rework for the vkms crc worker. I think
> > > > > > > > > this should fix all the locking/races/use-after-free issues I spotted in
> > > > > > > > > the code. There's more work we can do in the future as a follow-up:
> > > > > > > > >
> > > > > > > > > - directly access vkms_plane_state->base in the crc worker, with this
> > > > > > > > >   approach in this series here that should be safe now. Follow-up patches
> > > > > > > > >   could switch and remove all the separate crc_data infrastructure.
> > > > > > > > >
> > > > > > > > > - I think some kerneldoc for vkms structures would be nice. Documentation
> > > > > > > > >   the various functions is probably overkill.
> > > > > > > > >
> > > > > > > > > - Implementing a more generic blending engine, as prep for adding more
> > > > > > > > >   pixel formats, more planes, and more features in general.
> > > > > > > > >
> > > > > > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > > > > > > > > things worse, but I didn't do a full run.
> > > > > > > > >
> > > > > > > > > Cheers, Daniel
> > > > > > > > >
> > > > > > > > > Daniel Vetter (10):
> > > > > > > > >   drm/vkms: Fix crc worker races
> > > > > > > > >   drm/vkms: Use spin_lock_irq in process context
> > > > > > > > >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> > > > > > > > >   drm/vkms: Move format arrays to vkms_plane.c
> > > > > > > > >   drm/vkms: Add our own commit_tail
> > > > > > > > >   drm/vkms: flush crc workers earlier in commit flow
> > > > > > > > >   drm/vkms: Dont flush crc worker when we change crc status
> > > > > > > > >   drm/vkms: No _irqsave within spin_lock_irq needed
> > > > > > > > >   drm/vkms: totally reworked crc data tracking
> > > > > > > > >   drm/vkms: No need for ->pages_lock in crc work anymore
> > > > > > > > >
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> > > > > > > > >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> > > > > > > > >  5 files changed, 146 insertions(+), 75 deletions(-)
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > 2.20.1
> > > > > > > > >
> > > > > > > > > _______________________________________________
> > > > > > > > > dri-devel mailing list
> > > > > > > > > dri-devel@lists.freedesktop.org
> > > > > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Rodrigo Siqueira
> > > > > > > > https://siqueira.tech
> > > > > > >
> > > > > > > --
> > > > > > > Daniel Vetter
> > > > > > > Software Engineer, Intel Corporation
> > > > > > > http://blog.ffwll.ch
> > > > > >
> > > > > > --
> > > > > > Rodrigo Siqueira
> > > > > > https://siqueira.tech
> > > > >
> > > > > --
> > > > > Daniel Vetter
> > > > > Software Engineer, Intel Corporation
> > > > > http://blog.ffwll.ch
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Rodrigo Siqueira
> > > > https://siqueira.tech
> > >
> > >
> > >
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> >
> > --
> > Rodrigo Siqueira
> > https://siqueira.tech
> 
> 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Rodrigo Siqueira
https://siqueira.tech

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2019-07-01  3:30 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-06 22:27 [PATCH 00/10] drm/vkms: rework crc worker Daniel Vetter
2019-06-06 22:27 ` [PATCH 01/10] drm/vkms: Fix crc worker races Daniel Vetter
2019-06-12 13:33   ` Rodrigo Siqueira
2019-06-12 14:48     ` Daniel Vetter
2019-06-18  2:39       ` Rodrigo Siqueira
2019-06-18  8:49         ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 02/10] drm/vkms: Use spin_lock_irq in process context Daniel Vetter
2019-06-12 13:34   ` Rodrigo Siqueira
2019-06-12 14:54     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 03/10] drm/vkms: Rename vkms_output.state_lock to crc_lock Daniel Vetter
2019-06-12 13:38   ` Rodrigo Siqueira
2019-06-13  7:48     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 04/10] drm/vkms: Move format arrays to vkms_plane.c Daniel Vetter
2019-06-12 13:39   ` Rodrigo Siqueira
2019-06-19  2:12   ` Rodrigo Siqueira
2019-06-06 22:27 ` [PATCH 05/10] drm/vkms: Add our own commit_tail Daniel Vetter
2019-06-06 22:27 ` [PATCH 06/10] drm/vkms: flush crc workers earlier in commit flow Daniel Vetter
2019-06-12 13:42   ` Rodrigo Siqueira
2019-06-13  7:53     ` Daniel Vetter
2019-06-13  7:55       ` Daniel Vetter
2019-06-18  2:31       ` Rodrigo Siqueira
2019-06-06 22:27 ` [PATCH 07/10] drm/vkms: Dont flush crc worker when we change crc status Daniel Vetter
2019-06-19  2:17   ` Rodrigo Siqueira
2019-06-19  7:47     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 08/10] drm/vkms: No _irqsave within spin_lock_irq needed Daniel Vetter
2019-06-12 13:43   ` Rodrigo Siqueira
2019-06-06 22:27 ` [PATCH 09/10] drm/vkms: totally reworked crc data tracking Daniel Vetter
2019-06-12 13:46   ` Rodrigo Siqueira
2019-06-13  7:59     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 10/10] drm/vkms: No need for ->pages_lock in crc work anymore Daniel Vetter
2019-06-12 13:47   ` Rodrigo Siqueira
2019-06-12 13:28 ` [PATCH 00/10] drm/vkms: rework crc worker Rodrigo Siqueira
2019-06-12 14:42   ` Daniel Vetter
2019-06-18  2:49     ` Rodrigo Siqueira
2019-06-18  8:56       ` Daniel Vetter
2019-06-18 21:54         ` Rodrigo Siqueira
2019-06-18 22:06           ` Daniel Vetter
2019-06-18 22:07             ` Daniel Vetter
2019-06-18 22:25               ` Rodrigo Siqueira
2019-06-18 22:39                 ` Daniel Vetter
2019-06-26  1:44             ` Rodrigo Siqueira
2019-06-26  7:54               ` Daniel Vetter
2019-06-26 13:46                 ` Rodrigo Siqueira
2019-07-01  3:30                 ` Rodrigo Siqueira

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.