All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>,
	DRI Development <dri-devel@lists.freedesktop.org>
Subject: Re: [PATCH 00/10] drm/vkms: rework crc worker
Date: Wed, 12 Jun 2019 16:42:20 +0200	[thread overview]
Message-ID: <20190612144220.GO2458@phenom.ffwll.local> (raw)
In-Reply-To: <CADKXj+4MOQv1KZqPG=SpwsynLRBoFVdNXKt1qXZwxUFZVAYybQ@mail.gmail.com>

On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote:
> Hi Daniel,
> 
> First of all, thank you very much for your patchset.
> 
> I tried to make a detailed review of your series, and you can see my
> comments in each patch. You’ll notice that I asked many things related
> to the DRM subsystem with the hope that I could learn a little bit
> more about DRM from your comments.
> 
> Before you go through the review, I would like to start a discussion
> about the vkms race conditions. First, I have to admit that I did not
> understand the race conditions that you described before because I
> couldn’t reproduce them. Now, I'm suspecting that I could not
> experience the problem because I'm using QEMU with KVM; with this idea
> in mind, I suppose that we have two scenarios for using vkms in a
> virtual machine:
> 
> * Scenario 1: The user has hardware virtualization support; in this
> case, it is a little bit harder to experience race conditions with
> vkms.
> 
> * Scenario 2: Without hardware virtualization support, it is much
> easier to experience race conditions.
> 
> With these two scenarios in mind, I conducted a bunch of experiments
> for trying to shed light on this issue. I did:
> 
> 1. Enabled lockdep
> 
> 2. Defined two different environments for testing both using QEMU with
> and without kvm. See below the QEMU hardware options:
> 
> * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2
> * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4
> 
> 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat,
> III) execute kms_cursor_crc, III) save the lock_stat file, and IV)
> turn off the vm.
> 
> 4. From the lockdep_stat, I just highlighted the row related to vkms
> and the columns holdtime-total and holdtime-avg
> 
> I would like to highlight that the following data does not have any
> statistical approach, and its intention is solely to assist our
> discussion. See below the summary of the collected data:
> 
> Summary of the experiment results:
> 
> +----------------+----------------+----------------+
> |                |     env_kvm    |   env_no_kvm   |
> +                +----------------+----------------+
> | Test           | Before | After | Before | After |
> +----------------+--------+-------+--------+-------+
> | kms_cursor_crc |   S1   |   S2  |   M1   |   M2  |
> +----------------+--------+-------+--------+-------+
> 
> * Before: before apply this patchset
> * After: after apply this patchset
> 
> -----------------------------------------+------------------+-----------
> S1: Without this patchset and with kvm   | holdtime-total | holdtime-avg
> -----------------------------------------+----------------+-------------
> &(&vkms_out->lock)->rlock:               |  21983.52      |  6.21
> (work_completion)(&vkms_state->crc_wo:   |  20.47         |  20.47
> (wq_completion)vkms_crc_workq:           |  3999507.87    |  3352.48
> &(&vkms_out->state_lock)->rlock:         |  378.47        |  0.30
> (work_completion)(&vkms_state->crc_#2:   |  3999066.30    |  2848.34
> -----------------------------------------+----------------+-------------
> S2: With this patchset and with kvm      |                |
> -----------------------------------------+----------------+-------------
> &(&vkms_out->lock)->rlock:               |  23262.83      |  6.34
> (work_completion)(&vkms_state->crc_wo:   |  8.98          |  8.98
> &(&vkms_out->crc_lock)->rlock:           |  307.28        |  0.32
> (wq_completion)vkms_crc_workq:           |  6567727.05    |  12345.35
> (work_completion)(&vkms_state->crc_#2:   |  6567135.15    |  4488.81
> -----------------------------------------+----------------+-------------
> M1: Without this patchset and without kvm|                |
> -----------------------------------------+----------------+-------------
> &(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
> &(&vkms_out->lock)->rlock:               |  247190.04     |  39.39
> (work_completion)(&vkms_state->crc_wo:   |  31.32         |  31.32
> (wq_completion)vkms_crc_workq:           |  20991073.78   |  13525.18
> (work_completion)(&vkms_state->crc_#2:   |  20988347.18   |  11904.90
> -----------------------------------------+----------------+-------------
> M2: With this patchset and without kvm   |                |
> -----------------------------------------+----------------+-------------
> (wq_completion)vkms_crc_workq:           |  42819161.68   |  36597.57
> &(&vkms_out->lock)->rlock:               |  251257.06     |  35.80
> (work_completion)(&vkms_state->crc_wo:   |  69.37         |  69.37
> &(&vkms_out->crc_lock)->rlock:           |  3620.92       |  1.54
> (work_completion)(&vkms_state->crc_#2:   |  42803419.59   |  24306.31
> 
> First, I analyzed the scenarios with KVM (S1 and S2); more
> specifically, I focused on these two classes:
> 
> 1. (work_completion)(&vkms_state->crc_wo
> 2. (work_completion)(&vkms_state->crc_#2
> 
> After taking a look at the data, it looks like that this patchset
> greatly reduces the hold time average for crc_wo. On the other hand,
> it increases the hold time average for crc_#2. I didn’t understand
> well the reason for the difference. Could you help me here?

So there's two real locks here from our code, the ones you can see as
spinlocks:

&(&vkms_out->state_lock)->rlock:         |  4994.72       |  1.61
&(&vkms_out->lock)->rlock:               |  247190.04     |  39.39

All the others are fake locks that the workqueue adds, which only exist in
lockdep. They are used to catch special kinds of deadlocks like the below:

thread A:
1. mutex_lock(mutex_A)
2. flush_work(work_A)

thread B
1. running work_A:
2. mutex_lock(mutex_A)

thread B can't continue becuase mutex_A is already held by thread A.
thread A can't continue because thread B is blocked and the work never
finishes.
-> deadlock

I haven't checked which is which, but essentially what you measure with
the hold times of these fake locks is how long a work execution takes on
average.

Since my patches are supposed to fix races where the worker can't keep up
with the vblank hrtimer, then the average worker will (probably) do more,
so that going up is expected. I think.

I'm honestly not sure what's going on here, never looking into this in
detail.

> When I looked for the second set of scenarios (M1 and M2, both without
> KVM), the results look much more distant; basically, this patchset
> increased the hold time average. Again, could you help me understand a
> little bit better this issue?
> 
> Finally, after these tests, I have some questions:
> 
> 1. VKMS is a software-only driver; because of this, how about defining
> a minimal system resource for using it?

No idea, in reality it's probably "if it fails too often, buy faster cpu".
I do think we should make the code robust against a slow cpu, since atm
that's needed even on pretty fast machines (because our blending code is
really, really slow and inefficient).

> 2. During my experiments, I noticed that running tests with a VM that
> uses KVM had consistent results. For example, kms_flip never fails
> with QEMU+KVM; however, without KVM, two or three tests failed (one of
> them looks random). If we use vkms for test DRM stuff, should we
> recommend the use of KVM?

What do you mean without kvm? In general running without kvm shouldn't be
slower, so I'm a bit confused ... I'm running vgem directly on the
machine, by booting into new kernels (and controlling the machine over the
network).
-Daniel

> Best Regards
> 
> On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > Hi all,
> >
> > This here is the first part of a rework for the vkms crc worker. I think
> > this should fix all the locking/races/use-after-free issues I spotted in
> > the code. There's more work we can do in the future as a follow-up:
> >
> > - directly access vkms_plane_state->base in the crc worker, with this
> >   approach in this series here that should be safe now. Follow-up patches
> >   could switch and remove all the separate crc_data infrastructure.
> >
> > - I think some kerneldoc for vkms structures would be nice. Documentation
> >   the various functions is probably overkill.
> >
> > - Implementing a more generic blending engine, as prep for adding more
> >   pixel formats, more planes, and more features in general.
> >
> > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make
> > things worse, but I didn't do a full run.
> >
> > Cheers, Daniel
> >
> > Daniel Vetter (10):
> >   drm/vkms: Fix crc worker races
> >   drm/vkms: Use spin_lock_irq in process context
> >   drm/vkms: Rename vkms_output.state_lock to crc_lock
> >   drm/vkms: Move format arrays to vkms_plane.c
> >   drm/vkms: Add our own commit_tail
> >   drm/vkms: flush crc workers earlier in commit flow
> >   drm/vkms: Dont flush crc worker when we change crc status
> >   drm/vkms: No _irqsave within spin_lock_irq needed
> >   drm/vkms: totally reworked crc data tracking
> >   drm/vkms: No need for ->pages_lock in crc work anymore
> >
> >  drivers/gpu/drm/vkms/vkms_crc.c   | 74 +++++++++-------------------
> >  drivers/gpu/drm/vkms/vkms_crtc.c  | 80 ++++++++++++++++++++++++++-----
> >  drivers/gpu/drm/vkms/vkms_drv.c   | 35 ++++++++++++++
> >  drivers/gpu/drm/vkms/vkms_drv.h   | 24 +++++-----
> >  drivers/gpu/drm/vkms/vkms_plane.c |  8 ++++
> >  5 files changed, 146 insertions(+), 75 deletions(-)
> >
> > --
> > 2.20.1
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> 
> 
> -- 
> 
> Rodrigo Siqueira
> https://siqueira.tech

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2019-06-12 14:42 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-06 22:27 [PATCH 00/10] drm/vkms: rework crc worker Daniel Vetter
2019-06-06 22:27 ` [PATCH 01/10] drm/vkms: Fix crc worker races Daniel Vetter
2019-06-12 13:33   ` Rodrigo Siqueira
2019-06-12 14:48     ` Daniel Vetter
2019-06-18  2:39       ` Rodrigo Siqueira
2019-06-18  8:49         ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 02/10] drm/vkms: Use spin_lock_irq in process context Daniel Vetter
2019-06-12 13:34   ` Rodrigo Siqueira
2019-06-12 14:54     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 03/10] drm/vkms: Rename vkms_output.state_lock to crc_lock Daniel Vetter
2019-06-12 13:38   ` Rodrigo Siqueira
2019-06-13  7:48     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 04/10] drm/vkms: Move format arrays to vkms_plane.c Daniel Vetter
2019-06-12 13:39   ` Rodrigo Siqueira
2019-06-19  2:12   ` Rodrigo Siqueira
2019-06-06 22:27 ` [PATCH 05/10] drm/vkms: Add our own commit_tail Daniel Vetter
2019-06-06 22:27 ` [PATCH 06/10] drm/vkms: flush crc workers earlier in commit flow Daniel Vetter
2019-06-12 13:42   ` Rodrigo Siqueira
2019-06-13  7:53     ` Daniel Vetter
2019-06-13  7:55       ` Daniel Vetter
2019-06-18  2:31       ` Rodrigo Siqueira
2019-06-06 22:27 ` [PATCH 07/10] drm/vkms: Dont flush crc worker when we change crc status Daniel Vetter
2019-06-19  2:17   ` Rodrigo Siqueira
2019-06-19  7:47     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 08/10] drm/vkms: No _irqsave within spin_lock_irq needed Daniel Vetter
2019-06-12 13:43   ` Rodrigo Siqueira
2019-06-06 22:27 ` [PATCH 09/10] drm/vkms: totally reworked crc data tracking Daniel Vetter
2019-06-12 13:46   ` Rodrigo Siqueira
2019-06-13  7:59     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 10/10] drm/vkms: No need for ->pages_lock in crc work anymore Daniel Vetter
2019-06-12 13:47   ` Rodrigo Siqueira
2019-06-12 13:28 ` [PATCH 00/10] drm/vkms: rework crc worker Rodrigo Siqueira
2019-06-12 14:42   ` Daniel Vetter [this message]
2019-06-18  2:49     ` Rodrigo Siqueira
2019-06-18  8:56       ` Daniel Vetter
2019-06-18 21:54         ` Rodrigo Siqueira
2019-06-18 22:06           ` Daniel Vetter
2019-06-18 22:07             ` Daniel Vetter
2019-06-18 22:25               ` Rodrigo Siqueira
2019-06-18 22:39                 ` Daniel Vetter
2019-06-26  1:44             ` Rodrigo Siqueira
2019-06-26  7:54               ` Daniel Vetter
2019-06-26 13:46                 ` Rodrigo Siqueira
2019-07-01  3:30                 ` Rodrigo Siqueira

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190612144220.GO2458@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=rodrigosiqueiramelo@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.