All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
To: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Shayenne Moura <shayenneluzmoura@gmail.com>,
	Haneen Mohammed <hamohammed.sa@gmail.com>,
	DRI Development <dri-devel@lists.freedesktop.org>,
	Daniel Vetter <daniel.vetter@intel.com>
Subject: Re: [PATCH 01/10] drm/vkms: Fix crc worker races
Date: Wed, 12 Jun 2019 10:33:11 -0300	[thread overview]
Message-ID: <CADKXj+66m1ESVadraWWWsU+TWeVA=kuE7cia7sjt2zPTRdv11Q@mail.gmail.com> (raw)
In-Reply-To: <20190606222751.32567-2-daniel.vetter@ffwll.ch>

On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> The issue we have is that the crc worker might fall behind. We've
> tried to handle this by tracking both the earliest frame for which it
> still needs to compute a crc, and the last one. Plus when the
> crtc_state changes, we have a new work item, which are all run in
> order due to the ordered workqueue we allocate for each vkms crtc.
>
> Trouble is there's been a few small issues in the current code:
> - we need to capture frame_end in the vblank hrtimer, not in the
>   worker. The worker might run much later, and then we generate a lot
>   of crc for which there's already a different worker queued up.
> - frame number might be 0, so create a new crc_pending boolean to
>   track this without confusion.
> - we need to atomically grab frame_start/end and clear it, so do that
>   all in one go. This is not going to create a new race, because if we
>   race with the hrtimer then our work will be re-run.
> - only race that can happen is the following:
>   1. worker starts
>   2. hrtimer runs and updates frame_end
>   3. worker grabs frame_start/end, already reading the new frame_end,
>   and clears crc_pending
>   4. hrtimer calls queue_work()
>   5. worker completes
>   6. worker gets  re-run, crc_pending is false
>   Explain this case a bit better by rewording the comment.
>
> v2: Demote warning level output to debug when we fail to requeue, this
> is expected under high load when the crc worker can't quite keep up.
>
> Cc: Shayenne Moura <shayenneluzmoura@gmail.com>
> Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/vkms/vkms_crc.c  | 27 +++++++++++++--------------
>  drivers/gpu/drm/vkms/vkms_crtc.c |  9 +++++++--
>  drivers/gpu/drm/vkms/vkms_drv.h  |  2 ++
>  3 files changed, 22 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> index d7b409a3c0f8..66603da634fe 100644
> --- a/drivers/gpu/drm/vkms/vkms_crc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> @@ -166,16 +166,24 @@ void vkms_crc_work_handle(struct work_struct *work)
>         struct drm_plane *plane;
>         u32 crc32 = 0;
>         u64 frame_start, frame_end;
> +       bool crc_pending;
>         unsigned long flags;
>
>         spin_lock_irqsave(&out->state_lock, flags);
>         frame_start = crtc_state->frame_start;
>         frame_end = crtc_state->frame_end;
> +       crc_pending = crtc_state->crc_pending;
> +       crtc_state->frame_start = 0;
> +       crtc_state->frame_end = 0;
> +       crtc_state->crc_pending = false;
>         spin_unlock_irqrestore(&out->state_lock, flags);
>
> -       /* _vblank_handle() hasn't updated frame_start yet */
> -       if (!frame_start || frame_start == frame_end)
> -               goto out;
> +       /*
> +        * We raced with the vblank hrtimer and previous work already computed
> +        * the crc, nothing to do.
> +        */
> +       if (!crc_pending)
> +               return;

I think this condition is not reachable because crc_pending will be
filled with true in `vkms_vblank_simulate()` which in turn schedule
the function `vkms_crc_work_handle()`. Just for checking, I tried to
reach this condition by running kms_flip, kms_pipe_crc_basic, and
kms_cursor_crc with two different VM setups[1], but I couldn't reach
it. What do you think?

[1] Qemu parameters
VM1: -m 1G -smp cores=2,cpus=2
VM2: -enable-kvm -m 2G -smp cores=4,cpus=4

>         drm_for_each_plane(plane, &vdev->drm) {
>                 struct vkms_plane_state *vplane_state;
> @@ -196,20 +204,11 @@ void vkms_crc_work_handle(struct work_struct *work)
>         if (primary_crc)
>                 crc32 = _vkms_get_crc(primary_crc, cursor_crc);
>
> -       frame_end = drm_crtc_accurate_vblank_count(crtc);
> -
> -       /* queue_work can fail to schedule crc_work; add crc for
> -        * missing frames
> +       /*
> +        * The worker can fall behind the vblank hrtimer, make sure we catch up.
>          */
>         while (frame_start <= frame_end)
>                 drm_crtc_add_crc_entry(crtc, true, frame_start++, &crc32);

I want to take this opportunity to ask about this while; It's not
really specific to this patch.

I have to admit that I never fully got the idea behind this 'while';
it looks like that we just fill out the missed frames with a repeated
value. FWIU, `drm_crtc_add_crc_entry()` will add an entry with the CRC
information for a frame, but in this case, we are adding the same CRC
for a different set of frames. I agree that near frame has a similar
CRC value, but could we rely on this all the time? What could happen
if we have a great difference from the frame_start and frame_end?

> -
> -out:
> -       /* to avoid using the same value for frame number again */
> -       spin_lock_irqsave(&out->state_lock, flags);
> -       crtc_state->frame_end = frame_end;
> -       crtc_state->frame_start = 0;
> -       spin_unlock_irqrestore(&out->state_lock, flags);
>  }
>
>  static int vkms_crc_parse_source(const char *src_name, bool *enabled)
> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> index 1bbe099b7db8..c727d8486e97 100644
> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> @@ -30,13 +30,18 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
>                  * has read the data
>                  */
>                 spin_lock(&output->state_lock);
> -               if (!state->frame_start)
> +               if (!state->crc_pending)
>                         state->frame_start = frame;
> +               else
> +                       DRM_DEBUG_DRIVER("crc worker falling behind, frame_start: %llu, frame_end: %llu\n",
> +                                        state->frame_start, frame);
> +               state->frame_end = frame;
> +               state->crc_pending = true;
>                 spin_unlock(&output->state_lock);
>
>                 ret = queue_work(output->crc_workq, &state->crc_work);
>                 if (!ret)
> -                       DRM_WARN("failed to queue vkms_crc_work_handle");
> +                       DRM_DEBUG_DRIVER("vkms_crc_work_handle already queued\n");
>         }
>
>         spin_unlock(&output->lock);
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 81f1cfbeb936..3c7e06b19efd 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -56,6 +56,8 @@ struct vkms_plane_state {
>  struct vkms_crtc_state {
>         struct drm_crtc_state base;
>         struct work_struct crc_work;
> +
> +       bool crc_pending;
>         u64 frame_start;
>         u64 frame_end;
>  };
> --
> 2.20.1
>


-- 

Rodrigo Siqueira
https://siqueira.tech
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2019-06-12 13:33 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-06 22:27 [PATCH 00/10] drm/vkms: rework crc worker Daniel Vetter
2019-06-06 22:27 ` [PATCH 01/10] drm/vkms: Fix crc worker races Daniel Vetter
2019-06-12 13:33   ` Rodrigo Siqueira [this message]
2019-06-12 14:48     ` Daniel Vetter
2019-06-18  2:39       ` Rodrigo Siqueira
2019-06-18  8:49         ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 02/10] drm/vkms: Use spin_lock_irq in process context Daniel Vetter
2019-06-12 13:34   ` Rodrigo Siqueira
2019-06-12 14:54     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 03/10] drm/vkms: Rename vkms_output.state_lock to crc_lock Daniel Vetter
2019-06-12 13:38   ` Rodrigo Siqueira
2019-06-13  7:48     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 04/10] drm/vkms: Move format arrays to vkms_plane.c Daniel Vetter
2019-06-12 13:39   ` Rodrigo Siqueira
2019-06-19  2:12   ` Rodrigo Siqueira
2019-06-06 22:27 ` [PATCH 05/10] drm/vkms: Add our own commit_tail Daniel Vetter
2019-06-06 22:27 ` [PATCH 06/10] drm/vkms: flush crc workers earlier in commit flow Daniel Vetter
2019-06-12 13:42   ` Rodrigo Siqueira
2019-06-13  7:53     ` Daniel Vetter
2019-06-13  7:55       ` Daniel Vetter
2019-06-18  2:31       ` Rodrigo Siqueira
2019-06-06 22:27 ` [PATCH 07/10] drm/vkms: Dont flush crc worker when we change crc status Daniel Vetter
2019-06-19  2:17   ` Rodrigo Siqueira
2019-06-19  7:47     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 08/10] drm/vkms: No _irqsave within spin_lock_irq needed Daniel Vetter
2019-06-12 13:43   ` Rodrigo Siqueira
2019-06-06 22:27 ` [PATCH 09/10] drm/vkms: totally reworked crc data tracking Daniel Vetter
2019-06-12 13:46   ` Rodrigo Siqueira
2019-06-13  7:59     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 10/10] drm/vkms: No need for ->pages_lock in crc work anymore Daniel Vetter
2019-06-12 13:47   ` Rodrigo Siqueira
2019-06-12 13:28 ` [PATCH 00/10] drm/vkms: rework crc worker Rodrigo Siqueira
2019-06-12 14:42   ` Daniel Vetter
2019-06-18  2:49     ` Rodrigo Siqueira
2019-06-18  8:56       ` Daniel Vetter
2019-06-18 21:54         ` Rodrigo Siqueira
2019-06-18 22:06           ` Daniel Vetter
2019-06-18 22:07             ` Daniel Vetter
2019-06-18 22:25               ` Rodrigo Siqueira
2019-06-18 22:39                 ` Daniel Vetter
2019-06-26  1:44             ` Rodrigo Siqueira
2019-06-26  7:54               ` Daniel Vetter
2019-06-26 13:46                 ` Rodrigo Siqueira
2019-07-01  3:30                 ` Rodrigo Siqueira

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADKXj+66m1ESVadraWWWsU+TWeVA=kuE7cia7sjt2zPTRdv11Q@mail.gmail.com' \
    --to=rodrigosiqueiramelo@gmail.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hamohammed.sa@gmail.com \
    --cc=shayenneluzmoura@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.