All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	DRI Development <dri-devel@lists.freedesktop.org>,
	Shayenne Moura <shayenneluzmoura@gmail.com>,
	Daniel Vetter <daniel.vetter@intel.com>
Subject: Re: [PATCH 01/10] drm/vkms: Fix crc worker races
Date: Wed, 12 Jun 2019 16:48:49 +0200	[thread overview]
Message-ID: <20190612144849.GP2458@phenom.ffwll.local> (raw)
In-Reply-To: <CADKXj+66m1ESVadraWWWsU+TWeVA=kuE7cia7sjt2zPTRdv11Q@mail.gmail.com>

On Wed, Jun 12, 2019 at 10:33:11AM -0300, Rodrigo Siqueira wrote:
> On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > The issue we have is that the crc worker might fall behind. We've
> > tried to handle this by tracking both the earliest frame for which it
> > still needs to compute a crc, and the last one. Plus when the
> > crtc_state changes, we have a new work item, which are all run in
> > order due to the ordered workqueue we allocate for each vkms crtc.
> >
> > Trouble is there's been a few small issues in the current code:
> > - we need to capture frame_end in the vblank hrtimer, not in the
> >   worker. The worker might run much later, and then we generate a lot
> >   of crc for which there's already a different worker queued up.
> > - frame number might be 0, so create a new crc_pending boolean to
> >   track this without confusion.
> > - we need to atomically grab frame_start/end and clear it, so do that
> >   all in one go. This is not going to create a new race, because if we
> >   race with the hrtimer then our work will be re-run.
> > - only race that can happen is the following:
> >   1. worker starts
> >   2. hrtimer runs and updates frame_end
> >   3. worker grabs frame_start/end, already reading the new frame_end,
> >   and clears crc_pending
> >   4. hrtimer calls queue_work()
> >   5. worker completes
> >   6. worker gets  re-run, crc_pending is false
> >   Explain this case a bit better by rewording the comment.
> >
> > v2: Demote warning level output to debug when we fail to requeue, this
> > is expected under high load when the crc worker can't quite keep up.
> >
> > Cc: Shayenne Moura <shayenneluzmoura@gmail.com>
> > Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > ---
> >  drivers/gpu/drm/vkms/vkms_crc.c  | 27 +++++++++++++--------------
> >  drivers/gpu/drm/vkms/vkms_crtc.c |  9 +++++++--
> >  drivers/gpu/drm/vkms/vkms_drv.h  |  2 ++
> >  3 files changed, 22 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/vkms/vkms_crc.c b/drivers/gpu/drm/vkms/vkms_crc.c
> > index d7b409a3c0f8..66603da634fe 100644
> > --- a/drivers/gpu/drm/vkms/vkms_crc.c
> > +++ b/drivers/gpu/drm/vkms/vkms_crc.c
> > @@ -166,16 +166,24 @@ void vkms_crc_work_handle(struct work_struct *work)
> >         struct drm_plane *plane;
> >         u32 crc32 = 0;
> >         u64 frame_start, frame_end;
> > +       bool crc_pending;
> >         unsigned long flags;
> >
> >         spin_lock_irqsave(&out->state_lock, flags);
> >         frame_start = crtc_state->frame_start;
> >         frame_end = crtc_state->frame_end;
> > +       crc_pending = crtc_state->crc_pending;
> > +       crtc_state->frame_start = 0;
> > +       crtc_state->frame_end = 0;
> > +       crtc_state->crc_pending = false;
> >         spin_unlock_irqrestore(&out->state_lock, flags);
> >
> > -       /* _vblank_handle() hasn't updated frame_start yet */
> > -       if (!frame_start || frame_start == frame_end)
> > -               goto out;
> > +       /*
> > +        * We raced with the vblank hrtimer and previous work already computed
> > +        * the crc, nothing to do.
> > +        */
> > +       if (!crc_pending)
> > +               return;
> 
> I think this condition is not reachable because crc_pending will be
> filled with true in `vkms_vblank_simulate()` which in turn schedule
> the function `vkms_crc_work_handle()`. Just for checking, I tried to
> reach this condition by running kms_flip, kms_pipe_crc_basic, and
> kms_cursor_crc with two different VM setups[1], but I couldn't reach
> it. What do you think?

thread A			thread B
1. run vblank hrtimer

				2. starts running crc work (from previous
				vblank)

3. spin_lock()			-> gets stalled on the spin_lock() because
				   thread A has it already

4. update frame_end (only in
later patches, atm this is
impossible). crc_pending is set
already.

5. schedule_work: since the work
is running already, this means it
is scheduled to run once more.

6. spin_unlock

				7. compute crc, clear crc_pending
				8. work finishes
				9. work gets run again
				8. crc_pending=false

Since the spin_lock critical section is _very_ short (less than 1 usec I
bet), this race is very hard to hit.

Exercise: Figure out why schedule_work _must_ schedule the work item to
re-run if it's running already. If it doesn't do that there's another
race.

> 
> [1] Qemu parameters
> VM1: -m 1G -smp cores=2,cpus=2
> VM2: -enable-kvm -m 2G -smp cores=4,cpus=4
> 
> >         drm_for_each_plane(plane, &vdev->drm) {
> >                 struct vkms_plane_state *vplane_state;
> > @@ -196,20 +204,11 @@ void vkms_crc_work_handle(struct work_struct *work)
> >         if (primary_crc)
> >                 crc32 = _vkms_get_crc(primary_crc, cursor_crc);
> >
> > -       frame_end = drm_crtc_accurate_vblank_count(crtc);
> > -
> > -       /* queue_work can fail to schedule crc_work; add crc for
> > -        * missing frames
> > +       /*
> > +        * The worker can fall behind the vblank hrtimer, make sure we catch up.
> >          */
> >         while (frame_start <= frame_end)
> >                 drm_crtc_add_crc_entry(crtc, true, frame_start++, &crc32);
> 
> I want to take this opportunity to ask about this while; It's not
> really specific to this patch.
> 
> I have to admit that I never fully got the idea behind this 'while';
> it looks like that we just fill out the missed frames with a repeated
> value. FWIU, `drm_crtc_add_crc_entry()` will add an entry with the CRC
> information for a frame, but in this case, we are adding the same CRC
> for a different set of frames. I agree that near frame has a similar
> CRC value, but could we rely on this all the time? What could happen
> if we have a great difference from the frame_start and frame_end?

It's a cheap trick for slow cpu: If the crc work gets behind the vblank
hrtimer, we need to somehow catch up. With real hw this is not possible,
but with vkms we simulate the hw. The only quick way to catch up is to
fill out the same crc for everything. It's a lie, it will make some
kms_atomic tests fail, but it's the only thing we can really do. Aside
from trying to make the crc computation code as fast as possible.
-Daniel

> 
> > -
> > -out:
> > -       /* to avoid using the same value for frame number again */
> > -       spin_lock_irqsave(&out->state_lock, flags);
> > -       crtc_state->frame_end = frame_end;
> > -       crtc_state->frame_start = 0;
> > -       spin_unlock_irqrestore(&out->state_lock, flags);
> >  }
> >
> >  static int vkms_crc_parse_source(const char *src_name, bool *enabled)
> > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > index 1bbe099b7db8..c727d8486e97 100644
> > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > @@ -30,13 +30,18 @@ static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> >                  * has read the data
> >                  */
> >                 spin_lock(&output->state_lock);
> > -               if (!state->frame_start)
> > +               if (!state->crc_pending)
> >                         state->frame_start = frame;
> > +               else
> > +                       DRM_DEBUG_DRIVER("crc worker falling behind, frame_start: %llu, frame_end: %llu\n",
> > +                                        state->frame_start, frame);
> > +               state->frame_end = frame;
> > +               state->crc_pending = true;
> >                 spin_unlock(&output->state_lock);
> >
> >                 ret = queue_work(output->crc_workq, &state->crc_work);
> >                 if (!ret)
> > -                       DRM_WARN("failed to queue vkms_crc_work_handle");
> > +                       DRM_DEBUG_DRIVER("vkms_crc_work_handle already queued\n");
> >         }
> >
> >         spin_unlock(&output->lock);
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 81f1cfbeb936..3c7e06b19efd 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -56,6 +56,8 @@ struct vkms_plane_state {
> >  struct vkms_crtc_state {
> >         struct drm_crtc_state base;
> >         struct work_struct crc_work;
> > +
> > +       bool crc_pending;
> >         u64 frame_start;
> >         u64 frame_end;
> >  };
> > --
> > 2.20.1
> >
> 
> 
> -- 
> 
> Rodrigo Siqueira
> https://siqueira.tech

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2019-06-12 14:48 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-06 22:27 [PATCH 00/10] drm/vkms: rework crc worker Daniel Vetter
2019-06-06 22:27 ` [PATCH 01/10] drm/vkms: Fix crc worker races Daniel Vetter
2019-06-12 13:33   ` Rodrigo Siqueira
2019-06-12 14:48     ` Daniel Vetter [this message]
2019-06-18  2:39       ` Rodrigo Siqueira
2019-06-18  8:49         ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 02/10] drm/vkms: Use spin_lock_irq in process context Daniel Vetter
2019-06-12 13:34   ` Rodrigo Siqueira
2019-06-12 14:54     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 03/10] drm/vkms: Rename vkms_output.state_lock to crc_lock Daniel Vetter
2019-06-12 13:38   ` Rodrigo Siqueira
2019-06-13  7:48     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 04/10] drm/vkms: Move format arrays to vkms_plane.c Daniel Vetter
2019-06-12 13:39   ` Rodrigo Siqueira
2019-06-19  2:12   ` Rodrigo Siqueira
2019-06-06 22:27 ` [PATCH 05/10] drm/vkms: Add our own commit_tail Daniel Vetter
2019-06-06 22:27 ` [PATCH 06/10] drm/vkms: flush crc workers earlier in commit flow Daniel Vetter
2019-06-12 13:42   ` Rodrigo Siqueira
2019-06-13  7:53     ` Daniel Vetter
2019-06-13  7:55       ` Daniel Vetter
2019-06-18  2:31       ` Rodrigo Siqueira
2019-06-06 22:27 ` [PATCH 07/10] drm/vkms: Dont flush crc worker when we change crc status Daniel Vetter
2019-06-19  2:17   ` Rodrigo Siqueira
2019-06-19  7:47     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 08/10] drm/vkms: No _irqsave within spin_lock_irq needed Daniel Vetter
2019-06-12 13:43   ` Rodrigo Siqueira
2019-06-06 22:27 ` [PATCH 09/10] drm/vkms: totally reworked crc data tracking Daniel Vetter
2019-06-12 13:46   ` Rodrigo Siqueira
2019-06-13  7:59     ` Daniel Vetter
2019-06-06 22:27 ` [PATCH 10/10] drm/vkms: No need for ->pages_lock in crc work anymore Daniel Vetter
2019-06-12 13:47   ` Rodrigo Siqueira
2019-06-12 13:28 ` [PATCH 00/10] drm/vkms: rework crc worker Rodrigo Siqueira
2019-06-12 14:42   ` Daniel Vetter
2019-06-18  2:49     ` Rodrigo Siqueira
2019-06-18  8:56       ` Daniel Vetter
2019-06-18 21:54         ` Rodrigo Siqueira
2019-06-18 22:06           ` Daniel Vetter
2019-06-18 22:07             ` Daniel Vetter
2019-06-18 22:25               ` Rodrigo Siqueira
2019-06-18 22:39                 ` Daniel Vetter
2019-06-26  1:44             ` Rodrigo Siqueira
2019-06-26  7:54               ` Daniel Vetter
2019-06-26 13:46                 ` Rodrigo Siqueira
2019-07-01  3:30                 ` Rodrigo Siqueira

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190612144849.GP2458@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=daniel.vetter@ffwll.ch \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hamohammed.sa@gmail.com \
    --cc=rodrigosiqueiramelo@gmail.com \
    --cc=shayenneluzmoura@gmail.com \
    --subject='Re: [PATCH 01/10] drm/vkms: Fix crc worker races' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.