Re: [PATCH] drm: use seqlocks for vblank time/count

From: Daniel Vetter <daniel@ffwll.ch>
To: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>,
	intel-gfx@lists.freedesktop.org,
	Matthew Auld <matthew.auld@intel.com>,
	dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm: use seqlocks for vblank time/count
Date: Mon, 9 May 2016 20:11:49 +0200	[thread overview]
Message-ID: <20160509181149.GC27098@phenom.ffwll.local> (raw)
In-Reply-To: <20160509171607.GP4329@intel.com>

On Mon, May 09, 2016 at 08:16:07PM +0300, Ville Syrjälä wrote:
> On Mon, May 09, 2016 at 05:08:43PM +0100, Matthew Auld wrote:
> > This patch aims to replace the roll-your-own seqlock implementation with
> > full-blown seqlock'. We also remove the timestamp ring-buffer in favour
> > of single timestamp/count pair protected by a seqlock. In turn this
> > means we can now increment the vblank freely without the need for
> > clamping.
> 
> This will also change the behaviour to block new readers while the
> writer has the lock, whereas the old code would allow readers to
> proceed in parallel. We do the whole hw counter + scanout position
> query while holding the lock so it's not exactly zero amount of work,
> but I'm not sure that's a real problem.
> 
> I guess we could reduce the scope of the seqlock, but then maybe we'd
> need to keep the vblank_time_lock spinlock as well. The details escape
> me now, so I'd have re-read the code again.
> 
> Ccing Mario too.

Yeah, my idea was to keep the spinlock, and only replace the stuff in
store_vblank and the few do {} while (cur_vblank != get_vblank_counter)
loops. Extending the seqlock stuff to everything seems indeed counter to
Mario's locking scheme.

So goal would be to really just replace the half-baked seqlock that we
have already, and leave all other locking unchanged.
-Daniel

> 
> > 
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > ---
> >  drivers/gpu/drm/drm_irq.c | 111 +++++++++-------------------------------------
> >  include/drm/drmP.h        |  14 ++----
> >  2 files changed, 25 insertions(+), 100 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> > index 3c1a6f1..bfc6a8d 100644
> > --- a/drivers/gpu/drm/drm_irq.c
> > +++ b/drivers/gpu/drm/drm_irq.c
> > @@ -42,10 +42,6 @@
> >  #include <linux/vgaarb.h>
> >  #include <linux/export.h>
> >  
> > -/* Access macro for slots in vblank timestamp ringbuffer. */
> > -#define vblanktimestamp(dev, pipe, count) \
> > -	((dev)->vblank[pipe].time[(count) % DRM_VBLANKTIME_RBSIZE])
> > -
> >  /* Retry timestamp calculation up to 3 times to satisfy
> >   * drm_timestamp_precision before giving up.
> >   */
> > @@ -82,29 +78,13 @@ static void store_vblank(struct drm_device *dev, unsigned int pipe,
> >  			 struct timeval *t_vblank, u32 last)
> >  {
> >  	struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
> > -	u32 tslot;
> >  
> > -	assert_spin_locked(&dev->vblank_time_lock);
> > +	assert_spin_locked(&dev->vblank_seqlock.lock);
> >  
> >  	vblank->last = last;
> >  
> > -	/* All writers hold the spinlock, but readers are serialized by
> > -	 * the latching of vblank->count below.
> > -	 */
> > -	tslot = vblank->count + vblank_count_inc;
> > -	vblanktimestamp(dev, pipe, tslot) = *t_vblank;
> > -
> > -	/*
> > -	 * vblank timestamp updates are protected on the write side with
> > -	 * vblank_time_lock, but on the read side done locklessly using a
> > -	 * sequence-lock on the vblank counter. Ensure correct ordering using
> > -	 * memory barrriers. We need the barrier both before and also after the
> > -	 * counter update to synchronize with the next timestamp write.
> > -	 * The read-side barriers for this are in drm_vblank_count_and_time.
> > -	 */
> > -	smp_wmb();
> > +	vblank->time = *t_vblank;
> >  	vblank->count += vblank_count_inc;
> > -	smp_wmb();
> >  }
> >  
> >  /**
> > @@ -127,7 +107,7 @@ static void drm_reset_vblank_timestamp(struct drm_device *dev, unsigned int pipe
> >  	struct timeval t_vblank;
> >  	int count = DRM_TIMESTAMP_MAXRETRIES;
> >  
> > -	spin_lock(&dev->vblank_time_lock);
> > +	write_seqlock(&dev->vblank_seqlock);
> >  
> >  	/*
> >  	 * sample the current counter to avoid random jumps
> > @@ -152,7 +132,7 @@ static void drm_reset_vblank_timestamp(struct drm_device *dev, unsigned int pipe
> >  	 */
> >  	store_vblank(dev, pipe, 1, &t_vblank, cur_vblank);
> >  
> > -	spin_unlock(&dev->vblank_time_lock);
> > +	write_sequnlock(&dev->vblank_seqlock);
> >  }
> >  
> >  /**
> > @@ -205,7 +185,7 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe,
> >  		const struct timeval *t_old;
> >  		u64 diff_ns;
> >  
> > -		t_old = &vblanktimestamp(dev, pipe, vblank->count);
> > +		t_old = &vblank->time;
> >  		diff_ns = timeval_to_ns(&t_vblank) - timeval_to_ns(t_old);
> >  
> >  		/*
> > @@ -239,49 +219,6 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe,
> >  		diff = 1;
> >  	}
> >  
> > -	/*
> > -	 * FIMXE: Need to replace this hack with proper seqlocks.
> > -	 *
> > -	 * Restrict the bump of the software vblank counter to a safe maximum
> > -	 * value of +1 whenever there is the possibility that concurrent readers
> > -	 * of vblank timestamps could be active at the moment, as the current
> > -	 * implementation of the timestamp caching and updating is not safe
> > -	 * against concurrent readers for calls to store_vblank() with a bump
> > -	 * of anything but +1. A bump != 1 would very likely return corrupted
> > -	 * timestamps to userspace, because the same slot in the cache could
> > -	 * be concurrently written by store_vblank() and read by one of those
> > -	 * readers without the read-retry logic detecting the collision.
> > -	 *
> > -	 * Concurrent readers can exist when we are called from the
> > -	 * drm_vblank_off() or drm_vblank_on() functions and other non-vblank-
> > -	 * irq callers. However, all those calls to us are happening with the
> > -	 * vbl_lock locked to prevent drm_vblank_get(), so the vblank refcount
> > -	 * can't increase while we are executing. Therefore a zero refcount at
> > -	 * this point is safe for arbitrary counter bumps if we are called
> > -	 * outside vblank irq, a non-zero count is not 100% safe. Unfortunately
> > -	 * we must also accept a refcount of 1, as whenever we are called from
> > -	 * drm_vblank_get() -> drm_vblank_enable() the refcount will be 1 and
> > -	 * we must let that one pass through in order to not lose vblank counts
> > -	 * during vblank irq off - which would completely defeat the whole
> > -	 * point of this routine.
> > -	 *
> > -	 * Whenever we are called from vblank irq, we have to assume concurrent
> > -	 * readers exist or can show up any time during our execution, even if
> > -	 * the refcount is currently zero, as vblank irqs are usually only
> > -	 * enabled due to the presence of readers, and because when we are called
> > -	 * from vblank irq we can't hold the vbl_lock to protect us from sudden
> > -	 * bumps in vblank refcount. Therefore also restrict bumps to +1 when
> > -	 * called from vblank irq.
> > -	 */
> > -	if ((diff > 1) && (atomic_read(&vblank->refcount) > 1 ||
> > -	    (flags & DRM_CALLED_FROM_VBLIRQ))) {
> > -		DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=%u "
> > -			      "refcount %u, vblirq %u\n", pipe, diff,
> > -			      atomic_read(&vblank->refcount),
> > -			      (flags & DRM_CALLED_FROM_VBLIRQ) != 0);
> > -		diff = 1;
> > -	}
> > -
> >  	DRM_DEBUG_VBL("updating vblank count on crtc %u:"
> >  		      " current=%u, diff=%u, hw=%u hw_last=%u\n",
> >  		      pipe, vblank->count, diff, cur_vblank, vblank->last);
> > @@ -318,7 +255,7 @@ static void vblank_disable_and_save(struct drm_device *dev, unsigned int pipe)
> >  	 * so no updates of timestamps or count can happen after we've
> >  	 * disabled. Needed to prevent races in case of delayed irq's.
> >  	 */
> > -	spin_lock_irqsave(&dev->vblank_time_lock, irqflags);
> > +	write_seqlock_irqsave(&dev->vblank_seqlock, irqflags);
> >  
> >  	/*
> >  	 * Only disable vblank interrupts if they're enabled. This avoids
> > @@ -338,7 +275,7 @@ static void vblank_disable_and_save(struct drm_device *dev, unsigned int pipe)
> >  	 */
> >  	drm_update_vblank_count(dev, pipe, 0);
> >  
> > -	spin_unlock_irqrestore(&dev->vblank_time_lock, irqflags);
> > +	write_sequnlock_irqrestore(&dev->vblank_seqlock, irqflags);
> >  }
> >  
> >  static void vblank_disable_fn(unsigned long arg)
> > @@ -404,7 +341,7 @@ int drm_vblank_init(struct drm_device *dev, unsigned int num_crtcs)
> >  	unsigned int i;
> >  
> >  	spin_lock_init(&dev->vbl_lock);
> > -	spin_lock_init(&dev->vblank_time_lock);
> > +	seqlock_init(&dev->vblank_seqlock);
> >  
> >  	dev->num_crtcs = num_crtcs;
> >  
> > @@ -991,25 +928,19 @@ u32 drm_vblank_count_and_time(struct drm_device *dev, unsigned int pipe,
> >  			      struct timeval *vblanktime)
> >  {
> >  	struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
> > -	int count = DRM_TIMESTAMP_MAXRETRIES;
> > -	u32 cur_vblank;
> > +	u32 vblank_count;
> > +	unsigned int seq;
> >  
> >  	if (WARN_ON(pipe >= dev->num_crtcs))
> >  		return 0;
> >  
> > -	/*
> > -	 * Vblank timestamps are read lockless. To ensure consistency the vblank
> > -	 * counter is rechecked and ordering is ensured using memory barriers.
> > -	 * This works like a seqlock. The write-side barriers are in store_vblank.
> > -	 */
> >  	do {
> > -		cur_vblank = vblank->count;
> > -		smp_rmb();
> > -		*vblanktime = vblanktimestamp(dev, pipe, cur_vblank);
> > -		smp_rmb();
> > -	} while (cur_vblank != vblank->count && --count > 0);
> > +		seq = read_seqbegin(&dev->vblank_seqlock);
> > +		vblank_count = vblank->count;
> > +		*vblanktime = vblank->time;
> > +	} while (read_seqretry(&dev->vblank_seqlock, seq));
> >  
> > -	return cur_vblank;
> > +	return vblank_count;
> >  }
> >  EXPORT_SYMBOL(drm_vblank_count_and_time);
> >  
> > @@ -1160,11 +1091,11 @@ static int drm_vblank_enable(struct drm_device *dev, unsigned int pipe)
> >  
> >  	assert_spin_locked(&dev->vbl_lock);
> >  
> > -	spin_lock(&dev->vblank_time_lock);
> > +	write_seqlock(&dev->vblank_seqlock);
> >  
> >  	if (!vblank->enabled) {
> >  		/*
> > -		 * Enable vblank irqs under vblank_time_lock protection.
> > +		 * Enable vblank irqs under vblank_seqlock protection.
> >  		 * All vblank count & timestamp updates are held off
> >  		 * until we are done reinitializing master counter and
> >  		 * timestamps. Filtercode in drm_handle_vblank() will
> > @@ -1180,7 +1111,7 @@ static int drm_vblank_enable(struct drm_device *dev, unsigned int pipe)
> >  		}
> >  	}
> >  
> > -	spin_unlock(&dev->vblank_time_lock);
> > +	write_sequnlock(&dev->vblank_seqlock);
> >  
> >  	return ret;
> >  }
> > @@ -1880,18 +1811,18 @@ bool drm_handle_vblank(struct drm_device *dev, unsigned int pipe)
> >  	 * vblank enable/disable, as this would cause inconsistent
> >  	 * or corrupted timestamps and vblank counts.
> >  	 */
> > -	spin_lock(&dev->vblank_time_lock);
> > +	write_seqlock(&dev->vblank_seqlock);
> >  
> >  	/* Vblank irq handling disabled. Nothing to do. */
> >  	if (!vblank->enabled) {
> > -		spin_unlock(&dev->vblank_time_lock);
> > +		write_sequnlock(&dev->vblank_seqlock);
> >  		spin_unlock_irqrestore(&dev->event_lock, irqflags);
> >  		return false;
> >  	}
> >  
> >  	drm_update_vblank_count(dev, pipe, DRM_CALLED_FROM_VBLIRQ);
> >  
> > -	spin_unlock(&dev->vblank_time_lock);
> > +	write_sequnlock(&dev->vblank_seqlock);
> >  
> >  	wake_up(&vblank->queue);
> >  	drm_handle_vblank_events(dev, pipe);
> > diff --git a/include/drm/drmP.h b/include/drm/drmP.h
> > index 360b2a7..8bee424 100644
> > --- a/include/drm/drmP.h
> > +++ b/include/drm/drmP.h
> > @@ -52,6 +52,7 @@
> >  #include <linux/poll.h>
> >  #include <linux/ratelimit.h>
> >  #include <linux/sched.h>
> > +#include <linux/seqlock.h>
> >  #include <linux/slab.h>
> >  #include <linux/types.h>
> >  #include <linux/vmalloc.h>
> > @@ -392,11 +393,6 @@ struct drm_master {
> >  	void *driver_priv;
> >  };
> >  
> > -/* Size of ringbuffer for vblank timestamps. Just double-buffer
> > - * in initial implementation.
> > - */
> > -#define DRM_VBLANKTIME_RBSIZE 2
> > -
> >  /* Flags and return codes for get_vblank_timestamp() driver function. */
> >  #define DRM_CALLED_FROM_VBLIRQ 1
> >  #define DRM_VBLANKTIME_SCANOUTPOS_METHOD (1 << 0)
> > @@ -725,10 +721,8 @@ struct drm_vblank_crtc {
> >  	wait_queue_head_t queue;	/**< VBLANK wait queue */
> >  	struct timer_list disable_timer;		/* delayed disable timer */
> >  
> > -	/* vblank counter, protected by dev->vblank_time_lock for writes */
> > -	u32 count;
> > -	/* vblank timestamps, protected by dev->vblank_time_lock for writes */
> > -	struct timeval time[DRM_VBLANKTIME_RBSIZE];
> > +	u32 count;			/* vblank counter, protected by dev->vblank_seqlock */
> > +	struct timeval time;		/* vblank timestamp, protected by dev->vblank_seqlock */
> >  
> >  	atomic_t refcount;		/* number of users of vblank interruptsper crtc */
> >  	u32 last;			/* protected by dev->vbl_lock, used */
> > @@ -835,7 +829,7 @@ struct drm_device {
> >  	/* array of size num_crtcs */
> >  	struct drm_vblank_crtc *vblank;
> >  
> > -	spinlock_t vblank_time_lock;    /**< Protects vblank count and time updates during vblank enable/disable */
> > +	seqlock_t vblank_seqlock;	/**< Protects vblank count and time updates during vblank enable/disable */
> >  	spinlock_t vbl_lock;
> >  
> >  	u32 max_vblank_count;           /**< size of vblank counter register */
> > -- 
> > 2.4.11
> 
> -- 
> Ville Syrjälä
> Intel OTC
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel