From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f53.google.com ([74.125.82.53]:36324 "EHLO mail-wm0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754907AbcBIO3g (ORCPT ); Tue, 9 Feb 2016 09:29:36 -0500 Received: by mail-wm0-f53.google.com with SMTP id p63so160633413wmp.1 for ; Tue, 09 Feb 2016 06:29:35 -0800 (PST) Date: Tue, 9 Feb 2016 15:29:57 +0100 From: Daniel Vetter To: Mario Kleiner Cc: Daniel Vetter , Ville =?iso-8859-1?Q?Syrj=E4l=E4?= , dri-devel@lists.freedesktop.org, linux@bernd-steinhauser.de, stable@vger.kernel.org, michel@daenzer.net, vbabka@suse.cz, daniel.vetter@ffwll.ch, alexander.deucher@amd.com, christian.koenig@amd.com Subject: Re: [PATCH 2/6] drm: Prevent vblank counter bumps > 1 with active vblank clients. Message-ID: <20160209142957.GX11240@phenom.ffwll.local> References: <1454894009-15466-1-git-send-email-mario.kleiner.de@gmail.com> <1454894009-15466-3-git-send-email-mario.kleiner.de@gmail.com> <20160209095638.GM11240@phenom.ffwll.local> <20160209100727.GG23290@intel.com> <20160209102302.GT11240@phenom.ffwll.local> <56B9EC20.9050400@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <56B9EC20.9050400@gmail.com> Sender: stable-owner@vger.kernel.org List-ID: On Tue, Feb 09, 2016 at 02:39:44PM +0100, Mario Kleiner wrote: > On 02/09/2016 11:23 AM, Daniel Vetter wrote: > >On Tue, Feb 09, 2016 at 12:07:27PM +0200, Ville Syrj�l� wrote: > >>On Tue, Feb 09, 2016 at 10:56:38AM +0100, Daniel Vetter wrote: > >>>On Mon, Feb 08, 2016 at 02:13:25AM +0100, Mario Kleiner wrote: > >>>>This fixes a regression introduced by the new drm_update_vblank_count() > >>>>implementation in Linux 4.4: > >>>> > >>>>Restrict the bump of the software vblank counter in drm_update_vblank_count() > >>>>to a safe maximum value of +1 whenever there is the possibility that > >>>>concurrent readers of vblank timestamps could be active at the moment, > >>>>as the current implementation of the timestamp caching and updating is > >>>>not safe against concurrent readers for calls to store_vblank() with a > >>>>bump of anything but +1. A bump != 1 would very likely return corrupted > >>>>timestamps to userspace, because the same slot in the cache could > >>>>be concurrently written by store_vblank() and read by one of those > >>>>readers in a non-atomic fashion and without the read-retry logic > >>>>detecting this collision. > >>>> > >>>>Concurrent readers can exist while drm_update_vblank_count() is called > >>>>from the drm_vblank_off() or drm_vblank_on() functions or other non-vblank- > >>>>irq callers. However, all those calls are happening with the vbl_lock > >>>>locked thereby preventing a drm_vblank_get(), so the vblank refcount > >>>>can't increase while drm_update_vblank_count() is executing. Therefore > >>>>a zero vblank refcount during execution of that function signals that > >>>>is safe for arbitrary counter bumps if called from outside vblank irq, > >>>>whereas a non-zero count is not safe. > >>>> > >>>>Whenever the function is called from vblank irq, we have to assume concurrent > >>>>readers could show up any time during its execution, even if the refcount > >>>>is currently zero, as vblank irqs are usually only enabled due to the > >>>>presence of readers, and because when it is called from vblank irq it > >>>>can't hold the vbl_lock to protect it from sudden bumps in vblank refcount. > >>>>Therefore also restrict bumps to +1 when the function is called from vblank > >>>>irq. > >>>> > >>>>Such bumps of more than +1 can happen at other times than reenabling > >>>>vblank irqs, e.g., when regular vblank interrupts get delayed by more > >>>>than 1 frame due to long held locks, long irq off periods, realtime > >>>>preemption on RT kernels, or system management interrupts. > >>>> > >>>>Signed-off-by: Mario Kleiner > >>>>Cc: # 4.4+ > >>>>Cc: michel@daenzer.net > >>>>Cc: vbabka@suse.cz > >>>>Cc: ville.syrjala@linux.intel.com > >>>>Cc: daniel.vetter@ffwll.ch > >>>>Cc: dri-devel@lists.freedesktop.org > >>>>Cc: alexander.deucher@amd.com > >>>>Cc: christian.koenig@amd.com > >>> > >>>Imo this is duct-tape. If we want to fix this up properly I think we > >>>should just use a full-blown seqlock instead of our hand-rolled one. And > >>>that could handle any increment at all. > >> > >>And I even fixed this [1] almost a half a year ago when I sent the > >>original series, but that part got held hostage to the same seqlock > >>argument. Perfect is the enemy of good. > >> > >>[1] https://lists.freedesktop.org/archives/intel-gfx/2015-September/075879.html > > > >Hm yeah, that does suffer from reinventing seqlocks. But I'd prefer your > >patch over Mario's hack here tbh. Your patch with seqlock would be even > >more awesome. > >-Daniel > > > > I agree that my hack is only duct-tape. That's why the long code comment to > let people know under which condition they could remove it. > > Using seqlocks would be the robust long term solution. But as this is > supposed to be a fix for both 4.4 and 4.5 i thought that such a rewrite > would be too intrusive as a change, compared to this one-liner? > > The original "roll our own" seqlock look alike implementation was meant to > avoid/minimize taking locks, esp. with _irqsave that are taken by both > userspace and timing sensitive vblank irq handling code. There were various > locking changes since then and that advantage might have been lost already > quite a long time ago, so maybe switching to full seqlocks wouldn't pose > some new performance problems there, but i haven't looked into this. Last time I've checked we've already reinvented seqlocks completely, except buggy since ours can't take an increment > 1. I don't expect you'll be able to measure anything if we switch. Agree that it might be better to delay this for 4.6. So if you add a big "FIMXE: Need to replace this hack with proper seqlocks." a the top of your big comment (or just as a replacement for it), then Reviewed-by: Daniel Vetter But currently it looks like this is a proper long-term solution, which it imo isn't. -Daniel > > -mario > > >> > >>>-Daniel > >>> > >>>>--- > >>>> drivers/gpu/drm/drm_irq.c | 41 +++++++++++++++++++++++++++++++++++++++++ > >>>> 1 file changed, 41 insertions(+) > >>>> > >>>>diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c > >>>>index bcb8528..aa2c74b 100644 > >>>>--- a/drivers/gpu/drm/drm_irq.c > >>>>+++ b/drivers/gpu/drm/drm_irq.c > >>>>@@ -221,6 +221,47 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, > >>>> diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0; > >>>> } > >>>> > >>>>+ /* > >>>>+ * Restrict the bump of the software vblank counter to a safe maximum > >>>>+ * value of +1 whenever there is the possibility that concurrent readers > >>>>+ * of vblank timestamps could be active at the moment, as the current > >>>>+ * implementation of the timestamp caching and updating is not safe > >>>>+ * against concurrent readers for calls to store_vblank() with a bump > >>>>+ * of anything but +1. A bump != 1 would very likely return corrupted > >>>>+ * timestamps to userspace, because the same slot in the cache could > >>>>+ * be concurrently written by store_vblank() and read by one of those > >>>>+ * readers without the read-retry logic detecting the collision. > >>>>+ * > >>>>+ * Concurrent readers can exist when we are called from the > >>>>+ * drm_vblank_off() or drm_vblank_on() functions and other non-vblank- > >>>>+ * irq callers. However, all those calls to us are happening with the > >>>>+ * vbl_lock locked to prevent drm_vblank_get(), so the vblank refcount > >>>>+ * can't increase while we are executing. Therefore a zero refcount at > >>>>+ * this point is safe for arbitrary counter bumps if we are called > >>>>+ * outside vblank irq, a non-zero count is not 100% safe. Unfortunately > >>>>+ * we must also accept a refcount of 1, as whenever we are called from > >>>>+ * drm_vblank_get() -> drm_vblank_enable() the refcount will be 1 and > >>>>+ * we must let that one pass through in order to not lose vblank counts > >>>>+ * during vblank irq off - which would completely defeat the whole > >>>>+ * point of this routine. > >>>>+ * > >>>>+ * Whenever we are called from vblank irq, we have to assume concurrent > >>>>+ * readers exist or can show up any time during our execution, even if > >>>>+ * the refcount is currently zero, as vblank irqs are usually only > >>>>+ * enabled due to the presence of readers, and because when we are called > >>>>+ * from vblank irq we can't hold the vbl_lock to protect us from sudden > >>>>+ * bumps in vblank refcount. Therefore also restrict bumps to +1 when > >>>>+ * called from vblank irq. > >>>>+ */ > >>>>+ if ((diff > 1) && (atomic_read(&vblank->refcount) > 1 || > >>>>+ (flags & DRM_CALLED_FROM_VBLIRQ))) { > >>>>+ DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=%u " > >>>>+ "refcount %u, vblirq %u\n", pipe, diff, > >>>>+ atomic_read(&vblank->refcount), > >>>>+ (flags & DRM_CALLED_FROM_VBLIRQ) != 0); > >>>>+ diff = 1; > >>>>+ } > >>>>+ > >>>> DRM_DEBUG_VBL("updating vblank count on crtc %u:" > >>>> " current=%u, diff=%u, hw=%u hw_last=%u\n", > >>>> pipe, vblank->count, diff, cur_vblank, vblank->last); > >>>>-- > >>>>1.9.1 > >>>> > >>> > >>>-- > >>>Daniel Vetter > >>>Software Engineer, Intel Corporation > >>>http://blog.ffwll.ch > >> > >>-- > >>Ville Syrj�l� > >>Intel OTC > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Vetter Subject: Re: [PATCH 2/6] drm: Prevent vblank counter bumps > 1 with active vblank clients. Date: Tue, 9 Feb 2016 15:29:57 +0100 Message-ID: <20160209142957.GX11240@phenom.ffwll.local> References: <1454894009-15466-1-git-send-email-mario.kleiner.de@gmail.com> <1454894009-15466-3-git-send-email-mario.kleiner.de@gmail.com> <20160209095638.GM11240@phenom.ffwll.local> <20160209100727.GG23290@intel.com> <20160209102302.GT11240@phenom.ffwll.local> <56B9EC20.9050400@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Content-Disposition: inline In-Reply-To: <56B9EC20.9050400@gmail.com> Sender: stable-owner@vger.kernel.org To: Mario Kleiner Cc: Daniel Vetter , Ville =?iso-8859-1?Q?Syrj=E4l=E4?= , dri-devel@lists.freedesktop.org, linux@bernd-steinhauser.de, stable@vger.kernel.org, michel@daenzer.net, vbabka@suse.cz, daniel.vetter@ffwll.ch, alexander.deucher@amd.com, christian.koenig@amd.com List-Id: dri-devel@lists.freedesktop.org On Tue, Feb 09, 2016 at 02:39:44PM +0100, Mario Kleiner wrote: > On 02/09/2016 11:23 AM, Daniel Vetter wrote: > >On Tue, Feb 09, 2016 at 12:07:27PM +0200, Ville Syrj=E4l=E4 wrote: > >>On Tue, Feb 09, 2016 at 10:56:38AM +0100, Daniel Vetter wrote: > >>>On Mon, Feb 08, 2016 at 02:13:25AM +0100, Mario Kleiner wrote: > >>>>This fixes a regression introduced by the new drm_update_vblank_c= ount() > >>>>implementation in Linux 4.4: > >>>> > >>>>Restrict the bump of the software vblank counter in drm_update_vb= lank_count() > >>>>to a safe maximum value of +1 whenever there is the possibility t= hat > >>>>concurrent readers of vblank timestamps could be active at the mo= ment, > >>>>as the current implementation of the timestamp caching and updati= ng is > >>>>not safe against concurrent readers for calls to store_vblank() w= ith a > >>>>bump of anything but +1. A bump !=3D 1 would very likely return c= orrupted > >>>>timestamps to userspace, because the same slot in the cache could > >>>>be concurrently written by store_vblank() and read by one of thos= e > >>>>readers in a non-atomic fashion and without the read-retry logic > >>>>detecting this collision. > >>>> > >>>>Concurrent readers can exist while drm_update_vblank_count() is c= alled > >>>>from the drm_vblank_off() or drm_vblank_on() functions or other n= on-vblank- > >>>>irq callers. However, all those calls are happening with the vbl_= lock > >>>>locked thereby preventing a drm_vblank_get(), so the vblank refco= unt > >>>>can't increase while drm_update_vblank_count() is executing. Ther= efore > >>>>a zero vblank refcount during execution of that function signals = that > >>>>is safe for arbitrary counter bumps if called from outside vblank= irq, > >>>>whereas a non-zero count is not safe. > >>>> > >>>>Whenever the function is called from vblank irq, we have to assum= e concurrent > >>>>readers could show up any time during its execution, even if the = refcount > >>>>is currently zero, as vblank irqs are usually only enabled due to= the > >>>>presence of readers, and because when it is called from vblank ir= q it > >>>>can't hold the vbl_lock to protect it from sudden bumps in vblank= refcount. > >>>>Therefore also restrict bumps to +1 when the function is called f= rom vblank > >>>>irq. > >>>> > >>>>Such bumps of more than +1 can happen at other times than reenabl= ing > >>>>vblank irqs, e.g., when regular vblank interrupts get delayed by = more > >>>>than 1 frame due to long held locks, long irq off periods, realti= me > >>>>preemption on RT kernels, or system management interrupts. > >>>> > >>>>Signed-off-by: Mario Kleiner > >>>>Cc: # 4.4+ > >>>>Cc: michel@daenzer.net > >>>>Cc: vbabka@suse.cz > >>>>Cc: ville.syrjala@linux.intel.com > >>>>Cc: daniel.vetter@ffwll.ch > >>>>Cc: dri-devel@lists.freedesktop.org > >>>>Cc: alexander.deucher@amd.com > >>>>Cc: christian.koenig@amd.com > >>> > >>>Imo this is duct-tape. If we want to fix this up properly I think = we > >>>should just use a full-blown seqlock instead of our hand-rolled on= e. And > >>>that could handle any increment at all. > >> > >>And I even fixed this [1] almost a half a year ago when I sent the > >>original series, but that part got held hostage to the same seqlock > >>argument. Perfect is the enemy of good. > >> > >>[1] https://lists.freedesktop.org/archives/intel-gfx/2015-September= /075879.html > > > >Hm yeah, that does suffer from reinventing seqlocks. But I'd prefer = your > >patch over Mario's hack here tbh. Your patch with seqlock would be e= ven > >more awesome. > >-Daniel > > >=20 > I agree that my hack is only duct-tape. That's why the long code comm= ent to > let people know under which condition they could remove it. >=20 > Using seqlocks would be the robust long term solution. But as this is > supposed to be a fix for both 4.4 and 4.5 i thought that such a rewri= te > would be too intrusive as a change, compared to this one-liner? >=20 > The original "roll our own" seqlock look alike implementation was mea= nt to > avoid/minimize taking locks, esp. with _irqsave that are taken by bot= h > userspace and timing sensitive vblank irq handling code. There were v= arious > locking changes since then and that advantage might have been lost al= ready > quite a long time ago, so maybe switching to full seqlocks wouldn't p= ose > some new performance problems there, but i haven't looked into this. Last time I've checked we've already reinvented seqlocks completely, except buggy since ours can't take an increment > 1. I don't expect you= 'll be able to measure anything if we switch. Agree that it might be better to delay this for 4.6. So if you add a bi= g "FIMXE: Need to replace this hack with proper seqlocks." a the top of y= our big comment (or just as a replacement for it), then Reviewed-by: Daniel Vetter But currently it looks like this is a proper long-term solution, which = it imo isn't. -Daniel >=20 > -mario >=20 > >> > >>>-Daniel > >>> > >>>>--- > >>>> drivers/gpu/drm/drm_irq.c | 41 ++++++++++++++++++++++++++++++++= +++++++++ > >>>> 1 file changed, 41 insertions(+) > >>>> > >>>>diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.= c > >>>>index bcb8528..aa2c74b 100644 > >>>>--- a/drivers/gpu/drm/drm_irq.c > >>>>+++ b/drivers/gpu/drm/drm_irq.c > >>>>@@ -221,6 +221,47 @@ static void drm_update_vblank_count(struct d= rm_device *dev, unsigned int pipe, > >>>> diff =3D (flags & DRM_CALLED_FROM_VBLIRQ) !=3D 0; > >>>> } > >>>> > >>>>+ /* > >>>>+ * Restrict the bump of the software vblank counter to a safe m= aximum > >>>>+ * value of +1 whenever there is the possibility that concurren= t readers > >>>>+ * of vblank timestamps could be active at the moment, as the c= urrent > >>>>+ * implementation of the timestamp caching and updating is not = safe > >>>>+ * against concurrent readers for calls to store_vblank() with = a bump > >>>>+ * of anything but +1. A bump !=3D 1 would very likely return c= orrupted > >>>>+ * timestamps to userspace, because the same slot in the cache = could > >>>>+ * be concurrently written by store_vblank() and read by one of= those > >>>>+ * readers without the read-retry logic detecting the collision= =2E > >>>>+ * > >>>>+ * Concurrent readers can exist when we are called from the > >>>>+ * drm_vblank_off() or drm_vblank_on() functions and other non-= vblank- > >>>>+ * irq callers. However, all those calls to us are happening wi= th the > >>>>+ * vbl_lock locked to prevent drm_vblank_get(), so the vblank r= efcount > >>>>+ * can't increase while we are executing. Therefore a zero refc= ount at > >>>>+ * this point is safe for arbitrary counter bumps if we are cal= led > >>>>+ * outside vblank irq, a non-zero count is not 100% safe. Unfor= tunately > >>>>+ * we must also accept a refcount of 1, as whenever we are call= ed from > >>>>+ * drm_vblank_get() -> drm_vblank_enable() the refcount will be= 1 and > >>>>+ * we must let that one pass through in order to not lose vblan= k counts > >>>>+ * during vblank irq off - which would completely defeat the wh= ole > >>>>+ * point of this routine. > >>>>+ * > >>>>+ * Whenever we are called from vblank irq, we have to assume co= ncurrent > >>>>+ * readers exist or can show up any time during our execution, = even if > >>>>+ * the refcount is currently zero, as vblank irqs are usually o= nly > >>>>+ * enabled due to the presence of readers, and because when we = are called > >>>>+ * from vblank irq we can't hold the vbl_lock to protect us fro= m sudden > >>>>+ * bumps in vblank refcount. Therefore also restrict bumps to += 1 when > >>>>+ * called from vblank irq. > >>>>+ */ > >>>>+ if ((diff > 1) && (atomic_read(&vblank->refcount) > 1 || > >>>>+ (flags & DRM_CALLED_FROM_VBLIRQ))) { > >>>>+ DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=3D%= u " > >>>>+ "refcount %u, vblirq %u\n", pipe, diff, > >>>>+ atomic_read(&vblank->refcount), > >>>>+ (flags & DRM_CALLED_FROM_VBLIRQ) !=3D 0); > >>>>+ diff =3D 1; > >>>>+ } > >>>>+ > >>>> DRM_DEBUG_VBL("updating vblank count on crtc %u:" > >>>> " current=3D%u, diff=3D%u, hw=3D%u hw_last=3D%u\n", > >>>> pipe, vblank->count, diff, cur_vblank, vblank->last); > >>>>-- > >>>>1.9.1 > >>>> > >>> > >>>-- > >>>Daniel Vetter > >>>Software Engineer, Intel Corporation > >>>http://blog.ffwll.ch > >> > >>-- > >>Ville Syrj=E4l=E4 > >>Intel OTC > > --=20 Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch