From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <stable-owner@vger.kernel.org>
Received: from mail-wm0-f53.google.com ([74.125.82.53]:36324 "EHLO
	mail-wm0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1754907AbcBIO3g (ORCPT
	<rfc822;stable@vger.kernel.org>); Tue, 9 Feb 2016 09:29:36 -0500
Received: by mail-wm0-f53.google.com with SMTP id p63so160633413wmp.1
        for <stable@vger.kernel.org>; Tue, 09 Feb 2016 06:29:35 -0800 (PST)
Date: Tue, 9 Feb 2016 15:29:57 +0100
From: Daniel Vetter <daniel@ffwll.ch>
To: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>,
	Ville =?iso-8859-1?Q?Syrj=E4l=E4?=
	<ville.syrjala@linux.intel.com>, dri-devel@lists.freedesktop.org,
	linux@bernd-steinhauser.de, stable@vger.kernel.org,
	michel@daenzer.net, vbabka@suse.cz, daniel.vetter@ffwll.ch,
	alexander.deucher@amd.com, christian.koenig@amd.com
Subject: Re: [PATCH 2/6] drm: Prevent vblank counter bumps > 1 with active
 vblank clients.
Message-ID: <20160209142957.GX11240@phenom.ffwll.local>
References: <1454894009-15466-1-git-send-email-mario.kleiner.de@gmail.com>
 <1454894009-15466-3-git-send-email-mario.kleiner.de@gmail.com>
 <20160209095638.GM11240@phenom.ffwll.local>
 <20160209100727.GG23290@intel.com>
 <20160209102302.GT11240@phenom.ffwll.local>
 <56B9EC20.9050400@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <56B9EC20.9050400@gmail.com>
Sender: stable-owner@vger.kernel.org
List-ID: <stable.vger.kernel.org>

On Tue, Feb 09, 2016 at 02:39:44PM +0100, Mario Kleiner wrote:
> On 02/09/2016 11:23 AM, Daniel Vetter wrote:
> >On Tue, Feb 09, 2016 at 12:07:27PM +0200, Ville Syrj�l� wrote:
> >>On Tue, Feb 09, 2016 at 10:56:38AM +0100, Daniel Vetter wrote:
> >>>On Mon, Feb 08, 2016 at 02:13:25AM +0100, Mario Kleiner wrote:
> >>>>This fixes a regression introduced by the new drm_update_vblank_count()
> >>>>implementation in Linux 4.4:
> >>>>
> >>>>Restrict the bump of the software vblank counter in drm_update_vblank_count()
> >>>>to a safe maximum value of +1 whenever there is the possibility that
> >>>>concurrent readers of vblank timestamps could be active at the moment,
> >>>>as the current implementation of the timestamp caching and updating is
> >>>>not safe against concurrent readers for calls to store_vblank() with a
> >>>>bump of anything but +1. A bump != 1 would very likely return corrupted
> >>>>timestamps to userspace, because the same slot in the cache could
> >>>>be concurrently written by store_vblank() and read by one of those
> >>>>readers in a non-atomic fashion and without the read-retry logic
> >>>>detecting this collision.
> >>>>
> >>>>Concurrent readers can exist while drm_update_vblank_count() is called
> >>>>from the drm_vblank_off() or drm_vblank_on() functions or other non-vblank-
> >>>>irq callers. However, all those calls are happening with the vbl_lock
> >>>>locked thereby preventing a drm_vblank_get(), so the vblank refcount
> >>>>can't increase while drm_update_vblank_count() is executing. Therefore
> >>>>a zero vblank refcount during execution of that function signals that
> >>>>is safe for arbitrary counter bumps if called from outside vblank irq,
> >>>>whereas a non-zero count is not safe.
> >>>>
> >>>>Whenever the function is called from vblank irq, we have to assume concurrent
> >>>>readers could show up any time during its execution, even if the refcount
> >>>>is currently zero, as vblank irqs are usually only enabled due to the
> >>>>presence of readers, and because when it is called from vblank irq it
> >>>>can't hold the vbl_lock to protect it from sudden bumps in vblank refcount.
> >>>>Therefore also restrict bumps to +1 when the function is called from vblank
> >>>>irq.
> >>>>
> >>>>Such bumps of more than +1 can happen at other times than reenabling
> >>>>vblank irqs, e.g., when regular vblank interrupts get delayed by more
> >>>>than 1 frame due to long held locks, long irq off periods, realtime
> >>>>preemption on RT kernels, or system management interrupts.
> >>>>
> >>>>Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> >>>>Cc: <stable@vger.kernel.org> # 4.4+
> >>>>Cc: michel@daenzer.net
> >>>>Cc: vbabka@suse.cz
> >>>>Cc: ville.syrjala@linux.intel.com
> >>>>Cc: daniel.vetter@ffwll.ch
> >>>>Cc: dri-devel@lists.freedesktop.org
> >>>>Cc: alexander.deucher@amd.com
> >>>>Cc: christian.koenig@amd.com
> >>>
> >>>Imo this is duct-tape. If we want to fix this up properly I think we
> >>>should just use a full-blown seqlock instead of our hand-rolled one. And
> >>>that could handle any increment at all.
> >>
> >>And I even fixed this [1] almost a half a year ago when I sent the
> >>original series, but that part got held hostage to the same seqlock
> >>argument. Perfect is the enemy of good.
> >>
> >>[1] https://lists.freedesktop.org/archives/intel-gfx/2015-September/075879.html
> >
> >Hm yeah, that does suffer from reinventing seqlocks. But I'd prefer your
> >patch over Mario's hack here tbh. Your patch with seqlock would be even
> >more awesome.
> >-Daniel
> >
> 
> I agree that my hack is only duct-tape. That's why the long code comment to
> let people know under which condition they could remove it.
> 
> Using seqlocks would be the robust long term solution. But as this is
> supposed to be a fix for both 4.4 and 4.5 i thought that such a rewrite
> would be too intrusive as a change, compared to this one-liner?
> 
> The original "roll our own" seqlock look alike implementation was meant to
> avoid/minimize taking locks, esp. with _irqsave that are taken by both
> userspace and timing sensitive vblank irq handling code. There were various
> locking changes since then and that advantage might have been lost already
> quite a long time ago, so maybe switching to full seqlocks wouldn't pose
> some new performance problems there, but i haven't looked into this.

Last time I've checked we've already reinvented seqlocks completely,
except buggy since ours can't take an increment > 1. I don't expect you'll
be able to measure anything if we switch.

Agree that it might be better to delay this for 4.6. So if you add a big
"FIMXE: Need to replace this hack with proper seqlocks." a the top of your
big comment (or just as a replacement for it), then

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

But currently it looks like this is a proper long-term solution, which it
imo isn't.
-Daniel


> 
> -mario
> 
> >>
> >>>-Daniel
> >>>
> >>>>---
> >>>>  drivers/gpu/drm/drm_irq.c | 41 +++++++++++++++++++++++++++++++++++++++++
> >>>>  1 file changed, 41 insertions(+)
> >>>>
> >>>>diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> >>>>index bcb8528..aa2c74b 100644
> >>>>--- a/drivers/gpu/drm/drm_irq.c
> >>>>+++ b/drivers/gpu/drm/drm_irq.c
> >>>>@@ -221,6 +221,47 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe,
> >>>>  		diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0;
> >>>>  	}
> >>>>
> >>>>+	/*
> >>>>+	 * Restrict the bump of the software vblank counter to a safe maximum
> >>>>+	 * value of +1 whenever there is the possibility that concurrent readers
> >>>>+	 * of vblank timestamps could be active at the moment, as the current
> >>>>+	 * implementation of the timestamp caching and updating is not safe
> >>>>+	 * against concurrent readers for calls to store_vblank() with a bump
> >>>>+	 * of anything but +1. A bump != 1 would very likely return corrupted
> >>>>+	 * timestamps to userspace, because the same slot in the cache could
> >>>>+	 * be concurrently written by store_vblank() and read by one of those
> >>>>+	 * readers without the read-retry logic detecting the collision.
> >>>>+	 *
> >>>>+	 * Concurrent readers can exist when we are called from the
> >>>>+	 * drm_vblank_off() or drm_vblank_on() functions and other non-vblank-
> >>>>+	 * irq callers. However, all those calls to us are happening with the
> >>>>+	 * vbl_lock locked to prevent drm_vblank_get(), so the vblank refcount
> >>>>+	 * can't increase while we are executing. Therefore a zero refcount at
> >>>>+	 * this point is safe for arbitrary counter bumps if we are called
> >>>>+	 * outside vblank irq, a non-zero count is not 100% safe. Unfortunately
> >>>>+	 * we must also accept a refcount of 1, as whenever we are called from
> >>>>+	 * drm_vblank_get() -> drm_vblank_enable() the refcount will be 1 and
> >>>>+	 * we must let that one pass through in order to not lose vblank counts
> >>>>+	 * during vblank irq off - which would completely defeat the whole
> >>>>+	 * point of this routine.
> >>>>+	 *
> >>>>+	 * Whenever we are called from vblank irq, we have to assume concurrent
> >>>>+	 * readers exist or can show up any time during our execution, even if
> >>>>+	 * the refcount is currently zero, as vblank irqs are usually only
> >>>>+	 * enabled due to the presence of readers, and because when we are called
> >>>>+	 * from vblank irq we can't hold the vbl_lock to protect us from sudden
> >>>>+	 * bumps in vblank refcount. Therefore also restrict bumps to +1 when
> >>>>+	 * called from vblank irq.
> >>>>+	 */
> >>>>+	if ((diff > 1) && (atomic_read(&vblank->refcount) > 1 ||
> >>>>+	    (flags & DRM_CALLED_FROM_VBLIRQ))) {
> >>>>+		DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=%u "
> >>>>+			      "refcount %u, vblirq %u\n", pipe, diff,
> >>>>+			      atomic_read(&vblank->refcount),
> >>>>+			      (flags & DRM_CALLED_FROM_VBLIRQ) != 0);
> >>>>+		diff = 1;
> >>>>+	}
> >>>>+
> >>>>  	DRM_DEBUG_VBL("updating vblank count on crtc %u:"
> >>>>  		      " current=%u, diff=%u, hw=%u hw_last=%u\n",
> >>>>  		      pipe, vblank->count, diff, cur_vblank, vblank->last);
> >>>>--
> >>>>1.9.1
> >>>>
> >>>
> >>>--
> >>>Daniel Vetter
> >>>Software Engineer, Intel Corporation
> >>>http://blog.ffwll.ch
> >>
> >>--
> >>Ville Syrj�l�
> >>Intel OTC
> >

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

From mboxrd@z Thu Jan  1 00:00:00 1970
From: Daniel Vetter <daniel@ffwll.ch>
Subject: Re: [PATCH 2/6] drm: Prevent vblank counter bumps > 1 with active
 vblank clients.
Date: Tue, 9 Feb 2016 15:29:57 +0100
Message-ID: <20160209142957.GX11240@phenom.ffwll.local>
References: <1454894009-15466-1-git-send-email-mario.kleiner.de@gmail.com>
 <1454894009-15466-3-git-send-email-mario.kleiner.de@gmail.com>
 <20160209095638.GM11240@phenom.ffwll.local>
 <20160209100727.GG23290@intel.com>
 <20160209102302.GT11240@phenom.ffwll.local>
 <56B9EC20.9050400@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE
Return-path: <stable-owner@vger.kernel.org>
Content-Disposition: inline
In-Reply-To: <56B9EC20.9050400@gmail.com>
Sender: stable-owner@vger.kernel.org
To: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>, Ville =?iso-8859-1?Q?Syrj=E4l=E4?= <ville.syrjala@linux.intel.com>, dri-devel@lists.freedesktop.org, linux@bernd-steinhauser.de, stable@vger.kernel.org, michel@daenzer.net, vbabka@suse.cz, daniel.vetter@ffwll.ch, alexander.deucher@amd.com, christian.koenig@amd.com
List-Id: dri-devel@lists.freedesktop.org

On Tue, Feb 09, 2016 at 02:39:44PM +0100, Mario Kleiner wrote:
> On 02/09/2016 11:23 AM, Daniel Vetter wrote:
> >On Tue, Feb 09, 2016 at 12:07:27PM +0200, Ville Syrj=E4l=E4 wrote:
> >>On Tue, Feb 09, 2016 at 10:56:38AM +0100, Daniel Vetter wrote:
> >>>On Mon, Feb 08, 2016 at 02:13:25AM +0100, Mario Kleiner wrote:
> >>>>This fixes a regression introduced by the new drm_update_vblank_c=
ount()
> >>>>implementation in Linux 4.4:
> >>>>
> >>>>Restrict the bump of the software vblank counter in drm_update_vb=
lank_count()
> >>>>to a safe maximum value of +1 whenever there is the possibility t=
hat
> >>>>concurrent readers of vblank timestamps could be active at the mo=
ment,
> >>>>as the current implementation of the timestamp caching and updati=
ng is
> >>>>not safe against concurrent readers for calls to store_vblank() w=
ith a
> >>>>bump of anything but +1. A bump !=3D 1 would very likely return c=
orrupted
> >>>>timestamps to userspace, because the same slot in the cache could
> >>>>be concurrently written by store_vblank() and read by one of thos=
e
> >>>>readers in a non-atomic fashion and without the read-retry logic
> >>>>detecting this collision.
> >>>>
> >>>>Concurrent readers can exist while drm_update_vblank_count() is c=
alled
> >>>>from the drm_vblank_off() or drm_vblank_on() functions or other n=
on-vblank-
> >>>>irq callers. However, all those calls are happening with the vbl_=
lock
> >>>>locked thereby preventing a drm_vblank_get(), so the vblank refco=
unt
> >>>>can't increase while drm_update_vblank_count() is executing. Ther=
efore
> >>>>a zero vblank refcount during execution of that function signals =
that
> >>>>is safe for arbitrary counter bumps if called from outside vblank=
 irq,
> >>>>whereas a non-zero count is not safe.
> >>>>
> >>>>Whenever the function is called from vblank irq, we have to assum=
e concurrent
> >>>>readers could show up any time during its execution, even if the =
refcount
> >>>>is currently zero, as vblank irqs are usually only enabled due to=
 the
> >>>>presence of readers, and because when it is called from vblank ir=
q it
> >>>>can't hold the vbl_lock to protect it from sudden bumps in vblank=
 refcount.
> >>>>Therefore also restrict bumps to +1 when the function is called f=
rom vblank
> >>>>irq.
> >>>>
> >>>>Such bumps of more than +1 can happen at other times than reenabl=
ing
> >>>>vblank irqs, e.g., when regular vblank interrupts get delayed by =
more
> >>>>than 1 frame due to long held locks, long irq off periods, realti=
me
> >>>>preemption on RT kernels, or system management interrupts.
> >>>>
> >>>>Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> >>>>Cc: <stable@vger.kernel.org> # 4.4+
> >>>>Cc: michel@daenzer.net
> >>>>Cc: vbabka@suse.cz
> >>>>Cc: ville.syrjala@linux.intel.com
> >>>>Cc: daniel.vetter@ffwll.ch
> >>>>Cc: dri-devel@lists.freedesktop.org
> >>>>Cc: alexander.deucher@amd.com
> >>>>Cc: christian.koenig@amd.com
> >>>
> >>>Imo this is duct-tape. If we want to fix this up properly I think =
we
> >>>should just use a full-blown seqlock instead of our hand-rolled on=
e. And
> >>>that could handle any increment at all.
> >>
> >>And I even fixed this [1] almost a half a year ago when I sent the
> >>original series, but that part got held hostage to the same seqlock
> >>argument. Perfect is the enemy of good.
> >>
> >>[1] https://lists.freedesktop.org/archives/intel-gfx/2015-September=
/075879.html
> >
> >Hm yeah, that does suffer from reinventing seqlocks. But I'd prefer =
your
> >patch over Mario's hack here tbh. Your patch with seqlock would be e=
ven
> >more awesome.
> >-Daniel
> >
>=20
> I agree that my hack is only duct-tape. That's why the long code comm=
ent to
> let people know under which condition they could remove it.
>=20
> Using seqlocks would be the robust long term solution. But as this is
> supposed to be a fix for both 4.4 and 4.5 i thought that such a rewri=
te
> would be too intrusive as a change, compared to this one-liner?
>=20
> The original "roll our own" seqlock look alike implementation was mea=
nt to
> avoid/minimize taking locks, esp. with _irqsave that are taken by bot=
h
> userspace and timing sensitive vblank irq handling code. There were v=
arious
> locking changes since then and that advantage might have been lost al=
ready
> quite a long time ago, so maybe switching to full seqlocks wouldn't p=
ose
> some new performance problems there, but i haven't looked into this.

Last time I've checked we've already reinvented seqlocks completely,
except buggy since ours can't take an increment > 1. I don't expect you=
'll
be able to measure anything if we switch.

Agree that it might be better to delay this for 4.6. So if you add a bi=
g
"FIMXE: Need to replace this hack with proper seqlocks." a the top of y=
our
big comment (or just as a replacement for it), then

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

But currently it looks like this is a proper long-term solution, which =
it
imo isn't.
-Daniel


>=20
> -mario
>=20
> >>
> >>>-Daniel
> >>>
> >>>>---
> >>>>  drivers/gpu/drm/drm_irq.c | 41 ++++++++++++++++++++++++++++++++=
+++++++++
> >>>>  1 file changed, 41 insertions(+)
> >>>>
> >>>>diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.=
c
> >>>>index bcb8528..aa2c74b 100644
> >>>>--- a/drivers/gpu/drm/drm_irq.c
> >>>>+++ b/drivers/gpu/drm/drm_irq.c
> >>>>@@ -221,6 +221,47 @@ static void drm_update_vblank_count(struct d=
rm_device *dev, unsigned int pipe,
> >>>>  		diff =3D (flags & DRM_CALLED_FROM_VBLIRQ) !=3D 0;
> >>>>  	}
> >>>>
> >>>>+	/*
> >>>>+	 * Restrict the bump of the software vblank counter to a safe m=
aximum
> >>>>+	 * value of +1 whenever there is the possibility that concurren=
t readers
> >>>>+	 * of vblank timestamps could be active at the moment, as the c=
urrent
> >>>>+	 * implementation of the timestamp caching and updating is not =
safe
> >>>>+	 * against concurrent readers for calls to store_vblank() with =
a bump
> >>>>+	 * of anything but +1. A bump !=3D 1 would very likely return c=
orrupted
> >>>>+	 * timestamps to userspace, because the same slot in the cache =
could
> >>>>+	 * be concurrently written by store_vblank() and read by one of=
 those
> >>>>+	 * readers without the read-retry logic detecting the collision=
=2E
> >>>>+	 *
> >>>>+	 * Concurrent readers can exist when we are called from the
> >>>>+	 * drm_vblank_off() or drm_vblank_on() functions and other non-=
vblank-
> >>>>+	 * irq callers. However, all those calls to us are happening wi=
th the
> >>>>+	 * vbl_lock locked to prevent drm_vblank_get(), so the vblank r=
efcount
> >>>>+	 * can't increase while we are executing. Therefore a zero refc=
ount at
> >>>>+	 * this point is safe for arbitrary counter bumps if we are cal=
led
> >>>>+	 * outside vblank irq, a non-zero count is not 100% safe. Unfor=
tunately
> >>>>+	 * we must also accept a refcount of 1, as whenever we are call=
ed from
> >>>>+	 * drm_vblank_get() -> drm_vblank_enable() the refcount will be=
 1 and
> >>>>+	 * we must let that one pass through in order to not lose vblan=
k counts
> >>>>+	 * during vblank irq off - which would completely defeat the wh=
ole
> >>>>+	 * point of this routine.
> >>>>+	 *
> >>>>+	 * Whenever we are called from vblank irq, we have to assume co=
ncurrent
> >>>>+	 * readers exist or can show up any time during our execution, =
even if
> >>>>+	 * the refcount is currently zero, as vblank irqs are usually o=
nly
> >>>>+	 * enabled due to the presence of readers, and because when we =
are called
> >>>>+	 * from vblank irq we can't hold the vbl_lock to protect us fro=
m sudden
> >>>>+	 * bumps in vblank refcount. Therefore also restrict bumps to +=
1 when
> >>>>+	 * called from vblank irq.
> >>>>+	 */
> >>>>+	if ((diff > 1) && (atomic_read(&vblank->refcount) > 1 ||
> >>>>+	    (flags & DRM_CALLED_FROM_VBLIRQ))) {
> >>>>+		DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=3D%=
u "
> >>>>+			      "refcount %u, vblirq %u\n", pipe, diff,
> >>>>+			      atomic_read(&vblank->refcount),
> >>>>+			      (flags & DRM_CALLED_FROM_VBLIRQ) !=3D 0);
> >>>>+		diff =3D 1;
> >>>>+	}
> >>>>+
> >>>>  	DRM_DEBUG_VBL("updating vblank count on crtc %u:"
> >>>>  		      " current=3D%u, diff=3D%u, hw=3D%u hw_last=3D%u\n",
> >>>>  		      pipe, vblank->count, diff, cur_vblank, vblank->last);
> >>>>--
> >>>>1.9.1
> >>>>
> >>>
> >>>--
> >>>Daniel Vetter
> >>>Software Engineer, Intel Corporation
> >>>http://blog.ffwll.ch
> >>
> >>--
> >>Ville Syrj=E4l=E4
> >>Intel OTC
> >

--=20
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch