All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Hurley <peter@hurleysoftware.com>
To: Mario Kleiner <mario.kleiner.de@gmail.com>,
	Daniel Vetter <daniel@ffwll.ch>
Cc: "Daniel Vetter" <daniel.vetter@ffwll.ch>,
	"Intel Graphics Development" <intel-gfx@lists.freedesktop.org>,
	"Michel Dänzer" <michel@daenzer.net>,
	"DRI Development" <dri-devel@lists.freedesktop.org>,
	"Daniel Vetter" <daniel.vetter@intel.com>
Subject: Re: [PATCH] drm/vblank: Fixup and document timestamp update/read barriers
Date: Tue, 05 May 2015 10:36:24 -0400	[thread overview]
Message-ID: <5548D568.3050109@hurleysoftware.com> (raw)
In-Reply-To: <5546FB02.4010508@gmail.com>

On 05/04/2015 12:52 AM, Mario Kleiner wrote:
> On 04/16/2015 03:03 PM, Daniel Vetter wrote:
>> On Thu, Apr 16, 2015 at 08:30:55AM -0400, Peter Hurley wrote:
>>> On 04/15/2015 01:31 PM, Daniel Vetter wrote:
>>>> On Wed, Apr 15, 2015 at 09:00:04AM -0400, Peter Hurley wrote:
>>>>> Hi Daniel,
>>>>>
>>>>> On 04/15/2015 03:17 AM, Daniel Vetter wrote:
>>>>>> This was a bit too much cargo-culted, so lets make it solid:
>>>>>> - vblank->count doesn't need to be an atomic, writes are always done
>>>>>>    under the protection of dev->vblank_time_lock. Switch to an unsigned
>>>>>>    long instead and update comments. Note that atomic_read is just a
>>>>>>    normal read of a volatile variable, so no need to audit all the
>>>>>>    read-side access specifically.
>>>>>>
>>>>>> - The barriers for the vblank counter seqlock weren't complete: The
>>>>>>    read-side was missing the first barrier between the counter read and
>>>>>>    the timestamp read, it only had a barrier between the ts and the
>>>>>>    counter read. We need both.
>>>>>>
>>>>>> - Barriers weren't properly documented. Since barriers only work if
>>>>>>    you have them on boths sides of the transaction it's prudent to
>>>>>>    reference where the other side is. To avoid duplicating the
>>>>>>    write-side comment 3 times extract a little store_vblank() helper.
>>>>>>    In that helper also assert that we do indeed hold
>>>>>>    dev->vblank_time_lock, since in some cases the lock is acquired a
>>>>>>    few functions up in the callchain.
>>>>>>
>>>>>> Spotted while reviewing a patch from Chris Wilson to add a fastpath to
>>>>>> the vblank_wait ioctl.
>>>>>>
>>>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>>>> Cc: Mario Kleiner <mario.kleiner.de@gmail.com>
>>>>>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>>>>> Cc: Michel Dänzer <michel@daenzer.net>
>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/drm_irq.c | 92 ++++++++++++++++++++++++-----------------------
>>>>>>   include/drm/drmP.h        |  8 +++--
>>>>>>   2 files changed, 54 insertions(+), 46 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
>>>>>> index c8a34476570a..23bfbc61a494 100644
>>>>>> --- a/drivers/gpu/drm/drm_irq.c
>>>>>> +++ b/drivers/gpu/drm/drm_irq.c
>>>>>> @@ -74,6 +74,33 @@ module_param_named(vblankoffdelay, drm_vblank_offdelay, int, 0600);
>>>>>>   module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 0600);
>>>>>>   module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 0600);
>>>>>>
>>>>>> +static void store_vblank(struct drm_device *dev, int crtc,
>>>>>> +             unsigned vblank_count_inc,
>>>>>> +             struct timeval *t_vblank)
>>>>>> +{
>>>>>> +    struct drm_vblank_crtc *vblank = &dev->vblank[crtc];
>>>>>> +    u32 tslot;
>>>>>> +
>>>>>> +    assert_spin_locked(&dev->vblank_time_lock);
>>>>>> +
>>>>>> +    if (t_vblank) {
>>>>>> +        tslot = vblank->count + vblank_count_inc;
>>>>>> +        vblanktimestamp(dev, crtc, tslot) = *t_vblank;
>>>>>> +    }
>>>>>> +
>>>>>> +    /*
>>>>>> +     * vblank timestamp updates are protected on the write side with
>>>>>> +     * vblank_time_lock, but on the read side done locklessly using a
>>>>>> +     * sequence-lock on the vblank counter. Ensure correct ordering using
>>>>>> +     * memory barrriers. We need the barrier both before and also after the
>>>>>> +     * counter update to synchronize with the next timestamp write.
>>>>>> +     * The read-side barriers for this are in drm_vblank_count_and_time.
>>>>>> +     */
>>>>>> +    smp_wmb();
>>>>>> +    vblank->count += vblank_count_inc;
>>>>>> +    smp_wmb();
>>>>>
>>>>> The comment and the code are each self-contradictory.
>>>>>
>>>>> If vblank->count writes are always protected by vblank_time_lock (something I
>>>>> did not verify but that the comment above asserts), then the trailing write
>>>>> barrier is not required (and the assertion that it is in the comment is incorrect).
>>>>>
>>>>> A spin unlock operation is always a write barrier.
>>>>
>>>> Hm yeah. Otoh to me that's bordering on "code too clever for my own good".
>>>> That the spinlock is held I can assure. That no one goes around and does
>>>> multiple vblank updates (because somehow that code raced with the hw
>>>> itself) I can't easily assure with a simple assert or something similar.
>>>> It's not the case right now, but that can changes.
>>>
>>> The algorithm would be broken if multiple updates for the same vblank
>>> count were allowed; that's why it checks to see if the vblank count has
>>> not advanced before storing a new timestamp.
>>>
>>> Otherwise, the read side would not be able to determine that the
>>> timestamp is valid by double-checking that the vblank count has not
>>> changed.
>>>
>>> And besides, even if the code looped without dropping the spinlock,
>>> the correct write order would still be observed because it would still
>>> be executing on the same cpu.
>>>
>>> My objection to the write memory barrier is not about optimization;
>>> it's about correct code.
>>
>> Well diff=0 is not allowed, I guess I could enforce this with some
>> WARN_ON. And I still think my point of non-local correctness is solid.
>> With the smp_wmb() removed the following still works correctly:
>>
>> spin_lock(vblank_time_lock);
>> store_vblank(dev, crtc, 1, ts1);
>> spin_unlock(vblank_time_lock);
>>
>> spin_lock(vblank_time_lock);
>> store_vblank(dev, crtc, 1, ts2);
>> spin_unlock(vblank_time_lock);
>>
>> But with the smp_wmb(); removed the following would be broken:
>>
>> spin_lock(vblank_time_lock);
>> store_vblank(dev, crtc, 1, ts1);
>> store_vblank(dev, crtc, 1, ts2);
>> spin_unlock(vblank_time_lock);
>>
>> because the compiler/cpu is free to reorder the store for vblank->count
>> _ahead_ of the store for the timestamp. And that would trick readers into
>> believing that they have a valid timestamp when they potentially raced.
>>
>> Now you're correct that right now there's no such thing going on, and it's
>> unlikely to happen (given the nature of vblank updates). But my point is
>> that if we optimize this then the correctness can't be proven locally
>> anymore by just looking at store_vblank, but instead you must audit all
>> the callers. And leaking locking/barriers like that is too fragile design
>> for my taste.
>>
>> But you insist that my approach is broken somehow and dropping the smp_wmb
>> is needed for correctness. I don't see how that's the case at all.

Daniel,

I've been really busy this last week; my apologies for not replying promptly.

> Fwiw, i spent some time reeducating myself about memory barriers (thanks for your explanations) and thinking about this, and the last version of your patch looks good to me. It also makes sense to me to leave that last smb_wmb() in place to make future use of the helper robust - for non-local correctness, to avoid having to audit all future callers of that helper.

My concern wrt to unnecessary barriers in this algorithm is that the trailing
barrier now appears mandatory, when in fact it is not.

Moreover, this algorithm is, in general, fragile and not designed to handle
random or poorly-researched changes.

For example, if only the read and store operations are considered, it's obviously
unsafe, since a read may unwittingly retrieve an store in progress.


CPU 0                                   | CPU 1
                                        |
                             /* vblank->count == 0 */
                                        |
drm_vblank_count_and_time()             | store_vblank(.., inc = 2, ...)
                                        |
  cur_vblank <= LOAD vblank->count      |
                                        |   tslot = vblank->count + 2
                                        |   /* tslot == 2 */
                                        |   STORE vblanktime[0]
  - - - - - - - - - - - - - - - - - - - - - smp_wmb() - - - - - - - - - -
  /* cur_vblank == 0 */                 |
  local <= LOAD vblanktime[0]           |
  smp_rmb - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
                                        |
 * cpu has loaded the wrong timestamp * |
                                        |
  local <= LOAD vblank->count           |
  cur_vblank == local?                  |
  yes - exit loop                       |
                                        |   vblank->count += 2
  - - - - - - - - - - - - - - - - - - - - - smp_wmb() - - - - - - - - - -

Regards,
Peter Hurley


> I also tested your patch + a slightly modified version of Chris vblank delayed disable / instant query patches + my fixes using my own stress tests and hardware timing test equipment on both intel and nouveau, and everything seems to work fine.
>
> So i'm all for including this patch and it has my
> 
> Reviewed-and-tested-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> 
> I just sent out an updated version of my patches, so they don't conflict with this one and also fix a compile failure of drm/qxl with yours.
> 
> Thanks,
> -mario

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2015-05-05 14:36 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-15  7:17 [PATCH] drm/vblank: Fixup and document timestamp update/read barriers Daniel Vetter
2015-04-15  8:17 ` Chris Wilson
2015-04-15  9:25   ` Daniel Vetter
2015-04-15  9:36     ` Chris Wilson
2015-04-15 10:32 ` Daniel Vetter
2015-04-15 17:34   ` Daniel Vetter
2015-04-15 17:49     ` Chris Wilson
2015-04-15 21:26     ` Mario Kleiner
2015-04-16  1:29       ` Peter Hurley
2015-04-16  6:39         ` Mario Kleiner
2015-04-16  9:00           ` Peter Hurley
2015-04-16  9:06             ` Daniel Vetter
2015-04-16 12:52           ` Peter Hurley
2015-04-16  8:54       ` Daniel Vetter
2015-04-16  0:17     ` shuang.he
2015-04-16  8:59     ` Daniel Vetter
2015-04-17  4:27       ` shuang.he
2015-04-16  0:18   ` shuang.he
2015-04-15 13:00 ` Peter Hurley
2015-04-15 17:31   ` Daniel Vetter
2015-04-16 12:30     ` Peter Hurley
2015-04-16 13:03       ` [Intel-gfx] " Daniel Vetter
2015-05-04  4:52         ` Mario Kleiner
2015-05-05 14:36           ` Peter Hurley [this message]
2015-05-05 15:42             ` Daniel Vetter
2015-05-05 15:57               ` Peter Hurley
2015-05-05 19:01                 ` Peter Hurley
2015-05-06  8:56                 ` Daniel Vetter
2015-05-07 11:56                   ` [Intel-gfx] " Peter Hurley
2015-05-07 17:33                     ` Mario Kleiner
2015-04-15 19:40 ` shuang.he

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5548D568.3050109@hurleysoftware.com \
    --to=peter@hurleysoftware.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=daniel.vetter@intel.com \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=mario.kleiner.de@gmail.com \
    --cc=michel@daenzer.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.