From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Vetter Subject: Re: [PATCH] drm/gem: fix up flink name create race Date: Wed, 24 Jul 2013 11:02:02 +0200 Message-ID: References: <1373958731-4132-11-git-send-email-daniel.vetter@ffwll.ch> <1374645842-18538-1-git-send-email-daniel.vetter@ffwll.ch> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail-ob0-f170.google.com (mail-ob0-f170.google.com [209.85.214.170]) by gabe.freedesktop.org (Postfix) with ESMTP id 19921E619C for ; Wed, 24 Jul 2013 02:02:04 -0700 (PDT) Received: by mail-ob0-f170.google.com with SMTP id ef5so12635596obb.29 for ; Wed, 24 Jul 2013 02:02:03 -0700 (PDT) In-Reply-To: <1374645842-18538-1-git-send-email-daniel.vetter@ffwll.ch> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org Errors-To: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org To: DRI Development Cc: Daniel Vetter List-Id: dri-devel@lists.freedesktop.org On Wed, Jul 24, 2013 at 8:04 AM, Daniel Vetter wrote: > This is the 2nd attempt, I've always been a bit dissatisified with the > tricky nature of the first one: > > http://lists.freedesktop.org/archives/dri-devel/2012-July/025451.html > > The issue is that the flink ioctl can race with calling gem_close on > the last gem handle. In that case we'll end up with a zero handle > count, but an flink name (and it's corresponding reference). Which > results in a neat space leak. > > In my first attempt I've solved this by rechecking the handle count. > But fundamentally the issue is that ->handle_count isn't your usual > refcount - it can be resurrected from 0 among other things. > > For those special beasts atomic_t often suggest way more ordering that > it actually guarantees. To prevent being tricked by those hairy > semantics take the easy way out and simply protect the handle with the > existing dev->object_name_lock. > > With that change implemented it's dead easy to fix the flink vs. gem > close reace: When we try to create the name we simply have to check > whether there's still officially a gem handle around and if not refuse > to create the flink name. Since the handle count decrement and flink > name destruction is now also protected by that lock the reace is gone > and we can't ever leak the flink reference again. > > Outside of the drm core only the exynos driver looks at the handle > count, and tbh I have no idea why (it's just for debug dmesg output > luckily). > > I've considered inlining the drm_gem_object_handle_free, but I plan to > add more name-like things (like the exported dma_buf) to this scheme, > so it's clearer to leave the handle freeing in its own function. > > This is exercised by the new gem_flink_race i-g-t testcase, which on > my snb leaks gem objects at a rate of roughly 1k objects/s. That's actually incorrect since the leak I've found is just a race in the drm/i915 object tracking. So I need to go back to the drawing board and figure out which are the ghosts and which the dragons here. I've turned that testcase into an exercise for "drm/gem: completely close gem_open vs. gem_close races", but that race only results in userspace seeing different flink names for the same object. And that only happens if userspace is racy already. For this patch here I still think there's an issue, but I seriously need to restart my brain first and flush out the bogons with some coffee before I try again ;-) -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch