From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754682AbdGJT4b (ORCPT ); Mon, 10 Jul 2017 15:56:31 -0400 Received: from mail-wr0-f193.google.com ([209.85.128.193]:36202 "EHLO mail-wr0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754284AbdGJT43 (ORCPT ); Mon, 10 Jul 2017 15:56:29 -0400 Date: Mon, 10 Jul 2017 21:56:20 +0200 From: Alexandru Moise <00moses.alexander00@gmail.com> To: Daniel Vetter , rusty@rustcorp.com.au, jeyu@kernel.org, gregkh@linuxfoundation.org Cc: linux-kernel@vger.kernel.org, daniel.vetter@intel.com, dri-devel@lists.freedesktop.org, airlied@linux.ie, seanpaul@chromium.org, jani.nikula@linux.intel.com Subject: Re: [PATCH] drm: inhibit drm drivers register to uninitialized drm core Message-ID: <20170710195619.GA2113@gmail.com> References: <20170708214352.GA27205@gmail.com> <20170710065246.rn4o3bjje37bktww@phenom.ffwll.local> <20170710071453.GA16847@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.8.3 (2017-05-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 10, 2017 at 08:00:37PM +0200, Daniel Vetter wrote: > On Mon, Jul 10, 2017 at 9:14 AM, Alexandru Moise > <00moses.alexander00@gmail.com> wrote: > > On Mon, Jul 10, 2017 at 08:52:46AM +0200, Daniel Vetter wrote: > >> On Sat, Jul 08, 2017 at 11:43:52PM +0200, Alexandru Moise wrote: > >> > If the DRM core fails to init for whatever reason, ensure that > >> > no driver ever calls drm_dev_register(). > >> > > >> > This is best done at drm_dev_init() as it covers drivers that call > >> > drm_dev_alloc() as well as drivers that prefer to embed struct > >> > drm_device into their own device struct and call drm_dev_init() > >> > themselves. > >> > > >> > In my case I had so many dynamic device majors used that the major > >> > number for DRM (226) was stolen, causing DRM core init to fail after > >> > failing to register a chrdev, and ultimately calling debugfs_remove() > >> > on drm_debugfs_root in drm_core_exit(). > >> > > >> > After drm core failed to init, VGEM was still calling drm_dev_register(), > >> > ultimately leading to drm_debugfs_init(), with drm_debugfs_root passed > >> > as the root for the new debugfs dir at debugfs_create_dir(). > >> > > >> > This led to a kernel panic once we were either derefencing root->d_inode > >> > while it was NULL or calling root->d_inode->i_op->lookup() while it was > >> > NULL in debugfs at inode_lock() or lookup_*(). > >> > > >> > Signed-off-by: Alexandru Moise <00moses.alexander00@gmail.com> > >> > --- > >> > drivers/gpu/drm/drm_drv.c | 16 ++++++++++++++++ > >> > 1 file changed, 16 insertions(+) > >> > > >> > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c > >> > index 37b8ad3e30d8..2ed2d919beae 100644 > >> > --- a/drivers/gpu/drm/drm_drv.c > >> > +++ b/drivers/gpu/drm/drm_drv.c > >> > @@ -63,6 +63,15 @@ module_param_named(debug, drm_debug, int, 0600); > >> > static DEFINE_SPINLOCK(drm_minor_lock); > >> > static struct idr drm_minors_idr; > >> > > >> > +/* > >> > + * If the drm core fails to init for whatever reason, > >> > + * we should prevent any drivers from registering with it. > >> > + * It's best to check this at drm_dev_init(), as some drivers > >> > + * prefer to embed struct drm_device into their own device > >> > + * structure and call drm_dev_init() themselves. > >> > + */ > >> > +static bool drm_core_init_complete = false; > >> > + > >> > static struct dentry *drm_debugfs_root; > >> > > >> > #define DRM_PRINTK_FMT "[" DRM_NAME ":%s]%s %pV" > >> > @@ -484,6 +493,11 @@ int drm_dev_init(struct drm_device *dev, > >> > { > >> > int ret; > >> > > >> > + if (!drm_core_init_complete) { > >> > + DRM_ERROR("DRM core is not initialized\n"); > >> > + return -ENODEV; > >> > + } > >> > + > >> > kref_init(&dev->ref); > >> > dev->dev = parent; > >> > dev->driver = driver; > >> > @@ -966,6 +980,8 @@ static int __init drm_core_init(void) > >> > if (ret < 0) > >> > goto error; > >> > > >> > + drm_core_init_complete = true; > >> > + > >> > DRM_DEBUG("Initialized\n"); > >> > return 0; > >> > >> Isn't the correct fix to pass down the error value, which iirc should > >> make the kmod stuff unload the module again? Or does this not work' > >> -Daniel > > > > What if everything is built in? > > I feared that would be the answer :-/ Still feels funny that everyone > will need to hand-roll this, or does everyone simply assume that their > subsystem's module_init never fails? > > Adding a pile of kmod and driver folks in the hopes of getting a > better answer. If there's no better answer pls remind me to merge your > patch in 1-2 weeks, I'll likely forget ... > -Daniel I took a look at all the DRM drivers and they all seem pretty sane, they all handle the case of the DRM core failure, the fault is clearly on the DRM core side, it makes absolutely no sense trying to register with a subsystem that failed to init. This is certainly not a subsystem I am intimately acquainted with however, I just stumbled on this crash while working with debugfs. Most of the DRM core code looks pretty solid, and this is a relatively small corner-case, I believe most people out there won't be eating up that many major numbers. But I believe it's still the Right Thing™ to do. It might save some headaches in the future as DRM becomes more complex. ../Alex > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch