From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 15029C433F5 for ; Tue, 5 Apr 2022 17:29:36 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 59FC810ED4E; Tue, 5 Apr 2022 17:29:35 +0000 (UTC) Received: from mail-oa1-x35.google.com (mail-oa1-x35.google.com [IPv6:2001:4860:4864:20::35]) by gabe.freedesktop.org (Postfix) with ESMTPS id 361C010ED4E for ; Tue, 5 Apr 2022 17:29:34 +0000 (UTC) Received: by mail-oa1-x35.google.com with SMTP id 586e51a60fabf-ddfa38f1c1so79080fac.11 for ; Tue, 05 Apr 2022 10:29:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=hTmLNze/fum3lYDHZs7Sj1aLnDxMurPjlCcI6KAPtQw=; b=LbJpKtmNLFfmBBa8hI73mA0q3CBsHxWD5rGAH/a5zDNcSjWuPrutm69VA3/geneB9G pP2pvW/JddNsdNKYKSfjHiWncHsb3XxG9GZThdgfhubL/HZ4IB51n2b1YDm90xWG0GfZ Nqp2GVyKJQMxrGftECRmUpJYaBDF8o2XFRYtg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=hTmLNze/fum3lYDHZs7Sj1aLnDxMurPjlCcI6KAPtQw=; b=t/ELdc27m75GwTOfA0z4Q2at1LaTDpS3DKb/+iRhuGSbQUnITSwtaqTIzQUs4vwI4h wGeCknGE1R8jUQIk1FENBGxnero52JzdNyfEpmQsVukWlfSirUKJclOOXlnkFXKKf/7l COss9+eyHGA1BsRi/Kh+H+AtdXFxGO6csWrL9xjakfuShO+JWrqpb4Rc42Bxu/r1vl9T ddS2QQMnTig4rily9lNIRpRwLcD6HN0wWFOKH8de+V1Tol6OVeJZK+w7zmvZY+TCqqMb bzdxHIpMbapXREDfvd8jv++Oim44ZgHWcfX+TJlVT/a/08I45xkpCofK2cROogSnKJDU wnOw== X-Gm-Message-State: AOAM530c2fZS3MJ/9bMyOGgVQADC8UFWvGY2W4WytxUsvoddjgwTAR8m sfa23vRjpHp9p2f5/XhLtJTQstH4vXBSxfnJIPyXog== X-Google-Smtp-Source: ABdhPJz2hTIFazRR9MY8eLnHVlZ8caoGqxdFkjJVbRqRI2qhGE2WqsCDNkD6KwqVkoKmjggVmMcy5elsAsbz0Y7u4AU= X-Received: by 2002:a05:6870:ec9e:b0:dd:e1fd:d76c with SMTP id eo30-20020a056870ec9e00b000dde1fdd76cmr2156223oab.7.1649179773446; Tue, 05 Apr 2022 10:29:33 -0700 (PDT) MIME-Version: 1.0 References: <4ae20b63-f452-fdb4-ced6-d4968a8d69f0@redhat.com> <408ffe9b-f09f-dc7e-7f5e-a93b311a06fa@redhat.com> In-Reply-To: From: Daniel Vetter Date: Tue, 5 Apr 2022 19:29:22 +0200 Message-ID: Subject: Re: [PATCH v2 18/19] Revert "fbdev: Prevent probing generic drivers if a FB is already registered" To: Greg KH Content-Type: text/plain; charset="UTF-8" X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Linux Fbdev development list , Intel Graphics Development , LKML , DRI Development , Javier Martinez Canillas , Hans de Goede , Peter Jones , Geert Uytterhoeven , Thomas Zimmermann , Ilya Trukhanov , Daniel Vetter Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" On Tue, 5 Apr 2022 at 18:45, Greg KH wrote: > > On Tue, Apr 05, 2022 at 06:12:59PM +0200, Daniel Vetter wrote: > > On Tue, Apr 05, 2022 at 03:33:17PM +0200, Greg KH wrote: > > > On Tue, Apr 05, 2022 at 03:24:40PM +0200, Geert Uytterhoeven wrote: > > > > Hi Daniel, > > > > > > > > On Tue, Apr 5, 2022 at 1:48 PM Daniel Vetter wrote: > > > > > On Tue, 5 Apr 2022 at 11:52, Javier Martinez Canillas > > > > > wrote: > > > > > > On 4/5/22 11:24, Daniel Vetter wrote: > > > > > > > On Tue, 5 Apr 2022 at 11:19, Javier Martinez Canillas > > > > > > >> This is how I think that work, please let me know if you see something > > > > > > >> wrong in my logic: > > > > > > >> > > > > > > >> 1) A PCI device of OF device is registered for the GPU, this attempt to > > > > > > >> match a registered driver but no driver was registered that match yet. > > > > > > >> > > > > > > >> 2) The efifb driver is built-in, will be initialized according to the link > > > > > > >> order of the objects under drivers/video and the fbdev driver is registered. > > > > > > >> > > > > > > >> There is no platform device or PCI/OF device registered that matches. > > > > > > >> > > > > > > >> 3) The DRM driver is built-in, will be initialized according to the link > > > > > > >> order of the objects under drivers/gpu and the DRM driver is registered. > > > > > > >> > > > > > > >> This matches the device registered in (1) and the DRM driver probes. > > > > > > >> > > > > > > >> 4) The DRM driver .probe kicks out any conflicting DRM drivers and pdev > > > > > > >> before registering the DRM device. > > > > > > >> > > > > > > >> There are no conflicting drivers or platform device at this point. > > > > > > >> > > > > > > >> 5) Latter at some point the drivers/firmware/sysfb.c init function is > > > > > > >> executed, and this registers a platform device for the generic fb. > > > > > > >> > > > > > > >> This device matches the efifb driver registered in (2) and the fbdev > > > > > > >> driver probes. > > > > > > >> > > > > > > >> Since that happens *after* the DRM driver already matched, probed > > > > > > >> and registered the DRM device, that is a bug and what the reverted > > > > > > >> patch worked around. > > > > > > >> > > > > > > >> So we need to prevent (5) if (1) and (3) already happened. Having a flag > > > > > > >> set in the fbdev core somewhere when remove_conflicting_framebuffers() > > > > > > >> is called could be a solution indeed. > > > > > > >> > > > > > > >> That is, the fbdev core needs to know that a DRM driver already probed > > > > > > >> and make register_framebuffer() fail if info->flag & FBINFO_MISC_FIRMWARE > > > > > > >> > > > > > > >> I can attempt to write a patch for that. > > > > > > > > > > > > > > Ah yeah that could be an issue. I think the right fix is to replace > > > > > > > the platform dev unregister with a sysfb_unregister() function in > > > > > > > sysfb.c, which is synced with a common lock with the sysfb_init > > > > > > > function and a small boolean. I think I can type that up quickly for > > > > > > > v3. > > > > > > > > > > > > It's more complicated than that since sysfb is just *one* of the several > > > > > > places where platform devices can be registered for video devices. > > > > > > > > > > > > For instance, the vga16fb driver registers its own platform device in > > > > > > its module_init() function so that can also happen after the conflicting > > > > > > framebuffers (and associated devices) were removed by a DRM driver probe. > > > > > > > > > > > > I tried to minimize the issue for that particular driver with commit: > > > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0499f419b76f > > > > > > > > > > > > But the point stands, it all boils down to the fact that you have two > > > > > > different subsystems registering video drivers and they don't know all > > > > > > about each other to take a proper decision. > > > > > > > > > > > > Right now the drm_aperture_remove_conflicting_framebuffers() call signals > > > > > > in one direction from DRM to fbdev but there isn't a communication in the > > > > > > other direction, from fbdev to DRM. > > > > > > > > > > > > I believe the correct fix would be for the fbdev core to keep a list of > > > > > > the apertures struct that are passed to remove_conflicting_framebuffers(), > > > > > > that way it will know what apertures are not available anymore and prevent > > > > > > to register any fbdev framebuffer that conflicts with one already present. > > > > > > > > > > Hm that still feels like reinventing a driver model, badly. > > > > > > > > > > I think there's two cleaner solutions: > > > > > - move all the firmware driver platform_dev into sysfb.c, and then > > > > > just bind the special cases against that (e.g. offb, vga16fb and all > > > > > these). Then we'd have one sysfb_try_unregister(struct device *dev) > > > > > interface that fbmem.c uses. > > > > > - let fbmem.c call into each of these firmware device providers, which > > > > > means some loops most likely (like we can't call into vga16fb), so > > > > > probably need to move that into fbmem.c and it all gets a bit messy. > > > > > > > > > > > Let me know if you think that makes sense and I can attempt to write a fix. > > > > > > > > > > I still think unregistering the platform_dev properly makes the most > > > > > > > > That doesn't sound very driver-model-aware to me. The device is what > > > > the driver binds to; it does not cease to exist. > > > > > > I agree, that sounds odd. > > > > > > The device should always stick around (as the bus creates it), it's up > > > to the driver to bind to the device as needed. > > > > The device actually disappears when the real driver takes over. > > > > The firmware fb is a special thing which only really exists as long as the > > firmware is in charge of the display hardware. As soon as a real driver > > takes over, it stops being a thing. > > > > And since a driver without a device is a bit a funny thing, we have been > > pushing towards a model where the firmware code sets up a platform_device > > for this fw interface, and the fw driver (efifb, simplefb and others like > > that) bind against it. And then we started to throw out that > > platform_device (which unbinds the fw driver and prevents it from ever > > rebinding), except in the wrong layer so there's a few races. > > > > Should we throw out all that code and replace it with something else? What > > would that be like? > > Ah, no, sorry, I didn't know that at all. > > That sounds semi-sane, just fix the races by moving the layer elsewhere? Yeah essentially move it all into drivers/firmware/sysfb.c, for all drivers, both the registering and the nuking, and warp that into a local mutex. Currently parts is in there, parts is in fbmem.c, parts in some of the drivers like vga16fb, and some drivers (iirc only offb) still don't even have any platform_dev underneath their driver. So ideally the drivers would all just have their platform_driver probe functions, and that's it. It does mean though that some of that stuff needs to be moved to sysfb.c or into the relevant fw code that sets stuff up. It'll take some, so really just a direction check before we move further. You should get cc'ed on the patches (like with the sysfb stuff) anyway. Sounds roughly right? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 370DBC433F5 for ; Tue, 5 Apr 2022 17:29:39 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EA54510ED56; Tue, 5 Apr 2022 17:29:35 +0000 (UTC) Received: from mail-oa1-x31.google.com (mail-oa1-x31.google.com [IPv6:2001:4860:4864:20::31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 389F110ED52 for ; Tue, 5 Apr 2022 17:29:34 +0000 (UTC) Received: by mail-oa1-x31.google.com with SMTP id 586e51a60fabf-d6ca46da48so72501fac.12 for ; Tue, 05 Apr 2022 10:29:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=hTmLNze/fum3lYDHZs7Sj1aLnDxMurPjlCcI6KAPtQw=; b=LbJpKtmNLFfmBBa8hI73mA0q3CBsHxWD5rGAH/a5zDNcSjWuPrutm69VA3/geneB9G pP2pvW/JddNsdNKYKSfjHiWncHsb3XxG9GZThdgfhubL/HZ4IB51n2b1YDm90xWG0GfZ Nqp2GVyKJQMxrGftECRmUpJYaBDF8o2XFRYtg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=hTmLNze/fum3lYDHZs7Sj1aLnDxMurPjlCcI6KAPtQw=; b=mHeoCesl9Uixe9Fix+gtCXQGqWtCvvVnjwO1PJcLgKteBO5YS9nJDFATwGAbBAe+Li 2Hwfz18gLrl9YD7Q9RH3UQ0Y0xWrhjUkswGpXm8i+MOLB33U4AlefJc4OOaBzS8oVHSB N09cHJdO7SirSeIFAlla0hIOpil3R3Dwk+RHfMrDiRkEAJoJrjNp/iumsLzmjl4dzdYV 3NXIVFmsQ0mT6XiB4EHX8wy2AjBvr8Od+LKykauttYSWCsJuond/uX1D2yKadTUTmm6T gCJ2ZaXzsKKkhdWGxfmPJDO3+uImS3baaf86sO8DFJGI/w+RWJ2EmOg0ljI7sZC5cK44 5ASw== X-Gm-Message-State: AOAM531Vk+hrou05I86PxqGGrmmzQNJxCziH92QKJ94mBAW9oGyEW0GC JXrYzaxtQvps2dAh1Ydbbzeu3WHpU6pBxwinGsUx4w== X-Google-Smtp-Source: ABdhPJz2hTIFazRR9MY8eLnHVlZ8caoGqxdFkjJVbRqRI2qhGE2WqsCDNkD6KwqVkoKmjggVmMcy5elsAsbz0Y7u4AU= X-Received: by 2002:a05:6870:ec9e:b0:dd:e1fd:d76c with SMTP id eo30-20020a056870ec9e00b000dde1fdd76cmr2156223oab.7.1649179773446; Tue, 05 Apr 2022 10:29:33 -0700 (PDT) MIME-Version: 1.0 References: <4ae20b63-f452-fdb4-ced6-d4968a8d69f0@redhat.com> <408ffe9b-f09f-dc7e-7f5e-a93b311a06fa@redhat.com> In-Reply-To: From: Daniel Vetter Date: Tue, 5 Apr 2022 19:29:22 +0200 Message-ID: To: Greg KH Content-Type: text/plain; charset="UTF-8" Subject: Re: [Intel-gfx] [PATCH v2 18/19] Revert "fbdev: Prevent probing generic drivers if a FB is already registered" X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Linux Fbdev development list , Intel Graphics Development , LKML , DRI Development , Javier Martinez Canillas , Peter Jones , Geert Uytterhoeven , Thomas Zimmermann , Ilya Trukhanov , Daniel Vetter , Zack Rusin Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On Tue, 5 Apr 2022 at 18:45, Greg KH wrote: > > On Tue, Apr 05, 2022 at 06:12:59PM +0200, Daniel Vetter wrote: > > On Tue, Apr 05, 2022 at 03:33:17PM +0200, Greg KH wrote: > > > On Tue, Apr 05, 2022 at 03:24:40PM +0200, Geert Uytterhoeven wrote: > > > > Hi Daniel, > > > > > > > > On Tue, Apr 5, 2022 at 1:48 PM Daniel Vetter wrote: > > > > > On Tue, 5 Apr 2022 at 11:52, Javier Martinez Canillas > > > > > wrote: > > > > > > On 4/5/22 11:24, Daniel Vetter wrote: > > > > > > > On Tue, 5 Apr 2022 at 11:19, Javier Martinez Canillas > > > > > > >> This is how I think that work, please let me know if you see something > > > > > > >> wrong in my logic: > > > > > > >> > > > > > > >> 1) A PCI device of OF device is registered for the GPU, this attempt to > > > > > > >> match a registered driver but no driver was registered that match yet. > > > > > > >> > > > > > > >> 2) The efifb driver is built-in, will be initialized according to the link > > > > > > >> order of the objects under drivers/video and the fbdev driver is registered. > > > > > > >> > > > > > > >> There is no platform device or PCI/OF device registered that matches. > > > > > > >> > > > > > > >> 3) The DRM driver is built-in, will be initialized according to the link > > > > > > >> order of the objects under drivers/gpu and the DRM driver is registered. > > > > > > >> > > > > > > >> This matches the device registered in (1) and the DRM driver probes. > > > > > > >> > > > > > > >> 4) The DRM driver .probe kicks out any conflicting DRM drivers and pdev > > > > > > >> before registering the DRM device. > > > > > > >> > > > > > > >> There are no conflicting drivers or platform device at this point. > > > > > > >> > > > > > > >> 5) Latter at some point the drivers/firmware/sysfb.c init function is > > > > > > >> executed, and this registers a platform device for the generic fb. > > > > > > >> > > > > > > >> This device matches the efifb driver registered in (2) and the fbdev > > > > > > >> driver probes. > > > > > > >> > > > > > > >> Since that happens *after* the DRM driver already matched, probed > > > > > > >> and registered the DRM device, that is a bug and what the reverted > > > > > > >> patch worked around. > > > > > > >> > > > > > > >> So we need to prevent (5) if (1) and (3) already happened. Having a flag > > > > > > >> set in the fbdev core somewhere when remove_conflicting_framebuffers() > > > > > > >> is called could be a solution indeed. > > > > > > >> > > > > > > >> That is, the fbdev core needs to know that a DRM driver already probed > > > > > > >> and make register_framebuffer() fail if info->flag & FBINFO_MISC_FIRMWARE > > > > > > >> > > > > > > >> I can attempt to write a patch for that. > > > > > > > > > > > > > > Ah yeah that could be an issue. I think the right fix is to replace > > > > > > > the platform dev unregister with a sysfb_unregister() function in > > > > > > > sysfb.c, which is synced with a common lock with the sysfb_init > > > > > > > function and a small boolean. I think I can type that up quickly for > > > > > > > v3. > > > > > > > > > > > > It's more complicated than that since sysfb is just *one* of the several > > > > > > places where platform devices can be registered for video devices. > > > > > > > > > > > > For instance, the vga16fb driver registers its own platform device in > > > > > > its module_init() function so that can also happen after the conflicting > > > > > > framebuffers (and associated devices) were removed by a DRM driver probe. > > > > > > > > > > > > I tried to minimize the issue for that particular driver with commit: > > > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0499f419b76f > > > > > > > > > > > > But the point stands, it all boils down to the fact that you have two > > > > > > different subsystems registering video drivers and they don't know all > > > > > > about each other to take a proper decision. > > > > > > > > > > > > Right now the drm_aperture_remove_conflicting_framebuffers() call signals > > > > > > in one direction from DRM to fbdev but there isn't a communication in the > > > > > > other direction, from fbdev to DRM. > > > > > > > > > > > > I believe the correct fix would be for the fbdev core to keep a list of > > > > > > the apertures struct that are passed to remove_conflicting_framebuffers(), > > > > > > that way it will know what apertures are not available anymore and prevent > > > > > > to register any fbdev framebuffer that conflicts with one already present. > > > > > > > > > > Hm that still feels like reinventing a driver model, badly. > > > > > > > > > > I think there's two cleaner solutions: > > > > > - move all the firmware driver platform_dev into sysfb.c, and then > > > > > just bind the special cases against that (e.g. offb, vga16fb and all > > > > > these). Then we'd have one sysfb_try_unregister(struct device *dev) > > > > > interface that fbmem.c uses. > > > > > - let fbmem.c call into each of these firmware device providers, which > > > > > means some loops most likely (like we can't call into vga16fb), so > > > > > probably need to move that into fbmem.c and it all gets a bit messy. > > > > > > > > > > > Let me know if you think that makes sense and I can attempt to write a fix. > > > > > > > > > > I still think unregistering the platform_dev properly makes the most > > > > > > > > That doesn't sound very driver-model-aware to me. The device is what > > > > the driver binds to; it does not cease to exist. > > > > > > I agree, that sounds odd. > > > > > > The device should always stick around (as the bus creates it), it's up > > > to the driver to bind to the device as needed. > > > > The device actually disappears when the real driver takes over. > > > > The firmware fb is a special thing which only really exists as long as the > > firmware is in charge of the display hardware. As soon as a real driver > > takes over, it stops being a thing. > > > > And since a driver without a device is a bit a funny thing, we have been > > pushing towards a model where the firmware code sets up a platform_device > > for this fw interface, and the fw driver (efifb, simplefb and others like > > that) bind against it. And then we started to throw out that > > platform_device (which unbinds the fw driver and prevents it from ever > > rebinding), except in the wrong layer so there's a few races. > > > > Should we throw out all that code and replace it with something else? What > > would that be like? > > Ah, no, sorry, I didn't know that at all. > > That sounds semi-sane, just fix the races by moving the layer elsewhere? Yeah essentially move it all into drivers/firmware/sysfb.c, for all drivers, both the registering and the nuking, and warp that into a local mutex. Currently parts is in there, parts is in fbmem.c, parts in some of the drivers like vga16fb, and some drivers (iirc only offb) still don't even have any platform_dev underneath their driver. So ideally the drivers would all just have their platform_driver probe functions, and that's it. It does mean though that some of that stuff needs to be moved to sysfb.c or into the relevant fw code that sets stuff up. It'll take some, so really just a direction check before we move further. You should get cc'ed on the patches (like with the sysfb stuff) anyway. Sounds roughly right? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16CF5C352A1 for ; Wed, 6 Apr 2022 00:35:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1386145AbiDFAU6 (ORCPT ); Tue, 5 Apr 2022 20:20:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36504 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1572964AbiDERbd (ORCPT ); Tue, 5 Apr 2022 13:31:33 -0400 Received: from mail-oa1-x34.google.com (mail-oa1-x34.google.com [IPv6:2001:4860:4864:20::34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 26DDD23156 for ; Tue, 5 Apr 2022 10:29:34 -0700 (PDT) Received: by mail-oa1-x34.google.com with SMTP id 586e51a60fabf-de3ca1efbaso90989fac.9 for ; Tue, 05 Apr 2022 10:29:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=hTmLNze/fum3lYDHZs7Sj1aLnDxMurPjlCcI6KAPtQw=; b=LbJpKtmNLFfmBBa8hI73mA0q3CBsHxWD5rGAH/a5zDNcSjWuPrutm69VA3/geneB9G pP2pvW/JddNsdNKYKSfjHiWncHsb3XxG9GZThdgfhubL/HZ4IB51n2b1YDm90xWG0GfZ Nqp2GVyKJQMxrGftECRmUpJYaBDF8o2XFRYtg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=hTmLNze/fum3lYDHZs7Sj1aLnDxMurPjlCcI6KAPtQw=; b=jriFtN5MsbJWk0bwhmTphT4X7jwe+6i1Lre3/XHM76OS8uEc1PHm34uSQw2J5JL8OM vqebTZuYALCF9mmldkb1oTXn+27Jjh7GWXOBoonwxU4InVT6YxwgxmkXIUyTtgk42sV4 o+fDMtXbdFJR5LNCJqzoJ8G0TGgxdJeDU6Mv1g3MwBVPjlD/gYn6Umt5CeKuFhI8Gpp5 QPoGDNjcAluEzpb/vTKQV7LHXwljsOhx6HTJ+rZBQnTcBWkkEuCYWDCBAAYhfgPlZnmG cEUGruA/FZj9W6V7kU/PUF6gAjNjomAiJBAwdTN/QUX0TKSjQc617bw2L8dX1zVIjBLq JI6Q== X-Gm-Message-State: AOAM530G0YTofAdpgySmQlYXnUvRa+KU4dK9rr5caO2HAc7/+3HMaW5w DIXE6TdPQ5TrQU5KH6fAv0x18iiiueF7DBAqGRlUYJH1b5UKmQ== X-Google-Smtp-Source: ABdhPJz2hTIFazRR9MY8eLnHVlZ8caoGqxdFkjJVbRqRI2qhGE2WqsCDNkD6KwqVkoKmjggVmMcy5elsAsbz0Y7u4AU= X-Received: by 2002:a05:6870:ec9e:b0:dd:e1fd:d76c with SMTP id eo30-20020a056870ec9e00b000dde1fdd76cmr2156223oab.7.1649179773446; Tue, 05 Apr 2022 10:29:33 -0700 (PDT) MIME-Version: 1.0 References: <4ae20b63-f452-fdb4-ced6-d4968a8d69f0@redhat.com> <408ffe9b-f09f-dc7e-7f5e-a93b311a06fa@redhat.com> In-Reply-To: From: Daniel Vetter Date: Tue, 5 Apr 2022 19:29:22 +0200 Message-ID: Subject: Re: [PATCH v2 18/19] Revert "fbdev: Prevent probing generic drivers if a FB is already registered" To: Greg KH Cc: Geert Uytterhoeven , Javier Martinez Canillas , DRI Development , Intel Graphics Development , Linux Fbdev development list , LKML , Thomas Zimmermann , Zack Rusin , Hans de Goede , Ilya Trukhanov , Daniel Vetter , Peter Jones Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 5 Apr 2022 at 18:45, Greg KH wrote: > > On Tue, Apr 05, 2022 at 06:12:59PM +0200, Daniel Vetter wrote: > > On Tue, Apr 05, 2022 at 03:33:17PM +0200, Greg KH wrote: > > > On Tue, Apr 05, 2022 at 03:24:40PM +0200, Geert Uytterhoeven wrote: > > > > Hi Daniel, > > > > > > > > On Tue, Apr 5, 2022 at 1:48 PM Daniel Vetter wrote: > > > > > On Tue, 5 Apr 2022 at 11:52, Javier Martinez Canillas > > > > > wrote: > > > > > > On 4/5/22 11:24, Daniel Vetter wrote: > > > > > > > On Tue, 5 Apr 2022 at 11:19, Javier Martinez Canillas > > > > > > >> This is how I think that work, please let me know if you see something > > > > > > >> wrong in my logic: > > > > > > >> > > > > > > >> 1) A PCI device of OF device is registered for the GPU, this attempt to > > > > > > >> match a registered driver but no driver was registered that match yet. > > > > > > >> > > > > > > >> 2) The efifb driver is built-in, will be initialized according to the link > > > > > > >> order of the objects under drivers/video and the fbdev driver is registered. > > > > > > >> > > > > > > >> There is no platform device or PCI/OF device registered that matches. > > > > > > >> > > > > > > >> 3) The DRM driver is built-in, will be initialized according to the link > > > > > > >> order of the objects under drivers/gpu and the DRM driver is registered. > > > > > > >> > > > > > > >> This matches the device registered in (1) and the DRM driver probes. > > > > > > >> > > > > > > >> 4) The DRM driver .probe kicks out any conflicting DRM drivers and pdev > > > > > > >> before registering the DRM device. > > > > > > >> > > > > > > >> There are no conflicting drivers or platform device at this point. > > > > > > >> > > > > > > >> 5) Latter at some point the drivers/firmware/sysfb.c init function is > > > > > > >> executed, and this registers a platform device for the generic fb. > > > > > > >> > > > > > > >> This device matches the efifb driver registered in (2) and the fbdev > > > > > > >> driver probes. > > > > > > >> > > > > > > >> Since that happens *after* the DRM driver already matched, probed > > > > > > >> and registered the DRM device, that is a bug and what the reverted > > > > > > >> patch worked around. > > > > > > >> > > > > > > >> So we need to prevent (5) if (1) and (3) already happened. Having a flag > > > > > > >> set in the fbdev core somewhere when remove_conflicting_framebuffers() > > > > > > >> is called could be a solution indeed. > > > > > > >> > > > > > > >> That is, the fbdev core needs to know that a DRM driver already probed > > > > > > >> and make register_framebuffer() fail if info->flag & FBINFO_MISC_FIRMWARE > > > > > > >> > > > > > > >> I can attempt to write a patch for that. > > > > > > > > > > > > > > Ah yeah that could be an issue. I think the right fix is to replace > > > > > > > the platform dev unregister with a sysfb_unregister() function in > > > > > > > sysfb.c, which is synced with a common lock with the sysfb_init > > > > > > > function and a small boolean. I think I can type that up quickly for > > > > > > > v3. > > > > > > > > > > > > It's more complicated than that since sysfb is just *one* of the several > > > > > > places where platform devices can be registered for video devices. > > > > > > > > > > > > For instance, the vga16fb driver registers its own platform device in > > > > > > its module_init() function so that can also happen after the conflicting > > > > > > framebuffers (and associated devices) were removed by a DRM driver probe. > > > > > > > > > > > > I tried to minimize the issue for that particular driver with commit: > > > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0499f419b76f > > > > > > > > > > > > But the point stands, it all boils down to the fact that you have two > > > > > > different subsystems registering video drivers and they don't know all > > > > > > about each other to take a proper decision. > > > > > > > > > > > > Right now the drm_aperture_remove_conflicting_framebuffers() call signals > > > > > > in one direction from DRM to fbdev but there isn't a communication in the > > > > > > other direction, from fbdev to DRM. > > > > > > > > > > > > I believe the correct fix would be for the fbdev core to keep a list of > > > > > > the apertures struct that are passed to remove_conflicting_framebuffers(), > > > > > > that way it will know what apertures are not available anymore and prevent > > > > > > to register any fbdev framebuffer that conflicts with one already present. > > > > > > > > > > Hm that still feels like reinventing a driver model, badly. > > > > > > > > > > I think there's two cleaner solutions: > > > > > - move all the firmware driver platform_dev into sysfb.c, and then > > > > > just bind the special cases against that (e.g. offb, vga16fb and all > > > > > these). Then we'd have one sysfb_try_unregister(struct device *dev) > > > > > interface that fbmem.c uses. > > > > > - let fbmem.c call into each of these firmware device providers, which > > > > > means some loops most likely (like we can't call into vga16fb), so > > > > > probably need to move that into fbmem.c and it all gets a bit messy. > > > > > > > > > > > Let me know if you think that makes sense and I can attempt to write a fix. > > > > > > > > > > I still think unregistering the platform_dev properly makes the most > > > > > > > > That doesn't sound very driver-model-aware to me. The device is what > > > > the driver binds to; it does not cease to exist. > > > > > > I agree, that sounds odd. > > > > > > The device should always stick around (as the bus creates it), it's up > > > to the driver to bind to the device as needed. > > > > The device actually disappears when the real driver takes over. > > > > The firmware fb is a special thing which only really exists as long as the > > firmware is in charge of the display hardware. As soon as a real driver > > takes over, it stops being a thing. > > > > And since a driver without a device is a bit a funny thing, we have been > > pushing towards a model where the firmware code sets up a platform_device > > for this fw interface, and the fw driver (efifb, simplefb and others like > > that) bind against it. And then we started to throw out that > > platform_device (which unbinds the fw driver and prevents it from ever > > rebinding), except in the wrong layer so there's a few races. > > > > Should we throw out all that code and replace it with something else? What > > would that be like? > > Ah, no, sorry, I didn't know that at all. > > That sounds semi-sane, just fix the races by moving the layer elsewhere? Yeah essentially move it all into drivers/firmware/sysfb.c, for all drivers, both the registering and the nuking, and warp that into a local mutex. Currently parts is in there, parts is in fbmem.c, parts in some of the drivers like vga16fb, and some drivers (iirc only offb) still don't even have any platform_dev underneath their driver. So ideally the drivers would all just have their platform_driver probe functions, and that's it. It does mean though that some of that stuff needs to be moved to sysfb.c or into the relevant fw code that sets stuff up. It'll take some, so really just a direction check before we move further. You should get cc'ed on the patches (like with the sysfb stuff) anyway. Sounds roughly right? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch