ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Dmitry Torokhov <dmitry.torokhov@gmail.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>,
	Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
	ksummit <ksummit-discuss@lists.linuxfoundation.org>,
	ksummit@lists.linux.dev
Subject: Re: [TECH TOPIC] Driver probe fails and register succeeds
Date: Thu, 23 Jun 2022 18:00:06 -0700	[thread overview]
Message-ID: <YrUMlki+cJkJ3R/X@google.com> (raw)
In-Reply-To: <62b4fdd7c4183_8070294ad@dwillia2-xfh.notmuch>

On Thu, Jun 23, 2022 at 04:57:11PM -0700, Dan Williams wrote:
> Shuah Khan wrote:
> > On 6/23/22 5:30 PM, Laurent Pinchart wrote:
> > > Hi Shuah,
> > > 
> > > On Thu, Jun 23, 2022 at 05:28:09PM -0600, Shuah Khan wrote:
> > >> On 6/23/22 5:13 PM, Laurent Pinchart wrote:
> > >>> On Thu, Jun 23, 2022 at 05:05:30PM -0600, Shuah Khan wrote:
> > >>>> I have been debugging a driver probe failure and noticed that driver gets
> > >>>> registered even when driver probe fails. This is not a new behavior. The
> > >>>> code in question is the same since 2005.
> > >>>>
> > >>>> dmesg will say that a driver probe failed with error code and then the very
> > >>>> next message from interface core that says driver is registered successfully.
> > >>>> It will create sysfs interfaces.
> > >>>>
> > >>>> The probe failure is propagated from the drive probe routine all the way up to
> > >>>> __driver_attach(). __driver_attach() ignores the error and and returns success.
> > >>>>
> > >>>>            __device_driver_lock(dev, dev->parent);
> > >>>>            driver_probe_device(drv, dev);
> > >>>>            __device_driver_unlock(dev, dev->parent);
> > >>>>
> > >>>>            return 0;
> > >>>>
> > >>>> Interface driver register goes on to create sysfs entries as if driver probe
> > >>>> worked. It handles errors from driver_register() and unwinds the register
> > >>>> properly, however in this case it doesn't know about the failure.
> > >>>>
> > >>>> At this point the driver is defunct with sysfs interfaces. User has to run
> > >>>> rmmod to get rid of the defunct driver.
> > >>>>
> > >>>> Simply returning the error from __driver_attach() didn't work as expected.
> > >>>> I figured it would fail since not all interface drivers can handle errors
> > >>>> from driver probe routines.
> > >>>>
> > >>>> I propose that we discuss the scenario to find possible solutions to avoid
> > >>>> defunct drivers.
> > >>>
> > >>> This seems to be the expected behaviour to me. The probe failure doesn't
> > >>> necessarily indicate that the driver is at fault, it means that
> > >>> something went wrong when associating a particular device with the
> > >>> driver. It could be that the device is faulty for instance, and that
> > >>> shouldn't prevent the driver from being registered, especially if
> > >>> multiple instances of the device can be present in the system, as that
> > >>> would then prevent any of those instances from working due to one faulty
> > >>> device.
> > >>
> > >> Agreed. This behavior works well in the cases of hardware/device failures
> > >> that cause probe failure. The case I am seeing is a driver bug that causes
> > >> probe failure.
> > > 
> > > Is there a way for the kernel to determine that the probe failure was
> > > caused by a buggy driver and not a faulty device ?
> > > 
> > 
> > That has to be explored.
> > 
> > >>> What other behaviour would you expect ?
> > >>
> > >> I am looking to see if we can propagate the error to the interface driver to
> > >> handle instead of leaving the defunct driver. This isn't an easy problem to
> > >> solve though. As you mentioned driver probe could fail if device is bad
> > >> and we want the driver to handle the others.
> > >>
> > >> The fact is we will end up with defunct drivers in some cases. If user notices
> > >> the error they could go clean it up. My main concern is the sysfs interfaces
> > >> hanging around. The desired behavior would be not leaving defunct drivers with
> > >> associated sysfs files.
> > > 
> > > I don't think the driver is "defunct". It has been loaded successfully,
> > > and it's fully operational, just not bound to any device.
> > > 
> > 
> > Not in the case I am debugging. It won't be successfully bound any device.
> > That is what I meant by defunct. Maybe there is a better word to use.
> > 
> > The driver releases all resources in its probe failure path.
> 
> If the driver is bad there is no way for the kernel to know.
> 
> Are you perhaps looking for a technique to unload the driver if another
> driver knows that it is indeed ok to unload the driver if it does not
> attach to its intended device?
> 
> You mention the interface driver getting involved. The interface driver
> could do something like:
> 
>     device_add(dev);
>     device_lock(dev);
>     if (!dev->driver)
>         driver_unregister(drv);
>     device_unlock(dev);
> 
> ...but that would need to know that nothing else needs @drv and that
> @drv has been registered and ->probe() run synchronous with
> device_add(). That does not work with the async and deferred probing
> options.

Nor does this work for drivers for devices that might be hot-plugged,
instantiated via sysfs/new_id, etc, etc.

So if you want to fail driver registration because you know that you are
dealing with a singleton platform device with no additional dependencies
then it has to be done in the driver itself, not by the driver core.

Thanks.

-- 
Dmitry

  reply	other threads:[~2022-06-24  1:00 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-23 23:05 [TECH TOPIC] Driver probe fails and register succeeds Shuah Khan
2022-06-23 23:13 ` Laurent Pinchart
2022-06-23 23:28   ` Shuah Khan
2022-06-23 23:30     ` Laurent Pinchart
2022-06-23 23:38       ` Shuah Khan
2022-06-23 23:57         ` Dan Williams
2022-06-24  1:00           ` Dmitry Torokhov [this message]
2022-06-24  6:33             ` Greg KH
2022-06-23 23:24 ` Guenter Roeck
2022-06-24  6:31 ` Greg KH
2022-06-24 15:55   ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YrUMlki+cJkJ3R/X@google.com \
    --to=dmitry.torokhov@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=ksummit-discuss@lists.linuxfoundation.org \
    --cc=ksummit@lists.linux.dev \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=skhan@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).