dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Hans de Goede <hdegoede@redhat.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	David Airlie <airlied@linux.ie>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	intel-gfx <intel-gfx@lists.freedesktop.org>,
	Platform Driver <platform-driver-x86@vger.kernel.org>,
	USB list <linux-usb@vger.kernel.org>,
	Thomas Zimmermann <tzimmermann@suse.de>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Guenter Roeck <linux@roeck-us.net>
Subject: Re: [PATCH 1/9] drm/connector: Make the drm_sysfs connector->kdev device hold a reference to the connector
Date: Fri, 30 Apr 2021 15:32:39 +0200	[thread overview]
Message-ID: <5a6fc5d6-a218-8566-6b19-b4ae7d763210@redhat.com> (raw)
In-Reply-To: <CAKMK7uFf8n6QfRdSXeB6J+L7NPGbeEyJKhx1Vu7x8env=_7tkA@mail.gmail.com>

Hi,

On 4/30/21 1:38 PM, Daniel Vetter wrote:
> On Fri, Apr 30, 2021 at 1:28 PM Hans de Goede <hdegoede@redhat.com> wrote:
>>
>> Hi,
>>
>> On 4/29/21 9:09 PM, Daniel Vetter wrote:
>>> On Thu, Apr 29, 2021 at 02:33:17PM +0200, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 4/29/21 2:04 PM, Daniel Vetter wrote:
>>>>> On Thu, Apr 29, 2021 at 01:54:46PM +0200, Greg Kroah-Hartman wrote:
>>>>>> On Thu, Apr 29, 2021 at 01:40:28PM +0200, Daniel Vetter wrote:
>>>>>>> On Wed, Apr 28, 2021 at 11:52:49PM +0200, Hans de Goede wrote:
>>>>>>>> Userspace could hold open a reference to the connector->kdev device,
>>>>>>>> through e.g. holding a sysfs-atrtribute open after
>>>>>>>> drm_sysfs_connector_remove() has been called. In this case the connector
>>>>>>>> could be free-ed while the connector->kdev device's drvdata is still
>>>>>>>> pointing to it.
>>>>>>>>
>>>>>>>> Give drm_connector devices there own device type, which allows
>>>>>>>> us to specify our own release function and make drm_sysfs_connector_add()
>>>>>>>> take a reference on the connector object, and have the new release
>>>>>>>> function put the reference when the device is released.
>>>>>>>>
>>>>>>>> Giving drm_connector devices there own device type, will also allow
>>>>>>>> checking if a device is a drm_connector device with a
>>>>>>>> "if (device->type == &drm_sysfs_device_connector)" check.
>>>>>>>>
>>>>>>>> Note that the setting of the name member of the device_type struct will
>>>>>>>> cause udev events for drm_connector-s to now contain DEVTYPE=drm_connector
>>>>>>>> as extra info. So this extends the uevent part of the userspace API.
>>>>>>>>
>>>>>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>>>>>>
>>>>>>> Are you sure? I thought sysfs is supposed to flush out any pending
>>>>>>> operations (they complete fast) and handle open fd internally?
>>>>>>
>>>>>> Yes, it "should" :)
>>>>>
>>>>> Thanks for confirming my vague memories :-)
>>>>>
>>>>> Hans, pls drop this one.
>>>>
>>>> Please see my earlier reply to your review of this patch, it is
>>>> still needed but for a different reason:
>>>>
>>>> """
>>>> We still need this change though to make sure that the
>>>> "drm/connector: Add drm_connector_find_by_fwnode() function"
>>>> does not end up following a dangling drvdat pointer from one
>>>> if the drm_connector kdev-s.
>>>>
>>>> The class_dev_iter_init() in drm_connector_find_by_fwnode() gets
>>>> a reference on all devices and between getting that reference
>>>> and it calling drm_connector_get() - drm_connector_unregister()
>>>> may run and drop the possibly last reference to the
>>>> drm_connector object, freeing it and leaving the kdev's
>>>> drvdata as a dangling pointer.
>>>> """
>>>>
>>>> This is actually why I added it initially, and while adding it
>>>> I came up with this wrong theory of why it was necessary independently
>>>> of the drm_connector_find_by_fwnode() addition, sorry about that.
>>>
>>> Generally that's handled by a kref_get_unless_zero under the protection of
>>> the lock which protects the weak reference. Which I think is the right
>>> model here (at a glance at least) since this is a lookup function.
>>
>> I'm afraid that things are a bit more complicated here. The idea here
>> is that we have a subsystem outside of the DRM subsystem which received
>> a hotplug event for a drm-connector.  The only info which this subsystem
>> has is a reference on the fwnode level (either through device-tree or
>> to platform-code instantiating software-fwnode-s + links for this).
>>
>> So in order to deliver the hotplug event to the connector we need
>> to lookup the connector by fwnode.
>>
>> I've chosen to implement this by iterating over all drm_class
>> devices with a dev_type of drm_connector using class_dev_iter_init()
>> and friends. This makes sure that we either get a reference to
>> the device, or that we skip the device if it is being deleted.
>>
>> But this just gives us a reference to the connector->kdev, not
>> to the connector itself. A pointer to the connector itself is stored
>> as drvdata inside the device, but without taking a reference as
>> this patch does, there is no guarantee that that pointer does not
>> point to possibly free-ed mem.
>>
>> We could set drvdata to 0 from drm_sysfs_connector_remove()
>> Before calling device_unregister(connector->kdev) and then do
>> something like this inside drm_connector_find_by_fwnode():
>>
>> /*
>>  * Lock the device to ensure we either see the drvdata == NULL
>>  * set by drm_sysfs_connector_remove(); or we block the removal
>>  * from continuing until we are done with the device.
>>  */
>> device_lock(dev);
>> connector = dev_get_drvdata(dev);
>> if (connector && connector->fwnode == fwnode) {
>>         drm_connector_get(connector);
>>         found = connector;
>> }
>> device_unlock(dev);
> 
> Yes this is what I mean. Except not a drm_connector_get, but a
> kref_get_unless_zero. The connector might already be on it's way out,
> but the drvdata not yet cleared.

The function we race with is drm_sysfs_connector_remove() and either:

1. The lookup wins the race in which case drm_sysfs_connector_remove()
   can only complete after the drm_connector_get(); and the connector
   kref won't drop to 0 before drm_sysfs_connector_remove() completes; or
2. drm_sysfs_connector_remove() wins the race in which case drvdata will
   be 0.

So using kref_get_unless_zero here will not make a difference and
requires poking inside the drm_connector internals.

Note I will probably go with your suggestion below, so whether or
not to use kref_get_unless_zero here is likely no longer relevant.

>> With the device_lock() synchronizing against the device_lock()
>> in device_unregister(connector->kdev). So that we either see
>> drvdata == NULL if we race with unregistering; or we get
>> a reference on the drm_connector obj before its ref-count can
>> drop to 0.
> 
> The trouble is that most connectors aren't full drivers on their kdev.
> So this isn't the right lock. We need another lock which protects the
> drvdata pointer appropriately for drm connectors.
> 
>> There might be places though where we call code take the device_lock
>> while holding a lock necessary for the drm_connector_get() , so
>> this approach might lead to an AB BA deadlock. As such I think
>> my original approach is better (also see below).
>>
>>> Lookup tables holding full references tends to lead to all kinds of bad
>>> side effects.
>>
>> The proposed reference is not part of a lookup list, it is a
>> reference from the kdev on the drm_connector object which gets
>> dropped as soon as the kdev's refcount hits 0, which normally
>> happens directly after drm_connector_unregister() has run.
> 
> Yeah but the way you use it is for lookup purposes. What we're
> implementing is the "get me the drm_connector for this fwnode"
> functionality, and that _is_ a lookup.

Ack.

> How its implemented is an
> internal detail really, and somehow using full references for lookup
> functionality isn't great.

Ok, note that the caller of this only needs the reference for a
short while, what the caller does is:

        connector = drm_connector_find_by_fwnode(dp->connector_fwnode);
        if (connector) {
                drm_connector_oob_hotplug_event(connector, &data);
                drm_connector_put(connector);
        }

As a result of out discussion I have been thinking about enforcing this
short-lifetime of the reference by changing:

void drm_connector_oob_hotplug_event(struct drm_connector *connector,
                                     struct drm_connector_oob_hotplug_event_data *data);

to:

void drm_connector_oob_hotplug_event(struct fwnode_handle connector_fwnode,
                                     struct drm_connector_oob_hotplug_event_data *data);

And making that do the lookup (+ almost immediate put) internally, making
the connector-lookup a purely drm-subsys internal thing and enforcing code
outside of the drm-subsys not holding a long-time reference to the connector
this way.

Please let me know if you prefer the variant where the connector lookup
details are hidden from the callers ?

Then I can change this for for v2 of this patch/series.

> I'm also not sure why we have to use the kdev stuff here. For other
> random objects we need to look up we're building that functionality on
> that object. It means you need to keep another list_head around for
> that lookup, but that's really not a big cost. E.g. drm_bridge/panel
> work like that.

Using class_for_each_dev seemed like a good way to iterate over all
the connectors. But given the discussion this has caused, just adding
a new static list + mutex for this to drivers/gpu/drm/drm_connector.c
sounds like it might be a better approach indeed.

So shall I change thing over to this approach for v2 of this patch/series?

Regards,

Hans

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2021-04-30 13:32 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-28 21:52 [PATCH 0/9] drm + usb-type-c: Add support for out-of-band hotplug notification Hans de Goede
2021-04-28 21:52 ` [PATCH 1/9] drm/connector: Make the drm_sysfs connector->kdev device hold a reference to the connector Hans de Goede
2021-04-29 11:40   ` Daniel Vetter
2021-04-29 11:54     ` Greg Kroah-Hartman
2021-04-29 12:04       ` Daniel Vetter
2021-04-29 12:33         ` Hans de Goede
2021-04-29 19:09           ` Daniel Vetter
2021-04-30 11:28             ` Hans de Goede
2021-04-30 11:38               ` Daniel Vetter
2021-04-30 13:32                 ` Hans de Goede [this message]
2021-04-30 14:30                   ` Daniel Vetter
2021-04-30 13:45                 ` Hans de Goede
2021-04-29 12:30     ` Hans de Goede
2021-04-28 21:52 ` [PATCH 2/9] drm/connector: Add a fwnode pointer to drm_connector and register with ACPI Hans de Goede
2021-04-28 21:52 ` [PATCH 3/9] drm/connector: Add drm_connector_find_by_fwnode() function Hans de Goede
2021-04-28 21:52 ` [PATCH 4/9] drm/connector: Add support for out-of-band hotplug notification Hans de Goede
2021-05-03  8:00   ` Heikki Krogerus
2021-05-03 14:35     ` Hans de Goede
2021-05-04 14:52       ` Heikki Krogerus
2021-05-04 15:34         ` Hans de Goede
2021-05-04 12:53     ` Imre Deak
2021-04-28 21:52 ` [PATCH 5/9] drm/i915: Associate ACPI connector nodes with connector entries Hans de Goede
2021-04-28 21:52 ` [PATCH 6/9] drm/i915/dp: Add support for out-of-bound hotplug events Hans de Goede
2021-04-28 21:52 ` [PATCH 7/9] usb: typec: altmodes/displayport: Make dp_altmode_notify() more generic Hans de Goede
2021-04-28 21:52 ` [PATCH 8/9] usb: typec: altmodes/displayport: Notify drm subsys of hotplug events Hans de Goede
2021-04-28 21:52 ` [PATCH 9/9] platform/x86/intel_cht_int33fe: Correct "displayport" fwnode reference Hans de Goede
2021-04-28 21:57 ` [PATCH 0/9] drm + usb-type-c: Add support for out-of-band hotplug notification Hans de Goede

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5a6fc5d6-a218-8566-6b19-b4ae7d763210@redhat.com \
    --to=hdegoede@redhat.com \
    --cc=airlied@linux.ie \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=heikki.krogerus@linux.intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=platform-driver-x86@vger.kernel.org \
    --cc=rodrigo.vivi@intel.com \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).