All of lore.kernel.org
 help / color / mirror / Atom feed
* Armada DRM: bridge with componentized devices
@ 2019-01-03  9:47 Lubomir Rintel
  2019-01-03 13:11 ` Russell King - ARM Linux
  0 siblings, 1 reply; 53+ messages in thread
From: Lubomir Rintel @ 2019-01-03  9:47 UTC (permalink / raw)
  To: DRI Development; +Cc: Russell King, Liviu Dudau, Peter Rosin

Hello,

lately I've been trying to make the Himax HX8837 chip that drives the OLPC
LCD display work with Armada DRM driver. I've been advised to create a
bridge driver and not an encoder driver since the silicon is separate from
the LCDC.

The Armada DRM driver (and, I think, the i.MX one) creates the drm_device
once the component infrastructure sees the necessary sub-devices appear.
The sub-devices being the LCDCs and the encoders (not bridges) that it
expects to be created externally.

Currently, it seems, the only driver that can actually work with this (that
is -- creates a drm_encoder for a drm_device when the component is bound)
is the tda998x. All other similar drivers create a drm_bridge instead and
not use the component infrastructure at all. (In fact, tilcdc driver
contains a  hack to handle tda998x specially.)

I'm wondering how to reconcile the two?

* The tda998x driver has recently been modified to create a bridge on probe
  and eventually encoder on component bind. Is this an okay thing to do in
  a new driver? (this probably means the tilcdc hack can be removed...)

* If a non-componentized bridge were to be used (along with a dummy 
  encoder), at what point would it make sense to look for the bridge? 
  Would it be a good idea to defer the probe of crtc until a bridge can be 
  looked up and the attach it on component bind?  What if the bridge goes 
  away (a module is unloaded, etc.) in between?

I'd be thankful for opintions and advice before I move ahead with this.

Thanks,
Lubo

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-03  9:47 Armada DRM: bridge with componentized devices Lubomir Rintel
@ 2019-01-03 13:11 ` Russell King - ARM Linux
  2019-01-07 10:45   ` Daniel Vetter
  0 siblings, 1 reply; 53+ messages in thread
From: Russell King - ARM Linux @ 2019-01-03 13:11 UTC (permalink / raw)
  To: Lubomir Rintel; +Cc: Liviu Dudau, Peter Rosin, DRI Development

On Thu, Jan 03, 2019 at 10:47:27AM +0100, Lubomir Rintel wrote:
> Hello,
> 
> lately I've been trying to make the Himax HX8837 chip that drives the OLPC
> LCD display work with Armada DRM driver. I've been advised to create a
> bridge driver and not an encoder driver since the silicon is separate from
> the LCDC.
> 
> The Armada DRM driver (and, I think, the i.MX one) creates the drm_device
> once the component infrastructure sees the necessary sub-devices appear.
> The sub-devices being the LCDCs and the encoders (not bridges) that it
> expects to be created externally.
> 
> Currently, it seems, the only driver that can actually work with this (that
> is -- creates a drm_encoder for a drm_device when the component is bound)
> is the tda998x. All other similar drivers create a drm_bridge instead and
> not use the component infrastructure at all. (In fact, tilcdc driver
> contains a  hack to handle tda998x specially.)
> 
> I'm wondering how to reconcile the two?
> 
> * The tda998x driver has recently been modified to create a bridge on probe
>   and eventually encoder on component bind. Is this an okay thing to do in
>   a new driver? (this probably means the tilcdc hack can be removed...)
> 
> * If a non-componentized bridge were to be used (along with a dummy 
>   encoder), at what point would it make sense to look for the bridge? 
>   Would it be a good idea to defer the probe of crtc until a bridge can be 
>   looked up and the attach it on component bind?  What if the bridge goes 
>   away (a module is unloaded, etc.) in between?
> 
> I'd be thankful for opintions and advice before I move ahead with this.

This is the long-standing problem with the conflict between bridge
support and component support, and I'm not sure that there is really
any answer to it.

I've gone into the details of the two several times on the list,
particularly about the short-comings of the bridge approach, but it
seems no one cares to fix those short-comings.

You are re-identifying some of the issues that I've already pointed
out - such as what happens to DRM drives when the bridge driver is
unbound (it's really not about modules being unloaded, and the problem
can't be solved by taking a module reference count - all that the
module reference count does is ensure that the module doesn't go
away unexpected, there is no way to ensure that the device isn't
unbound.)

The issue of unbinding is precisely the issue which the component
support was created to solve - but everyone seems to prefer the buggy
bridge approach, and no one seems willing to do anything about the
bugs or even acknowledge that it's a problem.  It's strange - if one
identifies bugs that result in kernel oops in other kernel subsystems,
one is generally taken seriously and the problem is solved.

The issue about the encoders is something that I've tried to discuss,
and I've pointed out that moving it into the DRM driver adds additional
complexity there, and I'd hoped that my patch set I posted would've
generated discussion, but alas not.

What I'm not prepared to do is to introduce _known_ bugs into any
driver that I maintain.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-03 13:11 ` Russell King - ARM Linux
@ 2019-01-07 10:45   ` Daniel Vetter
  2019-01-07 11:26     ` Andrzej Hajda
  2019-01-07 16:12     ` Russell King - ARM Linux
  0 siblings, 2 replies; 53+ messages in thread
From: Daniel Vetter @ 2019-01-07 10:45 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Lubomir Rintel, Liviu Dudau, Peter Rosin, DRI Development

On Thu, Jan 03, 2019 at 01:11:47PM +0000, Russell King - ARM Linux wrote:
> On Thu, Jan 03, 2019 at 10:47:27AM +0100, Lubomir Rintel wrote:
> > Hello,
> > 
> > lately I've been trying to make the Himax HX8837 chip that drives the OLPC
> > LCD display work with Armada DRM driver. I've been advised to create a
> > bridge driver and not an encoder driver since the silicon is separate from
> > the LCDC.
> > 
> > The Armada DRM driver (and, I think, the i.MX one) creates the drm_device
> > once the component infrastructure sees the necessary sub-devices appear.
> > The sub-devices being the LCDCs and the encoders (not bridges) that it
> > expects to be created externally.
> > 
> > Currently, it seems, the only driver that can actually work with this (that
> > is -- creates a drm_encoder for a drm_device when the component is bound)
> > is the tda998x. All other similar drivers create a drm_bridge instead and
> > not use the component infrastructure at all. (In fact, tilcdc driver
> > contains a  hack to handle tda998x specially.)
> > 
> > I'm wondering how to reconcile the two?
> > 
> > * The tda998x driver has recently been modified to create a bridge on probe
> >   and eventually encoder on component bind. Is this an okay thing to do in
> >   a new driver? (this probably means the tilcdc hack can be removed...)
> > 
> > * If a non-componentized bridge were to be used (along with a dummy 
> >   encoder), at what point would it make sense to look for the bridge? 
> >   Would it be a good idea to defer the probe of crtc until a bridge can be 
> >   looked up and the attach it on component bind?  What if the bridge goes 
> >   away (a module is unloaded, etc.) in between?
> > 
> > I'd be thankful for opintions and advice before I move ahead with this.
> 
> This is the long-standing problem with the conflict between bridge
> support and component support, and I'm not sure that there is really
> any answer to it.
> 
> I've gone into the details of the two several times on the list,
> particularly about the short-comings of the bridge approach, but it
> seems no one cares to fix those short-comings.
> 
> You are re-identifying some of the issues that I've already pointed
> out - such as what happens to DRM drives when the bridge driver is
> unbound (it's really not about modules being unloaded, and the problem
> can't be solved by taking a module reference count - all that the
> module reference count does is ensure that the module doesn't go
> away unexpected, there is no way to ensure that the device isn't
> unbound.)
> 
> The issue of unbinding is precisely the issue which the component
> support was created to solve - but everyone seems to prefer the buggy
> bridge approach, and no one seems willing to do anything about the
> bugs or even acknowledge that it's a problem.  It's strange - if one
> identifies bugs that result in kernel oops in other kernel subsystems,
> one is generally taken seriously and the problem is solved.

Unbinding is really not the most important feature, especially for SoC. If
you feel different, working together with others, getting some agreement,
getting the patches reviewed and finding someone to get them merged is
very much appreciated. But just complaining won't move this forward.

> The issue about the encoders is something that I've tried to discuss,
> and I've pointed out that moving it into the DRM driver adds additional
> complexity there, and I'd hoped that my patch set I posted would've
> generated discussion, but alas not.
> 
> What I'm not prepared to do is to introduce _known_ bugs into any
> driver that I maintain.

I thought last time around the idea was to use device links (and teach
drm_bridge about them), so that the unloading works correctly.

Wrt tda988x: I think it really shouldn't create a drm_encoder, nor
register as a component. Fixing that is probably a bit more work.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-07 10:45   ` Daniel Vetter
@ 2019-01-07 11:26     ` Andrzej Hajda
  2019-01-07 16:08       ` Daniel Vetter
  2019-01-07 16:12     ` Russell King - ARM Linux
  1 sibling, 1 reply; 53+ messages in thread
From: Andrzej Hajda @ 2019-01-07 11:26 UTC (permalink / raw)
  To: Daniel Vetter, Russell King - ARM Linux
  Cc: Lubomir Rintel, Liviu Dudau, Peter Rosin, DRI Development

On 07.01.2019 11:45, Daniel Vetter wrote:
> On Thu, Jan 03, 2019 at 01:11:47PM +0000, Russell King - ARM Linux wrote:
>> On Thu, Jan 03, 2019 at 10:47:27AM +0100, Lubomir Rintel wrote:
>>> Hello,
>>>
>>> lately I've been trying to make the Himax HX8837 chip that drives the OLPC
>>> LCD display work with Armada DRM driver. I've been advised to create a
>>> bridge driver and not an encoder driver since the silicon is separate from
>>> the LCDC.
>>>
>>> The Armada DRM driver (and, I think, the i.MX one) creates the drm_device
>>> once the component infrastructure sees the necessary sub-devices appear.
>>> The sub-devices being the LCDCs and the encoders (not bridges) that it
>>> expects to be created externally.
>>>
>>> Currently, it seems, the only driver that can actually work with this (that
>>> is -- creates a drm_encoder for a drm_device when the component is bound)
>>> is the tda998x. All other similar drivers create a drm_bridge instead and
>>> not use the component infrastructure at all. (In fact, tilcdc driver
>>> contains a  hack to handle tda998x specially.)
>>>
>>> I'm wondering how to reconcile the two?
>>>
>>> * The tda998x driver has recently been modified to create a bridge on probe
>>>   and eventually encoder on component bind. Is this an okay thing to do in
>>>   a new driver? (this probably means the tilcdc hack can be removed...)
>>>
>>> * If a non-componentized bridge were to be used (along with a dummy 
>>>   encoder), at what point would it make sense to look for the bridge? 
>>>   Would it be a good idea to defer the probe of crtc until a bridge can be 
>>>   looked up and the attach it on component bind?  What if the bridge goes 
>>>   away (a module is unloaded, etc.) in between?
>>>
>>> I'd be thankful for opintions and advice before I move ahead with this.
>> This is the long-standing problem with the conflict between bridge
>> support and component support, and I'm not sure that there is really
>> any answer to it.
>>
>> I've gone into the details of the two several times on the list,
>> particularly about the short-comings of the bridge approach, but it
>> seems no one cares to fix those short-comings.
>>
>> You are re-identifying some of the issues that I've already pointed
>> out - such as what happens to DRM drives when the bridge driver is
>> unbound (it's really not about modules being unloaded, and the problem
>> can't be solved by taking a module reference count - all that the
>> module reference count does is ensure that the module doesn't go
>> away unexpected, there is no way to ensure that the device isn't
>> unbound.)
>>
>> The issue of unbinding is precisely the issue which the component
>> support was created to solve - but everyone seems to prefer the buggy
>> bridge approach, and no one seems willing to do anything about the
>> bugs or even acknowledge that it's a problem.  It's strange - if one
>> identifies bugs that result in kernel oops in other kernel subsystems,
>> one is generally taken seriously and the problem is solved.
> Unbinding is really not the most important feature, especially for SoC. If
> you feel different, working together with others, getting some agreement,
> getting the patches reviewed and finding someone to get them merged is
> very much appreciated. But just complaining won't move this forward.
>
>> The issue about the encoders is something that I've tried to discuss,
>> and I've pointed out that moving it into the DRM driver adds additional
>> complexity there, and I'd hoped that my patch set I posted would've
>> generated discussion, but alas not.
>>
>> What I'm not prepared to do is to introduce _known_ bugs into any
>> driver that I maintain.
> I thought last time around the idea was to use device links (and teach
> drm_bridge about them), so that the unloading works correctly.


With current device_links implementation registering links in probe of
the consumer is just incorrect - it can happen that neither consumer
neither provider is fully probed and creating device links in such state
is wrong according to docs, and my experiments. See [1] for discussion
and [2] for docs.


[1]: https://patchwork.freedesktop.org/patch/218878/

[2]:
https://www.kernel.org/doc/html/latest/driver-api/device_link.html#usage


Regards

Andrzej


>
> Wrt tda988x: I think it really shouldn't create a drm_encoder, nor
> register as a component. Fixing that is probably a bit more work.
> -Daniel


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-07 11:26     ` Andrzej Hajda
@ 2019-01-07 16:08       ` Daniel Vetter
  2019-01-07 16:27         ` Andrzej Hajda
  0 siblings, 1 reply; 53+ messages in thread
From: Daniel Vetter @ 2019-01-07 16:08 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: Liviu Dudau, Russell King - ARM Linux, DRI Development,
	Lubomir Rintel, Peter Rosin

On Mon, Jan 07, 2019 at 12:26:58PM +0100, Andrzej Hajda wrote:
> On 07.01.2019 11:45, Daniel Vetter wrote:
> > On Thu, Jan 03, 2019 at 01:11:47PM +0000, Russell King - ARM Linux wrote:
> >> On Thu, Jan 03, 2019 at 10:47:27AM +0100, Lubomir Rintel wrote:
> >>> Hello,
> >>>
> >>> lately I've been trying to make the Himax HX8837 chip that drives the OLPC
> >>> LCD display work with Armada DRM driver. I've been advised to create a
> >>> bridge driver and not an encoder driver since the silicon is separate from
> >>> the LCDC.
> >>>
> >>> The Armada DRM driver (and, I think, the i.MX one) creates the drm_device
> >>> once the component infrastructure sees the necessary sub-devices appear.
> >>> The sub-devices being the LCDCs and the encoders (not bridges) that it
> >>> expects to be created externally.
> >>>
> >>> Currently, it seems, the only driver that can actually work with this (that
> >>> is -- creates a drm_encoder for a drm_device when the component is bound)
> >>> is the tda998x. All other similar drivers create a drm_bridge instead and
> >>> not use the component infrastructure at all. (In fact, tilcdc driver
> >>> contains a  hack to handle tda998x specially.)
> >>>
> >>> I'm wondering how to reconcile the two?
> >>>
> >>> * The tda998x driver has recently been modified to create a bridge on probe
> >>>   and eventually encoder on component bind. Is this an okay thing to do in
> >>>   a new driver? (this probably means the tilcdc hack can be removed...)
> >>>
> >>> * If a non-componentized bridge were to be used (along with a dummy 
> >>>   encoder), at what point would it make sense to look for the bridge? 
> >>>   Would it be a good idea to defer the probe of crtc until a bridge can be 
> >>>   looked up and the attach it on component bind?  What if the bridge goes 
> >>>   away (a module is unloaded, etc.) in between?
> >>>
> >>> I'd be thankful for opintions and advice before I move ahead with this.
> >> This is the long-standing problem with the conflict between bridge
> >> support and component support, and I'm not sure that there is really
> >> any answer to it.
> >>
> >> I've gone into the details of the two several times on the list,
> >> particularly about the short-comings of the bridge approach, but it
> >> seems no one cares to fix those short-comings.
> >>
> >> You are re-identifying some of the issues that I've already pointed
> >> out - such as what happens to DRM drives when the bridge driver is
> >> unbound (it's really not about modules being unloaded, and the problem
> >> can't be solved by taking a module reference count - all that the
> >> module reference count does is ensure that the module doesn't go
> >> away unexpected, there is no way to ensure that the device isn't
> >> unbound.)
> >>
> >> The issue of unbinding is precisely the issue which the component
> >> support was created to solve - but everyone seems to prefer the buggy
> >> bridge approach, and no one seems willing to do anything about the
> >> bugs or even acknowledge that it's a problem.  It's strange - if one
> >> identifies bugs that result in kernel oops in other kernel subsystems,
> >> one is generally taken seriously and the problem is solved.
> > Unbinding is really not the most important feature, especially for SoC. If
> > you feel different, working together with others, getting some agreement,
> > getting the patches reviewed and finding someone to get them merged is
> > very much appreciated. But just complaining won't move this forward.
> >
> >> The issue about the encoders is something that I've tried to discuss,
> >> and I've pointed out that moving it into the DRM driver adds additional
> >> complexity there, and I'd hoped that my patch set I posted would've
> >> generated discussion, but alas not.
> >>
> >> What I'm not prepared to do is to introduce _known_ bugs into any
> >> driver that I maintain.
> > I thought last time around the idea was to use device links (and teach
> > drm_bridge about them), so that the unloading works correctly.
> 
> 
> With current device_links implementation registering links in probe of
> the consumer is just incorrect - it can happen that neither consumer
> neither provider is fully probed and creating device links in such state
> is wrong according to docs, and my experiments. See [1] for discussion
> and [2] for docs.

We could set up the device link only at drm_dev_register time. At that point
we know that driver loading has fully succeeded. We do have a list of
bridges at hand, so that's not going to be an issue.

The only case where this breaks is if a driver is still using the
deprecated ->load callback. But that ->load callback doesn't mix well with
EDEFER_PROBE/component framework anyway, so I think not going to be a
problem hopefully?
-Daniel
> 
> 
> [1]: https://patchwork.freedesktop.org/patch/218878/
> 
> [2]:
> https://www.kernel.org/doc/html/latest/driver-api/device_link.html#usage
> 
> 
> Regards
> 
> Andrzej
> 
> 
> >
> > Wrt tda988x: I think it really shouldn't create a drm_encoder, nor
> > register as a component. Fixing that is probably a bit more work.
> > -Daniel
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-07 10:45   ` Daniel Vetter
  2019-01-07 11:26     ` Andrzej Hajda
@ 2019-01-07 16:12     ` Russell King - ARM Linux
  2019-01-07 21:55       ` Daniel Vetter
  1 sibling, 1 reply; 53+ messages in thread
From: Russell King - ARM Linux @ 2019-01-07 16:12 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Lubomir Rintel, Liviu Dudau, Peter Rosin, DRI Development

On Mon, Jan 07, 2019 at 11:45:32AM +0100, Daniel Vetter wrote:
> On Thu, Jan 03, 2019 at 01:11:47PM +0000, Russell King - ARM Linux wrote:
> > This is the long-standing problem with the conflict between bridge
> > support and component support, and I'm not sure that there is really
> > any answer to it.
> > 
> > I've gone into the details of the two several times on the list,
> > particularly about the short-comings of the bridge approach, but it
> > seems no one cares to fix those short-comings.
> > 
> > You are re-identifying some of the issues that I've already pointed
> > out - such as what happens to DRM drives when the bridge driver is
> > unbound (it's really not about modules being unloaded, and the problem
> > can't be solved by taking a module reference count - all that the
> > module reference count does is ensure that the module doesn't go
> > away unexpected, there is no way to ensure that the device isn't
> > unbound.)
> > 
> > The issue of unbinding is precisely the issue which the component
> > support was created to solve - but everyone seems to prefer the buggy
> > bridge approach, and no one seems willing to do anything about the
> > bugs or even acknowledge that it's a problem.  It's strange - if one
> > identifies bugs that result in kernel oops in other kernel subsystems,
> > one is generally taken seriously and the problem is solved.
> 
> Unbinding is really not the most important feature, especially for SoC. If
> you feel different, working together with others, getting some agreement,
> getting the patches reviewed and finding someone to get them merged is
> very much appreciated. But just complaining won't move this forward.

Sorry, I disagree.  Unbinding is important if the current state results
in crashes and oops - the lack of unbinding support in bridge makes it
harder to develop without constantly rebooting the target machine.

If all you care about is the end user who probably never removes a
module, then yes, it's low priority, but if you care about efficient
development, then the story is rather different.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-07 16:08       ` Daniel Vetter
@ 2019-01-07 16:27         ` Andrzej Hajda
  2019-01-07 21:56           ` Daniel Vetter
  0 siblings, 1 reply; 53+ messages in thread
From: Andrzej Hajda @ 2019-01-07 16:27 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Lubomir Rintel, Liviu Dudau, Russell King - ARM Linux,
	DRI Development, Peter Rosin

On 07.01.2019 17:08, Daniel Vetter wrote:
> On Mon, Jan 07, 2019 at 12:26:58PM +0100, Andrzej Hajda wrote:
>> On 07.01.2019 11:45, Daniel Vetter wrote:
>>> On Thu, Jan 03, 2019 at 01:11:47PM +0000, Russell King - ARM Linux wrote:
>>>> On Thu, Jan 03, 2019 at 10:47:27AM +0100, Lubomir Rintel wrote:
>>>>> Hello,
>>>>>
>>>>> lately I've been trying to make the Himax HX8837 chip that drives the OLPC
>>>>> LCD display work with Armada DRM driver. I've been advised to create a
>>>>> bridge driver and not an encoder driver since the silicon is separate from
>>>>> the LCDC.
>>>>>
>>>>> The Armada DRM driver (and, I think, the i.MX one) creates the drm_device
>>>>> once the component infrastructure sees the necessary sub-devices appear.
>>>>> The sub-devices being the LCDCs and the encoders (not bridges) that it
>>>>> expects to be created externally.
>>>>>
>>>>> Currently, it seems, the only driver that can actually work with this (that
>>>>> is -- creates a drm_encoder for a drm_device when the component is bound)
>>>>> is the tda998x. All other similar drivers create a drm_bridge instead and
>>>>> not use the component infrastructure at all. (In fact, tilcdc driver
>>>>> contains a  hack to handle tda998x specially.)
>>>>>
>>>>> I'm wondering how to reconcile the two?
>>>>>
>>>>> * The tda998x driver has recently been modified to create a bridge on probe
>>>>>   and eventually encoder on component bind. Is this an okay thing to do in
>>>>>   a new driver? (this probably means the tilcdc hack can be removed...)
>>>>>
>>>>> * If a non-componentized bridge were to be used (along with a dummy 
>>>>>   encoder), at what point would it make sense to look for the bridge? 
>>>>>   Would it be a good idea to defer the probe of crtc until a bridge can be 
>>>>>   looked up and the attach it on component bind?  What if the bridge goes 
>>>>>   away (a module is unloaded, etc.) in between?
>>>>>
>>>>> I'd be thankful for opintions and advice before I move ahead with this.
>>>> This is the long-standing problem with the conflict between bridge
>>>> support and component support, and I'm not sure that there is really
>>>> any answer to it.
>>>>
>>>> I've gone into the details of the two several times on the list,
>>>> particularly about the short-comings of the bridge approach, but it
>>>> seems no one cares to fix those short-comings.
>>>>
>>>> You are re-identifying some of the issues that I've already pointed
>>>> out - such as what happens to DRM drives when the bridge driver is
>>>> unbound (it's really not about modules being unloaded, and the problem
>>>> can't be solved by taking a module reference count - all that the
>>>> module reference count does is ensure that the module doesn't go
>>>> away unexpected, there is no way to ensure that the device isn't
>>>> unbound.)
>>>>
>>>> The issue of unbinding is precisely the issue which the component
>>>> support was created to solve - but everyone seems to prefer the buggy
>>>> bridge approach, and no one seems willing to do anything about the
>>>> bugs or even acknowledge that it's a problem.  It's strange - if one
>>>> identifies bugs that result in kernel oops in other kernel subsystems,
>>>> one is generally taken seriously and the problem is solved.
>>> Unbinding is really not the most important feature, especially for SoC. If
>>> you feel different, working together with others, getting some agreement,
>>> getting the patches reviewed and finding someone to get them merged is
>>> very much appreciated. But just complaining won't move this forward.
>>>
>>>> The issue about the encoders is something that I've tried to discuss,
>>>> and I've pointed out that moving it into the DRM driver adds additional
>>>> complexity there, and I'd hoped that my patch set I posted would've
>>>> generated discussion, but alas not.
>>>>
>>>> What I'm not prepared to do is to introduce _known_ bugs into any
>>>> driver that I maintain.
>>> I thought last time around the idea was to use device links (and teach
>>> drm_bridge about them), so that the unloading works correctly.
>>
>> With current device_links implementation registering links in probe of
>> the consumer is just incorrect - it can happen that neither consumer
>> neither provider is fully probed and creating device links in such state
>> is wrong according to docs, and my experiments. See [1] for discussion
>> and [2] for docs.
> We could set up the device link only at drm_dev_register time. At that point
> we know that driver loading has fully succeeded. We do have a list of
> bridges at hand, so that's not going to be an issue.
>
> The only case where this breaks is if a driver is still using the
> deprecated ->load callback. But that ->load callback doesn't mix well with
> EDEFER_PROBE/component framework anyway, so I think not going to be a
> problem hopefully?


drm_dev_register usually is called from bind callback, which is called from probe callback of one of the components or master (depending on particular probe order). If you want to register device link in this function it is possible that the bad scenario will happen - there will be attempt to create device link between not-yet-probed consumer and not-yet-probed provider.


Regards

Andrzej


> -Daniel
>>
>> [1]: https://patchwork.freedesktop.org/patch/218878/
>>
>> [2]:
>> https://www.kernel.org/doc/html/latest/driver-api/device_link.html#usage
>>
>>
>> Regards
>>
>> Andrzej
>>
>>
>>> Wrt tda988x: I think it really shouldn't create a drm_encoder, nor
>>> register as a component. Fixing that is probably a bit more work.
>>> -Daniel
>>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-07 16:12     ` Russell King - ARM Linux
@ 2019-01-07 21:55       ` Daniel Vetter
  2019-01-08  0:39         ` Russell King - ARM Linux
  0 siblings, 1 reply; 53+ messages in thread
From: Daniel Vetter @ 2019-01-07 21:55 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Lubomir Rintel, Liviu Dudau, Peter Rosin, DRI Development

On Mon, Jan 7, 2019 at 5:13 PM Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
>
> On Mon, Jan 07, 2019 at 11:45:32AM +0100, Daniel Vetter wrote:
> > On Thu, Jan 03, 2019 at 01:11:47PM +0000, Russell King - ARM Linux wrote:
> > > This is the long-standing problem with the conflict between bridge
> > > support and component support, and I'm not sure that there is really
> > > any answer to it.
> > >
> > > I've gone into the details of the two several times on the list,
> > > particularly about the short-comings of the bridge approach, but it
> > > seems no one cares to fix those short-comings.
> > >
> > > You are re-identifying some of the issues that I've already pointed
> > > out - such as what happens to DRM drives when the bridge driver is
> > > unbound (it's really not about modules being unloaded, and the problem
> > > can't be solved by taking a module reference count - all that the
> > > module reference count does is ensure that the module doesn't go
> > > away unexpected, there is no way to ensure that the device isn't
> > > unbound.)
> > >
> > > The issue of unbinding is precisely the issue which the component
> > > support was created to solve - but everyone seems to prefer the buggy
> > > bridge approach, and no one seems willing to do anything about the
> > > bugs or even acknowledge that it's a problem.  It's strange - if one
> > > identifies bugs that result in kernel oops in other kernel subsystems,
> > > one is generally taken seriously and the problem is solved.
> >
> > Unbinding is really not the most important feature, especially for SoC. If
> > you feel different, working together with others, getting some agreement,
> > getting the patches reviewed and finding someone to get them merged is
> > very much appreciated. But just complaining won't move this forward.
>
> Sorry, I disagree.  Unbinding is important if the current state results
> in crashes and oops - the lack of unbinding support in bridge makes it
> harder to develop without constantly rebooting the target machine.
>
> If all you care about is the end user who probably never removes a
> module, then yes, it's low priority, but if you care about efficient
> development, then the story is rather different.

Unloading i915 needs a very careful script, or you'll get a rather
bright fireworks. Afaik all other drm drivers (except maybe udl) are
the same. At least if you do anything fancy, where fancy includes:
fbdev emulation, prime buffer sharing, shared dma fences, or well
anything really that goes beyond a dummy boot splash. The lifetimes of
all these things are flat-out broken. udl tries to at least wrap some
duct-tape around it, and Noralf greatly improved the situation in the
past year at least.

So still not seeing what exactly the massive blocker here is.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-07 16:27         ` Andrzej Hajda
@ 2019-01-07 21:56           ` Daniel Vetter
  2019-01-08  8:35             ` Andrzej Hajda
  0 siblings, 1 reply; 53+ messages in thread
From: Daniel Vetter @ 2019-01-07 21:56 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: Lubomir Rintel, Liviu Dudau, Russell King - ARM Linux,
	DRI Development, Peter Rosin

On Mon, Jan 7, 2019 at 5:27 PM Andrzej Hajda <a.hajda@samsung.com> wrote:
>
> On 07.01.2019 17:08, Daniel Vetter wrote:
> > On Mon, Jan 07, 2019 at 12:26:58PM +0100, Andrzej Hajda wrote:
> >> On 07.01.2019 11:45, Daniel Vetter wrote:
> >>> On Thu, Jan 03, 2019 at 01:11:47PM +0000, Russell King - ARM Linux wrote:
> >>>> On Thu, Jan 03, 2019 at 10:47:27AM +0100, Lubomir Rintel wrote:
> >>>>> Hello,
> >>>>>
> >>>>> lately I've been trying to make the Himax HX8837 chip that drives the OLPC
> >>>>> LCD display work with Armada DRM driver. I've been advised to create a
> >>>>> bridge driver and not an encoder driver since the silicon is separate from
> >>>>> the LCDC.
> >>>>>
> >>>>> The Armada DRM driver (and, I think, the i.MX one) creates the drm_device
> >>>>> once the component infrastructure sees the necessary sub-devices appear.
> >>>>> The sub-devices being the LCDCs and the encoders (not bridges) that it
> >>>>> expects to be created externally.
> >>>>>
> >>>>> Currently, it seems, the only driver that can actually work with this (that
> >>>>> is -- creates a drm_encoder for a drm_device when the component is bound)
> >>>>> is the tda998x. All other similar drivers create a drm_bridge instead and
> >>>>> not use the component infrastructure at all. (In fact, tilcdc driver
> >>>>> contains a  hack to handle tda998x specially.)
> >>>>>
> >>>>> I'm wondering how to reconcile the two?
> >>>>>
> >>>>> * The tda998x driver has recently been modified to create a bridge on probe
> >>>>>   and eventually encoder on component bind. Is this an okay thing to do in
> >>>>>   a new driver? (this probably means the tilcdc hack can be removed...)
> >>>>>
> >>>>> * If a non-componentized bridge were to be used (along with a dummy
> >>>>>   encoder), at what point would it make sense to look for the bridge?
> >>>>>   Would it be a good idea to defer the probe of crtc until a bridge can be
> >>>>>   looked up and the attach it on component bind?  What if the bridge goes
> >>>>>   away (a module is unloaded, etc.) in between?
> >>>>>
> >>>>> I'd be thankful for opintions and advice before I move ahead with this.
> >>>> This is the long-standing problem with the conflict between bridge
> >>>> support and component support, and I'm not sure that there is really
> >>>> any answer to it.
> >>>>
> >>>> I've gone into the details of the two several times on the list,
> >>>> particularly about the short-comings of the bridge approach, but it
> >>>> seems no one cares to fix those short-comings.
> >>>>
> >>>> You are re-identifying some of the issues that I've already pointed
> >>>> out - such as what happens to DRM drives when the bridge driver is
> >>>> unbound (it's really not about modules being unloaded, and the problem
> >>>> can't be solved by taking a module reference count - all that the
> >>>> module reference count does is ensure that the module doesn't go
> >>>> away unexpected, there is no way to ensure that the device isn't
> >>>> unbound.)
> >>>>
> >>>> The issue of unbinding is precisely the issue which the component
> >>>> support was created to solve - but everyone seems to prefer the buggy
> >>>> bridge approach, and no one seems willing to do anything about the
> >>>> bugs or even acknowledge that it's a problem.  It's strange - if one
> >>>> identifies bugs that result in kernel oops in other kernel subsystems,
> >>>> one is generally taken seriously and the problem is solved.
> >>> Unbinding is really not the most important feature, especially for SoC. If
> >>> you feel different, working together with others, getting some agreement,
> >>> getting the patches reviewed and finding someone to get them merged is
> >>> very much appreciated. But just complaining won't move this forward.
> >>>
> >>>> The issue about the encoders is something that I've tried to discuss,
> >>>> and I've pointed out that moving it into the DRM driver adds additional
> >>>> complexity there, and I'd hoped that my patch set I posted would've
> >>>> generated discussion, but alas not.
> >>>>
> >>>> What I'm not prepared to do is to introduce _known_ bugs into any
> >>>> driver that I maintain.
> >>> I thought last time around the idea was to use device links (and teach
> >>> drm_bridge about them), so that the unloading works correctly.
> >>
> >> With current device_links implementation registering links in probe of
> >> the consumer is just incorrect - it can happen that neither consumer
> >> neither provider is fully probed and creating device links in such state
> >> is wrong according to docs, and my experiments. See [1] for discussion
> >> and [2] for docs.
> > We could set up the device link only at drm_dev_register time. At that point
> > we know that driver loading has fully succeeded. We do have a list of
> > bridges at hand, so that's not going to be an issue.
> >
> > The only case where this breaks is if a driver is still using the
> > deprecated ->load callback. But that ->load callback doesn't mix well with
> > EDEFER_PROBE/component framework anyway, so I think not going to be a
> > problem hopefully?
>
>
> drm_dev_register usually is called from bind callback, which is called from probe callback of one of the components or master (depending on particular probe order). If you want to register device link in this function it is possible that the bad scenario will happen - there will be attempt to create device link between not-yet-probed consumer and not-yet-probed provider.

If you call drm_dev_register before you have all the components
assembled (i.e. anywhere else than in master bind) that sounds like a
driver bug. drm_dev_register publishes the drm device instance to the
world, if you're not ready to handle userspace requests at that point
(because not everything is loaded yet) then things will go boom in
very colorful ways. And from my (admittedly very rough) understanding
we should be able to register the the device links as the very last
step in the master bind function (and drm_dev_register should be about
the last thing you do in the master bind).

So not clear on why this won't work?
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-07 21:55       ` Daniel Vetter
@ 2019-01-08  0:39         ` Russell King - ARM Linux
  2019-01-08 14:29           ` Liviu Dudau
  0 siblings, 1 reply; 53+ messages in thread
From: Russell King - ARM Linux @ 2019-01-08  0:39 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Lubomir Rintel, Liviu Dudau, Peter Rosin, DRI Development

On Mon, Jan 07, 2019 at 10:55:15PM +0100, Daniel Vetter wrote:
> On Mon, Jan 7, 2019 at 5:13 PM Russell King - ARM Linux
> <linux@armlinux.org.uk> wrote:
> >
> > On Mon, Jan 07, 2019 at 11:45:32AM +0100, Daniel Vetter wrote:
> > > On Thu, Jan 03, 2019 at 01:11:47PM +0000, Russell King - ARM Linux wrote:
> > > > This is the long-standing problem with the conflict between bridge
> > > > support and component support, and I'm not sure that there is really
> > > > any answer to it.
> > > >
> > > > I've gone into the details of the two several times on the list,
> > > > particularly about the short-comings of the bridge approach, but it
> > > > seems no one cares to fix those short-comings.
> > > >
> > > > You are re-identifying some of the issues that I've already pointed
> > > > out - such as what happens to DRM drives when the bridge driver is
> > > > unbound (it's really not about modules being unloaded, and the problem
> > > > can't be solved by taking a module reference count - all that the
> > > > module reference count does is ensure that the module doesn't go
> > > > away unexpected, there is no way to ensure that the device isn't
> > > > unbound.)
> > > >
> > > > The issue of unbinding is precisely the issue which the component
> > > > support was created to solve - but everyone seems to prefer the buggy
> > > > bridge approach, and no one seems willing to do anything about the
> > > > bugs or even acknowledge that it's a problem.  It's strange - if one
> > > > identifies bugs that result in kernel oops in other kernel subsystems,
> > > > one is generally taken seriously and the problem is solved.
> > >
> > > Unbinding is really not the most important feature, especially for SoC. If
> > > you feel different, working together with others, getting some agreement,
> > > getting the patches reviewed and finding someone to get them merged is
> > > very much appreciated. But just complaining won't move this forward.
> >
> > Sorry, I disagree.  Unbinding is important if the current state results
> > in crashes and oops - the lack of unbinding support in bridge makes it
> > harder to develop without constantly rebooting the target machine.
> >
> > If all you care about is the end user who probably never removes a
> > module, then yes, it's low priority, but if you care about efficient
> > development, then the story is rather different.
> 
> Unloading i915 needs a very careful script, or you'll get a rather
> bright fireworks. Afaik all other drm drivers (except maybe udl) are
> the same. At least if you do anything fancy, where fancy includes:
> fbdev emulation, prime buffer sharing, shared dma fences, or well
> anything really that goes beyond a dummy boot splash. The lifetimes of
> all these things are flat-out broken. udl tries to at least wrap some
> duct-tape around it, and Noralf greatly improved the situation in the
> past year at least.
> 
> So still not seeing what exactly the massive blocker here is.

The fact that I can unload armada drm/tda998x modules without incident
today, and have done many times through development, and I don't wish
to regress from that position.  As far as I'm concerned, this is a
solved problem, but the pressure I'm under to convert tda998x to a
bridge driver is causing bugs that I've already solved by _not_ using
that to be introduced.

You've mentioned in your previous mail about me not trying to solve
the situation - I have tried, I've proposed a way around the component
vs bridge issue but I don't think it went anywhere.

If I can't get traction on issues, then I can only state what the
current state of affairs is to folk asking about it.  That is _not_
"whinging" about it, that is informing people and being helpful.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-07 21:56           ` Daniel Vetter
@ 2019-01-08  8:35             ` Andrzej Hajda
  2019-01-08  8:47               ` Daniel Vetter
  2019-01-08 10:16               ` Russell King - ARM Linux
  0 siblings, 2 replies; 53+ messages in thread
From: Andrzej Hajda @ 2019-01-08  8:35 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Lubomir Rintel, Liviu Dudau, Russell King - ARM Linux,
	DRI Development, Peter Rosin

On 07.01.2019 22:56, Daniel Vetter wrote:
> On Mon, Jan 7, 2019 at 5:27 PM Andrzej Hajda <a.hajda@samsung.com> wrote:
>> On 07.01.2019 17:08, Daniel Vetter wrote:
>>> On Mon, Jan 07, 2019 at 12:26:58PM +0100, Andrzej Hajda wrote:
>>>> On 07.01.2019 11:45, Daniel Vetter wrote:
>>>>> On Thu, Jan 03, 2019 at 01:11:47PM +0000, Russell King - ARM Linux wrote:
>>>>>> On Thu, Jan 03, 2019 at 10:47:27AM +0100, Lubomir Rintel wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> lately I've been trying to make the Himax HX8837 chip that drives the OLPC
>>>>>>> LCD display work with Armada DRM driver. I've been advised to create a
>>>>>>> bridge driver and not an encoder driver since the silicon is separate from
>>>>>>> the LCDC.
>>>>>>>
>>>>>>> The Armada DRM driver (and, I think, the i.MX one) creates the drm_device
>>>>>>> once the component infrastructure sees the necessary sub-devices appear.
>>>>>>> The sub-devices being the LCDCs and the encoders (not bridges) that it
>>>>>>> expects to be created externally.
>>>>>>>
>>>>>>> Currently, it seems, the only driver that can actually work with this (that
>>>>>>> is -- creates a drm_encoder for a drm_device when the component is bound)
>>>>>>> is the tda998x. All other similar drivers create a drm_bridge instead and
>>>>>>> not use the component infrastructure at all. (In fact, tilcdc driver
>>>>>>> contains a  hack to handle tda998x specially.)
>>>>>>>
>>>>>>> I'm wondering how to reconcile the two?
>>>>>>>
>>>>>>> * The tda998x driver has recently been modified to create a bridge on probe
>>>>>>>   and eventually encoder on component bind. Is this an okay thing to do in
>>>>>>>   a new driver? (this probably means the tilcdc hack can be removed...)
>>>>>>>
>>>>>>> * If a non-componentized bridge were to be used (along with a dummy
>>>>>>>   encoder), at what point would it make sense to look for the bridge?
>>>>>>>   Would it be a good idea to defer the probe of crtc until a bridge can be
>>>>>>>   looked up and the attach it on component bind?  What if the bridge goes
>>>>>>>   away (a module is unloaded, etc.) in between?
>>>>>>>
>>>>>>> I'd be thankful for opintions and advice before I move ahead with this.
>>>>>> This is the long-standing problem with the conflict between bridge
>>>>>> support and component support, and I'm not sure that there is really
>>>>>> any answer to it.
>>>>>>
>>>>>> I've gone into the details of the two several times on the list,
>>>>>> particularly about the short-comings of the bridge approach, but it
>>>>>> seems no one cares to fix those short-comings.
>>>>>>
>>>>>> You are re-identifying some of the issues that I've already pointed
>>>>>> out - such as what happens to DRM drives when the bridge driver is
>>>>>> unbound (it's really not about modules being unloaded, and the problem
>>>>>> can't be solved by taking a module reference count - all that the
>>>>>> module reference count does is ensure that the module doesn't go
>>>>>> away unexpected, there is no way to ensure that the device isn't
>>>>>> unbound.)
>>>>>>
>>>>>> The issue of unbinding is precisely the issue which the component
>>>>>> support was created to solve - but everyone seems to prefer the buggy
>>>>>> bridge approach, and no one seems willing to do anything about the
>>>>>> bugs or even acknowledge that it's a problem.  It's strange - if one
>>>>>> identifies bugs that result in kernel oops in other kernel subsystems,
>>>>>> one is generally taken seriously and the problem is solved.
>>>>> Unbinding is really not the most important feature, especially for SoC. If
>>>>> you feel different, working together with others, getting some agreement,
>>>>> getting the patches reviewed and finding someone to get them merged is
>>>>> very much appreciated. But just complaining won't move this forward.
>>>>>
>>>>>> The issue about the encoders is something that I've tried to discuss,
>>>>>> and I've pointed out that moving it into the DRM driver adds additional
>>>>>> complexity there, and I'd hoped that my patch set I posted would've
>>>>>> generated discussion, but alas not.
>>>>>>
>>>>>> What I'm not prepared to do is to introduce _known_ bugs into any
>>>>>> driver that I maintain.
>>>>> I thought last time around the idea was to use device links (and teach
>>>>> drm_bridge about them), so that the unloading works correctly.
>>>> With current device_links implementation registering links in probe of
>>>> the consumer is just incorrect - it can happen that neither consumer
>>>> neither provider is fully probed and creating device links in such state
>>>> is wrong according to docs, and my experiments. See [1] for discussion
>>>> and [2] for docs.
>>> We could set up the device link only at drm_dev_register time. At that point
>>> we know that driver loading has fully succeeded. We do have a list of
>>> bridges at hand, so that's not going to be an issue.
>>>
>>> The only case where this breaks is if a driver is still using the
>>> deprecated ->load callback. But that ->load callback doesn't mix well with
>>> EDEFER_PROBE/component framework anyway, so I think not going to be a
>>> problem hopefully?
>>
>> drm_dev_register usually is called from bind callback, which is called from probe callback of one of the components or master (depending on particular probe order). If you want to register device link in this function it is possible that the bad scenario will happen - there will be attempt to create device link between not-yet-probed consumer and not-yet-probed provider.
> If you call drm_dev_register before you have all the components
> assembled (i.e. anywhere else than in master bind) that sounds like a
> driver bug.


This is how components work, every component calls component_add usually
in probe, and this function checks if all components are present, if yes
(ie. during probe of the last component) master bind is called and it
creates drm device and then registers it. If you add device link at this
moment, it is still before end of probe of the last component.

On the other side provider is registered (drm_bridge_add in case of
bridges) during its probe so it becomes available to the consumers
BEFORE end of its probe and again it can happen that device link will be
added to early.


Regards

Andrzej


> drm_dev_register publishes the drm device instance to the
> world, if you're not ready to handle userspace requests at that point
> (because not everything is loaded yet) then things will go boom in
> very colorful ways. And from my (admittedly very rough) understanding
> we should be able to register the the device links as the very last
> step in the master bind function (and drm_dev_register should be about
> the last thing you do in the master bind).
>
> So not clear on why this won't work?




> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08  8:35             ` Andrzej Hajda
@ 2019-01-08  8:47               ` Daniel Vetter
  2019-01-08  9:22                 ` Andrzej Hajda
  2019-01-08 10:16               ` Russell King - ARM Linux
  1 sibling, 1 reply; 53+ messages in thread
From: Daniel Vetter @ 2019-01-08  8:47 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: Lubomir Rintel, Liviu Dudau, Russell King - ARM Linux,
	DRI Development, Peter Rosin

On Tue, Jan 8, 2019 at 9:35 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
> On 07.01.2019 22:56, Daniel Vetter wrote:
> > On Mon, Jan 7, 2019 at 5:27 PM Andrzej Hajda <a.hajda@samsung.com> wrote:
> >> On 07.01.2019 17:08, Daniel Vetter wrote:
> >>> On Mon, Jan 07, 2019 at 12:26:58PM +0100, Andrzej Hajda wrote:
> >>>> On 07.01.2019 11:45, Daniel Vetter wrote:
> >>>>> On Thu, Jan 03, 2019 at 01:11:47PM +0000, Russell King - ARM Linux wrote:
> >>>>>> On Thu, Jan 03, 2019 at 10:47:27AM +0100, Lubomir Rintel wrote:
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> lately I've been trying to make the Himax HX8837 chip that drives the OLPC
> >>>>>>> LCD display work with Armada DRM driver. I've been advised to create a
> >>>>>>> bridge driver and not an encoder driver since the silicon is separate from
> >>>>>>> the LCDC.
> >>>>>>>
> >>>>>>> The Armada DRM driver (and, I think, the i.MX one) creates the drm_device
> >>>>>>> once the component infrastructure sees the necessary sub-devices appear.
> >>>>>>> The sub-devices being the LCDCs and the encoders (not bridges) that it
> >>>>>>> expects to be created externally.
> >>>>>>>
> >>>>>>> Currently, it seems, the only driver that can actually work with this (that
> >>>>>>> is -- creates a drm_encoder for a drm_device when the component is bound)
> >>>>>>> is the tda998x. All other similar drivers create a drm_bridge instead and
> >>>>>>> not use the component infrastructure at all. (In fact, tilcdc driver
> >>>>>>> contains a  hack to handle tda998x specially.)
> >>>>>>>
> >>>>>>> I'm wondering how to reconcile the two?
> >>>>>>>
> >>>>>>> * The tda998x driver has recently been modified to create a bridge on probe
> >>>>>>>   and eventually encoder on component bind. Is this an okay thing to do in
> >>>>>>>   a new driver? (this probably means the tilcdc hack can be removed...)
> >>>>>>>
> >>>>>>> * If a non-componentized bridge were to be used (along with a dummy
> >>>>>>>   encoder), at what point would it make sense to look for the bridge?
> >>>>>>>   Would it be a good idea to defer the probe of crtc until a bridge can be
> >>>>>>>   looked up and the attach it on component bind?  What if the bridge goes
> >>>>>>>   away (a module is unloaded, etc.) in between?
> >>>>>>>
> >>>>>>> I'd be thankful for opintions and advice before I move ahead with this.
> >>>>>> This is the long-standing problem with the conflict between bridge
> >>>>>> support and component support, and I'm not sure that there is really
> >>>>>> any answer to it.
> >>>>>>
> >>>>>> I've gone into the details of the two several times on the list,
> >>>>>> particularly about the short-comings of the bridge approach, but it
> >>>>>> seems no one cares to fix those short-comings.
> >>>>>>
> >>>>>> You are re-identifying some of the issues that I've already pointed
> >>>>>> out - such as what happens to DRM drives when the bridge driver is
> >>>>>> unbound (it's really not about modules being unloaded, and the problem
> >>>>>> can't be solved by taking a module reference count - all that the
> >>>>>> module reference count does is ensure that the module doesn't go
> >>>>>> away unexpected, there is no way to ensure that the device isn't
> >>>>>> unbound.)
> >>>>>>
> >>>>>> The issue of unbinding is precisely the issue which the component
> >>>>>> support was created to solve - but everyone seems to prefer the buggy
> >>>>>> bridge approach, and no one seems willing to do anything about the
> >>>>>> bugs or even acknowledge that it's a problem.  It's strange - if one
> >>>>>> identifies bugs that result in kernel oops in other kernel subsystems,
> >>>>>> one is generally taken seriously and the problem is solved.
> >>>>> Unbinding is really not the most important feature, especially for SoC. If
> >>>>> you feel different, working together with others, getting some agreement,
> >>>>> getting the patches reviewed and finding someone to get them merged is
> >>>>> very much appreciated. But just complaining won't move this forward.
> >>>>>
> >>>>>> The issue about the encoders is something that I've tried to discuss,
> >>>>>> and I've pointed out that moving it into the DRM driver adds additional
> >>>>>> complexity there, and I'd hoped that my patch set I posted would've
> >>>>>> generated discussion, but alas not.
> >>>>>>
> >>>>>> What I'm not prepared to do is to introduce _known_ bugs into any
> >>>>>> driver that I maintain.
> >>>>> I thought last time around the idea was to use device links (and teach
> >>>>> drm_bridge about them), so that the unloading works correctly.
> >>>> With current device_links implementation registering links in probe of
> >>>> the consumer is just incorrect - it can happen that neither consumer
> >>>> neither provider is fully probed and creating device links in such state
> >>>> is wrong according to docs, and my experiments. See [1] for discussion
> >>>> and [2] for docs.
> >>> We could set up the device link only at drm_dev_register time. At that point
> >>> we know that driver loading has fully succeeded. We do have a list of
> >>> bridges at hand, so that's not going to be an issue.
> >>>
> >>> The only case where this breaks is if a driver is still using the
> >>> deprecated ->load callback. But that ->load callback doesn't mix well with
> >>> EDEFER_PROBE/component framework anyway, so I think not going to be a
> >>> problem hopefully?
> >>
> >> drm_dev_register usually is called from bind callback, which is called from probe callback of one of the components or master (depending on particular probe order). If you want to register device link in this function it is possible that the bad scenario will happen - there will be attempt to create device link between not-yet-probed consumer and not-yet-probed provider.
> > If you call drm_dev_register before you have all the components
> > assembled (i.e. anywhere else than in master bind) that sounds like a
> > driver bug.
>
>
> This is how components work, every component calls component_add usually
> in probe, and this function checks if all components are present, if yes
> (ie. during probe of the last component) master bind is called and it
> creates drm device and then registers it. If you add device link at this
> moment, it is still before end of probe of the last component.
>
> On the other side provider is registered (drm_bridge_add in case of
> bridges) during its probe so it becomes available to the consumers
> BEFORE end of its probe and again it can happen that device link will be
> added to early.

Hm I read that thread again. Is the reason why we can't add the device
link only the exynos behaviour to allow drm_panel to be
hot-added/removed?

Note that for bridge allowing this is 100% buggy: drm core does not
allow you to hotplug/hotunplug bridges.

In theory drm_panel can be hotplugged (because we can hotplug
drm_connector), but I'm pretty sure exynos gets it all wrong since
hotplugging panels is no easy thing to do. I don't think this is
something we want to support, forcing the entire driver to be unload
sounds like what we want to do here.
-Daniel

>
>
> Regards
>
> Andrzej
>
>
> > drm_dev_register publishes the drm device instance to the
> > world, if you're not ready to handle userspace requests at that point
> > (because not everything is loaded yet) then things will go boom in
> > very colorful ways. And from my (admittedly very rough) understanding
> > we should be able to register the the device links as the very last
> > step in the master bind function (and drm_dev_register should be about
> > the last thing you do in the master bind).
> >
> > So not clear on why this won't work?
>
>
>
>
> > -Daniel
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> >
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08  8:47               ` Daniel Vetter
@ 2019-01-08  9:22                 ` Andrzej Hajda
  2019-01-08 10:23                   ` Russell King - ARM Linux
                                     ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Andrzej Hajda @ 2019-01-08  9:22 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Lubomir Rintel, Liviu Dudau, Russell King - ARM Linux,
	DRI Development, Peter Rosin

On 08.01.2019 09:47, Daniel Vetter wrote:
> On Tue, Jan 8, 2019 at 9:35 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
>> On 07.01.2019 22:56, Daniel Vetter wrote:
>>> On Mon, Jan 7, 2019 at 5:27 PM Andrzej Hajda <a.hajda@samsung.com> wrote:
>>>> On 07.01.2019 17:08, Daniel Vetter wrote:
>>>>> On Mon, Jan 07, 2019 at 12:26:58PM +0100, Andrzej Hajda wrote:
>>>>>> On 07.01.2019 11:45, Daniel Vetter wrote:
>>>>>>> On Thu, Jan 03, 2019 at 01:11:47PM +0000, Russell King - ARM Linux wrote:
>>>>>>>> On Thu, Jan 03, 2019 at 10:47:27AM +0100, Lubomir Rintel wrote:
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> lately I've been trying to make the Himax HX8837 chip that drives the OLPC
>>>>>>>>> LCD display work with Armada DRM driver. I've been advised to create a
>>>>>>>>> bridge driver and not an encoder driver since the silicon is separate from
>>>>>>>>> the LCDC.
>>>>>>>>>
>>>>>>>>> The Armada DRM driver (and, I think, the i.MX one) creates the drm_device
>>>>>>>>> once the component infrastructure sees the necessary sub-devices appear.
>>>>>>>>> The sub-devices being the LCDCs and the encoders (not bridges) that it
>>>>>>>>> expects to be created externally.
>>>>>>>>>
>>>>>>>>> Currently, it seems, the only driver that can actually work with this (that
>>>>>>>>> is -- creates a drm_encoder for a drm_device when the component is bound)
>>>>>>>>> is the tda998x. All other similar drivers create a drm_bridge instead and
>>>>>>>>> not use the component infrastructure at all. (In fact, tilcdc driver
>>>>>>>>> contains a  hack to handle tda998x specially.)
>>>>>>>>>
>>>>>>>>> I'm wondering how to reconcile the two?
>>>>>>>>>
>>>>>>>>> * The tda998x driver has recently been modified to create a bridge on probe
>>>>>>>>>   and eventually encoder on component bind. Is this an okay thing to do in
>>>>>>>>>   a new driver? (this probably means the tilcdc hack can be removed...)
>>>>>>>>>
>>>>>>>>> * If a non-componentized bridge were to be used (along with a dummy
>>>>>>>>>   encoder), at what point would it make sense to look for the bridge?
>>>>>>>>>   Would it be a good idea to defer the probe of crtc until a bridge can be
>>>>>>>>>   looked up and the attach it on component bind?  What if the bridge goes
>>>>>>>>>   away (a module is unloaded, etc.) in between?
>>>>>>>>>
>>>>>>>>> I'd be thankful for opintions and advice before I move ahead with this.
>>>>>>>> This is the long-standing problem with the conflict between bridge
>>>>>>>> support and component support, and I'm not sure that there is really
>>>>>>>> any answer to it.
>>>>>>>>
>>>>>>>> I've gone into the details of the two several times on the list,
>>>>>>>> particularly about the short-comings of the bridge approach, but it
>>>>>>>> seems no one cares to fix those short-comings.
>>>>>>>>
>>>>>>>> You are re-identifying some of the issues that I've already pointed
>>>>>>>> out - such as what happens to DRM drives when the bridge driver is
>>>>>>>> unbound (it's really not about modules being unloaded, and the problem
>>>>>>>> can't be solved by taking a module reference count - all that the
>>>>>>>> module reference count does is ensure that the module doesn't go
>>>>>>>> away unexpected, there is no way to ensure that the device isn't
>>>>>>>> unbound.)
>>>>>>>>
>>>>>>>> The issue of unbinding is precisely the issue which the component
>>>>>>>> support was created to solve - but everyone seems to prefer the buggy
>>>>>>>> bridge approach, and no one seems willing to do anything about the
>>>>>>>> bugs or even acknowledge that it's a problem.  It's strange - if one
>>>>>>>> identifies bugs that result in kernel oops in other kernel subsystems,
>>>>>>>> one is generally taken seriously and the problem is solved.
>>>>>>> Unbinding is really not the most important feature, especially for SoC. If
>>>>>>> you feel different, working together with others, getting some agreement,
>>>>>>> getting the patches reviewed and finding someone to get them merged is
>>>>>>> very much appreciated. But just complaining won't move this forward.
>>>>>>>
>>>>>>>> The issue about the encoders is something that I've tried to discuss,
>>>>>>>> and I've pointed out that moving it into the DRM driver adds additional
>>>>>>>> complexity there, and I'd hoped that my patch set I posted would've
>>>>>>>> generated discussion, but alas not.
>>>>>>>>
>>>>>>>> What I'm not prepared to do is to introduce _known_ bugs into any
>>>>>>>> driver that I maintain.
>>>>>>> I thought last time around the idea was to use device links (and teach
>>>>>>> drm_bridge about them), so that the unloading works correctly.
>>>>>> With current device_links implementation registering links in probe of
>>>>>> the consumer is just incorrect - it can happen that neither consumer
>>>>>> neither provider is fully probed and creating device links in such state
>>>>>> is wrong according to docs, and my experiments. See [1] for discussion
>>>>>> and [2] for docs.
>>>>> We could set up the device link only at drm_dev_register time. At that point
>>>>> we know that driver loading has fully succeeded. We do have a list of
>>>>> bridges at hand, so that's not going to be an issue.
>>>>>
>>>>> The only case where this breaks is if a driver is still using the
>>>>> deprecated ->load callback. But that ->load callback doesn't mix well with
>>>>> EDEFER_PROBE/component framework anyway, so I think not going to be a
>>>>> problem hopefully?
>>>> drm_dev_register usually is called from bind callback, which is called from probe callback of one of the components or master (depending on particular probe order). If you want to register device link in this function it is possible that the bad scenario will happen - there will be attempt to create device link between not-yet-probed consumer and not-yet-probed provider.
>>> If you call drm_dev_register before you have all the components
>>> assembled (i.e. anywhere else than in master bind) that sounds like a
>>> driver bug.
>>
>> This is how components work, every component calls component_add usually
>> in probe, and this function checks if all components are present, if yes
>> (ie. during probe of the last component) master bind is called and it
>> creates drm device and then registers it. If you add device link at this
>> moment, it is still before end of probe of the last component.
>>
>> On the other side provider is registered (drm_bridge_add in case of
>> bridges) during its probe so it becomes available to the consumers
>> BEFORE end of its probe and again it can happen that device link will be
>> added to early.
> Hm I read that thread again. Is the reason why we can't add the device
> link only the exynos behaviour to allow drm_panel to be
> hot-added/removed?


Not only. What I have described above is common to all drivers it has
nothing specific to Exynos.

Of course it was tested on Exynos as this is HW I work on. Linus Walleij
has also reported in this thread that device links have different issue
- circular dependencies (last few emails in this lengthy thread). My
response explains it in detail.


>
> Note that for bridge allowing this is 100% buggy: drm core does not
> allow you to hotplug/hotunplug bridges.


What part of drm core disallows it? As I remember discussions about
drm_bridge design there were voices that they should be
hot(un)pluggable, and they are IMO, of course if they are not active.


>
> In theory drm_panel can be hotplugged (because we can hotplug
> drm_connector), but I'm pretty sure exynos gets it all wrong since
> hotplugging panels is no easy thing to do. I don't think this is
> something we want to support, forcing the entire driver to be unload
> sounds like what we want to do here.


I do not know if it is 'all wrong', but tests shows it is
hot(un)pluggable. In both cases of panel and bridge :)


Regards

Andrzej



> -Daniel
>
>>
>> Regards
>>
>> Andrzej
>>
>>
>>> drm_dev_register publishes the drm device instance to the
>>> world, if you're not ready to handle userspace requests at that point
>>> (because not everything is loaded yet) then things will go boom in
>>> very colorful ways. And from my (admittedly very rough) understanding
>>> we should be able to register the the device links as the very last
>>> step in the master bind function (and drm_dev_register should be about
>>> the last thing you do in the master bind).
>>>
>>> So not clear on why this won't work?
>>
>>
>>
>>> -Daniel
>>> --
>>> Daniel Vetter
>>> Software Engineer, Intel Corporation
>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>>>
>>>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08  8:35             ` Andrzej Hajda
  2019-01-08  8:47               ` Daniel Vetter
@ 2019-01-08 10:16               ` Russell King - ARM Linux
  2019-01-08 10:31                 ` Daniel Vetter
  1 sibling, 1 reply; 53+ messages in thread
From: Russell King - ARM Linux @ 2019-01-08 10:16 UTC (permalink / raw)
  To: Andrzej Hajda; +Cc: Lubomir Rintel, DRI Development, Liviu Dudau, Peter Rosin

On Tue, Jan 08, 2019 at 09:35:21AM +0100, Andrzej Hajda wrote:
> On 07.01.2019 22:56, Daniel Vetter wrote:
> > If you call drm_dev_register before you have all the components
> > assembled (i.e. anywhere else than in master bind) that sounds like a
> > driver bug.
> 
> 
> This is how components work, every component calls component_add usually
> in probe, and this function checks if all components are present, if yes
> (ie. during probe of the last component) master bind is called and it
> creates drm device and then registers it. If you add device link at this
> moment, it is still before end of probe of the last component.

I wonder if device links will create more problems with the component
stuff, because we'll end up with two different systems fighting to
achieve the same aim.

1) a component device is probed, and is added to the component helper
   as a component.  (Repeat for as many component devices as are
   required.)
2) the main device is probed, added as a master.  The system is
   complete, so the master bind is called, and device links are
   created - the component device is a provider, and the main
   device is a consumer.
3) the component device is unbound, which triggers the device links
   to unbind the provider device.  This undoes everything, removing
   the device links and removing the main device from the component
   helper.
4) the component device is re-probed, and added to the component
   helper again.

At this point, there is nothing that causes the main device to be
re-probed.  As far as userspace is concerned, that's confusing because
userspace never asked for any change of status of the provider device.

Without device links, the main device would not be unbound, and there
would be an additional stage:

5) the component helper notices the re-appearance of the component
   device, notices that the system is again complete, and re-binds
   the master device bring the system back up.

This seems to be a disadvantage with device links, but there is an
advantage that device links offer - the ability to deal with other
resources (such as clocks, regulators, etc) going away more
gracefully than is possible today - but folk have to understand
that a consumer device will only ever be unbound, it will never be
re-bound automatically when the resources re-appear.

I suppose that could be fixed by moving the consumer devices onto
the deferred probe list, so when a device is re-bound, they are
retried.  At that point, the component helper becomes entirely
redundant, since its functionality can be implemented completely
using device links.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08  9:22                 ` Andrzej Hajda
@ 2019-01-08 10:23                   ` Russell King - ARM Linux
  2019-01-08 10:32                     ` Andrzej Hajda
  2019-01-08 10:24                   ` Daniel Vetter
  2019-01-18 11:07                   ` Linus Walleij
  2 siblings, 1 reply; 53+ messages in thread
From: Russell King - ARM Linux @ 2019-01-08 10:23 UTC (permalink / raw)
  To: Andrzej Hajda; +Cc: Lubomir Rintel, DRI Development, Liviu Dudau, Peter Rosin

On Tue, Jan 08, 2019 at 10:22:06AM +0100, Andrzej Hajda wrote:
> What part of drm core disallows it? As I remember discussions about
> drm_bridge design there were voices that they should be
> hot(un)pluggable, and they are IMO, of course if they are not active.

Even if they are not active, once the DRM master device has used
of_drm_find_bridge(), it has a reference on struct drm_bridge.
Normally, that is allocated using something like devm_kzalloc()
in the bridge drivers probe function.

When the bridge driver is unbound, that memory will be freed, but
there is no notification to the DRM master that this structure is
no longer valid (unless you have something implemented in exynos
between the exynos core and the bridge driver(s) that specifically
deals with that.)  Note that there is nothing in drm_bridge_remove()
that does anything beyond removing the bridge from the global list
of bridges.

Any further accesses by the DRM master to that struct drm_bridge
will be a use-after-free of that memory.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08  9:22                 ` Andrzej Hajda
  2019-01-08 10:23                   ` Russell King - ARM Linux
@ 2019-01-08 10:24                   ` Daniel Vetter
  2019-01-08 11:25                     ` Andrzej Hajda
  2019-01-18 11:07                   ` Linus Walleij
  2 siblings, 1 reply; 53+ messages in thread
From: Daniel Vetter @ 2019-01-08 10:24 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: Lubomir Rintel, Liviu Dudau, Russell King - ARM Linux,
	DRI Development, Peter Rosin

On Tue, Jan 8, 2019 at 10:22 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
>
> On 08.01.2019 09:47, Daniel Vetter wrote:
> > On Tue, Jan 8, 2019 at 9:35 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
> >> On 07.01.2019 22:56, Daniel Vetter wrote:
> >>> On Mon, Jan 7, 2019 at 5:27 PM Andrzej Hajda <a.hajda@samsung.com> wrote:
> >>>> On 07.01.2019 17:08, Daniel Vetter wrote:
> >>>>> On Mon, Jan 07, 2019 at 12:26:58PM +0100, Andrzej Hajda wrote:
> >>>>>> On 07.01.2019 11:45, Daniel Vetter wrote:
> >>>>>>> On Thu, Jan 03, 2019 at 01:11:47PM +0000, Russell King - ARM Linux wrote:
> >>>>>>>> On Thu, Jan 03, 2019 at 10:47:27AM +0100, Lubomir Rintel wrote:
> >>>>>>>>> Hello,
> >>>>>>>>>
> >>>>>>>>> lately I've been trying to make the Himax HX8837 chip that drives the OLPC
> >>>>>>>>> LCD display work with Armada DRM driver. I've been advised to create a
> >>>>>>>>> bridge driver and not an encoder driver since the silicon is separate from
> >>>>>>>>> the LCDC.
> >>>>>>>>>
> >>>>>>>>> The Armada DRM driver (and, I think, the i.MX one) creates the drm_device
> >>>>>>>>> once the component infrastructure sees the necessary sub-devices appear.
> >>>>>>>>> The sub-devices being the LCDCs and the encoders (not bridges) that it
> >>>>>>>>> expects to be created externally.
> >>>>>>>>>
> >>>>>>>>> Currently, it seems, the only driver that can actually work with this (that
> >>>>>>>>> is -- creates a drm_encoder for a drm_device when the component is bound)
> >>>>>>>>> is the tda998x. All other similar drivers create a drm_bridge instead and
> >>>>>>>>> not use the component infrastructure at all. (In fact, tilcdc driver
> >>>>>>>>> contains a  hack to handle tda998x specially.)
> >>>>>>>>>
> >>>>>>>>> I'm wondering how to reconcile the two?
> >>>>>>>>>
> >>>>>>>>> * The tda998x driver has recently been modified to create a bridge on probe
> >>>>>>>>>   and eventually encoder on component bind. Is this an okay thing to do in
> >>>>>>>>>   a new driver? (this probably means the tilcdc hack can be removed...)
> >>>>>>>>>
> >>>>>>>>> * If a non-componentized bridge were to be used (along with a dummy
> >>>>>>>>>   encoder), at what point would it make sense to look for the bridge?
> >>>>>>>>>   Would it be a good idea to defer the probe of crtc until a bridge can be
> >>>>>>>>>   looked up and the attach it on component bind?  What if the bridge goes
> >>>>>>>>>   away (a module is unloaded, etc.) in between?
> >>>>>>>>>
> >>>>>>>>> I'd be thankful for opintions and advice before I move ahead with this.
> >>>>>>>> This is the long-standing problem with the conflict between bridge
> >>>>>>>> support and component support, and I'm not sure that there is really
> >>>>>>>> any answer to it.
> >>>>>>>>
> >>>>>>>> I've gone into the details of the two several times on the list,
> >>>>>>>> particularly about the short-comings of the bridge approach, but it
> >>>>>>>> seems no one cares to fix those short-comings.
> >>>>>>>>
> >>>>>>>> You are re-identifying some of the issues that I've already pointed
> >>>>>>>> out - such as what happens to DRM drives when the bridge driver is
> >>>>>>>> unbound (it's really not about modules being unloaded, and the problem
> >>>>>>>> can't be solved by taking a module reference count - all that the
> >>>>>>>> module reference count does is ensure that the module doesn't go
> >>>>>>>> away unexpected, there is no way to ensure that the device isn't
> >>>>>>>> unbound.)
> >>>>>>>>
> >>>>>>>> The issue of unbinding is precisely the issue which the component
> >>>>>>>> support was created to solve - but everyone seems to prefer the buggy
> >>>>>>>> bridge approach, and no one seems willing to do anything about the
> >>>>>>>> bugs or even acknowledge that it's a problem.  It's strange - if one
> >>>>>>>> identifies bugs that result in kernel oops in other kernel subsystems,
> >>>>>>>> one is generally taken seriously and the problem is solved.
> >>>>>>> Unbinding is really not the most important feature, especially for SoC. If
> >>>>>>> you feel different, working together with others, getting some agreement,
> >>>>>>> getting the patches reviewed and finding someone to get them merged is
> >>>>>>> very much appreciated. But just complaining won't move this forward.
> >>>>>>>
> >>>>>>>> The issue about the encoders is something that I've tried to discuss,
> >>>>>>>> and I've pointed out that moving it into the DRM driver adds additional
> >>>>>>>> complexity there, and I'd hoped that my patch set I posted would've
> >>>>>>>> generated discussion, but alas not.
> >>>>>>>>
> >>>>>>>> What I'm not prepared to do is to introduce _known_ bugs into any
> >>>>>>>> driver that I maintain.
> >>>>>>> I thought last time around the idea was to use device links (and teach
> >>>>>>> drm_bridge about them), so that the unloading works correctly.
> >>>>>> With current device_links implementation registering links in probe of
> >>>>>> the consumer is just incorrect - it can happen that neither consumer
> >>>>>> neither provider is fully probed and creating device links in such state
> >>>>>> is wrong according to docs, and my experiments. See [1] for discussion
> >>>>>> and [2] for docs.
> >>>>> We could set up the device link only at drm_dev_register time. At that point
> >>>>> we know that driver loading has fully succeeded. We do have a list of
> >>>>> bridges at hand, so that's not going to be an issue.
> >>>>>
> >>>>> The only case where this breaks is if a driver is still using the
> >>>>> deprecated ->load callback. But that ->load callback doesn't mix well with
> >>>>> EDEFER_PROBE/component framework anyway, so I think not going to be a
> >>>>> problem hopefully?
> >>>> drm_dev_register usually is called from bind callback, which is called from probe callback of one of the components or master (depending on particular probe order). If you want to register device link in this function it is possible that the bad scenario will happen - there will be attempt to create device link between not-yet-probed consumer and not-yet-probed provider.
> >>> If you call drm_dev_register before you have all the components
> >>> assembled (i.e. anywhere else than in master bind) that sounds like a
> >>> driver bug.
> >>
> >> This is how components work, every component calls component_add usually
> >> in probe, and this function checks if all components are present, if yes
> >> (ie. during probe of the last component) master bind is called and it
> >> creates drm device and then registers it. If you add device link at this
> >> moment, it is still before end of probe of the last component.
> >>
> >> On the other side provider is registered (drm_bridge_add in case of
> >> bridges) during its probe so it becomes available to the consumers
> >> BEFORE end of its probe and again it can happen that device link will be
> >> added to early.
> > Hm I read that thread again. Is the reason why we can't add the device
> > link only the exynos behaviour to allow drm_panel to be
> > hot-added/removed?
>
>
> Not only. What I have described above is common to all drivers it has
> nothing specific to Exynos.

I'm pretty sure that most drivers do not hot-add/remove drm things
after drm_dev_register (except drm_connector for dp mst support). It
would be buggy. Do you have more examples? I haven't reviewed them all
in detail, and things might have changed, so could very well be
there's exceptions.

> Of course it was tested on Exynos as this is HW I work on. Linus Walleij
> has also reported in this thread that device links have different issue
> - circular dependencies (last few emails in this lengthy thread). My
> response explains it in detail.
>
>
> >
> > Note that for bridge allowing this is 100% buggy: drm core does not
> > allow you to hotplug/hotunplug bridges.
>
>
> What part of drm core disallows it? As I remember discussions about
> drm_bridge design there were voices that they should be
> hot(un)pluggable, and they are IMO, of course if they are not active.

There's no locking at all to handle bridges appearing/disappearing
while the drm_device can be accessed. There's also no lifetime
management/refcounting. So it "works" but it's racy like mad.

> > In theory drm_panel can be hotplugged (because we can hotplug
> > drm_connector), but I'm pretty sure exynos gets it all wrong since
> > hotplugging panels is no easy thing to do. I don't think this is
> > something we want to support, forcing the entire driver to be unload
> > sounds like what we want to do here.
>
>
> I do not know if it is 'all wrong', but tests shows it is
> hot(un)pluggable. In both cases of panel and bridge :)

There's no drm_connector_get/put in exynos (except the one in
hdmi_mode_fixup, which looks rather fishy tbh). And drm_connector are
the only things in drm you can hotplug/unplug (except the overall
drm_device ofc). So again it maybe "works", but you're guaranteed to
have lots of races all over the place.
-Daniel

>
>
> Regards
>
> Andrzej
>
>
>
> > -Daniel
> >
> >>
> >> Regards
> >>
> >> Andrzej
> >>
> >>
> >>> drm_dev_register publishes the drm device instance to the
> >>> world, if you're not ready to handle userspace requests at that point
> >>> (because not everything is loaded yet) then things will go boom in
> >>> very colorful ways. And from my (admittedly very rough) understanding
> >>> we should be able to register the the device links as the very last
> >>> step in the master bind function (and drm_dev_register should be about
> >>> the last thing you do in the master bind).
> >>>
> >>> So not clear on why this won't work?
> >>
> >>
> >>
> >>> -Daniel
> >>> --
> >>> Daniel Vetter
> >>> Software Engineer, Intel Corporation
> >>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> >>>
> >>>
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08 10:16               ` Russell King - ARM Linux
@ 2019-01-08 10:31                 ` Daniel Vetter
  0 siblings, 0 replies; 53+ messages in thread
From: Daniel Vetter @ 2019-01-08 10:31 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Liviu Dudau, Peter Rosin, DRI Development, Lubomir Rintel

On Tue, Jan 8, 2019 at 11:16 AM Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
>
> On Tue, Jan 08, 2019 at 09:35:21AM +0100, Andrzej Hajda wrote:
> > On 07.01.2019 22:56, Daniel Vetter wrote:
> > > If you call drm_dev_register before you have all the components
> > > assembled (i.e. anywhere else than in master bind) that sounds like a
> > > driver bug.
> >
> >
> > This is how components work, every component calls component_add usually
> > in probe, and this function checks if all components are present, if yes
> > (ie. during probe of the last component) master bind is called and it
> > creates drm device and then registers it. If you add device link at this
> > moment, it is still before end of probe of the last component.
>
> I wonder if device links will create more problems with the component
> stuff, because we'll end up with two different systems fighting to
> achieve the same aim.
>
> 1) a component device is probed, and is added to the component helper
>    as a component.  (Repeat for as many component devices as are
>    required.)
> 2) the main device is probed, added as a master.  The system is
>    complete, so the master bind is called, and device links are
>    created - the component device is a provider, and the main
>    device is a consumer.
> 3) the component device is unbound, which triggers the device links
>    to unbind the provider device.  This undoes everything, removing
>    the device links and removing the main device from the component
>    helper.
> 4) the component device is re-probed, and added to the component
>    helper again.
>
> At this point, there is nothing that causes the main device to be
> re-probed.  As far as userspace is concerned, that's confusing because
> userspace never asked for any change of status of the provider device.
>
> Without device links, the main device would not be unbound, and there
> would be an additional stage:
>
> 5) the component helper notices the re-appearance of the component
>    device, notices that the system is again complete, and re-binds
>    the master device bring the system back up.
>
> This seems to be a disadvantage with device links, but there is an
> advantage that device links offer - the ability to deal with other
> resources (such as clocks, regulators, etc) going away more
> gracefully than is possible today - but folk have to understand
> that a consumer device will only ever be unbound, it will never be
> re-bound automatically when the resources re-appear.

Hm, I thought there was patches floating around to remedy that, I
guess they never landed? Without that device links aren't really
useful beyond fixing suspend/resume issues.

> I suppose that could be fixed by moving the consumer devices onto
> the deferred probe list, so when a device is re-bound, they are
> retried.  At that point, the component helper becomes entirely
> redundant, since its functionality can be implemented completely
> using device links.

There's still the benefit that component doesn't have opinions about
what exactly the things are you're collecting together - device links
only work if you have a struct device. That's generally ok for DT
platforms, but a notch more work on x86 (since you need to create
platform devices, which then creates lifetime issues since the
platform device is managed by the driver bound to the pci device, so
if you unload that again the platform device itself has to disappear).

I guess if device links really don't work plan B would be to have a
standard wrapper for bridg-as-component: drm_bridge_add would also
register a component, and a new of_drm_bridge_component_add would add
it to the component_match. We could probably even do standard
bind/unbind functions for that bridge component, which links it up to
the right encoder (or just to a given bridge pointer, whether that's
in drm_encoder of in a drm_bridge for chaining).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08 10:23                   ` Russell King - ARM Linux
@ 2019-01-08 10:32                     ` Andrzej Hajda
  0 siblings, 0 replies; 53+ messages in thread
From: Andrzej Hajda @ 2019-01-08 10:32 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Lubomir Rintel, DRI Development, Liviu Dudau, Peter Rosin

On 08.01.2019 11:23, Russell King - ARM Linux wrote:
> On Tue, Jan 08, 2019 at 10:22:06AM +0100, Andrzej Hajda wrote:
>> What part of drm core disallows it? As I remember discussions about
>> drm_bridge design there were voices that they should be
>> hot(un)pluggable, and they are IMO, of course if they are not active.
> Even if they are not active, once the DRM master device has used
> of_drm_find_bridge(), it has a reference on struct drm_bridge.
> Normally, that is allocated using something like devm_kzalloc()
> in the bridge drivers probe function.
>
> When the bridge driver is unbound, that memory will be freed, but
> there is no notification to the DRM master that this structure is
> no longer valid (unless you have something implemented in exynos
> between the exynos core and the bridge driver(s) that specifically
> deals with that.)  Note that there is nothing in drm_bridge_remove()
> that does anything beyond removing the bridge from the global list
> of bridges.
>
> Any further accesses by the DRM master to that struct drm_bridge
> will be a use-after-free of that memory.
>

This is fortunate case of mipi-dsi bus, where master is notified upon
child removal (mipi_dsi_host_ops::detach), so it can perform proper cleanup.


Regards

Andrzej

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08 10:24                   ` Daniel Vetter
@ 2019-01-08 11:25                     ` Andrzej Hajda
  2019-01-08 11:38                       ` Russell King - ARM Linux
  0 siblings, 1 reply; 53+ messages in thread
From: Andrzej Hajda @ 2019-01-08 11:25 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Lubomir Rintel, Liviu Dudau, Russell King - ARM Linux,
	DRI Development, Peter Rosin

On 08.01.2019 11:24, Daniel Vetter wrote:
> On Tue, Jan 8, 2019 at 10:22 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
>> On 08.01.2019 09:47, Daniel Vetter wrote:
>>> On Tue, Jan 8, 2019 at 9:35 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
>>>> On 07.01.2019 22:56, Daniel Vetter wrote:
>>>>> On Mon, Jan 7, 2019 at 5:27 PM Andrzej Hajda <a.hajda@samsung.com> wrote:
>>>>>> On 07.01.2019 17:08, Daniel Vetter wrote:
>>>>>>> On Mon, Jan 07, 2019 at 12:26:58PM +0100, Andrzej Hajda wrote:
>>>>>>>> On 07.01.2019 11:45, Daniel Vetter wrote:
>>>>>>>>> On Thu, Jan 03, 2019 at 01:11:47PM +0000, Russell King - ARM Linux wrote:
>>>>>>>>>> On Thu, Jan 03, 2019 at 10:47:27AM +0100, Lubomir Rintel wrote:
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>> lately I've been trying to make the Himax HX8837 chip that drives the OLPC
>>>>>>>>>>> LCD display work with Armada DRM driver. I've been advised to create a
>>>>>>>>>>> bridge driver and not an encoder driver since the silicon is separate from
>>>>>>>>>>> the LCDC.
>>>>>>>>>>>
>>>>>>>>>>> The Armada DRM driver (and, I think, the i.MX one) creates the drm_device
>>>>>>>>>>> once the component infrastructure sees the necessary sub-devices appear.
>>>>>>>>>>> The sub-devices being the LCDCs and the encoders (not bridges) that it
>>>>>>>>>>> expects to be created externally.
>>>>>>>>>>>
>>>>>>>>>>> Currently, it seems, the only driver that can actually work with this (that
>>>>>>>>>>> is -- creates a drm_encoder for a drm_device when the component is bound)
>>>>>>>>>>> is the tda998x. All other similar drivers create a drm_bridge instead and
>>>>>>>>>>> not use the component infrastructure at all. (In fact, tilcdc driver
>>>>>>>>>>> contains a  hack to handle tda998x specially.)
>>>>>>>>>>>
>>>>>>>>>>> I'm wondering how to reconcile the two?
>>>>>>>>>>>
>>>>>>>>>>> * The tda998x driver has recently been modified to create a bridge on probe
>>>>>>>>>>>   and eventually encoder on component bind. Is this an okay thing to do in
>>>>>>>>>>>   a new driver? (this probably means the tilcdc hack can be removed...)
>>>>>>>>>>>
>>>>>>>>>>> * If a non-componentized bridge were to be used (along with a dummy
>>>>>>>>>>>   encoder), at what point would it make sense to look for the bridge?
>>>>>>>>>>>   Would it be a good idea to defer the probe of crtc until a bridge can be
>>>>>>>>>>>   looked up and the attach it on component bind?  What if the bridge goes
>>>>>>>>>>>   away (a module is unloaded, etc.) in between?
>>>>>>>>>>>
>>>>>>>>>>> I'd be thankful for opintions and advice before I move ahead with this.
>>>>>>>>>> This is the long-standing problem with the conflict between bridge
>>>>>>>>>> support and component support, and I'm not sure that there is really
>>>>>>>>>> any answer to it.
>>>>>>>>>>
>>>>>>>>>> I've gone into the details of the two several times on the list,
>>>>>>>>>> particularly about the short-comings of the bridge approach, but it
>>>>>>>>>> seems no one cares to fix those short-comings.
>>>>>>>>>>
>>>>>>>>>> You are re-identifying some of the issues that I've already pointed
>>>>>>>>>> out - such as what happens to DRM drives when the bridge driver is
>>>>>>>>>> unbound (it's really not about modules being unloaded, and the problem
>>>>>>>>>> can't be solved by taking a module reference count - all that the
>>>>>>>>>> module reference count does is ensure that the module doesn't go
>>>>>>>>>> away unexpected, there is no way to ensure that the device isn't
>>>>>>>>>> unbound.)
>>>>>>>>>>
>>>>>>>>>> The issue of unbinding is precisely the issue which the component
>>>>>>>>>> support was created to solve - but everyone seems to prefer the buggy
>>>>>>>>>> bridge approach, and no one seems willing to do anything about the
>>>>>>>>>> bugs or even acknowledge that it's a problem.  It's strange - if one
>>>>>>>>>> identifies bugs that result in kernel oops in other kernel subsystems,
>>>>>>>>>> one is generally taken seriously and the problem is solved.
>>>>>>>>> Unbinding is really not the most important feature, especially for SoC. If
>>>>>>>>> you feel different, working together with others, getting some agreement,
>>>>>>>>> getting the patches reviewed and finding someone to get them merged is
>>>>>>>>> very much appreciated. But just complaining won't move this forward.
>>>>>>>>>
>>>>>>>>>> The issue about the encoders is something that I've tried to discuss,
>>>>>>>>>> and I've pointed out that moving it into the DRM driver adds additional
>>>>>>>>>> complexity there, and I'd hoped that my patch set I posted would've
>>>>>>>>>> generated discussion, but alas not.
>>>>>>>>>>
>>>>>>>>>> What I'm not prepared to do is to introduce _known_ bugs into any
>>>>>>>>>> driver that I maintain.
>>>>>>>>> I thought last time around the idea was to use device links (and teach
>>>>>>>>> drm_bridge about them), so that the unloading works correctly.
>>>>>>>> With current device_links implementation registering links in probe of
>>>>>>>> the consumer is just incorrect - it can happen that neither consumer
>>>>>>>> neither provider is fully probed and creating device links in such state
>>>>>>>> is wrong according to docs, and my experiments. See [1] for discussion
>>>>>>>> and [2] for docs.
>>>>>>> We could set up the device link only at drm_dev_register time. At that point
>>>>>>> we know that driver loading has fully succeeded. We do have a list of
>>>>>>> bridges at hand, so that's not going to be an issue.
>>>>>>>
>>>>>>> The only case where this breaks is if a driver is still using the
>>>>>>> deprecated ->load callback. But that ->load callback doesn't mix well with
>>>>>>> EDEFER_PROBE/component framework anyway, so I think not going to be a
>>>>>>> problem hopefully?
>>>>>> drm_dev_register usually is called from bind callback, which is called from probe callback of one of the components or master (depending on particular probe order). If you want to register device link in this function it is possible that the bad scenario will happen - there will be attempt to create device link between not-yet-probed consumer and not-yet-probed provider.
>>>>> If you call drm_dev_register before you have all the components
>>>>> assembled (i.e. anywhere else than in master bind) that sounds like a
>>>>> driver bug.
>>>> This is how components work, every component calls component_add usually
>>>> in probe, and this function checks if all components are present, if yes
>>>> (ie. during probe of the last component) master bind is called and it
>>>> creates drm device and then registers it. If you add device link at this
>>>> moment, it is still before end of probe of the last component.
>>>>
>>>> On the other side provider is registered (drm_bridge_add in case of
>>>> bridges) during its probe so it becomes available to the consumers
>>>> BEFORE end of its probe and again it can happen that device link will be
>>>> added to early.
>>> Hm I read that thread again. Is the reason why we can't add the device
>>> link only the exynos behaviour to allow drm_panel to be
>>> hot-added/removed?
>>
>> Not only. What I have described above is common to all drivers it has
>> nothing specific to Exynos.
> I'm pretty sure that most drivers do not hot-add/remove drm things
> after drm_dev_register (except drm_connector for dp mst support). It
> would be buggy. Do you have more examples? I haven't reviewed them all
> in detail, and things might have changed, so could very well be
> there's exceptions.


Issues with device links have nothing to do with hotplugging, they are
generic - lifetime of the objects (drm_bridge, drm_panel) is just
slightly different of lifetime of device links, and this is racy even if
you do not want hotplugging. Moreover since drm_dev is not device (has
no associated struct device) assuming we can reuse its parent to create
device link results in circular dependencies.


>
>> Of course it was tested on Exynos as this is HW I work on. Linus Walleij
>> has also reported in this thread that device links have different issue
>> - circular dependencies (last few emails in this lengthy thread). My
>> response explains it in detail.
>>
>>
>>> Note that for bridge allowing this is 100% buggy: drm core does not
>>> allow you to hotplug/hotunplug bridges.
>>
>> What part of drm core disallows it? As I remember discussions about
>> drm_bridge design there were voices that they should be
>> hot(un)pluggable, and they are IMO, of course if they are not active.
> There's no locking at all to handle bridges appearing/disappearing
> while the drm_device can be accessed. There's also no lifetime
> management/refcounting. So it "works" but it's racy like mad.


I have not recently examined the code, but as I remember core uses the
bridge only if it is attached to encoder which is attached to pipeline
and the connector is in connected state (or in transition phase to/from
this state).


>
>>> In theory drm_panel can be hotplugged (because we can hotplug
>>> drm_connector), but I'm pretty sure exynos gets it all wrong since
>>> hotplugging panels is no easy thing to do. I don't think this is
>>> something we want to support, forcing the entire driver to be unload
>>> sounds like what we want to do here.
>>
>> I do not know if it is 'all wrong', but tests shows it is
>> hot(un)pluggable. In both cases of panel and bridge :)
> There's no drm_connector_get/put in exynos (except the one in
> hdmi_mode_fixup, which looks rather fishy tbh). And drm_connector are
> the only things in drm you can hotplug/unplug (except the overall
> drm_device ofc). So again it maybe "works", but you're guaranteed to
> have lots of races all over the place.


Personally I have implemented only panel hotplug support and I have it
tested/analyzed quite thoroughly - it does not play with connector
removal it just makes connector disconnected in case panel is removed,
much simpler case. In case of bridge I have helped in design but I have
not tested it thoroughly, and as you pointed it out there are fishy
places, but I guess the design should be OK.

The design is as follows (just bridge removal scenario and tc358764 bridge):

1. User requests bridge unbind - tc358764_remove is called.

2. tc358764_remove calls mipi_dsi_detach - encoder can perform pipeline
cleanup, including connector removal.

3. Now the bridge is detached, so it can be removed - tc358764_remove
calls drm_bridge_remove.


As I see in the code it looks like there are missing pieces/bugs but it
is just something which can be fixed.

BTW the approach that last element of the pipeline should create
connector complicates things a lot, as Laurent (and me) argued multiple
times moving connector creation out of bridges would simplify things.


Regards

Andrzej


> -Daniel
>
>>
>> Regards
>>
>> Andrzej
>>
>>
>>
>>> -Daniel
>>>
>>>> Regards
>>>>
>>>> Andrzej
>>>>
>>>>
>>>>> drm_dev_register publishes the drm device instance to the
>>>>> world, if you're not ready to handle userspace requests at that point
>>>>> (because not everything is loaded yet) then things will go boom in
>>>>> very colorful ways. And from my (admittedly very rough) understanding
>>>>> we should be able to register the the device links as the very last
>>>>> step in the master bind function (and drm_dev_register should be about
>>>>> the last thing you do in the master bind).
>>>>>
>>>>> So not clear on why this won't work?
>>>>
>>>>
>>>>> -Daniel
>>>>> --
>>>>> Daniel Vetter
>>>>> Software Engineer, Intel Corporation
>>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>>>>>
>>>>>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08 11:25                     ` Andrzej Hajda
@ 2019-01-08 11:38                       ` Russell King - ARM Linux
  2019-01-08 12:27                         ` Andrzej Hajda
  0 siblings, 1 reply; 53+ messages in thread
From: Russell King - ARM Linux @ 2019-01-08 11:38 UTC (permalink / raw)
  To: Andrzej Hajda; +Cc: Lubomir Rintel, DRI Development, Liviu Dudau, Peter Rosin

On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> Issues with device links have nothing to do with hotplugging, they are
> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> slightly different of lifetime of device links, and this is racy even if
> you do not want hotplugging. Moreover since drm_dev is not device (has
> no associated struct device) assuming we can reuse its parent to create
> device link results in circular dependencies.

How about having the device links created depending on whether the
main drm driver wants them or not - that would mean that Exynos
could continue avoiding them, but others that want them can have
the links?

I don't think the links should be created when the bridge is attached,
they should be created when the main drm driver gains its first
reference to the drm_bridge (via of_drm_find_bridge()) - since that's
the point where things may explode if the drm_bridge goes away.
Calling drm_bridge_attach() for a bridge that has been unbound will
be a use-after-free bug, so using device links at that point is way
too late.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08 11:38                       ` Russell King - ARM Linux
@ 2019-01-08 12:27                         ` Andrzej Hajda
  2019-01-08 13:21                           ` Russell King - ARM Linux
  0 siblings, 1 reply; 53+ messages in thread
From: Andrzej Hajda @ 2019-01-08 12:27 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Lubomir Rintel, DRI Development, Liviu Dudau, Peter Rosin

On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
>> Issues with device links have nothing to do with hotplugging, they are
>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
>> slightly different of lifetime of device links, and this is racy even if
>> you do not want hotplugging. Moreover since drm_dev is not device (has
>> no associated struct device) assuming we can reuse its parent to create
>> device link results in circular dependencies.
> How about having the device links created depending on whether the
> main drm driver wants them or not - that would mean that Exynos
> could continue avoiding them, but others that want them can have
> the links?


That should be OK for Exynos. But regardless of Exynos device_links at
the current state will not work correctly with bridges/panels as I
described earlier.


>
> I don't think the links should be created when the bridge is attached,
> they should be created when the main drm driver gains its first
> reference to the drm_bridge (via of_drm_find_bridge()) - since that's
> the point where things may explode if the drm_bridge goes away.
> Calling drm_bridge_attach() for a bridge that has been unbound will
> be a use-after-free bug, so using device links at that point is way
> too late.
>
If you want to create device_link you should access bridge->dev, if you
do it outside bridge_lock it can explode as well. So link should be
created under bridge_lock then (inside of_drm_find_bridge? via another
getter? ).


Regards

Andrzej

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08 12:27                         ` Andrzej Hajda
@ 2019-01-08 13:21                           ` Russell King - ARM Linux
  2019-01-08 13:34                             ` Daniel Vetter
  2019-01-08 14:33                             ` Andrzej Hajda
  0 siblings, 2 replies; 53+ messages in thread
From: Russell King - ARM Linux @ 2019-01-08 13:21 UTC (permalink / raw)
  To: Andrzej Hajda; +Cc: Lubomir Rintel, DRI Development, Liviu Dudau, Peter Rosin

On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> > On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> >> Issues with device links have nothing to do with hotplugging, they are
> >> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> >> slightly different of lifetime of device links, and this is racy even if
> >> you do not want hotplugging. Moreover since drm_dev is not device (has
> >> no associated struct device) assuming we can reuse its parent to create
> >> device link results in circular dependencies.
> > How about having the device links created depending on whether the
> > main drm driver wants them or not - that would mean that Exynos
> > could continue avoiding them, but others that want them can have
> > the links?
> 
> 
> That should be OK for Exynos. But regardless of Exynos device_links at
> the current state will not work correctly with bridges/panels as I
> described earlier.

However, I'm not sure you're correct with your interpretation of the
documentation.  Firstly, the documentation says:

   Another example for an inconsistent state would be a device link that
   represents a driver presence dependency, yet is added from the consumer's
   ->probe callback while the supplier hasn't probed yet: Had the driver core
   known about the device link earlier, it wouldn't have probed the consumer
   in the first place. The onus is thus on the consumer to check presence of
   the supplier after adding the link, and defer probing on non-presence.

This is fine - if we add the device link from of_drm_find_bridge(), we
will be in the consumer's ->probe function, and the supplier must have
been probed for us to find the struct drm_bridge.

Secondly, device links are created by the regulator support whenever a
regulator is "got" - between the supplier and the consumer.  I'm not
sure that regulator uses this in a safe way since it looks to me like
there could be a race between the point where the regulator has been
found and the point that the device link is created, and the regulator
supplier being unbound.  Regulator uses stateless device links to
order PM, but not to remove consumers when the supplier goes away.

Finally, I believe that CCF is looking at using device links as well,
which will mean a link is established by a clk_get() type operation
and released in a clk_put() operation.  There hasn't been any code
merged for this, but I have seen it discussed.

These all have in common one thing - a device link is created at the
point that the resource is obtained and removed when the resource is
released.

I don't see how DRM bridges are any different from any other resource
in the system, and why you think that device links wouldn't work for
DRM bridges.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08 13:21                           ` Russell King - ARM Linux
@ 2019-01-08 13:34                             ` Daniel Vetter
  2019-01-08 14:33                             ` Andrzej Hajda
  1 sibling, 0 replies; 53+ messages in thread
From: Daniel Vetter @ 2019-01-08 13:34 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Liviu Dudau, Peter Rosin, DRI Development, Lubomir Rintel

On Tue, Jan 8, 2019 at 2:22 PM Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
> On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
> > On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> > > On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> > >> Issues with device links have nothing to do with hotplugging, they are
> > >> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> > >> slightly different of lifetime of device links, and this is racy even if
> > >> you do not want hotplugging. Moreover since drm_dev is not device (has
> > >> no associated struct device) assuming we can reuse its parent to create
> > >> device link results in circular dependencies.
> > > How about having the device links created depending on whether the
> > > main drm driver wants them or not - that would mean that Exynos
> > > could continue avoiding them, but others that want them can have
> > > the links?
> >
> >
> > That should be OK for Exynos. But regardless of Exynos device_links at
> > the current state will not work correctly with bridges/panels as I
> > described earlier.
>
> However, I'm not sure you're correct with your interpretation of the
> documentation.  Firstly, the documentation says:
>
>    Another example for an inconsistent state would be a device link that
>    represents a driver presence dependency, yet is added from the consumer's
>    ->probe callback while the supplier hasn't probed yet: Had the driver core
>    known about the device link earlier, it wouldn't have probed the consumer
>    in the first place. The onus is thus on the consumer to check presence of
>    the supplier after adding the link, and defer probing on non-presence.
>
> This is fine - if we add the device link from of_drm_find_bridge(), we
> will be in the consumer's ->probe function, and the supplier must have
> been probed for us to find the struct drm_bridge.
>
> Secondly, device links are created by the regulator support whenever a
> regulator is "got" - between the supplier and the consumer.  I'm not
> sure that regulator uses this in a safe way since it looks to me like
> there could be a race between the point where the regulator has been
> found and the point that the device link is created, and the regulator
> supplier being unbound.  Regulator uses stateless device links to
> order PM, but not to remove consumers when the supplier goes away.
>
> Finally, I believe that CCF is looking at using device links as well,
> which will mean a link is established by a clk_get() type operation
> and released in a clk_put() operation.  There hasn't been any code
> merged for this, but I have seen it discussed.
>
> These all have in common one thing - a device link is created at the
> point that the resource is obtained and removed when the resource is
> released.
>
> I don't see how DRM bridges are any different from any other resource
> in the system, and why you think that device links wouldn't work for
> DRM bridges.

Yeah, for PM ordering device links should work for bridges and panels.
I think all the questions/confusion is whether they also work for
load/unload ordering, which they atm don't anyway because if you
unload a provider and then reload it, the consumer isn't added to the
reprobe list (unlikely what component achieves, which does call master
bind again in this case).

I also think the entire discussion around whether you can hotadd
bridges/panels after drm_dev_register is orthogonal, or at least
should be. If you want to support hot-adding, they neither can you use
device links nor component, because they both create a more static
connection between producer and consumer. Aside from hotadding
anything aside from drm_connector isn't really supported, and you need
to roll your own locking and everything (which exynos really doesn't
seem to do - the fact it works if you're careful doesn't really change
that).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08  0:39         ` Russell King - ARM Linux
@ 2019-01-08 14:29           ` Liviu Dudau
  0 siblings, 0 replies; 53+ messages in thread
From: Liviu Dudau @ 2019-01-08 14:29 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: Lubomir Rintel, DRI Development, Peter Rosin

On Tue, Jan 08, 2019 at 12:39:36AM +0000, Russell King - ARM Linux wrote:
> On Mon, Jan 07, 2019 at 10:55:15PM +0100, Daniel Vetter wrote:
> > On Mon, Jan 7, 2019 at 5:13 PM Russell King - ARM Linux
> > <linux@armlinux.org.uk> wrote:
> > >
> > > On Mon, Jan 07, 2019 at 11:45:32AM +0100, Daniel Vetter wrote:
> > > > On Thu, Jan 03, 2019 at 01:11:47PM +0000, Russell King - ARM Linux wrote:
> > > > > This is the long-standing problem with the conflict between bridge
> > > > > support and component support, and I'm not sure that there is really
> > > > > any answer to it.
> > > > >
> > > > > I've gone into the details of the two several times on the list,
> > > > > particularly about the short-comings of the bridge approach, but it
> > > > > seems no one cares to fix those short-comings.
> > > > >
> > > > > You are re-identifying some of the issues that I've already pointed
> > > > > out - such as what happens to DRM drives when the bridge driver is
> > > > > unbound (it's really not about modules being unloaded, and the problem
> > > > > can't be solved by taking a module reference count - all that the
> > > > > module reference count does is ensure that the module doesn't go
> > > > > away unexpected, there is no way to ensure that the device isn't
> > > > > unbound.)
> > > > >
> > > > > The issue of unbinding is precisely the issue which the component
> > > > > support was created to solve - but everyone seems to prefer the buggy
> > > > > bridge approach, and no one seems willing to do anything about the
> > > > > bugs or even acknowledge that it's a problem.  It's strange - if one
> > > > > identifies bugs that result in kernel oops in other kernel subsystems,
> > > > > one is generally taken seriously and the problem is solved.
> > > >
> > > > Unbinding is really not the most important feature, especially for SoC. If
> > > > you feel different, working together with others, getting some agreement,
> > > > getting the patches reviewed and finding someone to get them merged is
> > > > very much appreciated. But just complaining won't move this forward.
> > >
> > > Sorry, I disagree.  Unbinding is important if the current state results
> > > in crashes and oops - the lack of unbinding support in bridge makes it
> > > harder to develop without constantly rebooting the target machine.
> > >
> > > If all you care about is the end user who probably never removes a
> > > module, then yes, it's low priority, but if you care about efficient
> > > development, then the story is rather different.
> > 
> > Unloading i915 needs a very careful script, or you'll get a rather
> > bright fireworks. Afaik all other drm drivers (except maybe udl) are
> > the same. At least if you do anything fancy, where fancy includes:
> > fbdev emulation, prime buffer sharing, shared dma fences, or well
> > anything really that goes beyond a dummy boot splash. The lifetimes of
> > all these things are flat-out broken. udl tries to at least wrap some
> > duct-tape around it, and Noralf greatly improved the situation in the
> > past year at least.
> > 
> > So still not seeing what exactly the massive blocker here is.
> 
> The fact that I can unload armada drm/tda998x modules without incident
> today, and have done many times through development, and I don't wish
> to regress from that position.  As far as I'm concerned, this is a
> solved problem, but the pressure I'm under to convert tda998x to a
> bridge driver is causing bugs that I've already solved by _not_ using
> that to be introduced.

And hdlcd and mali-dp have been unloaded and loaded back multiple times
on my dev boards. I can't comment on other Arm SoC drm drivers, but I'll
be prepared to bet that at least another one is capable of doing the
same thing.

Now, the fact that you might have an fbcon that binds to the framebuffer
and makes life a bit harder ... that's another thing.

Best regards,
Liviu

> 
> You've mentioned in your previous mail about me not trying to solve
> the situation - I have tried, I've proposed a way around the component
> vs bridge issue but I don't think it went anywhere.
> 
> If I can't get traction on issues, then I can only state what the
> current state of affairs is to folk asking about it.  That is _not_
> "whinging" about it, that is informing people and being helpful.
> 
> -- 
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
> According to speedtest.net: 11.9Mbps down 500kbps up

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08 13:21                           ` Russell King - ARM Linux
  2019-01-08 13:34                             ` Daniel Vetter
@ 2019-01-08 14:33                             ` Andrzej Hajda
  2019-01-08 15:07                               ` Russell King - ARM Linux
  1 sibling, 1 reply; 53+ messages in thread
From: Andrzej Hajda @ 2019-01-08 14:33 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Lubomir Rintel, DRI Development, Liviu Dudau, Peter Rosin

On 08.01.2019 14:21, Russell King - ARM Linux wrote:
> On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
>> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
>>> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
>>>> Issues with device links have nothing to do with hotplugging, they are
>>>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
>>>> slightly different of lifetime of device links, and this is racy even if
>>>> you do not want hotplugging. Moreover since drm_dev is not device (has
>>>> no associated struct device) assuming we can reuse its parent to create
>>>> device link results in circular dependencies.
>>> How about having the device links created depending on whether the
>>> main drm driver wants them or not - that would mean that Exynos
>>> could continue avoiding them, but others that want them can have
>>> the links?
>>
>> That should be OK for Exynos. But regardless of Exynos device_links at
>> the current state will not work correctly with bridges/panels as I
>> described earlier.
> However, I'm not sure you're correct with your interpretation of the
> documentation.  Firstly, the documentation says:
>
>    Another example for an inconsistent state would be a device link that
>    represents a driver presence dependency, yet is added from the consumer's
>    ->probe callback while the supplier hasn't probed yet: Had the driver core
>    known about the device link earlier, it wouldn't have probed the consumer
>    in the first place. The onus is thus on the consumer to check presence of
>    the supplier after adding the link, and defer probing on non-presence.
>
> This is fine - if we add the device link from of_drm_find_bridge(), we
> will be in the consumer's ->probe function, and the supplier must have
> been probed for us to find the struct drm_bridge.


Supplier usually is registered in it's probe time, so there is short
period of time when supplier is available, but the probe is not yet
finished - quite unsafe, but not impossible, especially if there exists

some kind of notifications about resource appearance (MIPI-DSI case).


Regards

Andrzej


>
> Secondly, device links are created by the regulator support whenever a
> regulator is "got" - between the supplier and the consumer.  I'm not
> sure that regulator uses this in a safe way since it looks to me like
> there could be a race between the point where the regulator has been
> found and the point that the device link is created, and the regulator
> supplier being unbound.  Regulator uses stateless device links to
> order PM, but not to remove consumers when the supplier goes away.
>
> Finally, I believe that CCF is looking at using device links as well,
> which will mean a link is established by a clk_get() type operation
> and released in a clk_put() operation.  There hasn't been any code
> merged for this, but I have seen it discussed.
>
> These all have in common one thing - a device link is created at the
> point that the resource is obtained and removed when the resource is
> released.
>
> I don't see how DRM bridges are any different from any other resource
> in the system, and why you think that device links wouldn't work for
> DRM bridges.
>


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08 14:33                             ` Andrzej Hajda
@ 2019-01-08 15:07                               ` Russell King - ARM Linux
  2019-01-08 18:07                                 ` Daniel Vetter
  2019-01-09  9:12                                 ` Andrzej Hajda
  0 siblings, 2 replies; 53+ messages in thread
From: Russell King - ARM Linux @ 2019-01-08 15:07 UTC (permalink / raw)
  To: Andrzej Hajda; +Cc: Lubomir Rintel, DRI Development, Liviu Dudau, Peter Rosin

On Tue, Jan 08, 2019 at 03:33:54PM +0100, Andrzej Hajda wrote:
> On 08.01.2019 14:21, Russell King - ARM Linux wrote:
> > On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
> >> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> >>> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> >>>> Issues with device links have nothing to do with hotplugging, they are
> >>>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> >>>> slightly different of lifetime of device links, and this is racy even if
> >>>> you do not want hotplugging. Moreover since drm_dev is not device (has
> >>>> no associated struct device) assuming we can reuse its parent to create
> >>>> device link results in circular dependencies.
> >>> How about having the device links created depending on whether the
> >>> main drm driver wants them or not - that would mean that Exynos
> >>> could continue avoiding them, but others that want them can have
> >>> the links?
> >>
> >> That should be OK for Exynos. But regardless of Exynos device_links at
> >> the current state will not work correctly with bridges/panels as I
> >> described earlier.
> > However, I'm not sure you're correct with your interpretation of the
> > documentation.  Firstly, the documentation says:
> >
> >    Another example for an inconsistent state would be a device link that
> >    represents a driver presence dependency, yet is added from the consumer's
> >    ->probe callback while the supplier hasn't probed yet: Had the driver core
> >    known about the device link earlier, it wouldn't have probed the consumer
> >    in the first place. The onus is thus on the consumer to check presence of
> >    the supplier after adding the link, and defer probing on non-presence.
> >
> > This is fine - if we add the device link from of_drm_find_bridge(), we
> > will be in the consumer's ->probe function, and the supplier must have
> > been probed for us to find the struct drm_bridge.
> 
> 
> Supplier usually is registered in it's probe time, so there is short
> period of time when supplier is available, but the probe is not yet
> finished - quite unsafe, but not impossible, especially if there exists
> some kind of notifications about resource appearance (MIPI-DSI case).

At some point during the supplier probe, the resource becomes available
to consumers.  At that point, device links can be setup before the
supplier has finished probing.  So any driver that provides resources
to another driver will, at some point during the provider's probe,
make resources available, and therefore be a candidate for device links
to be created _before_ the probe function has returned.

What is a problem is if the provider publishes resources, and then fails
its probe function, causing the resource to disappear.

Taking DRM bridge, drm_bridge_add() returns void - it never fails.
Most bridge drivers do drm_bridge_add() as the very last step before
returning zero from their probe function.  There are a few exceptions.

megachips-stdpxxxx-ge-b850v3-fw.c is already buggy - it calls
devm_request_threaded_irq(), which if it fails, the bridge is left added
to the global list of bridges.  Since this structure was allocated with
devm_kzalloc(), it will be freed when the probe function fails.  So,
any failure here (eg a deferred probe because the IRQ controller is not
available) already creates a latent bug just waiting to bite.

tc358764.c is another case where the bridge is published prior to
initialisation completion, but it looks like that's under the control
of the "host" (consumer) driver.

mtk_hdmi.c enables clocks for audio after registering the bridge, which
may fail, but are unlikely to.  However, moving them prior to
drm_bridge_add() probably won't hurt and avoids the "published
interfaces which then vanish" problem.

Then we have exynos_drm_mic.c and tda998x_drv.c, but I think the
latter's error path after drm_bridge_add() can be eliminated if we
transitioned to device links instead of the component helper.

Outside DRM, take regulators - at some point during a regulator
supplier's probe function, the resource will be published, and as soon
as it is, it's available for regulator_get() to find it and setup a
device link before the probe function has finished.

From what I can tell, in the situation where a supplier makes resources
available, the consumer binds to the resource, and then the supplier
goes away, the device link will remain, and the consumer will not be
unbound, which leads to an unexpected state.  The solution to this is
the age old kernel rule of "don't publish your interfaces until you've
completed initialisation".  IOW, publish at a point where you aren't
going to fail.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08 15:07                               ` Russell King - ARM Linux
@ 2019-01-08 18:07                                 ` Daniel Vetter
  2019-01-09  9:12                                 ` Andrzej Hajda
  1 sibling, 0 replies; 53+ messages in thread
From: Daniel Vetter @ 2019-01-08 18:07 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Liviu Dudau, Peter Rosin, DRI Development, Lubomir Rintel

On Tue, Jan 8, 2019 at 4:07 PM Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
>
> On Tue, Jan 08, 2019 at 03:33:54PM +0100, Andrzej Hajda wrote:
> > On 08.01.2019 14:21, Russell King - ARM Linux wrote:
> > > On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
> > >> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> > >>> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> > >>>> Issues with device links have nothing to do with hotplugging, they are
> > >>>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> > >>>> slightly different of lifetime of device links, and this is racy even if
> > >>>> you do not want hotplugging. Moreover since drm_dev is not device (has
> > >>>> no associated struct device) assuming we can reuse its parent to create
> > >>>> device link results in circular dependencies.
> > >>> How about having the device links created depending on whether the
> > >>> main drm driver wants them or not - that would mean that Exynos
> > >>> could continue avoiding them, but others that want them can have
> > >>> the links?
> > >>
> > >> That should be OK for Exynos. But regardless of Exynos device_links at
> > >> the current state will not work correctly with bridges/panels as I
> > >> described earlier.
> > > However, I'm not sure you're correct with your interpretation of the
> > > documentation.  Firstly, the documentation says:
> > >
> > >    Another example for an inconsistent state would be a device link that
> > >    represents a driver presence dependency, yet is added from the consumer's
> > >    ->probe callback while the supplier hasn't probed yet: Had the driver core
> > >    known about the device link earlier, it wouldn't have probed the consumer
> > >    in the first place. The onus is thus on the consumer to check presence of
> > >    the supplier after adding the link, and defer probing on non-presence.
> > >
> > > This is fine - if we add the device link from of_drm_find_bridge(), we
> > > will be in the consumer's ->probe function, and the supplier must have
> > > been probed for us to find the struct drm_bridge.
> >
> >
> > Supplier usually is registered in it's probe time, so there is short
> > period of time when supplier is available, but the probe is not yet
> > finished - quite unsafe, but not impossible, especially if there exists
> > some kind of notifications about resource appearance (MIPI-DSI case).
>
> At some point during the supplier probe, the resource becomes available
> to consumers.  At that point, device links can be setup before the
> supplier has finished probing.  So any driver that provides resources
> to another driver will, at some point during the provider's probe,
> make resources available, and therefore be a candidate for device links
> to be created _before_ the probe function has returned.
>
> What is a problem is if the provider publishes resources, and then fails
> its probe function, causing the resource to disappear.
>
> Taking DRM bridge, drm_bridge_add() returns void - it never fails.
> Most bridge drivers do drm_bridge_add() as the very last step before
> returning zero from their probe function.  There are a few exceptions.

Yup, that's how it should be. And this is a general rule: Only ever
publish your interface (whether userspace dev node, or some kernel
internal thing) once setup is 100% done and nothing can fail anymore.
If you call drm_bridge_add before your bridge is fully set up, that's
a bug. Same way that calling drm_dev_register before all the sub-parts
(except drm_connector, which can be hotplugged) is a bug. That
ordering issue is why the ->load callback is deprecated, since it's
fundamentally unsafe (but we did paper over the worst races with
opening the chardev using some global locking).

If you want more examples from within drm look at gem bo or
drm_framebuffer: These also only get registered once fully set up, and
the registration is the very last thing addfb or dumb_create do. Heck
even for hotplugged drm_connector the same rule holds: First set it up
(including panel behind it and everything), then register it.

> megachips-stdpxxxx-ge-b850v3-fw.c is already buggy - it calls
> devm_request_threaded_irq(), which if it fails, the bridge is left added
> to the global list of bridges.  Since this structure was allocated with
> devm_kzalloc(), it will be freed when the probe function fails.  So,
> any failure here (eg a deferred probe because the IRQ controller is not
> available) already creates a latent bug just waiting to bite.
>
> tc358764.c is another case where the bridge is published prior to
> initialisation completion, but it looks like that's under the control
> of the "host" (consumer) driver.
>
> mtk_hdmi.c enables clocks for audio after registering the bridge, which
> may fail, but are unlikely to.  However, moving them prior to
> drm_bridge_add() probably won't hurt and avoids the "published
> interfaces which then vanish" problem.
>
> Then we have exynos_drm_mic.c and tda998x_drv.c, but I think the
> latter's error path after drm_bridge_add() can be eliminated if we
> transitioned to device links instead of the component helper.
>
> Outside DRM, take regulators - at some point during a regulator
> supplier's probe function, the resource will be published, and as soon
> as it is, it's available for regulator_get() to find it and setup a
> device link before the probe function has finished.
>
> From what I can tell, in the situation where a supplier makes resources
> available, the consumer binds to the resource, and then the supplier
> goes away, the device link will remain, and the consumer will not be
> unbound, which leads to an unexpected state.  The solution to this is
> the age old kernel rule of "don't publish your interfaces until you've
> completed initialisation".  IOW, publish at a point where you aren't
> going to fail.

Fully agreed. Any driver that does something else is suspect, and
definitely shouldn't be held up as an example of what we should
support in generic/shared code. There's just too many surprises
lurking otherwise (and so if you insist on doing things like that, you
get to hand-roll a lot more of your driver code).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08 15:07                               ` Russell King - ARM Linux
  2019-01-08 18:07                                 ` Daniel Vetter
@ 2019-01-09  9:12                                 ` Andrzej Hajda
  2019-01-09  9:24                                   ` Rafael J. Wysocki
  1 sibling, 1 reply; 53+ messages in thread
From: Andrzej Hajda @ 2019-01-09  9:12 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Rafael J. Wysocki, Liviu Dudau, DRI Development, Lubomir Rintel,
	Peter Rosin

+CC: Rafael J. Wysocki <rafael@kernel.org>

On 08.01.2019 16:07, Russell King - ARM Linux wrote:
> On Tue, Jan 08, 2019 at 03:33:54PM +0100, Andrzej Hajda wrote:
>> On 08.01.2019 14:21, Russell King - ARM Linux wrote:
>>> On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
>>>> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
>>>>> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
>>>>>> Issues with device links have nothing to do with hotplugging, they are
>>>>>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
>>>>>> slightly different of lifetime of device links, and this is racy even if
>>>>>> you do not want hotplugging. Moreover since drm_dev is not device (has
>>>>>> no associated struct device) assuming we can reuse its parent to create
>>>>>> device link results in circular dependencies.
>>>>> How about having the device links created depending on whether the
>>>>> main drm driver wants them or not - that would mean that Exynos
>>>>> could continue avoiding them, but others that want them can have
>>>>> the links?
>>>> That should be OK for Exynos. But regardless of Exynos device_links at
>>>> the current state will not work correctly with bridges/panels as I
>>>> described earlier.
>>> However, I'm not sure you're correct with your interpretation of the
>>> documentation.  Firstly, the documentation says:
>>>
>>>    Another example for an inconsistent state would be a device link that
>>>    represents a driver presence dependency, yet is added from the consumer's
>>>    ->probe callback while the supplier hasn't probed yet: Had the driver core
>>>    known about the device link earlier, it wouldn't have probed the consumer
>>>    in the first place. The onus is thus on the consumer to check presence of
>>>    the supplier after adding the link, and defer probing on non-presence.
>>>
>>> This is fine - if we add the device link from of_drm_find_bridge(), we
>>> will be in the consumer's ->probe function, and the supplier must have
>>> been probed for us to find the struct drm_bridge.
>>
>> Supplier usually is registered in it's probe time, so there is short
>> period of time when supplier is available, but the probe is not yet
>> finished - quite unsafe, but not impossible, especially if there exists
>> some kind of notifications about resource appearance (MIPI-DSI case).
> At some point during the supplier probe, the resource becomes available
> to consumers.  At that point, device links can be setup before the
> supplier has finished probing.  So any driver that provides resources
> to another driver will, at some point during the provider's probe,
> make resources available, and therefore be a candidate for device links
> to be created _before_ the probe function has returned.
>
> What is a problem is if the provider publishes resources, and then fails
> its probe function, causing the resource to disappear.


But creating link to not-probed provider is still incorrect usage from
device_links framework PoV, and my tests showed indeed that device link
created before end of provider's probe does not work at all - and since
it is stated in the documentation I guess it is by design.


>
> Taking DRM bridge, drm_bridge_add() returns void - it never fails.
> Most bridge drivers do drm_bridge_add() as the very last step before
> returning zero from their probe function.  There are a few exceptions.
>
> megachips-stdpxxxx-ge-b850v3-fw.c is already buggy - it calls
> devm_request_threaded_irq(), which if it fails, the bridge is left added
> to the global list of bridges.  Since this structure was allocated with
> devm_kzalloc(), it will be freed when the probe function fails.  So,
> any failure here (eg a deferred probe because the IRQ controller is not
> available) already creates a latent bug just waiting to bite.
>
> tc358764.c is another case where the bridge is published prior to
> initialisation completion, but it looks like that's under the control
> of the "host" (consumer) driver.
>
> mtk_hdmi.c enables clocks for audio after registering the bridge, which
> may fail, but are unlikely to.  However, moving them prior to
> drm_bridge_add() probably won't hurt and avoids the "published
> interfaces which then vanish" problem.
>
> Then we have exynos_drm_mic.c and tda998x_drv.c, but I think the
> latter's error path after drm_bridge_add() can be eliminated if we
> transitioned to device links instead of the component helper.
>
> Outside DRM, take regulators - at some point during a regulator
> supplier's probe function, the resource will be published, and as soon
> as it is, it's available for regulator_get() to find it and setup a
> device link before the probe function has finished.
>
> >From what I can tell, in the situation where a supplier makes resources
> available, the consumer binds to the resource, and then the supplier
> goes away, the device link will remain, and the consumer will not be
> unbound, which leads to an unexpected state.  The solution to this is
> the age old kernel rule of "don't publish your interfaces until you've
> completed initialisation".  IOW, publish at a point where you aren't
> going to fail.
>

What about complex devices which wants to publish multiple interfaces? I
can imagine in some cases it could be impossible to stick to this rule.


Regards

Andrzej



_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-09  9:12                                 ` Andrzej Hajda
@ 2019-01-09  9:24                                   ` Rafael J. Wysocki
  2019-01-09  9:30                                     ` Russell King - ARM Linux
  0 siblings, 1 reply; 53+ messages in thread
From: Rafael J. Wysocki @ 2019-01-09  9:24 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: Rafael J. Wysocki, Liviu Dudau, Russell King - ARM Linux,
	DRI Development, Lubomir Rintel, Peter Rosin

On Wed, Jan 9, 2019 at 10:13 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
>
> +CC: Rafael J. Wysocki <rafael@kernel.org>
>
> On 08.01.2019 16:07, Russell King - ARM Linux wrote:
> > On Tue, Jan 08, 2019 at 03:33:54PM +0100, Andrzej Hajda wrote:
> >> On 08.01.2019 14:21, Russell King - ARM Linux wrote:
> >>> On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
> >>>> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> >>>>> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> >>>>>> Issues with device links have nothing to do with hotplugging, they are
> >>>>>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> >>>>>> slightly different of lifetime of device links, and this is racy even if
> >>>>>> you do not want hotplugging. Moreover since drm_dev is not device (has
> >>>>>> no associated struct device) assuming we can reuse its parent to create
> >>>>>> device link results in circular dependencies.
> >>>>> How about having the device links created depending on whether the
> >>>>> main drm driver wants them or not - that would mean that Exynos
> >>>>> could continue avoiding them, but others that want them can have
> >>>>> the links?
> >>>> That should be OK for Exynos. But regardless of Exynos device_links at
> >>>> the current state will not work correctly with bridges/panels as I
> >>>> described earlier.
> >>> However, I'm not sure you're correct with your interpretation of the
> >>> documentation.  Firstly, the documentation says:
> >>>
> >>>    Another example for an inconsistent state would be a device link that
> >>>    represents a driver presence dependency, yet is added from the consumer's
> >>>    ->probe callback while the supplier hasn't probed yet: Had the driver core
> >>>    known about the device link earlier, it wouldn't have probed the consumer
> >>>    in the first place. The onus is thus on the consumer to check presence of
> >>>    the supplier after adding the link, and defer probing on non-presence.
> >>>
> >>> This is fine - if we add the device link from of_drm_find_bridge(), we
> >>> will be in the consumer's ->probe function, and the supplier must have
> >>> been probed for us to find the struct drm_bridge.
> >>
> >> Supplier usually is registered in it's probe time, so there is short
> >> period of time when supplier is available, but the probe is not yet
> >> finished - quite unsafe, but not impossible, especially if there exists
> >> some kind of notifications about resource appearance (MIPI-DSI case).
> > At some point during the supplier probe, the resource becomes available
> > to consumers.  At that point, device links can be setup before the
> > supplier has finished probing.  So any driver that provides resources
> > to another driver will, at some point during the provider's probe,
> > make resources available, and therefore be a candidate for device links
> > to be created _before_ the probe function has returned.
> >
> > What is a problem is if the provider publishes resources, and then fails
> > its probe function, causing the resource to disappear.
>
>
> But creating link to not-probed provider is still incorrect usage from
> device_links framework PoV, and my tests showed indeed that device link
> created before end of provider's probe does not work at all - and since
> it is stated in the documentation I guess it is by design.

Yes, it is.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-09  9:24                                   ` Rafael J. Wysocki
@ 2019-01-09  9:30                                     ` Russell King - ARM Linux
  2019-01-11 14:20                                       ` Daniel Vetter
  0 siblings, 1 reply; 53+ messages in thread
From: Russell King - ARM Linux @ 2019-01-09  9:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Lubomir Rintel, Liviu Dudau, DRI Development, Peter Rosin

On Wed, Jan 09, 2019 at 10:24:01AM +0100, Rafael J. Wysocki wrote:
> On Wed, Jan 9, 2019 at 10:13 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
> >
> > +CC: Rafael J. Wysocki <rafael@kernel.org>
> >
> > On 08.01.2019 16:07, Russell King - ARM Linux wrote:
> > > On Tue, Jan 08, 2019 at 03:33:54PM +0100, Andrzej Hajda wrote:
> > >> On 08.01.2019 14:21, Russell King - ARM Linux wrote:
> > >>> On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
> > >>>> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> > >>>>> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> > >>>>>> Issues with device links have nothing to do with hotplugging, they are
> > >>>>>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> > >>>>>> slightly different of lifetime of device links, and this is racy even if
> > >>>>>> you do not want hotplugging. Moreover since drm_dev is not device (has
> > >>>>>> no associated struct device) assuming we can reuse its parent to create
> > >>>>>> device link results in circular dependencies.
> > >>>>> How about having the device links created depending on whether the
> > >>>>> main drm driver wants them or not - that would mean that Exynos
> > >>>>> could continue avoiding them, but others that want them can have
> > >>>>> the links?
> > >>>> That should be OK for Exynos. But regardless of Exynos device_links at
> > >>>> the current state will not work correctly with bridges/panels as I
> > >>>> described earlier.
> > >>> However, I'm not sure you're correct with your interpretation of the
> > >>> documentation.  Firstly, the documentation says:
> > >>>
> > >>>    Another example for an inconsistent state would be a device link that
> > >>>    represents a driver presence dependency, yet is added from the consumer's
> > >>>    ->probe callback while the supplier hasn't probed yet: Had the driver core
> > >>>    known about the device link earlier, it wouldn't have probed the consumer
> > >>>    in the first place. The onus is thus on the consumer to check presence of
> > >>>    the supplier after adding the link, and defer probing on non-presence.
> > >>>
> > >>> This is fine - if we add the device link from of_drm_find_bridge(), we
> > >>> will be in the consumer's ->probe function, and the supplier must have
> > >>> been probed for us to find the struct drm_bridge.
> > >>
> > >> Supplier usually is registered in it's probe time, so there is short
> > >> period of time when supplier is available, but the probe is not yet
> > >> finished - quite unsafe, but not impossible, especially if there exists
> > >> some kind of notifications about resource appearance (MIPI-DSI case).
> > > At some point during the supplier probe, the resource becomes available
> > > to consumers.  At that point, device links can be setup before the
> > > supplier has finished probing.  So any driver that provides resources
> > > to another driver will, at some point during the provider's probe,
> > > make resources available, and therefore be a candidate for device links
> > > to be created _before_ the probe function has returned.
> > >
> > > What is a problem is if the provider publishes resources, and then fails
> > > its probe function, causing the resource to disappear.
> >
> >
> > But creating link to not-probed provider is still incorrect usage from
> > device_links framework PoV, and my tests showed indeed that device link
> > created before end of provider's probe does not work at all - and since
> > it is stated in the documentation I guess it is by design.
> 
> Yes, it is.

So is the regulator support and the use of it being proposed for the CCF
all going against the design of device links?  In both those cases,
device links _can_ be created while the supplier is still in the probe
function by a consumer finding the resource.

This seems fragile by design.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-09  9:30                                     ` Russell King - ARM Linux
@ 2019-01-11 14:20                                       ` Daniel Vetter
  2019-01-11 14:26                                         ` Rafael J. Wysocki
  0 siblings, 1 reply; 53+ messages in thread
From: Daniel Vetter @ 2019-01-11 14:20 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Lubomir Rintel, Rafael J. Wysocki, Liviu Dudau, DRI Development,
	Peter Rosin

On Wed, Jan 9, 2019 at 10:31 AM Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
> On Wed, Jan 09, 2019 at 10:24:01AM +0100, Rafael J. Wysocki wrote:
> > On Wed, Jan 9, 2019 at 10:13 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
> > >
> > > +CC: Rafael J. Wysocki <rafael@kernel.org>
> > >
> > > On 08.01.2019 16:07, Russell King - ARM Linux wrote:
> > > > On Tue, Jan 08, 2019 at 03:33:54PM +0100, Andrzej Hajda wrote:
> > > >> On 08.01.2019 14:21, Russell King - ARM Linux wrote:
> > > >>> On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
> > > >>>> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> > > >>>>> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> > > >>>>>> Issues with device links have nothing to do with hotplugging, they are
> > > >>>>>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> > > >>>>>> slightly different of lifetime of device links, and this is racy even if
> > > >>>>>> you do not want hotplugging. Moreover since drm_dev is not device (has
> > > >>>>>> no associated struct device) assuming we can reuse its parent to create
> > > >>>>>> device link results in circular dependencies.
> > > >>>>> How about having the device links created depending on whether the
> > > >>>>> main drm driver wants them or not - that would mean that Exynos
> > > >>>>> could continue avoiding them, but others that want them can have
> > > >>>>> the links?
> > > >>>> That should be OK for Exynos. But regardless of Exynos device_links at
> > > >>>> the current state will not work correctly with bridges/panels as I
> > > >>>> described earlier.
> > > >>> However, I'm not sure you're correct with your interpretation of the
> > > >>> documentation.  Firstly, the documentation says:
> > > >>>
> > > >>>    Another example for an inconsistent state would be a device link that
> > > >>>    represents a driver presence dependency, yet is added from the consumer's
> > > >>>    ->probe callback while the supplier hasn't probed yet: Had the driver core
> > > >>>    known about the device link earlier, it wouldn't have probed the consumer
> > > >>>    in the first place. The onus is thus on the consumer to check presence of
> > > >>>    the supplier after adding the link, and defer probing on non-presence.
> > > >>>
> > > >>> This is fine - if we add the device link from of_drm_find_bridge(), we
> > > >>> will be in the consumer's ->probe function, and the supplier must have
> > > >>> been probed for us to find the struct drm_bridge.
> > > >>
> > > >> Supplier usually is registered in it's probe time, so there is short
> > > >> period of time when supplier is available, but the probe is not yet
> > > >> finished - quite unsafe, but not impossible, especially if there exists
> > > >> some kind of notifications about resource appearance (MIPI-DSI case).
> > > > At some point during the supplier probe, the resource becomes available
> > > > to consumers.  At that point, device links can be setup before the
> > > > supplier has finished probing.  So any driver that provides resources
> > > > to another driver will, at some point during the provider's probe,
> > > > make resources available, and therefore be a candidate for device links
> > > > to be created _before_ the probe function has returned.
> > > >
> > > > What is a problem is if the provider publishes resources, and then fails
> > > > its probe function, causing the resource to disappear.
> > >
> > >
> > > But creating link to not-probed provider is still incorrect usage from
> > > device_links framework PoV, and my tests showed indeed that device link
> > > created before end of provider's probe does not work at all - and since
> > > it is stated in the documentation I guess it is by design.
> >
> > Yes, it is.
>
> So is the regulator support and the use of it being proposed for the CCF
> all going against the design of device links?  In both those cases,
> device links _can_ be created while the supplier is still in the probe
> function by a consumer finding the resource.
>
> This seems fragile by design.

Rafael, can you confirm?

This would mean device links aren't really useful for componentized
drivers (whether they use the component framework or something
subsystem specific like drm_bridge, regulator, ccf or whatever), which
seems to defeat the purpose of them. Or is there some other way to set
up device links, which doesn't hit these cases?

Note that there's some other issues with device links (discussed
earlier in this thread round reprobing), but that issue should be easy
to fix and not something fundamental like the above one.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-11 14:20                                       ` Daniel Vetter
@ 2019-01-11 14:26                                         ` Rafael J. Wysocki
  2019-01-11 14:32                                           ` Russell King - ARM Linux
  0 siblings, 1 reply; 53+ messages in thread
From: Rafael J. Wysocki @ 2019-01-11 14:26 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Lubomir Rintel, Rafael J. Wysocki, Liviu Dudau,
	Russell King - ARM Linux, DRI Development, Peter Rosin

On Fri, Jan 11, 2019 at 3:20 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Wed, Jan 9, 2019 at 10:31 AM Russell King - ARM Linux
> <linux@armlinux.org.uk> wrote:
> > On Wed, Jan 09, 2019 at 10:24:01AM +0100, Rafael J. Wysocki wrote:
> > > On Wed, Jan 9, 2019 at 10:13 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
> > > >
> > > > +CC: Rafael J. Wysocki <rafael@kernel.org>
> > > >
> > > > On 08.01.2019 16:07, Russell King - ARM Linux wrote:
> > > > > On Tue, Jan 08, 2019 at 03:33:54PM +0100, Andrzej Hajda wrote:
> > > > >> On 08.01.2019 14:21, Russell King - ARM Linux wrote:
> > > > >>> On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
> > > > >>>> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> > > > >>>>> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> > > > >>>>>> Issues with device links have nothing to do with hotplugging, they are
> > > > >>>>>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> > > > >>>>>> slightly different of lifetime of device links, and this is racy even if
> > > > >>>>>> you do not want hotplugging. Moreover since drm_dev is not device (has
> > > > >>>>>> no associated struct device) assuming we can reuse its parent to create
> > > > >>>>>> device link results in circular dependencies.
> > > > >>>>> How about having the device links created depending on whether the
> > > > >>>>> main drm driver wants them or not - that would mean that Exynos
> > > > >>>>> could continue avoiding them, but others that want them can have
> > > > >>>>> the links?
> > > > >>>> That should be OK for Exynos. But regardless of Exynos device_links at
> > > > >>>> the current state will not work correctly with bridges/panels as I
> > > > >>>> described earlier.
> > > > >>> However, I'm not sure you're correct with your interpretation of the
> > > > >>> documentation.  Firstly, the documentation says:
> > > > >>>
> > > > >>>    Another example for an inconsistent state would be a device link that
> > > > >>>    represents a driver presence dependency, yet is added from the consumer's
> > > > >>>    ->probe callback while the supplier hasn't probed yet: Had the driver core
> > > > >>>    known about the device link earlier, it wouldn't have probed the consumer
> > > > >>>    in the first place. The onus is thus on the consumer to check presence of
> > > > >>>    the supplier after adding the link, and defer probing on non-presence.
> > > > >>>
> > > > >>> This is fine - if we add the device link from of_drm_find_bridge(), we
> > > > >>> will be in the consumer's ->probe function, and the supplier must have
> > > > >>> been probed for us to find the struct drm_bridge.
> > > > >>
> > > > >> Supplier usually is registered in it's probe time, so there is short
> > > > >> period of time when supplier is available, but the probe is not yet
> > > > >> finished - quite unsafe, but not impossible, especially if there exists
> > > > >> some kind of notifications about resource appearance (MIPI-DSI case).
> > > > > At some point during the supplier probe, the resource becomes available
> > > > > to consumers.  At that point, device links can be setup before the
> > > > > supplier has finished probing.  So any driver that provides resources
> > > > > to another driver will, at some point during the provider's probe,
> > > > > make resources available, and therefore be a candidate for device links
> > > > > to be created _before_ the probe function has returned.
> > > > >
> > > > > What is a problem is if the provider publishes resources, and then fails
> > > > > its probe function, causing the resource to disappear.
> > > >
> > > >
> > > > But creating link to not-probed provider is still incorrect usage from
> > > > device_links framework PoV, and my tests showed indeed that device link
> > > > created before end of provider's probe does not work at all - and since
> > > > it is stated in the documentation I guess it is by design.
> > >
> > > Yes, it is.
> >
> > So is the regulator support and the use of it being proposed for the CCF
> > all going against the design of device links?  In both those cases,
> > device links _can_ be created while the supplier is still in the probe
> > function by a consumer finding the resource.
> >
> > This seems fragile by design.
>
> Rafael, can you confirm?

Let me quote from
https://www.kernel.org/doc/html/latest/driver-api/device_link.html :

"When driver presence on the supplier is irrelevant and only correct
suspend/resume and shutdown ordering is needed, the device link may
simply be set up with the DL_FLAG_STATELESS flag. In other words,
enforcing driver presence on the supplier is optional."

Which is exactly the case at hand here AFAICS.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-11 14:26                                         ` Rafael J. Wysocki
@ 2019-01-11 14:32                                           ` Russell King - ARM Linux
  2019-01-11 14:36                                             ` Daniel Vetter
  2019-01-11 14:36                                             ` Rafael J. Wysocki
  0 siblings, 2 replies; 53+ messages in thread
From: Russell King - ARM Linux @ 2019-01-11 14:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Lubomir Rintel, Liviu Dudau, DRI Development, Peter Rosin

On Fri, Jan 11, 2019 at 03:26:44PM +0100, Rafael J. Wysocki wrote:
> On Fri, Jan 11, 2019 at 3:20 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Wed, Jan 9, 2019 at 10:31 AM Russell King - ARM Linux
> > <linux@armlinux.org.uk> wrote:
> > > On Wed, Jan 09, 2019 at 10:24:01AM +0100, Rafael J. Wysocki wrote:
> > > > On Wed, Jan 9, 2019 at 10:13 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
> > > > >
> > > > > +CC: Rafael J. Wysocki <rafael@kernel.org>
> > > > >
> > > > > On 08.01.2019 16:07, Russell King - ARM Linux wrote:
> > > > > > On Tue, Jan 08, 2019 at 03:33:54PM +0100, Andrzej Hajda wrote:
> > > > > >> On 08.01.2019 14:21, Russell King - ARM Linux wrote:
> > > > > >>> On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
> > > > > >>>> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> > > > > >>>>> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> > > > > >>>>>> Issues with device links have nothing to do with hotplugging, they are
> > > > > >>>>>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> > > > > >>>>>> slightly different of lifetime of device links, and this is racy even if
> > > > > >>>>>> you do not want hotplugging. Moreover since drm_dev is not device (has
> > > > > >>>>>> no associated struct device) assuming we can reuse its parent to create
> > > > > >>>>>> device link results in circular dependencies.
> > > > > >>>>> How about having the device links created depending on whether the
> > > > > >>>>> main drm driver wants them or not - that would mean that Exynos
> > > > > >>>>> could continue avoiding them, but others that want them can have
> > > > > >>>>> the links?
> > > > > >>>> That should be OK for Exynos. But regardless of Exynos device_links at
> > > > > >>>> the current state will not work correctly with bridges/panels as I
> > > > > >>>> described earlier.
> > > > > >>> However, I'm not sure you're correct with your interpretation of the
> > > > > >>> documentation.  Firstly, the documentation says:
> > > > > >>>
> > > > > >>>    Another example for an inconsistent state would be a device link that
> > > > > >>>    represents a driver presence dependency, yet is added from the consumer's
> > > > > >>>    ->probe callback while the supplier hasn't probed yet: Had the driver core
> > > > > >>>    known about the device link earlier, it wouldn't have probed the consumer
> > > > > >>>    in the first place. The onus is thus on the consumer to check presence of
> > > > > >>>    the supplier after adding the link, and defer probing on non-presence.
> > > > > >>>
> > > > > >>> This is fine - if we add the device link from of_drm_find_bridge(), we
> > > > > >>> will be in the consumer's ->probe function, and the supplier must have
> > > > > >>> been probed for us to find the struct drm_bridge.
> > > > > >>
> > > > > >> Supplier usually is registered in it's probe time, so there is short
> > > > > >> period of time when supplier is available, but the probe is not yet
> > > > > >> finished - quite unsafe, but not impossible, especially if there exists
> > > > > >> some kind of notifications about resource appearance (MIPI-DSI case).
> > > > > > At some point during the supplier probe, the resource becomes available
> > > > > > to consumers.  At that point, device links can be setup before the
> > > > > > supplier has finished probing.  So any driver that provides resources
> > > > > > to another driver will, at some point during the provider's probe,
> > > > > > make resources available, and therefore be a candidate for device links
> > > > > > to be created _before_ the probe function has returned.
> > > > > >
> > > > > > What is a problem is if the provider publishes resources, and then fails
> > > > > > its probe function, causing the resource to disappear.
> > > > >
> > > > >
> > > > > But creating link to not-probed provider is still incorrect usage from
> > > > > device_links framework PoV, and my tests showed indeed that device link
> > > > > created before end of provider's probe does not work at all - and since
> > > > > it is stated in the documentation I guess it is by design.
> > > >
> > > > Yes, it is.
> > >
> > > So is the regulator support and the use of it being proposed for the CCF
> > > all going against the design of device links?  In both those cases,
> > > device links _can_ be created while the supplier is still in the probe
> > > function by a consumer finding the resource.
> > >
> > > This seems fragile by design.
> >
> > Rafael, can you confirm?
> 
> Let me quote from
> https://www.kernel.org/doc/html/latest/driver-api/device_link.html :
> 
> "When driver presence on the supplier is irrelevant and only correct
> suspend/resume and shutdown ordering is needed, the device link may
> simply be set up with the DL_FLAG_STATELESS flag. In other words,
> enforcing driver presence on the supplier is optional."
> 
> Which is exactly the case at hand here AFAICS.

That is not what we're discussing.  We are discussing using device
links to solve the problem with DRM bridges which may be removed
today from DRM without the rest of the DRM system being aware.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-11 14:32                                           ` Russell King - ARM Linux
@ 2019-01-11 14:36                                             ` Daniel Vetter
  2019-01-11 14:40                                               ` Rafael J. Wysocki
  2019-01-11 14:36                                             ` Rafael J. Wysocki
  1 sibling, 1 reply; 53+ messages in thread
From: Daniel Vetter @ 2019-01-11 14:36 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Lubomir Rintel, Rafael J. Wysocki, Liviu Dudau, DRI Development,
	Peter Rosin

On Fri, Jan 11, 2019 at 3:32 PM Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
> On Fri, Jan 11, 2019 at 03:26:44PM +0100, Rafael J. Wysocki wrote:
> > On Fri, Jan 11, 2019 at 3:20 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Wed, Jan 9, 2019 at 10:31 AM Russell King - ARM Linux
> > > <linux@armlinux.org.uk> wrote:
> > > > On Wed, Jan 09, 2019 at 10:24:01AM +0100, Rafael J. Wysocki wrote:
> > > > > On Wed, Jan 9, 2019 at 10:13 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
> > > > > >
> > > > > > +CC: Rafael J. Wysocki <rafael@kernel.org>
> > > > > >
> > > > > > On 08.01.2019 16:07, Russell King - ARM Linux wrote:
> > > > > > > On Tue, Jan 08, 2019 at 03:33:54PM +0100, Andrzej Hajda wrote:
> > > > > > >> On 08.01.2019 14:21, Russell King - ARM Linux wrote:
> > > > > > >>> On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
> > > > > > >>>> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> > > > > > >>>>> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> > > > > > >>>>>> Issues with device links have nothing to do with hotplugging, they are
> > > > > > >>>>>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> > > > > > >>>>>> slightly different of lifetime of device links, and this is racy even if
> > > > > > >>>>>> you do not want hotplugging. Moreover since drm_dev is not device (has
> > > > > > >>>>>> no associated struct device) assuming we can reuse its parent to create
> > > > > > >>>>>> device link results in circular dependencies.
> > > > > > >>>>> How about having the device links created depending on whether the
> > > > > > >>>>> main drm driver wants them or not - that would mean that Exynos
> > > > > > >>>>> could continue avoiding them, but others that want them can have
> > > > > > >>>>> the links?
> > > > > > >>>> That should be OK for Exynos. But regardless of Exynos device_links at
> > > > > > >>>> the current state will not work correctly with bridges/panels as I
> > > > > > >>>> described earlier.
> > > > > > >>> However, I'm not sure you're correct with your interpretation of the
> > > > > > >>> documentation.  Firstly, the documentation says:
> > > > > > >>>
> > > > > > >>>    Another example for an inconsistent state would be a device link that
> > > > > > >>>    represents a driver presence dependency, yet is added from the consumer's
> > > > > > >>>    ->probe callback while the supplier hasn't probed yet: Had the driver core
> > > > > > >>>    known about the device link earlier, it wouldn't have probed the consumer
> > > > > > >>>    in the first place. The onus is thus on the consumer to check presence of
> > > > > > >>>    the supplier after adding the link, and defer probing on non-presence.
> > > > > > >>>
> > > > > > >>> This is fine - if we add the device link from of_drm_find_bridge(), we
> > > > > > >>> will be in the consumer's ->probe function, and the supplier must have
> > > > > > >>> been probed for us to find the struct drm_bridge.
> > > > > > >>
> > > > > > >> Supplier usually is registered in it's probe time, so there is short
> > > > > > >> period of time when supplier is available, but the probe is not yet
> > > > > > >> finished - quite unsafe, but not impossible, especially if there exists
> > > > > > >> some kind of notifications about resource appearance (MIPI-DSI case).
> > > > > > > At some point during the supplier probe, the resource becomes available
> > > > > > > to consumers.  At that point, device links can be setup before the
> > > > > > > supplier has finished probing.  So any driver that provides resources
> > > > > > > to another driver will, at some point during the provider's probe,
> > > > > > > make resources available, and therefore be a candidate for device links
> > > > > > > to be created _before_ the probe function has returned.
> > > > > > >
> > > > > > > What is a problem is if the provider publishes resources, and then fails
> > > > > > > its probe function, causing the resource to disappear.
> > > > > >
> > > > > >
> > > > > > But creating link to not-probed provider is still incorrect usage from
> > > > > > device_links framework PoV, and my tests showed indeed that device link
> > > > > > created before end of provider's probe does not work at all - and since
> > > > > > it is stated in the documentation I guess it is by design.
> > > > >
> > > > > Yes, it is.
> > > >
> > > > So is the regulator support and the use of it being proposed for the CCF
> > > > all going against the design of device links?  In both those cases,
> > > > device links _can_ be created while the supplier is still in the probe
> > > > function by a consumer finding the resource.
> > > >
> > > > This seems fragile by design.
> > >
> > > Rafael, can you confirm?
> >
> > Let me quote from
> > https://www.kernel.org/doc/html/latest/driver-api/device_link.html :
> >
> > "When driver presence on the supplier is irrelevant and only correct
> > suspend/resume and shutdown ordering is needed, the device link may
> > simply be set up with the DL_FLAG_STATELESS flag. In other words,
> > enforcing driver presence on the supplier is optional."
> >
> > Which is exactly the case at hand here AFAICS.
>
> That is not what we're discussing.  We are discussing using device
> links to solve the problem with DRM bridges which may be removed
> today from DRM without the rest of the DRM system being aware.

Yeah, we want device links not just for the suspend/resume ordering,
but also as full dependencies. Atm that's solved with component (for
the full generic case), but that needs more hand-rolled code.
device_links would be much easier to integrate into all these
subsystems (and you really want to unload the drm driver if
clocks/transcoders/whatever disappears). For most drivers there's
really no difference between the runtime dependencies for
suspend/resume and the load/unload dependencies. Having two use 2
completely different solutions for the common case where they match
exactly seems suboptimal.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-11 14:32                                           ` Russell King - ARM Linux
  2019-01-11 14:36                                             ` Daniel Vetter
@ 2019-01-11 14:36                                             ` Rafael J. Wysocki
  2019-01-11 14:49                                               ` Russell King - ARM Linux
  1 sibling, 1 reply; 53+ messages in thread
From: Rafael J. Wysocki @ 2019-01-11 14:36 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Lubomir Rintel, Rafael J. Wysocki, Liviu Dudau, DRI Development,
	Peter Rosin

On Fri, Jan 11, 2019 at 3:32 PM Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
>
> On Fri, Jan 11, 2019 at 03:26:44PM +0100, Rafael J. Wysocki wrote:
> > On Fri, Jan 11, 2019 at 3:20 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Wed, Jan 9, 2019 at 10:31 AM Russell King - ARM Linux
> > > <linux@armlinux.org.uk> wrote:
> > > > On Wed, Jan 09, 2019 at 10:24:01AM +0100, Rafael J. Wysocki wrote:
> > > > > On Wed, Jan 9, 2019 at 10:13 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
> > > > > >
> > > > > > +CC: Rafael J. Wysocki <rafael@kernel.org>
> > > > > >
> > > > > > On 08.01.2019 16:07, Russell King - ARM Linux wrote:
> > > > > > > On Tue, Jan 08, 2019 at 03:33:54PM +0100, Andrzej Hajda wrote:
> > > > > > >> On 08.01.2019 14:21, Russell King - ARM Linux wrote:
> > > > > > >>> On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
> > > > > > >>>> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> > > > > > >>>>> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> > > > > > >>>>>> Issues with device links have nothing to do with hotplugging, they are
> > > > > > >>>>>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> > > > > > >>>>>> slightly different of lifetime of device links, and this is racy even if
> > > > > > >>>>>> you do not want hotplugging. Moreover since drm_dev is not device (has
> > > > > > >>>>>> no associated struct device) assuming we can reuse its parent to create
> > > > > > >>>>>> device link results in circular dependencies.
> > > > > > >>>>> How about having the device links created depending on whether the
> > > > > > >>>>> main drm driver wants them or not - that would mean that Exynos
> > > > > > >>>>> could continue avoiding them, but others that want them can have
> > > > > > >>>>> the links?
> > > > > > >>>> That should be OK for Exynos. But regardless of Exynos device_links at
> > > > > > >>>> the current state will not work correctly with bridges/panels as I
> > > > > > >>>> described earlier.
> > > > > > >>> However, I'm not sure you're correct with your interpretation of the
> > > > > > >>> documentation.  Firstly, the documentation says:
> > > > > > >>>
> > > > > > >>>    Another example for an inconsistent state would be a device link that
> > > > > > >>>    represents a driver presence dependency, yet is added from the consumer's
> > > > > > >>>    ->probe callback while the supplier hasn't probed yet: Had the driver core
> > > > > > >>>    known about the device link earlier, it wouldn't have probed the consumer
> > > > > > >>>    in the first place. The onus is thus on the consumer to check presence of
> > > > > > >>>    the supplier after adding the link, and defer probing on non-presence.
> > > > > > >>>
> > > > > > >>> This is fine - if we add the device link from of_drm_find_bridge(), we
> > > > > > >>> will be in the consumer's ->probe function, and the supplier must have
> > > > > > >>> been probed for us to find the struct drm_bridge.
> > > > > > >>
> > > > > > >> Supplier usually is registered in it's probe time, so there is short
> > > > > > >> period of time when supplier is available, but the probe is not yet
> > > > > > >> finished - quite unsafe, but not impossible, especially if there exists
> > > > > > >> some kind of notifications about resource appearance (MIPI-DSI case).
> > > > > > > At some point during the supplier probe, the resource becomes available
> > > > > > > to consumers.  At that point, device links can be setup before the
> > > > > > > supplier has finished probing.  So any driver that provides resources
> > > > > > > to another driver will, at some point during the provider's probe,
> > > > > > > make resources available, and therefore be a candidate for device links
> > > > > > > to be created _before_ the probe function has returned.
> > > > > > >
> > > > > > > What is a problem is if the provider publishes resources, and then fails
> > > > > > > its probe function, causing the resource to disappear.
> > > > > >
> > > > > >
> > > > > > But creating link to not-probed provider is still incorrect usage from
> > > > > > device_links framework PoV, and my tests showed indeed that device link
> > > > > > created before end of provider's probe does not work at all - and since
> > > > > > it is stated in the documentation I guess it is by design.
> > > > >
> > > > > Yes, it is.
> > > >
> > > > So is the regulator support and the use of it being proposed for the CCF
> > > > all going against the design of device links?  In both those cases,
> > > > device links _can_ be created while the supplier is still in the probe
> > > > function by a consumer finding the resource.
> > > >
> > > > This seems fragile by design.
> > >
> > > Rafael, can you confirm?
> >
> > Let me quote from
> > https://www.kernel.org/doc/html/latest/driver-api/device_link.html :
> >
> > "When driver presence on the supplier is irrelevant and only correct
> > suspend/resume and shutdown ordering is needed, the device link may
> > simply be set up with the DL_FLAG_STATELESS flag. In other words,
> > enforcing driver presence on the supplier is optional."
> >
> > Which is exactly the case at hand here AFAICS.
>
> That is not what we're discussing.  We are discussing using device
> links to solve the problem with DRM bridges which may be removed
> today from DRM without the rest of the DRM system being aware.

Still, I think, the DL_FLAG_STATELESS flag should work for you as it
turns the state tracking off entirely.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-11 14:36                                             ` Daniel Vetter
@ 2019-01-11 14:40                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 53+ messages in thread
From: Rafael J. Wysocki @ 2019-01-11 14:40 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Lubomir Rintel, Rafael J. Wysocki, Liviu Dudau,
	Russell King - ARM Linux, DRI Development, Peter Rosin

On Fri, Jan 11, 2019 at 3:36 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Jan 11, 2019 at 3:32 PM Russell King - ARM Linux
> <linux@armlinux.org.uk> wrote:
> > On Fri, Jan 11, 2019 at 03:26:44PM +0100, Rafael J. Wysocki wrote:
> > > On Fri, Jan 11, 2019 at 3:20 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Wed, Jan 9, 2019 at 10:31 AM Russell King - ARM Linux
> > > > <linux@armlinux.org.uk> wrote:
> > > > > On Wed, Jan 09, 2019 at 10:24:01AM +0100, Rafael J. Wysocki wrote:
> > > > > > On Wed, Jan 9, 2019 at 10:13 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
> > > > > > >
> > > > > > > +CC: Rafael J. Wysocki <rafael@kernel.org>
> > > > > > >
> > > > > > > On 08.01.2019 16:07, Russell King - ARM Linux wrote:
> > > > > > > > On Tue, Jan 08, 2019 at 03:33:54PM +0100, Andrzej Hajda wrote:
> > > > > > > >> On 08.01.2019 14:21, Russell King - ARM Linux wrote:
> > > > > > > >>> On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
> > > > > > > >>>> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> > > > > > > >>>>> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> > > > > > > >>>>>> Issues with device links have nothing to do with hotplugging, they are
> > > > > > > >>>>>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> > > > > > > >>>>>> slightly different of lifetime of device links, and this is racy even if
> > > > > > > >>>>>> you do not want hotplugging. Moreover since drm_dev is not device (has
> > > > > > > >>>>>> no associated struct device) assuming we can reuse its parent to create
> > > > > > > >>>>>> device link results in circular dependencies.
> > > > > > > >>>>> How about having the device links created depending on whether the
> > > > > > > >>>>> main drm driver wants them or not - that would mean that Exynos
> > > > > > > >>>>> could continue avoiding them, but others that want them can have
> > > > > > > >>>>> the links?
> > > > > > > >>>> That should be OK for Exynos. But regardless of Exynos device_links at
> > > > > > > >>>> the current state will not work correctly with bridges/panels as I
> > > > > > > >>>> described earlier.
> > > > > > > >>> However, I'm not sure you're correct with your interpretation of the
> > > > > > > >>> documentation.  Firstly, the documentation says:
> > > > > > > >>>
> > > > > > > >>>    Another example for an inconsistent state would be a device link that
> > > > > > > >>>    represents a driver presence dependency, yet is added from the consumer's
> > > > > > > >>>    ->probe callback while the supplier hasn't probed yet: Had the driver core
> > > > > > > >>>    known about the device link earlier, it wouldn't have probed the consumer
> > > > > > > >>>    in the first place. The onus is thus on the consumer to check presence of
> > > > > > > >>>    the supplier after adding the link, and defer probing on non-presence.
> > > > > > > >>>
> > > > > > > >>> This is fine - if we add the device link from of_drm_find_bridge(), we
> > > > > > > >>> will be in the consumer's ->probe function, and the supplier must have
> > > > > > > >>> been probed for us to find the struct drm_bridge.
> > > > > > > >>
> > > > > > > >> Supplier usually is registered in it's probe time, so there is short
> > > > > > > >> period of time when supplier is available, but the probe is not yet
> > > > > > > >> finished - quite unsafe, but not impossible, especially if there exists
> > > > > > > >> some kind of notifications about resource appearance (MIPI-DSI case).
> > > > > > > > At some point during the supplier probe, the resource becomes available
> > > > > > > > to consumers.  At that point, device links can be setup before the
> > > > > > > > supplier has finished probing.  So any driver that provides resources
> > > > > > > > to another driver will, at some point during the provider's probe,
> > > > > > > > make resources available, and therefore be a candidate for device links
> > > > > > > > to be created _before_ the probe function has returned.
> > > > > > > >
> > > > > > > > What is a problem is if the provider publishes resources, and then fails
> > > > > > > > its probe function, causing the resource to disappear.
> > > > > > >
> > > > > > >
> > > > > > > But creating link to not-probed provider is still incorrect usage from
> > > > > > > device_links framework PoV, and my tests showed indeed that device link
> > > > > > > created before end of provider's probe does not work at all - and since
> > > > > > > it is stated in the documentation I guess it is by design.
> > > > > >
> > > > > > Yes, it is.
> > > > >
> > > > > So is the regulator support and the use of it being proposed for the CCF
> > > > > all going against the design of device links?  In both those cases,
> > > > > device links _can_ be created while the supplier is still in the probe
> > > > > function by a consumer finding the resource.
> > > > >
> > > > > This seems fragile by design.
> > > >
> > > > Rafael, can you confirm?
> > >
> > > Let me quote from
> > > https://www.kernel.org/doc/html/latest/driver-api/device_link.html :
> > >
> > > "When driver presence on the supplier is irrelevant and only correct
> > > suspend/resume and shutdown ordering is needed, the device link may
> > > simply be set up with the DL_FLAG_STATELESS flag. In other words,
> > > enforcing driver presence on the supplier is optional."
> > >
> > > Which is exactly the case at hand here AFAICS.
> >
> > That is not what we're discussing.  We are discussing using device
> > links to solve the problem with DRM bridges which may be removed
> > today from DRM without the rest of the DRM system being aware.
>
> Yeah, we want device links not just for the suspend/resume ordering,
> but also as full dependencies. Atm that's solved with component (for
> the full generic case), but that needs more hand-rolled code.
> device_links would be much easier to integrate into all these
> subsystems (and you really want to unload the drm driver if
> clocks/transcoders/whatever disappears). For most drivers there's
> really no difference between the runtime dependencies for
> suspend/resume and the load/unload dependencies. Having two use 2
> completely different solutions for the common case where they match
> exactly seems suboptimal.

Well, the state tracking in device links needs the initial state to be
supplier driver present and functional.  If you need full dependencies
without that, device links are probably not for you.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-11 14:36                                             ` Rafael J. Wysocki
@ 2019-01-11 14:49                                               ` Russell King - ARM Linux
  2019-01-14 12:32                                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 53+ messages in thread
From: Russell King - ARM Linux @ 2019-01-11 14:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Lubomir Rintel, Liviu Dudau, DRI Development, Peter Rosin

On Fri, Jan 11, 2019 at 03:36:28PM +0100, Rafael J. Wysocki wrote:
> On Fri, Jan 11, 2019 at 3:32 PM Russell King - ARM Linux
> <linux@armlinux.org.uk> wrote:
> >
> > On Fri, Jan 11, 2019 at 03:26:44PM +0100, Rafael J. Wysocki wrote:
> > > On Fri, Jan 11, 2019 at 3:20 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Wed, Jan 9, 2019 at 10:31 AM Russell King - ARM Linux
> > > > <linux@armlinux.org.uk> wrote:
> > > > > On Wed, Jan 09, 2019 at 10:24:01AM +0100, Rafael J. Wysocki wrote:
> > > > > > On Wed, Jan 9, 2019 at 10:13 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
> > > > > > >
> > > > > > > +CC: Rafael J. Wysocki <rafael@kernel.org>
> > > > > > >
> > > > > > > On 08.01.2019 16:07, Russell King - ARM Linux wrote:
> > > > > > > > On Tue, Jan 08, 2019 at 03:33:54PM +0100, Andrzej Hajda wrote:
> > > > > > > >> On 08.01.2019 14:21, Russell King - ARM Linux wrote:
> > > > > > > >>> On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
> > > > > > > >>>> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> > > > > > > >>>>> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> > > > > > > >>>>>> Issues with device links have nothing to do with hotplugging, they are
> > > > > > > >>>>>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> > > > > > > >>>>>> slightly different of lifetime of device links, and this is racy even if
> > > > > > > >>>>>> you do not want hotplugging. Moreover since drm_dev is not device (has
> > > > > > > >>>>>> no associated struct device) assuming we can reuse its parent to create
> > > > > > > >>>>>> device link results in circular dependencies.
> > > > > > > >>>>> How about having the device links created depending on whether the
> > > > > > > >>>>> main drm driver wants them or not - that would mean that Exynos
> > > > > > > >>>>> could continue avoiding them, but others that want them can have
> > > > > > > >>>>> the links?
> > > > > > > >>>> That should be OK for Exynos. But regardless of Exynos device_links at
> > > > > > > >>>> the current state will not work correctly with bridges/panels as I
> > > > > > > >>>> described earlier.
> > > > > > > >>> However, I'm not sure you're correct with your interpretation of the
> > > > > > > >>> documentation.  Firstly, the documentation says:
> > > > > > > >>>
> > > > > > > >>>    Another example for an inconsistent state would be a device link that
> > > > > > > >>>    represents a driver presence dependency, yet is added from the consumer's
> > > > > > > >>>    ->probe callback while the supplier hasn't probed yet: Had the driver core
> > > > > > > >>>    known about the device link earlier, it wouldn't have probed the consumer
> > > > > > > >>>    in the first place. The onus is thus on the consumer to check presence of
> > > > > > > >>>    the supplier after adding the link, and defer probing on non-presence.
> > > > > > > >>>
> > > > > > > >>> This is fine - if we add the device link from of_drm_find_bridge(), we
> > > > > > > >>> will be in the consumer's ->probe function, and the supplier must have
> > > > > > > >>> been probed for us to find the struct drm_bridge.
> > > > > > > >>
> > > > > > > >> Supplier usually is registered in it's probe time, so there is short
> > > > > > > >> period of time when supplier is available, but the probe is not yet
> > > > > > > >> finished - quite unsafe, but not impossible, especially if there exists
> > > > > > > >> some kind of notifications about resource appearance (MIPI-DSI case).
> > > > > > > > At some point during the supplier probe, the resource becomes available
> > > > > > > > to consumers.  At that point, device links can be setup before the
> > > > > > > > supplier has finished probing.  So any driver that provides resources
> > > > > > > > to another driver will, at some point during the provider's probe,
> > > > > > > > make resources available, and therefore be a candidate for device links
> > > > > > > > to be created _before_ the probe function has returned.
> > > > > > > >
> > > > > > > > What is a problem is if the provider publishes resources, and then fails
> > > > > > > > its probe function, causing the resource to disappear.
> > > > > > >
> > > > > > >
> > > > > > > But creating link to not-probed provider is still incorrect usage from
> > > > > > > device_links framework PoV, and my tests showed indeed that device link
> > > > > > > created before end of provider's probe does not work at all - and since
> > > > > > > it is stated in the documentation I guess it is by design.
> > > > > >
> > > > > > Yes, it is.
> > > > >
> > > > > So is the regulator support and the use of it being proposed for the CCF
> > > > > all going against the design of device links?  In both those cases,
> > > > > device links _can_ be created while the supplier is still in the probe
> > > > > function by a consumer finding the resource.
> > > > >
> > > > > This seems fragile by design.
> > > >
> > > > Rafael, can you confirm?
> > >
> > > Let me quote from
> > > https://www.kernel.org/doc/html/latest/driver-api/device_link.html :
> > >
> > > "When driver presence on the supplier is irrelevant and only correct
> > > suspend/resume and shutdown ordering is needed, the device link may
> > > simply be set up with the DL_FLAG_STATELESS flag. In other words,
> > > enforcing driver presence on the supplier is optional."
> > >
> > > Which is exactly the case at hand here AFAICS.
> >
> > That is not what we're discussing.  We are discussing using device
> > links to solve the problem with DRM bridges which may be removed
> > today from DRM without the rest of the DRM system being aware.
> 
> Still, I think, the DL_FLAG_STATELESS flag should work for you as it
> turns the state tracking off entirely.

Let me try again.

This thread is discussing how to deal with Armada DRM, its use of the
component helper, TDA998x's hybrid use of the component helper and
DRM bridges.

Currently, DRM bridges register into the DRM system and are added to
a global list of bridges.  When a DRM driver binds, it looks up the
DRM bridge and attaches to it.  There is no way in generic DRM code
for the DRM driver to know if the DRM bridge is unbound from DRM,
consequently a DRM driver may continue trying to call functions in
the DRM bridge driver using memory that has already been freed.

We had similar issues with imx-drm, and a number of years ago at a
kernel summit, this was discussed, and I proposed a system which is
now known as the component helper.  This handles the problem of a
multi-component system.

However, DRM bridge was already established, and there appears to be
no desire to convert DRM bridges and DRM drivers to use the component
helper.

We are presently in the situation where Armada DRM is incompatible
with the DRM bridges way of doing things, and making it compatible
essentially means introducing potential use-after-free bugs into the
code.

Device links in their stateful form appear to provide an alternative
to the component helper, in that a stateful device link will remove
consumers of a resource when the supplier is going away - which is
exactly the problem which the component helper is solving.  The
difference is that device links look like being a cleaner solution.

Just like the component helper, a stateful link would unbind the
consumer of a resource when the supplier goes away - which is exactly
the behaviour we're wanting.

The problem is that the connection between various drivers is only
really known when the drivers obtain their resources, and the
following can happen:

supplier		consumer

probe()
 alloc
			probe()
 publish
			obtain supplier's resource
 return

Where things go wrong is if a stateful link is created when the
consumer obtains the suppliers resource before the supplier has
finished probing - which from what's been said is illegal.

It seems that stateful device links can only be created by bus layer
code, which limits their usefulness - in the case we have here, DT
doesn't always know ahead of time about these links.

It sounds from what you're saying that you don't want to entertain
any changes to device links - in which case, DRM can't use them to
solve this problem, even though they would be an elegant solution.
So, I think we're going to have to find a way to retrofit component
stuff into DRM in a way that makes it optional for any DRM bridge
consumer.

I hope this makes the discussion and what we're trying to achieve
a little clearer.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-11 14:49                                               ` Russell King - ARM Linux
@ 2019-01-14 12:32                                                 ` Rafael J. Wysocki
  2019-01-15  0:04                                                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 53+ messages in thread
From: Rafael J. Wysocki @ 2019-01-14 12:32 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Lubomir Rintel, Rafael J. Wysocki, Liviu Dudau, DRI Development,
	Peter Rosin

On Fri, Jan 11, 2019 at 3:49 PM Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
>
> On Fri, Jan 11, 2019 at 03:36:28PM +0100, Rafael J. Wysocki wrote:
> > On Fri, Jan 11, 2019 at 3:32 PM Russell King - ARM Linux
> > <linux@armlinux.org.uk> wrote:
> > >
> > > On Fri, Jan 11, 2019 at 03:26:44PM +0100, Rafael J. Wysocki wrote:
> > > > On Fri, Jan 11, 2019 at 3:20 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >
> > > > > On Wed, Jan 9, 2019 at 10:31 AM Russell King - ARM Linux
> > > > > <linux@armlinux.org.uk> wrote:
> > > > > > On Wed, Jan 09, 2019 at 10:24:01AM +0100, Rafael J. Wysocki wrote:
> > > > > > > On Wed, Jan 9, 2019 at 10:13 AM Andrzej Hajda <a.hajda@samsung.com> wrote:
> > > > > > > >
> > > > > > > > +CC: Rafael J. Wysocki <rafael@kernel.org>
> > > > > > > >
> > > > > > > > On 08.01.2019 16:07, Russell King - ARM Linux wrote:
> > > > > > > > > On Tue, Jan 08, 2019 at 03:33:54PM +0100, Andrzej Hajda wrote:
> > > > > > > > >> On 08.01.2019 14:21, Russell King - ARM Linux wrote:
> > > > > > > > >>> On Tue, Jan 08, 2019 at 01:27:56PM +0100, Andrzej Hajda wrote:
> > > > > > > > >>>> On 08.01.2019 12:38, Russell King - ARM Linux wrote:
> > > > > > > > >>>>> On Tue, Jan 08, 2019 at 12:25:34PM +0100, Andrzej Hajda wrote:
> > > > > > > > >>>>>> Issues with device links have nothing to do with hotplugging, they are
> > > > > > > > >>>>>> generic - lifetime of the objects (drm_bridge, drm_panel) is just
> > > > > > > > >>>>>> slightly different of lifetime of device links, and this is racy even if
> > > > > > > > >>>>>> you do not want hotplugging. Moreover since drm_dev is not device (has
> > > > > > > > >>>>>> no associated struct device) assuming we can reuse its parent to create
> > > > > > > > >>>>>> device link results in circular dependencies.
> > > > > > > > >>>>> How about having the device links created depending on whether the
> > > > > > > > >>>>> main drm driver wants them or not - that would mean that Exynos
> > > > > > > > >>>>> could continue avoiding them, but others that want them can have
> > > > > > > > >>>>> the links?
> > > > > > > > >>>> That should be OK for Exynos. But regardless of Exynos device_links at
> > > > > > > > >>>> the current state will not work correctly with bridges/panels as I
> > > > > > > > >>>> described earlier.
> > > > > > > > >>> However, I'm not sure you're correct with your interpretation of the
> > > > > > > > >>> documentation.  Firstly, the documentation says:
> > > > > > > > >>>
> > > > > > > > >>>    Another example for an inconsistent state would be a device link that
> > > > > > > > >>>    represents a driver presence dependency, yet is added from the consumer's
> > > > > > > > >>>    ->probe callback while the supplier hasn't probed yet: Had the driver core
> > > > > > > > >>>    known about the device link earlier, it wouldn't have probed the consumer
> > > > > > > > >>>    in the first place. The onus is thus on the consumer to check presence of
> > > > > > > > >>>    the supplier after adding the link, and defer probing on non-presence.
> > > > > > > > >>>
> > > > > > > > >>> This is fine - if we add the device link from of_drm_find_bridge(), we
> > > > > > > > >>> will be in the consumer's ->probe function, and the supplier must have
> > > > > > > > >>> been probed for us to find the struct drm_bridge.
> > > > > > > > >>
> > > > > > > > >> Supplier usually is registered in it's probe time, so there is short
> > > > > > > > >> period of time when supplier is available, but the probe is not yet
> > > > > > > > >> finished - quite unsafe, but not impossible, especially if there exists
> > > > > > > > >> some kind of notifications about resource appearance (MIPI-DSI case).
> > > > > > > > > At some point during the supplier probe, the resource becomes available
> > > > > > > > > to consumers.  At that point, device links can be setup before the
> > > > > > > > > supplier has finished probing.  So any driver that provides resources
> > > > > > > > > to another driver will, at some point during the provider's probe,
> > > > > > > > > make resources available, and therefore be a candidate for device links
> > > > > > > > > to be created _before_ the probe function has returned.
> > > > > > > > >
> > > > > > > > > What is a problem is if the provider publishes resources, and then fails
> > > > > > > > > its probe function, causing the resource to disappear.
> > > > > > > >
> > > > > > > >
> > > > > > > > But creating link to not-probed provider is still incorrect usage from
> > > > > > > > device_links framework PoV, and my tests showed indeed that device link
> > > > > > > > created before end of provider's probe does not work at all - and since
> > > > > > > > it is stated in the documentation I guess it is by design.
> > > > > > >
> > > > > > > Yes, it is.
> > > > > >
> > > > > > So is the regulator support and the use of it being proposed for the CCF
> > > > > > all going against the design of device links?  In both those cases,
> > > > > > device links _can_ be created while the supplier is still in the probe
> > > > > > function by a consumer finding the resource.
> > > > > >
> > > > > > This seems fragile by design.
> > > > >
> > > > > Rafael, can you confirm?
> > > >
> > > > Let me quote from
> > > > https://www.kernel.org/doc/html/latest/driver-api/device_link.html :
> > > >
> > > > "When driver presence on the supplier is irrelevant and only correct
> > > > suspend/resume and shutdown ordering is needed, the device link may
> > > > simply be set up with the DL_FLAG_STATELESS flag. In other words,
> > > > enforcing driver presence on the supplier is optional."
> > > >
> > > > Which is exactly the case at hand here AFAICS.
> > >
> > > That is not what we're discussing.  We are discussing using device
> > > links to solve the problem with DRM bridges which may be removed
> > > today from DRM without the rest of the DRM system being aware.
> >
> > Still, I think, the DL_FLAG_STATELESS flag should work for you as it
> > turns the state tracking off entirely.
>
> Let me try again.
>
> This thread is discussing how to deal with Armada DRM, its use of the
> component helper, TDA998x's hybrid use of the component helper and
> DRM bridges.
>
> Currently, DRM bridges register into the DRM system and are added to
> a global list of bridges.  When a DRM driver binds, it looks up the
> DRM bridge and attaches to it.  There is no way in generic DRM code
> for the DRM driver to know if the DRM bridge is unbound from DRM,
> consequently a DRM driver may continue trying to call functions in
> the DRM bridge driver using memory that has already been freed.
>
> We had similar issues with imx-drm, and a number of years ago at a
> kernel summit, this was discussed, and I proposed a system which is
> now known as the component helper.  This handles the problem of a
> multi-component system.
>
> However, DRM bridge was already established, and there appears to be
> no desire to convert DRM bridges and DRM drivers to use the component
> helper.
>
> We are presently in the situation where Armada DRM is incompatible
> with the DRM bridges way of doing things, and making it compatible
> essentially means introducing potential use-after-free bugs into the
> code.
>
> Device links in their stateful form appear to provide an alternative
> to the component helper, in that a stateful device link will remove
> consumers of a resource when the supplier is going away - which is
> exactly the problem which the component helper is solving.  The
> difference is that device links look like being a cleaner solution.
>
> Just like the component helper, a stateful link would unbind the
> consumer of a resource when the supplier goes away - which is exactly
> the behaviour we're wanting.
>
> The problem is that the connection between various drivers is only
> really known when the drivers obtain their resources, and the
> following can happen:
>
> supplier                consumer
>
> probe()
>  alloc
>                         probe()
>  publish
>                         obtain supplier's resource
>  return
>
> Where things go wrong is if a stateful link is created when the
> consumer obtains the suppliers resource before the supplier has
> finished probing - which from what's been said is illegal.

It just doesn't work (which means "invalid" rather than "illegal" I
guess, but whatever :-)).

Admittedly, the original design didn't take this particular case into
account and I'm not actually sure how it can be taken into account
either.

If the link is created by the consumer in the scenario above, its
status will be updated twice in a row after the consumer probe returns
and after the supplier probe returns.  It looks like this update would
need to work regardless of the order it which this happened which
sounds somewhat challenging.  I would need to think about that.

Moreover, there seems to be is an additional complication here, which
is that the supplier probe may finish and the status update for its
links may run before the extra link is actually created by the
consumer probe.  Or is there any reason why that cannot happen in this
particular case?

> It seems that stateful device links can only be created by bus layer
> code, which limits their usefulness - in the case we have here, DT
> doesn't always know ahead of time about these links.

That's fair enough.

> It sounds from what you're saying that you don't want to entertain
> any changes to device links

Why not?

If they can be made work for this use case, they would become more
useful in general I suppose, but I'm just not sure how to do that ATM.

> - in which case, DRM can't use them to
> solve this problem, even though they would be an elegant solution.
> So, I think we're going to have to find a way to retrofit component
> stuff into DRM in a way that makes it optional for any DRM bridge
> consumer.
>
> I hope this makes the discussion and what we're trying to achieve
> a little clearer.

Yes, it does, thank you!

Cheers,
Rafael
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-14 12:32                                                 ` Rafael J. Wysocki
@ 2019-01-15  0:04                                                   ` Rafael J. Wysocki
  2019-01-15 22:47                                                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 53+ messages in thread
From: Rafael J. Wysocki @ 2019-01-15  0:04 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Lubomir Rintel, Linux PM, Liviu Dudau, DRI Development, Peter Rosin

[-- Attachment #1: Type: text/plain, Size: 3707 bytes --]

[CC Lukas and linux-pm]

On Mon, Jan 14, 2019 at 1:32 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Fri, Jan 11, 2019 at 3:49 PM Russell King - ARM Linux
> <linux@armlinux.org.uk> wrote:

[cut]

> >
> > This thread is discussing how to deal with Armada DRM, its use of the
> > component helper, TDA998x's hybrid use of the component helper and
> > DRM bridges.
> >
> > Currently, DRM bridges register into the DRM system and are added to
> > a global list of bridges.  When a DRM driver binds, it looks up the
> > DRM bridge and attaches to it.  There is no way in generic DRM code
> > for the DRM driver to know if the DRM bridge is unbound from DRM,
> > consequently a DRM driver may continue trying to call functions in
> > the DRM bridge driver using memory that has already been freed.
> >
> > We had similar issues with imx-drm, and a number of years ago at a
> > kernel summit, this was discussed, and I proposed a system which is
> > now known as the component helper.  This handles the problem of a
> > multi-component system.
> >
> > However, DRM bridge was already established, and there appears to be
> > no desire to convert DRM bridges and DRM drivers to use the component
> > helper.
> >
> > We are presently in the situation where Armada DRM is incompatible
> > with the DRM bridges way of doing things, and making it compatible
> > essentially means introducing potential use-after-free bugs into the
> > code.
> >
> > Device links in their stateful form appear to provide an alternative
> > to the component helper, in that a stateful device link will remove
> > consumers of a resource when the supplier is going away - which is
> > exactly the problem which the component helper is solving.  The
> > difference is that device links look like being a cleaner solution.
> >
> > Just like the component helper, a stateful link would unbind the
> > consumer of a resource when the supplier goes away - which is exactly
> > the behaviour we're wanting.
> >
> > The problem is that the connection between various drivers is only
> > really known when the drivers obtain their resources, and the
> > following can happen:
> >
> > supplier                consumer
> >
> > probe()
> >  alloc
> >                         probe()
> >  publish
> >                         obtain supplier's resource
> >  return
> >
> > Where things go wrong is if a stateful link is created when the
> > consumer obtains the suppliers resource before the supplier has
> > finished probing - which from what's been said is illegal.
>
> It just doesn't work (which means "invalid" rather than "illegal" I
> guess, but whatever :-)).
>
> Admittedly, the original design didn't take this particular case into
> account and I'm not actually sure how it can be taken into account
> either.
>
> If the link is created by the consumer in the scenario above, its
> status will be updated twice in a row after the consumer probe returns
> and after the supplier probe returns.  It looks like this update would
> need to work regardless of the order it which this happened which
> sounds somewhat challenging.  I would need to think about that.

So if I'm not mistaken it can be made work with the help of the
(completely untested) attached patch (of course, the documentation
would need to be updated too).

The key observation here is that it should be fine to create a link
from the consumer driver's probe while the supplier is still probing
if the consumer has some way to find out that the supplier is
functional at that point (like when it has published itself already in
your example above).  The initial state of the link can be "consumer
probe" in that case and I don't see a reason why that might not work.

[-- Attachment #2: device-links-extension.patch --]
[-- Type: text/x-patch, Size: 1366 bytes --]

---
 drivers/base/core.c |   19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -260,6 +260,17 @@ struct device_link *device_link_add(stru
 		link->status = DL_STATE_NONE;
 	} else {
 		switch (supplier->links.status) {
+		case DL_DEV_PROBING:
+			/*
+			 * A consumer driver can create a link to a supplier
+			 * that also is probing as long as it knows that the
+			 * supplier is already functional (for example, it has
+			 * just acquired some resources from the supplier).
+			 */
+			link->status = consumer->links.status == DL_DEV_PROBING ?
+					DL_STATE_CONSUMER_PROBE :
+					DL_STATE_DORMANT;
+			break;
 		case DL_DEV_DRIVER_BOUND:
 			switch (consumer->links.status) {
 			case DL_DEV_PROBING:
@@ -474,6 +485,14 @@ void device_links_driver_bound(struct de
 		if (link->flags & DL_FLAG_STATELESS)
 			continue;
 
+		/*
+		 * Links created during consumer probe may be in the "consumer
+		 * probe" state to start with if the supplier is still probing
+		 * when they are created.  Skip them here.
+		 */
+		if (link->status == DL_STATE_CONSUMER_PROBE)
+			continue;
+
 		WARN_ON(link->status != DL_STATE_DORMANT);
 		WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
 	}

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-15  0:04                                                   ` Rafael J. Wysocki
@ 2019-01-15 22:47                                                     ` Rafael J. Wysocki
  2019-01-16 18:42                                                       ` Daniel Vetter
  2019-01-17 17:26                                                       ` Russell King - ARM Linux admin
  0 siblings, 2 replies; 53+ messages in thread
From: Rafael J. Wysocki @ 2019-01-15 22:47 UTC (permalink / raw)
  To: Russell King - ARM Linux, Andrzej Hajda
  Cc: Linux PM, Greg Kroah-Hartman, Liviu Dudau, DRI Development,
	Lubomir Rintel, Peter Rosin

[CC Greg]

On Tuesday, January 15, 2019 1:04:02 AM CET Rafael J. Wysocki wrote:
> [CC Lukas and linux-pm]
> 
> On Mon, Jan 14, 2019 at 1:32 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Fri, Jan 11, 2019 at 3:49 PM Russell King - ARM Linux
> > <linux@armlinux.org.uk> wrote:
> 
> [cut]
> 
> > >
> > > This thread is discussing how to deal with Armada DRM, its use of the
> > > component helper, TDA998x's hybrid use of the component helper and
> > > DRM bridges.
> > >
> > > Currently, DRM bridges register into the DRM system and are added to
> > > a global list of bridges.  When a DRM driver binds, it looks up the
> > > DRM bridge and attaches to it.  There is no way in generic DRM code
> > > for the DRM driver to know if the DRM bridge is unbound from DRM,
> > > consequently a DRM driver may continue trying to call functions in
> > > the DRM bridge driver using memory that has already been freed.
> > >
> > > We had similar issues with imx-drm, and a number of years ago at a
> > > kernel summit, this was discussed, and I proposed a system which is
> > > now known as the component helper.  This handles the problem of a
> > > multi-component system.
> > >
> > > However, DRM bridge was already established, and there appears to be
> > > no desire to convert DRM bridges and DRM drivers to use the component
> > > helper.
> > >
> > > We are presently in the situation where Armada DRM is incompatible
> > > with the DRM bridges way of doing things, and making it compatible
> > > essentially means introducing potential use-after-free bugs into the
> > > code.
> > >
> > > Device links in their stateful form appear to provide an alternative
> > > to the component helper, in that a stateful device link will remove
> > > consumers of a resource when the supplier is going away - which is
> > > exactly the problem which the component helper is solving.  The
> > > difference is that device links look like being a cleaner solution.
> > >
> > > Just like the component helper, a stateful link would unbind the
> > > consumer of a resource when the supplier goes away - which is exactly
> > > the behaviour we're wanting.
> > >
> > > The problem is that the connection between various drivers is only
> > > really known when the drivers obtain their resources, and the
> > > following can happen:
> > >
> > > supplier                consumer
> > >
> > > probe()
> > >  alloc
> > >                         probe()
> > >  publish
> > >                         obtain supplier's resource
> > >  return
> > >
> > > Where things go wrong is if a stateful link is created when the
> > > consumer obtains the suppliers resource before the supplier has
> > > finished probing - which from what's been said is illegal.
> >
> > It just doesn't work (which means "invalid" rather than "illegal" I
> > guess, but whatever :-)).
> >
> > Admittedly, the original design didn't take this particular case into
> > account and I'm not actually sure how it can be taken into account
> > either.
> >
> > If the link is created by the consumer in the scenario above, its
> > status will be updated twice in a row after the consumer probe returns
> > and after the supplier probe returns.  It looks like this update would
> > need to work regardless of the order it which this happened which
> > sounds somewhat challenging.  I would need to think about that.
> 
> So if I'm not mistaken it can be made work with the help of the
> (completely untested) attached patch (of course, the documentation
> would need to be updated too).
> 
> The key observation here is that it should be fine to create a link
> from the consumer driver's probe while the supplier is still probing
> if the consumer has some way to find out that the supplier is
> functional at that point (like when it has published itself already in
> your example above).  The initial state of the link can be "consumer
> probe" in that case and I don't see a reason why that might not work.
> 

Below is a more complete patch that should take all of the corner cases
into account unless I have missed anything.  Testing it would be
appreciated. :-)

---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: [PATCH] driver core: Fix adding device links to probing suppliers

Currently, it is not valid to add a device link from a consumer
driver ->probe() callback to a supplier that is still probing too, but
generally this is a valid use case.  For example, if the consumer has
just acquired a resource that can only be available when the supplier
is functional, adding a device link to that supplier right away
should be safe (and even desirable arguably), but device_link_add()
doesn't handle that case correctly and the initial state of the link
created by it is wrong then.

To address this problem, change the initial state of device links
added between a probing supplier and a probing consumer to
DL_STATE_CONSUMER_PROBE and update device_links_driver_bound() to
skip such links on the supplier side.

With this change, if the supplier probe completes first,
device_links_driver_bound() called for it will skip the link state
update and when it is called for the consumer, the link state will
be updated to "active".  In turn, if the consumer probe completes
first, device_links_driver_bound() called for it will change the
state of the link to "active" and when it is called for the
supplier, the link status update will be skipped.

However, in principle the supplier or consumer probe may still fail
after the link has been added, so modify device_links_no_driver() to
change device links in the "active" or "consumer probe" state to
"dormant" on the supplier side and update __device_links_no_driver()
to change the link state to "available" only if it is "consumer
probe" or "active".

Then, if the supplier probe fails first, the leftover link to the
probing consumer will become "dormant" and device_links_no_driver()
called for the consumer (when its probe fails) will clean it up.
In turn, if the consumer probe fails first, it will either drop the
link, or change its state to "available" and, in the latter case,
when device_links_no_driver() is called for the supplier, it will
update the link state to "dormant".  [If the supplier probe fails,
but the consumer probe succeeds, which should not happen as long as
the consumer driver is correct, the link still will be around, but
it will be "dormant" until the supplier is probed again.]

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 Documentation/driver-api/device_link.rst |   10 ++--
 drivers/base/core.c                      |   74 +++++++++++++++++++++++++++----
 2 files changed, 73 insertions(+), 11 deletions(-)

Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -260,17 +260,26 @@ struct device_link *device_link_add(stru
 		link->status = DL_STATE_NONE;
 	} else {
 		switch (supplier->links.status) {
-		case DL_DEV_DRIVER_BOUND:
+		case DL_DEV_PROBING:
 			switch (consumer->links.status) {
 			case DL_DEV_PROBING:
 				/*
-				 * Some callers expect the link creation during
-				 * consumer driver probe to resume the supplier
-				 * even without DL_FLAG_RPM_ACTIVE.
+				 * A consumer driver can create a link to a
+				 * supplier that has not completed its probing
+				 * yet as long as it knows that the supplier is
+				 * already functional (for example, it has just
+				 * acquired some resources from the supplier).
 				 */
-				if (flags & DL_FLAG_PM_RUNTIME)
-					pm_runtime_resume(supplier);
-
+				link->status = DL_STATE_CONSUMER_PROBE;
+				break;
+			default:
+				link->status = DL_STATE_DORMANT;
+				break;
+			}
+			break;
+		case DL_DEV_DRIVER_BOUND:
+			switch (consumer->links.status) {
+			case DL_DEV_PROBING:
 				link->status = DL_STATE_CONSUMER_PROBE;
 				break;
 			case DL_DEV_DRIVER_BOUND:
@@ -291,6 +300,14 @@ struct device_link *device_link_add(stru
 	}
 
 	/*
+	 * Some callers expect the link creation during consumer driver probe to
+	 * resume the supplier even without DL_FLAG_RPM_ACTIVE.
+	 */
+	if (link->status == DL_STATE_CONSUMER_PROBE &&
+	    flags & DL_FLAG_PM_RUNTIME)
+		pm_runtime_resume(supplier);
+
+	/*
 	 * Move the consumer and all of the devices depending on it to the end
 	 * of dpm_list and the devices_kset list.
 	 *
@@ -474,6 +491,16 @@ void device_links_driver_bound(struct de
 		if (link->flags & DL_FLAG_STATELESS)
 			continue;
 
+		/*
+		 * Links created during consumer probe may be in the "consumer
+		 * probe" state to start with if the supplier is still probing
+		 * when they are created and they may become "active" if the
+		 * consumer probe returns first.  Skip them here.
+		 */
+		if (link->status == DL_STATE_CONSUMER_PROBE ||
+		    link->status == DL_STATE_ACTIVE)
+			continue;
+
 		WARN_ON(link->status != DL_STATE_DORMANT);
 		WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
 	}
@@ -513,17 +540,48 @@ static void __device_links_no_driver(str
 
 		if (link->flags & DL_FLAG_AUTOREMOVE_CONSUMER)
 			kref_put(&link->kref, __device_link_del);
-		else if (link->status != DL_STATE_SUPPLIER_UNBIND)
+		else if (link->status == DL_STATE_CONSUMER_PROBE ||
+			 link->status == DL_STATE_ACTIVE)
 			WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
 	}
 
 	dev->links.status = DL_DEV_NO_DRIVER;
 }
 
+/**
+ * device_links_no_driver - Update links after failing driver probe.
+ * @dev: Device whose driver has just failed to probe.
+ *
+ * Clean up leftover links to consumers for @dev and invoke
+ * %__device_links_no_driver() to update links to suppliers for it as
+ * appropriate.
+ *
+ * Links with the DL_FLAG_STATELESS flag set are ignored.
+ */
 void device_links_no_driver(struct device *dev)
 {
+	struct device_link *link;
+
 	device_links_write_lock();
+
+	list_for_each_entry(link, &dev->links.consumers, s_node) {
+		if (link->flags & DL_FLAG_STATELESS)
+			continue;
+
+		/*
+		 * The probe has failed, so if the status of the link is
+		 * "consumer probe" or "active", it must have been added by
+		 * a probing consumer while this device was still probing.
+		 * Change its state to "dormant", as it represents a valid
+		 * relationship, but it is not functionally meaningful.
+		 */
+		if (link->status == DL_STATE_CONSUMER_PROBE ||
+		    link->status == DL_STATE_ACTIVE)
+			WRITE_ONCE(link->status, DL_STATE_DORMANT);
+	}
+
 	__device_links_no_driver(dev);
+
 	device_links_write_unlock();
 }
 
Index: linux-pm/Documentation/driver-api/device_link.rst
===================================================================
--- linux-pm.orig/Documentation/driver-api/device_link.rst
+++ linux-pm/Documentation/driver-api/device_link.rst
@@ -59,11 +59,15 @@ device ``->probe`` callback or a boot-ti
 
 Another example for an inconsistent state would be a device link that
 represents a driver presence dependency, yet is added from the consumer's
-``->probe`` callback while the supplier hasn't probed yet:  Had the driver
-core known about the device link earlier, it wouldn't have probed the
+``->probe`` callback while the supplier hasn't started to probe yet:  Had the
+driver core known about the device link earlier, it wouldn't have probed the
 consumer in the first place.  The onus is thus on the consumer to check
 presence of the supplier after adding the link, and defer probing on
-non-presence.
+non-presence.  [Note that it is valid to create a link from the consumer's
+``->probe`` callback while the supplier is still probing, but the consumer must
+know that the supplier is functional already at the link creation time (that is
+the case, for instance, if the consumer has just acquired some resources that
+would not have been available had the supplier not been functional then).]
 
 If a device link is added in the ``->probe`` callback of the supplier or
 consumer driver, it is typically deleted in its ``->remove`` callback for

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-15 22:47                                                     ` Rafael J. Wysocki
@ 2019-01-16 18:42                                                       ` Daniel Vetter
  2019-01-16 22:43                                                         ` Rafael J. Wysocki
  2019-01-17 17:26                                                       ` Russell King - ARM Linux admin
  1 sibling, 1 reply; 53+ messages in thread
From: Daniel Vetter @ 2019-01-16 18:42 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linus Walleij, Thierry Reding,
	Laurent Pinchart, Sean Paul, Sarvela, Tomi P
  Cc: Lubomir Rintel, Greg Kroah-Hartman, Linux PM, Liviu Dudau,
	Russell King - ARM Linux, DRI Development, Peter Rosin

On Tue, Jan 15, 2019 at 11:47 PM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> [CC Greg]
>
> On Tuesday, January 15, 2019 1:04:02 AM CET Rafael J. Wysocki wrote:
> > [CC Lukas and linux-pm]
> >
> > On Mon, Jan 14, 2019 at 1:32 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > >
> > > On Fri, Jan 11, 2019 at 3:49 PM Russell King - ARM Linux
> > > <linux@armlinux.org.uk> wrote:
> >
> > [cut]
> >
> > > >
> > > > This thread is discussing how to deal with Armada DRM, its use of the
> > > > component helper, TDA998x's hybrid use of the component helper and
> > > > DRM bridges.
> > > >
> > > > Currently, DRM bridges register into the DRM system and are added to
> > > > a global list of bridges.  When a DRM driver binds, it looks up the
> > > > DRM bridge and attaches to it.  There is no way in generic DRM code
> > > > for the DRM driver to know if the DRM bridge is unbound from DRM,
> > > > consequently a DRM driver may continue trying to call functions in
> > > > the DRM bridge driver using memory that has already been freed.
> > > >
> > > > We had similar issues with imx-drm, and a number of years ago at a
> > > > kernel summit, this was discussed, and I proposed a system which is
> > > > now known as the component helper.  This handles the problem of a
> > > > multi-component system.
> > > >
> > > > However, DRM bridge was already established, and there appears to be
> > > > no desire to convert DRM bridges and DRM drivers to use the component
> > > > helper.
> > > >
> > > > We are presently in the situation where Armada DRM is incompatible
> > > > with the DRM bridges way of doing things, and making it compatible
> > > > essentially means introducing potential use-after-free bugs into the
> > > > code.
> > > >
> > > > Device links in their stateful form appear to provide an alternative
> > > > to the component helper, in that a stateful device link will remove
> > > > consumers of a resource when the supplier is going away - which is
> > > > exactly the problem which the component helper is solving.  The
> > > > difference is that device links look like being a cleaner solution.
> > > >
> > > > Just like the component helper, a stateful link would unbind the
> > > > consumer of a resource when the supplier goes away - which is exactly
> > > > the behaviour we're wanting.
> > > >
> > > > The problem is that the connection between various drivers is only
> > > > really known when the drivers obtain their resources, and the
> > > > following can happen:
> > > >
> > > > supplier                consumer
> > > >
> > > > probe()
> > > >  alloc
> > > >                         probe()
> > > >  publish
> > > >                         obtain supplier's resource
> > > >  return
> > > >
> > > > Where things go wrong is if a stateful link is created when the
> > > > consumer obtains the suppliers resource before the supplier has
> > > > finished probing - which from what's been said is illegal.
> > >
> > > It just doesn't work (which means "invalid" rather than "illegal" I
> > > guess, but whatever :-)).
> > >
> > > Admittedly, the original design didn't take this particular case into
> > > account and I'm not actually sure how it can be taken into account
> > > either.
> > >
> > > If the link is created by the consumer in the scenario above, its
> > > status will be updated twice in a row after the consumer probe returns
> > > and after the supplier probe returns.  It looks like this update would
> > > need to work regardless of the order it which this happened which
> > > sounds somewhat challenging.  I would need to think about that.
> >
> > So if I'm not mistaken it can be made work with the help of the
> > (completely untested) attached patch (of course, the documentation
> > would need to be updated too).
> >
> > The key observation here is that it should be fine to create a link
> > from the consumer driver's probe while the supplier is still probing
> > if the consumer has some way to find out that the supplier is
> > functional at that point (like when it has published itself already in
> > your example above).  The initial state of the link can be "consumer
> > probe" in that case and I don't see a reason why that might not work.
> >
>
> Below is a more complete patch that should take all of the corner cases
> into account unless I have missed anything.  Testing it would be
> appreciated. :-)
>
> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: [PATCH] driver core: Fix adding device links to probing suppliers
>
> Currently, it is not valid to add a device link from a consumer
> driver ->probe() callback to a supplier that is still probing too, but
> generally this is a valid use case.  For example, if the consumer has
> just acquired a resource that can only be available when the supplier
> is functional, adding a device link to that supplier right away
> should be safe (and even desirable arguably), but device_link_add()
> doesn't handle that case correctly and the initial state of the link
> created by it is wrong then.
>
> To address this problem, change the initial state of device links
> added between a probing supplier and a probing consumer to
> DL_STATE_CONSUMER_PROBE and update device_links_driver_bound() to
> skip such links on the supplier side.
>
> With this change, if the supplier probe completes first,
> device_links_driver_bound() called for it will skip the link state
> update and when it is called for the consumer, the link state will
> be updated to "active".  In turn, if the consumer probe completes
> first, device_links_driver_bound() called for it will change the
> state of the link to "active" and when it is called for the
> supplier, the link status update will be skipped.
>
> However, in principle the supplier or consumer probe may still fail
> after the link has been added, so modify device_links_no_driver() to
> change device links in the "active" or "consumer probe" state to
> "dormant" on the supplier side and update __device_links_no_driver()
> to change the link state to "available" only if it is "consumer
> probe" or "active".
>
> Then, if the supplier probe fails first, the leftover link to the
> probing consumer will become "dormant" and device_links_no_driver()
> called for the consumer (when its probe fails) will clean it up.
> In turn, if the consumer probe fails first, it will either drop the
> link, or change its state to "available" and, in the latter case,
> when device_links_no_driver() is called for the supplier, it will
> update the link state to "dormant".  [If the supplier probe fails,
> but the consumer probe succeeds, which should not happen as long as
> the consumer driver is correct, the link still will be around, but
> it will be "dormant" until the supplier is probed again.]
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Pulling in a bunch of drm_bridge and drm_panel people. See huge
discussion upthread, this here is Rafael's idea for making device
links more useful. Would be great if someone could test this out with
panel/bridge dependencies.

I guess this here isn't yet solving the reprobe issue where the
provider was unloaded and reloaded (which should cause the consumer to
reprobe too, with the EPROBE_DEFERRED logic). I think that was the
other issue we've hit.
-Daniel

> ---
>  Documentation/driver-api/device_link.rst |   10 ++--
>  drivers/base/core.c                      |   74 +++++++++++++++++++++++++++----
>  2 files changed, 73 insertions(+), 11 deletions(-)
>
> Index: linux-pm/drivers/base/core.c
> ===================================================================
> --- linux-pm.orig/drivers/base/core.c
> +++ linux-pm/drivers/base/core.c
> @@ -260,17 +260,26 @@ struct device_link *device_link_add(stru
>                 link->status = DL_STATE_NONE;
>         } else {
>                 switch (supplier->links.status) {
> -               case DL_DEV_DRIVER_BOUND:
> +               case DL_DEV_PROBING:
>                         switch (consumer->links.status) {
>                         case DL_DEV_PROBING:
>                                 /*
> -                                * Some callers expect the link creation during
> -                                * consumer driver probe to resume the supplier
> -                                * even without DL_FLAG_RPM_ACTIVE.
> +                                * A consumer driver can create a link to a
> +                                * supplier that has not completed its probing
> +                                * yet as long as it knows that the supplier is
> +                                * already functional (for example, it has just
> +                                * acquired some resources from the supplier).
>                                  */
> -                               if (flags & DL_FLAG_PM_RUNTIME)
> -                                       pm_runtime_resume(supplier);
> -
> +                               link->status = DL_STATE_CONSUMER_PROBE;
> +                               break;
> +                       default:
> +                               link->status = DL_STATE_DORMANT;
> +                               break;
> +                       }
> +                       break;
> +               case DL_DEV_DRIVER_BOUND:
> +                       switch (consumer->links.status) {
> +                       case DL_DEV_PROBING:
>                                 link->status = DL_STATE_CONSUMER_PROBE;
>                                 break;
>                         case DL_DEV_DRIVER_BOUND:
> @@ -291,6 +300,14 @@ struct device_link *device_link_add(stru
>         }
>
>         /*
> +        * Some callers expect the link creation during consumer driver probe to
> +        * resume the supplier even without DL_FLAG_RPM_ACTIVE.
> +        */
> +       if (link->status == DL_STATE_CONSUMER_PROBE &&
> +           flags & DL_FLAG_PM_RUNTIME)
> +               pm_runtime_resume(supplier);
> +
> +       /*
>          * Move the consumer and all of the devices depending on it to the end
>          * of dpm_list and the devices_kset list.
>          *
> @@ -474,6 +491,16 @@ void device_links_driver_bound(struct de
>                 if (link->flags & DL_FLAG_STATELESS)
>                         continue;
>
> +               /*
> +                * Links created during consumer probe may be in the "consumer
> +                * probe" state to start with if the supplier is still probing
> +                * when they are created and they may become "active" if the
> +                * consumer probe returns first.  Skip them here.
> +                */
> +               if (link->status == DL_STATE_CONSUMER_PROBE ||
> +                   link->status == DL_STATE_ACTIVE)
> +                       continue;
> +
>                 WARN_ON(link->status != DL_STATE_DORMANT);
>                 WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
>         }
> @@ -513,17 +540,48 @@ static void __device_links_no_driver(str
>
>                 if (link->flags & DL_FLAG_AUTOREMOVE_CONSUMER)
>                         kref_put(&link->kref, __device_link_del);
> -               else if (link->status != DL_STATE_SUPPLIER_UNBIND)
> +               else if (link->status == DL_STATE_CONSUMER_PROBE ||
> +                        link->status == DL_STATE_ACTIVE)
>                         WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
>         }
>
>         dev->links.status = DL_DEV_NO_DRIVER;
>  }
>
> +/**
> + * device_links_no_driver - Update links after failing driver probe.
> + * @dev: Device whose driver has just failed to probe.
> + *
> + * Clean up leftover links to consumers for @dev and invoke
> + * %__device_links_no_driver() to update links to suppliers for it as
> + * appropriate.
> + *
> + * Links with the DL_FLAG_STATELESS flag set are ignored.
> + */
>  void device_links_no_driver(struct device *dev)
>  {
> +       struct device_link *link;
> +
>         device_links_write_lock();
> +
> +       list_for_each_entry(link, &dev->links.consumers, s_node) {
> +               if (link->flags & DL_FLAG_STATELESS)
> +                       continue;
> +
> +               /*
> +                * The probe has failed, so if the status of the link is
> +                * "consumer probe" or "active", it must have been added by
> +                * a probing consumer while this device was still probing.
> +                * Change its state to "dormant", as it represents a valid
> +                * relationship, but it is not functionally meaningful.
> +                */
> +               if (link->status == DL_STATE_CONSUMER_PROBE ||
> +                   link->status == DL_STATE_ACTIVE)
> +                       WRITE_ONCE(link->status, DL_STATE_DORMANT);
> +       }
> +
>         __device_links_no_driver(dev);
> +
>         device_links_write_unlock();
>  }
>
> Index: linux-pm/Documentation/driver-api/device_link.rst
> ===================================================================
> --- linux-pm.orig/Documentation/driver-api/device_link.rst
> +++ linux-pm/Documentation/driver-api/device_link.rst
> @@ -59,11 +59,15 @@ device ``->probe`` callback or a boot-ti
>
>  Another example for an inconsistent state would be a device link that
>  represents a driver presence dependency, yet is added from the consumer's
> -``->probe`` callback while the supplier hasn't probed yet:  Had the driver
> -core known about the device link earlier, it wouldn't have probed the
> +``->probe`` callback while the supplier hasn't started to probe yet:  Had the
> +driver core known about the device link earlier, it wouldn't have probed the
>  consumer in the first place.  The onus is thus on the consumer to check
>  presence of the supplier after adding the link, and defer probing on
> -non-presence.
> +non-presence.  [Note that it is valid to create a link from the consumer's
> +``->probe`` callback while the supplier is still probing, but the consumer must
> +know that the supplier is functional already at the link creation time (that is
> +the case, for instance, if the consumer has just acquired some resources that
> +would not have been available had the supplier not been functional then).]
>
>  If a device link is added in the ``->probe`` callback of the supplier or
>  consumer driver, it is typically deleted in its ``->remove`` callback for
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-16 18:42                                                       ` Daniel Vetter
@ 2019-01-16 22:43                                                         ` Rafael J. Wysocki
  2019-01-17 12:20                                                           ` Daniel Vetter
  0 siblings, 1 reply; 53+ messages in thread
From: Rafael J. Wysocki @ 2019-01-16 22:43 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Lubomir Rintel, Sarvela, Tomi P, Greg Kroah-Hartman, Linux PM,
	Liviu Dudau, Russell King - ARM Linux, DRI Development,
	Thierry Reding, Laurent Pinchart, Sean Paul, Peter Rosin

On Wednesday, January 16, 2019 7:42:45 PM CET Daniel Vetter wrote:
> On Tue, Jan 15, 2019 at 11:47 PM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >
> > [CC Greg]
> >
> > On Tuesday, January 15, 2019 1:04:02 AM CET Rafael J. Wysocki wrote:
> > > [CC Lukas and linux-pm]
> > >
> > > On Mon, Jan 14, 2019 at 1:32 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > >
> > > > On Fri, Jan 11, 2019 at 3:49 PM Russell King - ARM Linux
> > > > <linux@armlinux.org.uk> wrote:
> > >
> > > [cut]
> > >
> > > > >
> > > > > This thread is discussing how to deal with Armada DRM, its use of the
> > > > > component helper, TDA998x's hybrid use of the component helper and
> > > > > DRM bridges.
> > > > >
> > > > > Currently, DRM bridges register into the DRM system and are added to
> > > > > a global list of bridges.  When a DRM driver binds, it looks up the
> > > > > DRM bridge and attaches to it.  There is no way in generic DRM code
> > > > > for the DRM driver to know if the DRM bridge is unbound from DRM,
> > > > > consequently a DRM driver may continue trying to call functions in
> > > > > the DRM bridge driver using memory that has already been freed.
> > > > >
> > > > > We had similar issues with imx-drm, and a number of years ago at a
> > > > > kernel summit, this was discussed, and I proposed a system which is
> > > > > now known as the component helper.  This handles the problem of a
> > > > > multi-component system.
> > > > >
> > > > > However, DRM bridge was already established, and there appears to be
> > > > > no desire to convert DRM bridges and DRM drivers to use the component
> > > > > helper.
> > > > >
> > > > > We are presently in the situation where Armada DRM is incompatible
> > > > > with the DRM bridges way of doing things, and making it compatible
> > > > > essentially means introducing potential use-after-free bugs into the
> > > > > code.
> > > > >
> > > > > Device links in their stateful form appear to provide an alternative
> > > > > to the component helper, in that a stateful device link will remove
> > > > > consumers of a resource when the supplier is going away - which is
> > > > > exactly the problem which the component helper is solving.  The
> > > > > difference is that device links look like being a cleaner solution.
> > > > >
> > > > > Just like the component helper, a stateful link would unbind the
> > > > > consumer of a resource when the supplier goes away - which is exactly
> > > > > the behaviour we're wanting.
> > > > >
> > > > > The problem is that the connection between various drivers is only
> > > > > really known when the drivers obtain their resources, and the
> > > > > following can happen:
> > > > >
> > > > > supplier                consumer
> > > > >
> > > > > probe()
> > > > >  alloc
> > > > >                         probe()
> > > > >  publish
> > > > >                         obtain supplier's resource
> > > > >  return
> > > > >
> > > > > Where things go wrong is if a stateful link is created when the
> > > > > consumer obtains the suppliers resource before the supplier has
> > > > > finished probing - which from what's been said is illegal.
> > > >
> > > > It just doesn't work (which means "invalid" rather than "illegal" I
> > > > guess, but whatever :-)).
> > > >
> > > > Admittedly, the original design didn't take this particular case into
> > > > account and I'm not actually sure how it can be taken into account
> > > > either.
> > > >
> > > > If the link is created by the consumer in the scenario above, its
> > > > status will be updated twice in a row after the consumer probe returns
> > > > and after the supplier probe returns.  It looks like this update would
> > > > need to work regardless of the order it which this happened which
> > > > sounds somewhat challenging.  I would need to think about that.
> > >
> > > So if I'm not mistaken it can be made work with the help of the
> > > (completely untested) attached patch (of course, the documentation
> > > would need to be updated too).
> > >
> > > The key observation here is that it should be fine to create a link
> > > from the consumer driver's probe while the supplier is still probing
> > > if the consumer has some way to find out that the supplier is
> > > functional at that point (like when it has published itself already in
> > > your example above).  The initial state of the link can be "consumer
> > > probe" in that case and I don't see a reason why that might not work.
> > >
> >
> > Below is a more complete patch that should take all of the corner cases
> > into account unless I have missed anything.  Testing it would be
> > appreciated. :-)
> >
> > ---
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Subject: [PATCH] driver core: Fix adding device links to probing suppliers
> >
> > Currently, it is not valid to add a device link from a consumer
> > driver ->probe() callback to a supplier that is still probing too, but
> > generally this is a valid use case.  For example, if the consumer has
> > just acquired a resource that can only be available when the supplier
> > is functional, adding a device link to that supplier right away
> > should be safe (and even desirable arguably), but device_link_add()
> > doesn't handle that case correctly and the initial state of the link
> > created by it is wrong then.
> >
> > To address this problem, change the initial state of device links
> > added between a probing supplier and a probing consumer to
> > DL_STATE_CONSUMER_PROBE and update device_links_driver_bound() to
> > skip such links on the supplier side.
> >
> > With this change, if the supplier probe completes first,
> > device_links_driver_bound() called for it will skip the link state
> > update and when it is called for the consumer, the link state will
> > be updated to "active".  In turn, if the consumer probe completes
> > first, device_links_driver_bound() called for it will change the
> > state of the link to "active" and when it is called for the
> > supplier, the link status update will be skipped.
> >
> > However, in principle the supplier or consumer probe may still fail
> > after the link has been added, so modify device_links_no_driver() to
> > change device links in the "active" or "consumer probe" state to
> > "dormant" on the supplier side and update __device_links_no_driver()
> > to change the link state to "available" only if it is "consumer
> > probe" or "active".
> >
> > Then, if the supplier probe fails first, the leftover link to the
> > probing consumer will become "dormant" and device_links_no_driver()
> > called for the consumer (when its probe fails) will clean it up.
> > In turn, if the consumer probe fails first, it will either drop the
> > link, or change its state to "available" and, in the latter case,
> > when device_links_no_driver() is called for the supplier, it will
> > update the link state to "dormant".  [If the supplier probe fails,
> > but the consumer probe succeeds, which should not happen as long as
> > the consumer driver is correct, the link still will be around, but
> > it will be "dormant" until the supplier is probed again.]
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Pulling in a bunch of drm_bridge and drm_panel people. See huge
> discussion upthread, this here is Rafael's idea for making device
> links more useful. Would be great if someone could test this out with
> panel/bridge dependencies.
> 
> I guess this here isn't yet solving the reprobe issue where the
> provider was unloaded and reloaded (which should cause the consumer to
> reprobe too, with the EPROBE_DEFERRED logic). I think that was the
> other issue we've hit.

I don't think it is addressed here.

Can anyone please explain to me what happens in more detail?

Cheers,
Rafael

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-16 22:43                                                         ` Rafael J. Wysocki
@ 2019-01-17 12:20                                                           ` Daniel Vetter
  2019-01-18  9:36                                                             ` Lucas Stach
  0 siblings, 1 reply; 53+ messages in thread
From: Daniel Vetter @ 2019-01-17 12:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Lubomir Rintel, Sarvela, Tomi P, Greg Kroah-Hartman, Linux PM,
	Liviu Dudau, Russell King - ARM Linux, DRI Development,
	Thierry Reding, Laurent Pinchart, Sean Paul, Peter Rosin

On Wed, Jan 16, 2019 at 11:44 PM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> On Wednesday, January 16, 2019 7:42:45 PM CET Daniel Vetter wrote:
> > On Tue, Jan 15, 2019 at 11:47 PM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > >
> > > [CC Greg]
> > >
> > > On Tuesday, January 15, 2019 1:04:02 AM CET Rafael J. Wysocki wrote:
> > > > [CC Lukas and linux-pm]
> > > >
> > > > On Mon, Jan 14, 2019 at 1:32 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > > >
> > > > > On Fri, Jan 11, 2019 at 3:49 PM Russell King - ARM Linux
> > > > > <linux@armlinux.org.uk> wrote:
> > > >
> > > > [cut]
> > > >
> > > > > >
> > > > > > This thread is discussing how to deal with Armada DRM, its use of the
> > > > > > component helper, TDA998x's hybrid use of the component helper and
> > > > > > DRM bridges.
> > > > > >
> > > > > > Currently, DRM bridges register into the DRM system and are added to
> > > > > > a global list of bridges.  When a DRM driver binds, it looks up the
> > > > > > DRM bridge and attaches to it.  There is no way in generic DRM code
> > > > > > for the DRM driver to know if the DRM bridge is unbound from DRM,
> > > > > > consequently a DRM driver may continue trying to call functions in
> > > > > > the DRM bridge driver using memory that has already been freed.
> > > > > >
> > > > > > We had similar issues with imx-drm, and a number of years ago at a
> > > > > > kernel summit, this was discussed, and I proposed a system which is
> > > > > > now known as the component helper.  This handles the problem of a
> > > > > > multi-component system.
> > > > > >
> > > > > > However, DRM bridge was already established, and there appears to be
> > > > > > no desire to convert DRM bridges and DRM drivers to use the component
> > > > > > helper.
> > > > > >
> > > > > > We are presently in the situation where Armada DRM is incompatible
> > > > > > with the DRM bridges way of doing things, and making it compatible
> > > > > > essentially means introducing potential use-after-free bugs into the
> > > > > > code.
> > > > > >
> > > > > > Device links in their stateful form appear to provide an alternative
> > > > > > to the component helper, in that a stateful device link will remove
> > > > > > consumers of a resource when the supplier is going away - which is
> > > > > > exactly the problem which the component helper is solving.  The
> > > > > > difference is that device links look like being a cleaner solution.
> > > > > >
> > > > > > Just like the component helper, a stateful link would unbind the
> > > > > > consumer of a resource when the supplier goes away - which is exactly
> > > > > > the behaviour we're wanting.
> > > > > >
> > > > > > The problem is that the connection between various drivers is only
> > > > > > really known when the drivers obtain their resources, and the
> > > > > > following can happen:
> > > > > >
> > > > > > supplier                consumer
> > > > > >
> > > > > > probe()
> > > > > >  alloc
> > > > > >                         probe()
> > > > > >  publish
> > > > > >                         obtain supplier's resource
> > > > > >  return
> > > > > >
> > > > > > Where things go wrong is if a stateful link is created when the
> > > > > > consumer obtains the suppliers resource before the supplier has
> > > > > > finished probing - which from what's been said is illegal.
> > > > >
> > > > > It just doesn't work (which means "invalid" rather than "illegal" I
> > > > > guess, but whatever :-)).
> > > > >
> > > > > Admittedly, the original design didn't take this particular case into
> > > > > account and I'm not actually sure how it can be taken into account
> > > > > either.
> > > > >
> > > > > If the link is created by the consumer in the scenario above, its
> > > > > status will be updated twice in a row after the consumer probe returns
> > > > > and after the supplier probe returns.  It looks like this update would
> > > > > need to work regardless of the order it which this happened which
> > > > > sounds somewhat challenging.  I would need to think about that.
> > > >
> > > > So if I'm not mistaken it can be made work with the help of the
> > > > (completely untested) attached patch (of course, the documentation
> > > > would need to be updated too).
> > > >
> > > > The key observation here is that it should be fine to create a link
> > > > from the consumer driver's probe while the supplier is still probing
> > > > if the consumer has some way to find out that the supplier is
> > > > functional at that point (like when it has published itself already in
> > > > your example above).  The initial state of the link can be "consumer
> > > > probe" in that case and I don't see a reason why that might not work.
> > > >
> > >
> > > Below is a more complete patch that should take all of the corner cases
> > > into account unless I have missed anything.  Testing it would be
> > > appreciated. :-)
> > >
> > > ---
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > Subject: [PATCH] driver core: Fix adding device links to probing suppliers
> > >
> > > Currently, it is not valid to add a device link from a consumer
> > > driver ->probe() callback to a supplier that is still probing too, but
> > > generally this is a valid use case.  For example, if the consumer has
> > > just acquired a resource that can only be available when the supplier
> > > is functional, adding a device link to that supplier right away
> > > should be safe (and even desirable arguably), but device_link_add()
> > > doesn't handle that case correctly and the initial state of the link
> > > created by it is wrong then.
> > >
> > > To address this problem, change the initial state of device links
> > > added between a probing supplier and a probing consumer to
> > > DL_STATE_CONSUMER_PROBE and update device_links_driver_bound() to
> > > skip such links on the supplier side.
> > >
> > > With this change, if the supplier probe completes first,
> > > device_links_driver_bound() called for it will skip the link state
> > > update and when it is called for the consumer, the link state will
> > > be updated to "active".  In turn, if the consumer probe completes
> > > first, device_links_driver_bound() called for it will change the
> > > state of the link to "active" and when it is called for the
> > > supplier, the link status update will be skipped.
> > >
> > > However, in principle the supplier or consumer probe may still fail
> > > after the link has been added, so modify device_links_no_driver() to
> > > change device links in the "active" or "consumer probe" state to
> > > "dormant" on the supplier side and update __device_links_no_driver()
> > > to change the link state to "available" only if it is "consumer
> > > probe" or "active".
> > >
> > > Then, if the supplier probe fails first, the leftover link to the
> > > probing consumer will become "dormant" and device_links_no_driver()
> > > called for the consumer (when its probe fails) will clean it up.
> > > In turn, if the consumer probe fails first, it will either drop the
> > > link, or change its state to "available" and, in the latter case,
> > > when device_links_no_driver() is called for the supplier, it will
> > > update the link state to "dormant".  [If the supplier probe fails,
> > > but the consumer probe succeeds, which should not happen as long as
> > > the consumer driver is correct, the link still will be around, but
> > > it will be "dormant" until the supplier is probed again.]
> > >
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Pulling in a bunch of drm_bridge and drm_panel people. See huge
> > discussion upthread, this here is Rafael's idea for making device
> > links more useful. Would be great if someone could test this out with
> > panel/bridge dependencies.
> >
> > I guess this here isn't yet solving the reprobe issue where the
> > provider was unloaded and reloaded (which should cause the consumer to
> > reprobe too, with the EPROBE_DEFERRED logic). I think that was the
> > other issue we've hit.
>
> I don't think it is addressed here.
>
> Can anyone please explain to me what happens in more detail?

My understanding (and this might be wrong, Russell, Andrzej, ... pls
correct) is that the following sequence goes wrong:

- componentized driver (both producer and consumer) fully loaded&working
- you unbind the producer/one of the components
- component framework or driver core through device_link also unbinds
the master/consumer
- you reload/rebind the component
- with the component framework this results in the master being
rebound, and the overall driver working again. With device_links
nothing happens.

I think there was a patch floating around once to put drivers unbound
due to device_links onto the deferred probe list, so that the next
time something is bound they have a chance to rebind. But I can't find
that patch anymore, maybe someone else has the link still?

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-15 22:47                                                     ` Rafael J. Wysocki
  2019-01-16 18:42                                                       ` Daniel Vetter
@ 2019-01-17 17:26                                                       ` Russell King - ARM Linux admin
  2019-01-17 22:43                                                         ` Rafael J. Wysocki
  1 sibling, 1 reply; 53+ messages in thread
From: Russell King - ARM Linux admin @ 2019-01-17 17:26 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Lubomir Rintel, Greg Kroah-Hartman, Linux PM, Liviu Dudau,
	DRI Development, Peter Rosin

On Tue, Jan 15, 2019 at 11:47:00PM +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: [PATCH] driver core: Fix adding device links to probing suppliers
> 
> Currently, it is not valid to add a device link from a consumer
> driver ->probe() callback to a supplier that is still probing too, but
> generally this is a valid use case.  For example, if the consumer has
> just acquired a resource that can only be available when the supplier
> is functional, adding a device link to that supplier right away
> should be safe (and even desirable arguably), but device_link_add()
> doesn't handle that case correctly and the initial state of the link
> created by it is wrong then.
> 
> To address this problem, change the initial state of device links
> added between a probing supplier and a probing consumer to
> DL_STATE_CONSUMER_PROBE and update device_links_driver_bound() to
> skip such links on the supplier side.
> 
> With this change, if the supplier probe completes first,
> device_links_driver_bound() called for it will skip the link state
> update and when it is called for the consumer, the link state will
> be updated to "active".  In turn, if the consumer probe completes
> first, device_links_driver_bound() called for it will change the
> state of the link to "active" and when it is called for the
> supplier, the link status update will be skipped.
> 
> However, in principle the supplier or consumer probe may still fail
> after the link has been added, so modify device_links_no_driver() to
> change device links in the "active" or "consumer probe" state to
> "dormant" on the supplier side and update __device_links_no_driver()
> to change the link state to "available" only if it is "consumer
> probe" or "active".
> 
> Then, if the supplier probe fails first, the leftover link to the
> probing consumer will become "dormant" and device_links_no_driver()
> called for the consumer (when its probe fails) will clean it up.
> In turn, if the consumer probe fails first, it will either drop the
> link, or change its state to "available" and, in the latter case,
> when device_links_no_driver() is called for the supplier, it will
> update the link state to "dormant".  [If the supplier probe fails,
> but the consumer probe succeeds, which should not happen as long as
> the consumer driver is correct, the link still will be around, but
> it will be "dormant" until the supplier is probed again.]
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Hi Rafael,

I've tried this with Armada DRM without using the component helper for
bridges, and it seems to work up to a point (I have a vga bridge here
to allow further testing):

[    2.323441] armada-drm display-subsystem: assigned reserved memory node framebuffer
[    2.332001] armada-drm display-subsystem: bound f1820000.lcd-controller (ops armada_lcd_ops)
[    2.340790] armada-drm display-subsystem: bound f1810000.lcd-controller (ops armada_lcd_ops)
[    2.349345] armada-drm display-subsystem: node=/i2c-mux/i2c@0/hdmi-encoder@70
[    2.356719] armada-drm display-subsystem: panel=fffffdfb bridge=  (null) ret=0
[    2.364038] armada-drm display-subsystem: Linked as a consumer to 1-0070
[    2.370818] armada-drm display-subsystem: bridge=ee8cda2c ret=0
[    2.376894] armada-drm display-subsystem: node=/vga-bridge
[    2.382453] armada-drm display-subsystem: panel=fffffdfb bridge=(null) ret=0
[    2.389762] armada-drm display-subsystem: Linked as a consumer to vga-bridge
[    2.396883] armada-drm display-subsystem: bridge=ef3bec40 ret=0

When I remove the HDMI encoder:

root@cubox:/sys/bus/i2c/drivers/tda998x# echo 1-0070 > unbind

then I get:

[ 1013.824860] Console: switching to colour dummy device 80x30
[ 1013.866785] armada-drm display-subsystem: Dropping the link to
vga-bridge
[ 1013.867126] armada-drm display-subsystem: Dropping the link to 1-0070

which looks like it did what was expected - the DRM device is indeed
unbound, the nodes in /dev/dri are gone.  When rebinding the HDMI
encoder:

[ 1015.864703] tda998x 1-0070: found TDA19988
[ 1015.898078] tda9950 1-0034: TDA9950 CEC interface, hardware version 3.3
[ 1015.941374] Registered IR keymap rc-cec
[ 1015.941684] rc rc0: tda9950 as /devices/platform/mbus/mbus:internal-regs/f1011000.i2c/i2c-0/i2c-1/1-0070/rc/rc0
[ 1015.942439] input: tda9950 as /devices/platform/mbus/mbus:internal-regs/f1011000.i2c/i2c-0/i2c-1/1-0070/rc/rc0/input2
[ 1016.982921] alloc_contig_range: [10390, 10490) PFNs busy
[ 1016.987400] kirkwood-spdif-audio audio-subsystem: snd-soc-dummy-dai <-> kirkwood-fe mapping ok
[ 1016.987591] kirkwood-spdif-audio audio-subsystem: multicodec <-> kirkwood-spdif mapping ok

but the DRM stuff doesn't come back - this is what Daniel was talking
about further down this thread.

I guess the kernel can't know that it should come back because the
device links were dropped at the unbind stage, which means the kernel
has lost the information necessary to know that the display subsystem
is dependent on the presence of the HDMI encoder.  I don't see an easy
way around that.

If we keep the device links after an unbind event, then, because we
create them during probe, we'll be attempting to recreate the link
when we reattach the supplier to the consumer.  If we only allow one
instance, then what does device_link_add() return.

Maybe it is going to be less painful to push everything bridge-related
to use the component helper after all.  Dunno.  Problems either way.

> ---
>  Documentation/driver-api/device_link.rst |   10 ++--
>  drivers/base/core.c                      |   74 +++++++++++++++++++++++++++----
>  2 files changed, 73 insertions(+), 11 deletions(-)
> 
> Index: linux-pm/drivers/base/core.c
> ===================================================================
> --- linux-pm.orig/drivers/base/core.c
> +++ linux-pm/drivers/base/core.c
> @@ -260,17 +260,26 @@ struct device_link *device_link_add(stru
>  		link->status = DL_STATE_NONE;
>  	} else {
>  		switch (supplier->links.status) {
> -		case DL_DEV_DRIVER_BOUND:
> +		case DL_DEV_PROBING:
>  			switch (consumer->links.status) {
>  			case DL_DEV_PROBING:
>  				/*
> -				 * Some callers expect the link creation during
> -				 * consumer driver probe to resume the supplier
> -				 * even without DL_FLAG_RPM_ACTIVE.
> +				 * A consumer driver can create a link to a
> +				 * supplier that has not completed its probing
> +				 * yet as long as it knows that the supplier is
> +				 * already functional (for example, it has just
> +				 * acquired some resources from the supplier).
>  				 */
> -				if (flags & DL_FLAG_PM_RUNTIME)
> -					pm_runtime_resume(supplier);
> -
> +				link->status = DL_STATE_CONSUMER_PROBE;
> +				break;
> +			default:
> +				link->status = DL_STATE_DORMANT;
> +				break;
> +			}
> +			break;
> +		case DL_DEV_DRIVER_BOUND:
> +			switch (consumer->links.status) {
> +			case DL_DEV_PROBING:
>  				link->status = DL_STATE_CONSUMER_PROBE;
>  				break;
>  			case DL_DEV_DRIVER_BOUND:
> @@ -291,6 +300,14 @@ struct device_link *device_link_add(stru
>  	}
>  
>  	/*
> +	 * Some callers expect the link creation during consumer driver probe to
> +	 * resume the supplier even without DL_FLAG_RPM_ACTIVE.
> +	 */
> +	if (link->status == DL_STATE_CONSUMER_PROBE &&
> +	    flags & DL_FLAG_PM_RUNTIME)
> +		pm_runtime_resume(supplier);
> +
> +	/*
>  	 * Move the consumer and all of the devices depending on it to the end
>  	 * of dpm_list and the devices_kset list.
>  	 *
> @@ -474,6 +491,16 @@ void device_links_driver_bound(struct de
>  		if (link->flags & DL_FLAG_STATELESS)
>  			continue;
>  
> +		/*
> +		 * Links created during consumer probe may be in the "consumer
> +		 * probe" state to start with if the supplier is still probing
> +		 * when they are created and they may become "active" if the
> +		 * consumer probe returns first.  Skip them here.
> +		 */
> +		if (link->status == DL_STATE_CONSUMER_PROBE ||
> +		    link->status == DL_STATE_ACTIVE)
> +			continue;
> +
>  		WARN_ON(link->status != DL_STATE_DORMANT);
>  		WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
>  	}
> @@ -513,17 +540,48 @@ static void __device_links_no_driver(str
>  
>  		if (link->flags & DL_FLAG_AUTOREMOVE_CONSUMER)
>  			kref_put(&link->kref, __device_link_del);
> -		else if (link->status != DL_STATE_SUPPLIER_UNBIND)
> +		else if (link->status == DL_STATE_CONSUMER_PROBE ||
> +			 link->status == DL_STATE_ACTIVE)
>  			WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
>  	}
>  
>  	dev->links.status = DL_DEV_NO_DRIVER;
>  }
>  
> +/**
> + * device_links_no_driver - Update links after failing driver probe.
> + * @dev: Device whose driver has just failed to probe.
> + *
> + * Clean up leftover links to consumers for @dev and invoke
> + * %__device_links_no_driver() to update links to suppliers for it as
> + * appropriate.
> + *
> + * Links with the DL_FLAG_STATELESS flag set are ignored.
> + */
>  void device_links_no_driver(struct device *dev)
>  {
> +	struct device_link *link;
> +
>  	device_links_write_lock();
> +
> +	list_for_each_entry(link, &dev->links.consumers, s_node) {
> +		if (link->flags & DL_FLAG_STATELESS)
> +			continue;
> +
> +		/*
> +		 * The probe has failed, so if the status of the link is
> +		 * "consumer probe" or "active", it must have been added by
> +		 * a probing consumer while this device was still probing.
> +		 * Change its state to "dormant", as it represents a valid
> +		 * relationship, but it is not functionally meaningful.
> +		 */
> +		if (link->status == DL_STATE_CONSUMER_PROBE ||
> +		    link->status == DL_STATE_ACTIVE)
> +			WRITE_ONCE(link->status, DL_STATE_DORMANT);
> +	}
> +
>  	__device_links_no_driver(dev);
> +
>  	device_links_write_unlock();
>  }
>  
> Index: linux-pm/Documentation/driver-api/device_link.rst
> ===================================================================
> --- linux-pm.orig/Documentation/driver-api/device_link.rst
> +++ linux-pm/Documentation/driver-api/device_link.rst
> @@ -59,11 +59,15 @@ device ``->probe`` callback or a boot-ti
>  
>  Another example for an inconsistent state would be a device link that
>  represents a driver presence dependency, yet is added from the consumer's
> -``->probe`` callback while the supplier hasn't probed yet:  Had the driver
> -core known about the device link earlier, it wouldn't have probed the
> +``->probe`` callback while the supplier hasn't started to probe yet:  Had the
> +driver core known about the device link earlier, it wouldn't have probed the
>  consumer in the first place.  The onus is thus on the consumer to check
>  presence of the supplier after adding the link, and defer probing on
> -non-presence.
> +non-presence.  [Note that it is valid to create a link from the consumer's
> +``->probe`` callback while the supplier is still probing, but the consumer must
> +know that the supplier is functional already at the link creation time (that is
> +the case, for instance, if the consumer has just acquired some resources that
> +would not have been available had the supplier not been functional then).]
>  
>  If a device link is added in the ``->probe`` callback of the supplier or
>  consumer driver, it is typically deleted in its ``->remove`` callback for
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-17 17:26                                                       ` Russell King - ARM Linux admin
@ 2019-01-17 22:43                                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 53+ messages in thread
From: Rafael J. Wysocki @ 2019-01-17 22:43 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Lubomir Rintel, Greg Kroah-Hartman, Linux PM, Liviu Dudau,
	DRI Development, Peter Rosin

On Thursday, January 17, 2019 6:26:25 PM CET Russell King - ARM Linux admin wrote:
> On Tue, Jan 15, 2019 at 11:47:00PM +0100, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Subject: [PATCH] driver core: Fix adding device links to probing suppliers
> > 
> > Currently, it is not valid to add a device link from a consumer
> > driver ->probe() callback to a supplier that is still probing too, but
> > generally this is a valid use case.  For example, if the consumer has
> > just acquired a resource that can only be available when the supplier
> > is functional, adding a device link to that supplier right away
> > should be safe (and even desirable arguably), but device_link_add()
> > doesn't handle that case correctly and the initial state of the link
> > created by it is wrong then.
> > 
> > To address this problem, change the initial state of device links
> > added between a probing supplier and a probing consumer to
> > DL_STATE_CONSUMER_PROBE and update device_links_driver_bound() to
> > skip such links on the supplier side.
> > 
> > With this change, if the supplier probe completes first,
> > device_links_driver_bound() called for it will skip the link state
> > update and when it is called for the consumer, the link state will
> > be updated to "active".  In turn, if the consumer probe completes
> > first, device_links_driver_bound() called for it will change the
> > state of the link to "active" and when it is called for the
> > supplier, the link status update will be skipped.
> > 
> > However, in principle the supplier or consumer probe may still fail
> > after the link has been added, so modify device_links_no_driver() to
> > change device links in the "active" or "consumer probe" state to
> > "dormant" on the supplier side and update __device_links_no_driver()
> > to change the link state to "available" only if it is "consumer
> > probe" or "active".
> > 
> > Then, if the supplier probe fails first, the leftover link to the
> > probing consumer will become "dormant" and device_links_no_driver()
> > called for the consumer (when its probe fails) will clean it up.
> > In turn, if the consumer probe fails first, it will either drop the
> > link, or change its state to "available" and, in the latter case,
> > when device_links_no_driver() is called for the supplier, it will
> > update the link state to "dormant".  [If the supplier probe fails,
> > but the consumer probe succeeds, which should not happen as long as
> > the consumer driver is correct, the link still will be around, but
> > it will be "dormant" until the supplier is probed again.]
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Hi Rafael,

Hi,

> I've tried this with Armada DRM without using the component helper for
> bridges, and it seems to work up to a point (I have a vga bridge here
> to allow further testing):
> 
> [    2.323441] armada-drm display-subsystem: assigned reserved memory node framebuffer
> [    2.332001] armada-drm display-subsystem: bound f1820000.lcd-controller (ops armada_lcd_ops)
> [    2.340790] armada-drm display-subsystem: bound f1810000.lcd-controller (ops armada_lcd_ops)
> [    2.349345] armada-drm display-subsystem: node=/i2c-mux/i2c@0/hdmi-encoder@70
> [    2.356719] armada-drm display-subsystem: panel=fffffdfb bridge=  (null) ret=0
> [    2.364038] armada-drm display-subsystem: Linked as a consumer to 1-0070
> [    2.370818] armada-drm display-subsystem: bridge=ee8cda2c ret=0
> [    2.376894] armada-drm display-subsystem: node=/vga-bridge
> [    2.382453] armada-drm display-subsystem: panel=fffffdfb bridge=(null) ret=0
> [    2.389762] armada-drm display-subsystem: Linked as a consumer to vga-bridge
> [    2.396883] armada-drm display-subsystem: bridge=ef3bec40 ret=0
> 
> When I remove the HDMI encoder:
> 
> root@cubox:/sys/bus/i2c/drivers/tda998x# echo 1-0070 > unbind
> 
> then I get:
> 
> [ 1013.824860] Console: switching to colour dummy device 80x30
> [ 1013.866785] armada-drm display-subsystem: Dropping the link to
> vga-bridge
> [ 1013.867126] armada-drm display-subsystem: Dropping the link to 1-0070
> 
> which looks like it did what was expected - the DRM device is indeed
> unbound, the nodes in /dev/dri are gone.

OK, thanks!

This tells me that the patch is an improvement, so I'm going to move on to
integrate it.

> When rebinding the HDMI encoder:
> 
> [ 1015.864703] tda998x 1-0070: found TDA19988
> [ 1015.898078] tda9950 1-0034: TDA9950 CEC interface, hardware version 3.3
> [ 1015.941374] Registered IR keymap rc-cec
> [ 1015.941684] rc rc0: tda9950 as /devices/platform/mbus/mbus:internal-regs/f1011000.i2c/i2c-0/i2c-1/1-0070/rc/rc0
> [ 1015.942439] input: tda9950 as /devices/platform/mbus/mbus:internal-regs/f1011000.i2c/i2c-0/i2c-1/1-0070/rc/rc0/input2
> [ 1016.982921] alloc_contig_range: [10390, 10490) PFNs busy
> [ 1016.987400] kirkwood-spdif-audio audio-subsystem: snd-soc-dummy-dai <-> kirkwood-fe mapping ok
> [ 1016.987591] kirkwood-spdif-audio audio-subsystem: multicodec <-> kirkwood-spdif mapping ok
> 
> but the DRM stuff doesn't come back - this is what Daniel was talking
> about further down this thread.

Right, it's that case AFAICS.

> I guess the kernel can't know that it should come back because the
> device links were dropped at the unbind stage, which means the kernel
> has lost the information necessary to know that the display subsystem
> is dependent on the presence of the HDMI encoder.  I don't see an easy
> way around that.

If the link is defined as "persistent", it will survive the removal
of drivers, but it will become "dormant".

Currently, if such a link is present, the consumer probe fill fail
with -EPROBE_DEFER, but it might also trigger device_attach() on
the supplier.  It would make sense to do that anyway IMO.

> If we keep the device links after an unbind event, then, because we
> create them during probe, we'll be attempting to recreate the link
> when we reattach the supplier to the consumer.  If we only allow one
> instance, then what does device_link_add() return.

It returns the existing link, but it will reference-count it too.

However, that arguably is an overlooked use case too, because "persistent"
links created during consumer or supplier probe should not be reference
counted if the same probe runs again.  Extra flags may be needed to
handle that, though.

> Maybe it is going to be less painful to push everything bridge-related
> to use the component helper after all.  Dunno.  Problems either way.

Well, this is a fairly complicated use case and I'm glad that we have it,
because it shows what's missing.

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-17 12:20                                                           ` Daniel Vetter
@ 2019-01-18  9:36                                                             ` Lucas Stach
  2019-01-18 10:03                                                               ` Rafael J. Wysocki
  0 siblings, 1 reply; 53+ messages in thread
From: Lucas Stach @ 2019-01-18  9:36 UTC (permalink / raw)
  To: Daniel Vetter, Rafael J. Wysocki
  Cc: Lubomir Rintel, Sarvela, Tomi P, Greg Kroah-Hartman, Linux PM,
	Liviu Dudau, Russell King - ARM Linux, DRI Development,
	Thierry Reding, Laurent Pinchart, Sean Paul, Peter Rosin

Am Donnerstag, den 17.01.2019, 13:20 +0100 schrieb Daniel Vetter:
[...]
> 
> > I don't think it is addressed here.
> > 
> > Can anyone please explain to me what happens in more detail?
> 
> My understanding (and this might be wrong, Russell, Andrzej, ... pls
> correct) is that the following sequence goes wrong:
> 
> - componentized driver (both producer and consumer) fully loaded&working
> - you unbind the producer/one of the components
> - component framework or driver core through device_link also unbinds
> the master/consumer
> - you reload/rebind the component
> - with the component framework this results in the master being
> rebound, and the overall driver working again. With device_links
> nothing happens.
> 
> I think there was a patch floating around once to put drivers unbound
> due to device_links onto the deferred probe list, so that the next
> time something is bound they have a chance to rebind. But I can't find
> that patch anymore, maybe someone else has the link still?

https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1348023.html
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1355243.html

I can repost if this is finally deemed a good idea.

Regards,
Lucas

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-18  9:36                                                             ` Lucas Stach
@ 2019-01-18 10:03                                                               ` Rafael J. Wysocki
  2019-01-18 11:06                                                                 ` Daniel Vetter
  0 siblings, 1 reply; 53+ messages in thread
From: Rafael J. Wysocki @ 2019-01-18 10:03 UTC (permalink / raw)
  To: Lucas Stach
  Cc: Lubomir Rintel, Sarvela, Tomi P, Liviu Dudau, Linux PM,
	Rafael J. Wysocki, Russell King - ARM Linux, DRI Development,
	Thierry Reding, Laurent Pinchart, Greg Kroah-Hartman, Sean Paul,
	Peter Rosin

On Fri, Jan 18, 2019 at 10:37 AM Lucas Stach <l.stach@pengutronix.de> wrote:
>
> Am Donnerstag, den 17.01.2019, 13:20 +0100 schrieb Daniel Vetter:
> [...]
> >
> > > I don't think it is addressed here.
> > >
> > > Can anyone please explain to me what happens in more detail?
> >
> > My understanding (and this might be wrong, Russell, Andrzej, ... pls
> > correct) is that the following sequence goes wrong:
> >
> > - componentized driver (both producer and consumer) fully loaded&working
> > - you unbind the producer/one of the components
> > - component framework or driver core through device_link also unbinds
> > the master/consumer
> > - you reload/rebind the component
> > - with the component framework this results in the master being
> > rebound, and the overall driver working again. With device_links
> > nothing happens.
> >
> > I think there was a patch floating around once to put drivers unbound
> > due to device_links onto the deferred probe list, so that the next
> > time something is bound they have a chance to rebind. But I can't find
> > that patch anymore, maybe someone else has the link still?
>
> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1348023.html
> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1355243.html
>
> I can repost if this is finally deemed a good idea.

I don't think this will help.

Two things appear to be needed here: missing suppliers need to be
probed automatically at a consumer probe time and "persistent" links
created by ->probe() callbacks should not be reference-counted when
those callbacks run again.  I'm going to cut a patch to do that (early
next week if all goes well).
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-18 10:03                                                               ` Rafael J. Wysocki
@ 2019-01-18 11:06                                                                 ` Daniel Vetter
  2019-01-18 11:17                                                                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 53+ messages in thread
From: Daniel Vetter @ 2019-01-18 11:06 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Lubomir Rintel, Sarvela, Tomi P, Liviu Dudau, Linux PM,
	Rafael J. Wysocki, Russell King - ARM Linux, DRI Development,
	Thierry Reding, Laurent Pinchart, Greg Kroah-Hartman, Sean Paul,
	Peter Rosin

On Fri, Jan 18, 2019 at 11:03 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Fri, Jan 18, 2019 at 10:37 AM Lucas Stach <l.stach@pengutronix.de> wrote:
> >
> > Am Donnerstag, den 17.01.2019, 13:20 +0100 schrieb Daniel Vetter:
> > [...]
> > >
> > > > I don't think it is addressed here.
> > > >
> > > > Can anyone please explain to me what happens in more detail?
> > >
> > > My understanding (and this might be wrong, Russell, Andrzej, ... pls
> > > correct) is that the following sequence goes wrong:
> > >
> > > - componentized driver (both producer and consumer) fully loaded&working
> > > - you unbind the producer/one of the components
> > > - component framework or driver core through device_link also unbinds
> > > the master/consumer
> > > - you reload/rebind the component
> > > - with the component framework this results in the master being
> > > rebound, and the overall driver working again. With device_links
> > > nothing happens.
> > >
> > > I think there was a patch floating around once to put drivers unbound
> > > due to device_links onto the deferred probe list, so that the next
> > > time something is bound they have a chance to rebind. But I can't find
> > > that patch anymore, maybe someone else has the link still?
> >
> > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1348023.html
> > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1355243.html

Thanks for digging it out, those are the patches I had in mind.

> > I can repost if this is finally deemed a good idea.
>
> I don't think this will help.
>
> Two things appear to be needed here: missing suppliers need to be
> probed automatically at a consumer probe time and "persistent" links
> created by ->probe() callbacks should not be reference-counted when
> those callbacks run again.  I'm going to cut a patch to do that (early
> next week if all goes well).

I thought this should work, since it makes an unbind of a consumer act
the same way as if that consumer's bind function has hit an
EPROBE_DEFER. Example:

1. consumer's bind is called, realizes (through a subsystem function
like of_drm_find_panel) that the producer isn't there yet, fails with
EPROBE DEFER.

2. driver core puts the consumer onto the deferred probe list.

3. producer gets loaded, registers the panel

4. driver core re-runs the consumer's bind function

5. consumer's bind function finds the produced and sets up the device link.

Now for the unbind case:

1. producer is unbound, driver core unbinds consumer due to the device_link

2. (Only with Lucas' patch) driver core puts the consumer onto the
deferred probe list.

3. Developer (this really is useful for development) rebinds/reloads
producer driver, which re-registers the panel

4 & 5 work exactly the same as with the initial load sequence.

With this initial loading and reloading would be very similar, and I
think that's a good thing. It also solves the device_link
lifetime/refcounting issue, since the device_link gets torn down
in/restored completely, and reloading isn't a special case like in
your proposed solution - there's no device_link left over from a
previous loading of the driver, the dependency is only handled by
putting the consumer (back)  onto the deferred probe list.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-08  9:22                 ` Andrzej Hajda
  2019-01-08 10:23                   ` Russell King - ARM Linux
  2019-01-08 10:24                   ` Daniel Vetter
@ 2019-01-18 11:07                   ` Linus Walleij
  2 siblings, 0 replies; 53+ messages in thread
From: Linus Walleij @ 2019-01-18 11:07 UTC (permalink / raw)
  To: Andrzej Hajda
  Cc: Liviu Dudau, Russell King - ARM Linux, DRI Development,
	Lubomir Rintel, Peter Rosin

On Tue, Jan 8, 2019 at 10:22 AM Andrzej Hajda <a.hajda@samsung.com> wrote:

> Of course it was tested on Exynos as this is HW I work on. Linus Walleij
> has also reported in this thread that device links have different issue
> - circular dependencies (last few emails in this lengthy thread). My
> response explains it in detail.

Indeed, it was detailed in
commit d6a77ba0eb92d8ffa4b05a442fc20d0a9b11c4c4

AFAICT the problem was that struct drm_device does not
contain a struct device itself, just a pointer
struct device * ->dev to the bus device which is unorthodox:
most other device type have a full struct device inside
them and the bus device would rather be that device's
parent.

I had this situation with struct gpio_chip for a long time
but eventually fixed it, which proved immensely helpful.

Yours,
Linus Walleij
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-18 11:06                                                                 ` Daniel Vetter
@ 2019-01-18 11:17                                                                   ` Rafael J. Wysocki
  2019-01-18 11:37                                                                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 53+ messages in thread
From: Rafael J. Wysocki @ 2019-01-18 11:17 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rafael J. Wysocki, Sarvela, Tomi P, Liviu Dudau, Linux PM,
	Rafael J. Wysocki, Russell King - ARM Linux, DRI Development,
	Lubomir Rintel, Thierry Reding, Laurent Pinchart,
	Greg Kroah-Hartman, Sean Paul, Peter Rosin

On Fri, Jan 18, 2019 at 12:06 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Jan 18, 2019 at 11:03 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Fri, Jan 18, 2019 at 10:37 AM Lucas Stach <l.stach@pengutronix.de> wrote:
> > >
> > > Am Donnerstag, den 17.01.2019, 13:20 +0100 schrieb Daniel Vetter:
> > > [...]
> > > >
> > > > > I don't think it is addressed here.
> > > > >
> > > > > Can anyone please explain to me what happens in more detail?
> > > >
> > > > My understanding (and this might be wrong, Russell, Andrzej, ... pls
> > > > correct) is that the following sequence goes wrong:
> > > >
> > > > - componentized driver (both producer and consumer) fully loaded&working
> > > > - you unbind the producer/one of the components
> > > > - component framework or driver core through device_link also unbinds
> > > > the master/consumer
> > > > - you reload/rebind the component
> > > > - with the component framework this results in the master being
> > > > rebound, and the overall driver working again. With device_links
> > > > nothing happens.
> > > >
> > > > I think there was a patch floating around once to put drivers unbound
> > > > due to device_links onto the deferred probe list, so that the next
> > > > time something is bound they have a chance to rebind. But I can't find
> > > > that patch anymore, maybe someone else has the link still?
> > >
> > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1348023.html
> > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1355243.html
>
> Thanks for digging it out, those are the patches I had in mind.
>
> > > I can repost if this is finally deemed a good idea.
> >
> > I don't think this will help.
> >
> > Two things appear to be needed here: missing suppliers need to be
> > probed automatically at a consumer probe time and "persistent" links
> > created by ->probe() callbacks should not be reference-counted when
> > those callbacks run again.  I'm going to cut a patch to do that (early
> > next week if all goes well).
>
> I thought this should work, since it makes an unbind of a consumer act
> the same way as if that consumer's bind function has hit an
> EPROBE_DEFER. Example:
>
> 1. consumer's bind is called, realizes (through a subsystem function
> like of_drm_find_panel) that the producer isn't there yet, fails with
> EPROBE DEFER.
>
> 2. driver core puts the consumer onto the deferred probe list.
>
> 3. producer gets loaded, registers the panel
>
> 4. driver core re-runs the consumer's bind function
>
> 5. consumer's bind function finds the produced and sets up the device link.
>
> Now for the unbind case:
>
> 1. producer is unbound, driver core unbinds consumer due to the device_link
>
> 2. (Only with Lucas' patch) driver core puts the consumer onto the
> deferred probe list.
>
> 3. Developer (this really is useful for development) rebinds/reloads
> producer driver, which re-registers the panel
>
> 4 & 5 work exactly the same as with the initial load sequence.

I misunderstood the Russell's description, sorry.

My understanding was that the consumer driver would be rebound at this
point and that the (missing) producer was expected to rebind as well
in response.

> With this initial loading and reloading would be very similar, and I
> think that's a good thing. It also solves the device_link
> lifetime/refcounting issue, since the device_link gets torn down
> in/restored completely, and reloading isn't a special case like in
> your proposed solution - there's no device_link left over from a
> previous loading of the driver, the dependency is only handled by
> putting the consumer (back)  onto the deferred probe list.

My idea was based on incorrect understanding of the problem, so scratch it. :-)

The Lukas' patch would indeed work here, but I'm not sure if doing
that for all links unconditionally is a good idea.

I'd rather have a link flag to indicate that this behavior is desirable.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-18 11:17                                                                   ` Rafael J. Wysocki
@ 2019-01-18 11:37                                                                     ` Rafael J. Wysocki
  2019-01-18 12:57                                                                       ` Daniel Vetter
  0 siblings, 1 reply; 53+ messages in thread
From: Rafael J. Wysocki @ 2019-01-18 11:37 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rafael J. Wysocki, Sarvela, Tomi P, Liviu Dudau, Linux PM,
	Rafael J. Wysocki, Russell King - ARM Linux, DRI Development,
	Lubomir Rintel, Thierry Reding, Laurent Pinchart,
	Greg Kroah-Hartman, Sean Paul, Peter Rosin

On Fri, Jan 18, 2019 at 12:17 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Fri, Jan 18, 2019 at 12:06 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > On Fri, Jan 18, 2019 at 11:03 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > >
> > > On Fri, Jan 18, 2019 at 10:37 AM Lucas Stach <l.stach@pengutronix.de> wrote:
> > > >
> > > > Am Donnerstag, den 17.01.2019, 13:20 +0100 schrieb Daniel Vetter:
> > > > [...]
> > > > >
> > > > > > I don't think it is addressed here.
> > > > > >
> > > > > > Can anyone please explain to me what happens in more detail?
> > > > >
> > > > > My understanding (and this might be wrong, Russell, Andrzej, ... pls
> > > > > correct) is that the following sequence goes wrong:
> > > > >
> > > > > - componentized driver (both producer and consumer) fully loaded&working
> > > > > - you unbind the producer/one of the components
> > > > > - component framework or driver core through device_link also unbinds
> > > > > the master/consumer
> > > > > - you reload/rebind the component
> > > > > - with the component framework this results in the master being
> > > > > rebound, and the overall driver working again. With device_links
> > > > > nothing happens.
> > > > >
> > > > > I think there was a patch floating around once to put drivers unbound
> > > > > due to device_links onto the deferred probe list, so that the next
> > > > > time something is bound they have a chance to rebind. But I can't find
> > > > > that patch anymore, maybe someone else has the link still?
> > > >
> > > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1348023.html
> > > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1355243.html
> >
> > Thanks for digging it out, those are the patches I had in mind.
> >
> > > > I can repost if this is finally deemed a good idea.
> > >
> > > I don't think this will help.
> > >
> > > Two things appear to be needed here: missing suppliers need to be
> > > probed automatically at a consumer probe time and "persistent" links
> > > created by ->probe() callbacks should not be reference-counted when
> > > those callbacks run again.  I'm going to cut a patch to do that (early
> > > next week if all goes well).
> >
> > I thought this should work, since it makes an unbind of a consumer act
> > the same way as if that consumer's bind function has hit an
> > EPROBE_DEFER. Example:
> >
> > 1. consumer's bind is called, realizes (through a subsystem function
> > like of_drm_find_panel) that the producer isn't there yet, fails with
> > EPROBE DEFER.
> >
> > 2. driver core puts the consumer onto the deferred probe list.
> >
> > 3. producer gets loaded, registers the panel
> >
> > 4. driver core re-runs the consumer's bind function
> >
> > 5. consumer's bind function finds the produced and sets up the device link.
> >
> > Now for the unbind case:
> >
> > 1. producer is unbound, driver core unbinds consumer due to the device_link
> >
> > 2. (Only with Lucas' patch) driver core puts the consumer onto the
> > deferred probe list.
> >
> > 3. Developer (this really is useful for development) rebinds/reloads
> > producer driver, which re-registers the panel
> >
> > 4 & 5 work exactly the same as with the initial load sequence.
>
> I misunderstood the Russell's description, sorry.
>
> My understanding was that the consumer driver would be rebound at this
> point and that the (missing) producer was expected to rebind as well
> in response.
>
> > With this initial loading and reloading would be very similar, and I
> > think that's a good thing. It also solves the device_link
> > lifetime/refcounting issue, since the device_link gets torn down
> > in/restored completely, and reloading isn't a special case like in
> > your proposed solution - there's no device_link left over from a
> > previous loading of the driver, the dependency is only handled by
> > putting the consumer (back)  onto the deferred probe list.
>
> My idea was based on incorrect understanding of the problem, so scratch it. :-)
>
> The Lukas' patch would indeed work here, but I'm not sure if doing
> that for all links unconditionally is a good idea.
>
> I'd rather have a link flag to indicate that this behavior is desirable.

That said, the creation of a device link is a heavy-weight operation
in general as it triggers a list reordering that may turn out to be
recursive etc., so I'd rather avoid doing too much of that.

Moreover, creating a link between two devices once should be
sufficient, so I'd prefer "persistent" links to be used in that case,
but they need to be fixed to avoid the extra reference counting.  That
and a new link flag to indicate that the consumer should be probed
automatically after binding the supplier driver.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-18 11:37                                                                     ` Rafael J. Wysocki
@ 2019-01-18 12:57                                                                       ` Daniel Vetter
  2019-01-24 11:00                                                                         ` Rafael J. Wysocki
  0 siblings, 1 reply; 53+ messages in thread
From: Daniel Vetter @ 2019-01-18 12:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Lubomir Rintel, Sarvela, Tomi P, Liviu Dudau, Linux PM,
	Rafael J. Wysocki, Russell King - ARM Linux, DRI Development,
	Thierry Reding, Laurent Pinchart, Greg Kroah-Hartman, Sean Paul,
	Peter Rosin

On Fri, Jan 18, 2019 at 12:38 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Fri, Jan 18, 2019 at 12:17 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Fri, Jan 18, 2019 at 12:06 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > On Fri, Jan 18, 2019 at 11:03 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > >
> > > > On Fri, Jan 18, 2019 at 10:37 AM Lucas Stach <l.stach@pengutronix.de> wrote:
> > > > >
> > > > > Am Donnerstag, den 17.01.2019, 13:20 +0100 schrieb Daniel Vetter:
> > > > > [...]
> > > > > >
> > > > > > > I don't think it is addressed here.
> > > > > > >
> > > > > > > Can anyone please explain to me what happens in more detail?
> > > > > >
> > > > > > My understanding (and this might be wrong, Russell, Andrzej, ... pls
> > > > > > correct) is that the following sequence goes wrong:
> > > > > >
> > > > > > - componentized driver (both producer and consumer) fully loaded&working
> > > > > > - you unbind the producer/one of the components
> > > > > > - component framework or driver core through device_link also unbinds
> > > > > > the master/consumer
> > > > > > - you reload/rebind the component
> > > > > > - with the component framework this results in the master being
> > > > > > rebound, and the overall driver working again. With device_links
> > > > > > nothing happens.
> > > > > >
> > > > > > I think there was a patch floating around once to put drivers unbound
> > > > > > due to device_links onto the deferred probe list, so that the next
> > > > > > time something is bound they have a chance to rebind. But I can't find
> > > > > > that patch anymore, maybe someone else has the link still?
> > > > >
> > > > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1348023.html
> > > > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1355243.html
> > >
> > > Thanks for digging it out, those are the patches I had in mind.
> > >
> > > > > I can repost if this is finally deemed a good idea.
> > > >
> > > > I don't think this will help.
> > > >
> > > > Two things appear to be needed here: missing suppliers need to be
> > > > probed automatically at a consumer probe time and "persistent" links
> > > > created by ->probe() callbacks should not be reference-counted when
> > > > those callbacks run again.  I'm going to cut a patch to do that (early
> > > > next week if all goes well).
> > >
> > > I thought this should work, since it makes an unbind of a consumer act
> > > the same way as if that consumer's bind function has hit an
> > > EPROBE_DEFER. Example:
> > >
> > > 1. consumer's bind is called, realizes (through a subsystem function
> > > like of_drm_find_panel) that the producer isn't there yet, fails with
> > > EPROBE DEFER.
> > >
> > > 2. driver core puts the consumer onto the deferred probe list.
> > >
> > > 3. producer gets loaded, registers the panel
> > >
> > > 4. driver core re-runs the consumer's bind function
> > >
> > > 5. consumer's bind function finds the produced and sets up the device link.
> > >
> > > Now for the unbind case:
> > >
> > > 1. producer is unbound, driver core unbinds consumer due to the device_link
> > >
> > > 2. (Only with Lucas' patch) driver core puts the consumer onto the
> > > deferred probe list.
> > >
> > > 3. Developer (this really is useful for development) rebinds/reloads
> > > producer driver, which re-registers the panel
> > >
> > > 4 & 5 work exactly the same as with the initial load sequence.
> >
> > I misunderstood the Russell's description, sorry.
> >
> > My understanding was that the consumer driver would be rebound at this
> > point and that the (missing) producer was expected to rebind as well
> > in response.
> >
> > > With this initial loading and reloading would be very similar, and I
> > > think that's a good thing. It also solves the device_link
> > > lifetime/refcounting issue, since the device_link gets torn down
> > > in/restored completely, and reloading isn't a special case like in
> > > your proposed solution - there's no device_link left over from a
> > > previous loading of the driver, the dependency is only handled by
> > > putting the consumer (back)  onto the deferred probe list.
> >
> > My idea was based on incorrect understanding of the problem, so scratch it. :-)
> >
> > The Lukas' patch would indeed work here, but I'm not sure if doing
> > that for all links unconditionally is a good idea.
> >
> > I'd rather have a link flag to indicate that this behavior is desirable.
>
> That said, the creation of a device link is a heavy-weight operation
> in general as it triggers a list reordering that may turn out to be
> recursive etc., so I'd rather avoid doing too much of that.
>
> Moreover, creating a link between two devices once should be
> sufficient, so I'd prefer "persistent" links to be used in that case,
> but they need to be fixed to avoid the extra reference counting.  That
> and a new link flag to indicate that the consumer should be probed
> automatically after binding the supplier driver.

Yeah if persistent links make more sense from an implementation pov
then that's all fine, as long as it results in the same behaviour for
from the pov of all the involved bind functions. So if the core
handles all the refcount trickery, that's good.

Also an explicit flag sounds like a good idea, since for some other
problems we want to make the device_link opt-in. There's otherwise
some loops in some cases apparently.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Armada DRM: bridge with componentized devices
  2019-01-18 12:57                                                                       ` Daniel Vetter
@ 2019-01-24 11:00                                                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 53+ messages in thread
From: Rafael J. Wysocki @ 2019-01-24 11:00 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rafael J. Wysocki, Sarvela, Tomi P, Greg Kroah-Hartman, Linux PM,
	Liviu Dudau, Russell King - ARM Linux, DRI Development,
	Lubomir Rintel, Thierry Reding, Laurent Pinchart, Sean Paul,
	Peter Rosin

On Friday, January 18, 2019 1:57:47 PM CET Daniel Vetter wrote:
> On Fri, Jan 18, 2019 at 12:38 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Fri, Jan 18, 2019 at 12:17 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > >
> > > On Fri, Jan 18, 2019 at 12:06 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > On Fri, Jan 18, 2019 at 11:03 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > > >
> > > > > On Fri, Jan 18, 2019 at 10:37 AM Lucas Stach <l.stach@pengutronix.de> wrote:
> > > > > >
> > > > > > Am Donnerstag, den 17.01.2019, 13:20 +0100 schrieb Daniel Vetter:
> > > > > > [...]
> > > > > > >
> > > > > > > > I don't think it is addressed here.
> > > > > > > >
> > > > > > > > Can anyone please explain to me what happens in more detail?
> > > > > > >
> > > > > > > My understanding (and this might be wrong, Russell, Andrzej, ... pls
> > > > > > > correct) is that the following sequence goes wrong:
> > > > > > >
> > > > > > > - componentized driver (both producer and consumer) fully loaded&working
> > > > > > > - you unbind the producer/one of the components
> > > > > > > - component framework or driver core through device_link also unbinds
> > > > > > > the master/consumer
> > > > > > > - you reload/rebind the component
> > > > > > > - with the component framework this results in the master being
> > > > > > > rebound, and the overall driver working again. With device_links
> > > > > > > nothing happens.
> > > > > > >
> > > > > > > I think there was a patch floating around once to put drivers unbound
> > > > > > > due to device_links onto the deferred probe list, so that the next
> > > > > > > time something is bound they have a chance to rebind. But I can't find
> > > > > > > that patch anymore, maybe someone else has the link still?
> > > > > >
> > > > > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1348023.html
> > > > > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1355243.html
> > > >
> > > > Thanks for digging it out, those are the patches I had in mind.
> > > >
> > > > > > I can repost if this is finally deemed a good idea.
> > > > >
> > > > > I don't think this will help.
> > > > >
> > > > > Two things appear to be needed here: missing suppliers need to be
> > > > > probed automatically at a consumer probe time and "persistent" links
> > > > > created by ->probe() callbacks should not be reference-counted when
> > > > > those callbacks run again.  I'm going to cut a patch to do that (early
> > > > > next week if all goes well).
> > > >
> > > > I thought this should work, since it makes an unbind of a consumer act
> > > > the same way as if that consumer's bind function has hit an
> > > > EPROBE_DEFER. Example:
> > > >
> > > > 1. consumer's bind is called, realizes (through a subsystem function
> > > > like of_drm_find_panel) that the producer isn't there yet, fails with
> > > > EPROBE DEFER.
> > > >
> > > > 2. driver core puts the consumer onto the deferred probe list.
> > > >
> > > > 3. producer gets loaded, registers the panel
> > > >
> > > > 4. driver core re-runs the consumer's bind function
> > > >
> > > > 5. consumer's bind function finds the produced and sets up the device link.
> > > >
> > > > Now for the unbind case:
> > > >
> > > > 1. producer is unbound, driver core unbinds consumer due to the device_link
> > > >
> > > > 2. (Only with Lucas' patch) driver core puts the consumer onto the
> > > > deferred probe list.
> > > >
> > > > 3. Developer (this really is useful for development) rebinds/reloads
> > > > producer driver, which re-registers the panel
> > > >
> > > > 4 & 5 work exactly the same as with the initial load sequence.
> > >
> > > I misunderstood the Russell's description, sorry.
> > >
> > > My understanding was that the consumer driver would be rebound at this
> > > point and that the (missing) producer was expected to rebind as well
> > > in response.
> > >
> > > > With this initial loading and reloading would be very similar, and I
> > > > think that's a good thing. It also solves the device_link
> > > > lifetime/refcounting issue, since the device_link gets torn down
> > > > in/restored completely, and reloading isn't a special case like in
> > > > your proposed solution - there's no device_link left over from a
> > > > previous loading of the driver, the dependency is only handled by
> > > > putting the consumer (back)  onto the deferred probe list.
> > >
> > > My idea was based on incorrect understanding of the problem, so scratch it. :-)
> > >
> > > The Lukas' patch would indeed work here, but I'm not sure if doing
> > > that for all links unconditionally is a good idea.
> > >
> > > I'd rather have a link flag to indicate that this behavior is desirable.
> >
> > That said, the creation of a device link is a heavy-weight operation
> > in general as it triggers a list reordering that may turn out to be
> > recursive etc., so I'd rather avoid doing too much of that.
> >
> > Moreover, creating a link between two devices once should be
> > sufficient, so I'd prefer "persistent" links to be used in that case,
> > but they need to be fixed to avoid the extra reference counting.  That
> > and a new link flag to indicate that the consumer should be probed
> > automatically after binding the supplier driver.
> 
> Yeah if persistent links make more sense from an implementation pov
> then that's all fine, as long as it results in the same behaviour for
> from the pov of all the involved bind functions. So if the core
> handles all the refcount trickery, that's good.
> 
> Also an explicit flag sounds like a good idea, since for some other
> problems we want to make the device_link opt-in. There's otherwise
> some loops in some cases apparently.

OK, I think I know how to make device links support this use case, but
that will require some rework of the framework, mostly consisting of
fixes and cleanups.

I'll send the first part of it in a minute.

Cheers,
Rafael

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2019-01-24 11:00 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-03  9:47 Armada DRM: bridge with componentized devices Lubomir Rintel
2019-01-03 13:11 ` Russell King - ARM Linux
2019-01-07 10:45   ` Daniel Vetter
2019-01-07 11:26     ` Andrzej Hajda
2019-01-07 16:08       ` Daniel Vetter
2019-01-07 16:27         ` Andrzej Hajda
2019-01-07 21:56           ` Daniel Vetter
2019-01-08  8:35             ` Andrzej Hajda
2019-01-08  8:47               ` Daniel Vetter
2019-01-08  9:22                 ` Andrzej Hajda
2019-01-08 10:23                   ` Russell King - ARM Linux
2019-01-08 10:32                     ` Andrzej Hajda
2019-01-08 10:24                   ` Daniel Vetter
2019-01-08 11:25                     ` Andrzej Hajda
2019-01-08 11:38                       ` Russell King - ARM Linux
2019-01-08 12:27                         ` Andrzej Hajda
2019-01-08 13:21                           ` Russell King - ARM Linux
2019-01-08 13:34                             ` Daniel Vetter
2019-01-08 14:33                             ` Andrzej Hajda
2019-01-08 15:07                               ` Russell King - ARM Linux
2019-01-08 18:07                                 ` Daniel Vetter
2019-01-09  9:12                                 ` Andrzej Hajda
2019-01-09  9:24                                   ` Rafael J. Wysocki
2019-01-09  9:30                                     ` Russell King - ARM Linux
2019-01-11 14:20                                       ` Daniel Vetter
2019-01-11 14:26                                         ` Rafael J. Wysocki
2019-01-11 14:32                                           ` Russell King - ARM Linux
2019-01-11 14:36                                             ` Daniel Vetter
2019-01-11 14:40                                               ` Rafael J. Wysocki
2019-01-11 14:36                                             ` Rafael J. Wysocki
2019-01-11 14:49                                               ` Russell King - ARM Linux
2019-01-14 12:32                                                 ` Rafael J. Wysocki
2019-01-15  0:04                                                   ` Rafael J. Wysocki
2019-01-15 22:47                                                     ` Rafael J. Wysocki
2019-01-16 18:42                                                       ` Daniel Vetter
2019-01-16 22:43                                                         ` Rafael J. Wysocki
2019-01-17 12:20                                                           ` Daniel Vetter
2019-01-18  9:36                                                             ` Lucas Stach
2019-01-18 10:03                                                               ` Rafael J. Wysocki
2019-01-18 11:06                                                                 ` Daniel Vetter
2019-01-18 11:17                                                                   ` Rafael J. Wysocki
2019-01-18 11:37                                                                     ` Rafael J. Wysocki
2019-01-18 12:57                                                                       ` Daniel Vetter
2019-01-24 11:00                                                                         ` Rafael J. Wysocki
2019-01-17 17:26                                                       ` Russell King - ARM Linux admin
2019-01-17 22:43                                                         ` Rafael J. Wysocki
2019-01-18 11:07                   ` Linus Walleij
2019-01-08 10:16               ` Russell King - ARM Linux
2019-01-08 10:31                 ` Daniel Vetter
2019-01-07 16:12     ` Russell King - ARM Linux
2019-01-07 21:55       ` Daniel Vetter
2019-01-08  0:39         ` Russell King - ARM Linux
2019-01-08 14:29           ` Liviu Dudau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.