All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device
       [not found] ` <1375775624-12250-6-git-send-email-panto@antoniou-consulting.com>
@ 2013-08-06  9:33   ` Greg Kroah-Hartman
  2013-08-06  9:37     ` Pantelis Antoniou
  0 siblings, 1 reply; 26+ messages in thread
From: Greg Kroah-Hartman @ 2013-08-06  9:33 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Tony Lindgren, Russell King, Benoît Coussno, Paul Walmsley,
	Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi, linux-omap,
	linux-kernel

On Tue, Aug 06, 2013 at 10:53:44AM +0300, Pantelis Antoniou wrote:
> Removing any omap device always resulted in a crash; turns out
> BUS_NOTIFY_DEL_DEVICE is not the last notifier event sent in the
> course of removing the device, the correct event is
> BUS_NOTIFY_UNBOUND_DRIVER, which still is not the right place to
> perform the cleanup. A device callback handles that properly, as
> well as making sure the hwmods of the device are shutdown.
> 
> Signed-off-by: Pantelis Antoniou <panto@antoniou-consulting.com>
> ---
>  arch/arm/mach-omap2/omap_device.c | 34 ++++++++++++++++++++++++++++++++--
>  1 file changed, 32 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/mach-omap2/omap_device.c b/arch/arm/mach-omap2/omap_device.c
> index f33b40c..6dec521 100644
> --- a/arch/arm/mach-omap2/omap_device.c
> +++ b/arch/arm/mach-omap2/omap_device.c
> @@ -178,6 +178,32 @@ odbfd_exit:
>  	return ret;
>  }
>  
> +static void _omap_device_cleanup(struct device *dev)
> +{
> +	struct platform_device *pdev = to_platform_device(dev);
> +	struct omap_device *od;
> +	struct omap_hwmod *oh;
> +	int i;
> +
> +	od = pdev->archdata.od;
> +	if (!od)
> +		return;
> +
> +	for (i = 0; i < od->hwmods_cnt; i++) {
> +
> +		oh = od->hwmods[i];
> +
> +		/* shutdown hwmods */
> +		omap_hwmod_shutdown(oh);
> +
> +		/* we don't remove clocks cause there's no API to do so */
> +		/* no harm done, since they will not be created next time */
> +	}
> +
> +	/* cleanup the structure now */
> +	omap_device_delete(od);
> +}
> +
>  static int _omap_device_notifier_call(struct notifier_block *nb,
>  				      unsigned long event, void *dev)
>  {
> @@ -185,9 +211,13 @@ static int _omap_device_notifier_call(struct notifier_block *nb,
>  	struct omap_device *od;
>  
>  	switch (event) {
> -	case BUS_NOTIFY_DEL_DEVICE:
> +	case BUS_NOTIFY_UNBOUND_DRIVER:
> +		/* NOTIFY_DEL_DEVICE is not the right call...
> +		 * we use a callback here, to make sure no-one is going to
> +		 * try to use the omap_device data after they're deleted
> +		 */
>  		if (pdev->archdata.od)
> -			omap_device_delete(pdev->archdata.od);
> +			device_schedule_callback(dev, _omap_device_cleanup);

Really?  This is one sign that you are totally using the driver core
incorrectly.  You shouldn't have to rely on notifier callbacks to handle
device removals, your bus code should do that for you directly.

I don't like this at all, sorry.

And I was waiting for the day when people started to finally remove
platform devices from the system, I always thought it would never work
properly.  Good luck with this, I think you have a lot of work ahead of
yourself...

greg k-h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] pdev: Fix platform device resource linking
       [not found] ` <1375775624-12250-2-git-send-email-panto@antoniou-consulting.com>
@ 2013-08-06  9:36   ` Greg Kroah-Hartman
  2013-08-06  9:45     ` Pantelis Antoniou
  0 siblings, 1 reply; 26+ messages in thread
From: Greg Kroah-Hartman @ 2013-08-06  9:36 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Tony Lindgren, Russell King, Benoît Coussno, Paul Walmsley,
	Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi, linux-omap,
	linux-kernel

On Tue, Aug 06, 2013 at 10:53:40AM +0300, Pantelis Antoniou wrote:
> Platform device removal uncovered a number of problems with
> the way resources are handled in the core platform code.
> 
> Resources now form child/parent linkages and this requires
> proper linking of the resources. On top of that the OF core
> directly creates it's own platform devices. Simplify things
> by providing helper functions that manage the linking properly.

Ugh, the OF core shouldn't be creating platform devices.  Well, yes, I
know it does that today, but ick, ick, ick.

> Two functions are provided:
> 
> platform_device_link_resources(), which links all the
> linkable resources (if not already linked).
> 
> and platform_device_unlink_resources(), which unlinks all the
> resources.

Why would anyone need to call this?  I'm getting the feeling that OF
should just have it's own bus of devices to handle this type of mess.
ACPI is going through the same rewrite for this same type of problem
(they did things differently.)  I suggest you work with the ACPI
developers to so the same thing they are, to solve it correctly for
everyone.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device
  2013-08-06  9:33   ` [PATCH 5/5] arm: omap: Proper cleanups for omap_device Greg Kroah-Hartman
@ 2013-08-06  9:37     ` Pantelis Antoniou
  2013-08-06 10:14       ` Greg Kroah-Hartman
  2013-08-07 16:15       ` Tony Lindgren
  0 siblings, 2 replies; 26+ messages in thread
From: Pantelis Antoniou @ 2013-08-06  9:37 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Tony Lindgren, Russell King, Benoît Coussno, Paul Walmsley,
	Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi, linux-omap,
	linux-kernel

Hi Greg,

On Aug 6, 2013, at 12:33 PM, Greg Kroah-Hartman wrote:

> On Tue, Aug 06, 2013 at 10:53:44AM +0300, Pantelis Antoniou wrote:
>> Removing any omap device always resulted in a crash; turns out
>> BUS_NOTIFY_DEL_DEVICE is not the last notifier event sent in the
>> course of removing the device, the correct event is
>> BUS_NOTIFY_UNBOUND_DRIVER, which still is not the right place to
>> perform the cleanup. A device callback handles that properly, as
>> well as making sure the hwmods of the device are shutdown.
>> 
>> Signed-off-by: Pantelis Antoniou <panto@antoniou-consulting.com>
>> ---
>> arch/arm/mach-omap2/omap_device.c | 34 ++++++++++++++++++++++++++++++++--
>> 1 file changed, 32 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/arm/mach-omap2/omap_device.c b/arch/arm/mach-omap2/omap_device.c
>> index f33b40c..6dec521 100644
>> --- a/arch/arm/mach-omap2/omap_device.c
>> +++ b/arch/arm/mach-omap2/omap_device.c
>> @@ -178,6 +178,32 @@ odbfd_exit:
>> 	return ret;
>> }
>> 
>> +static void _omap_device_cleanup(struct device *dev)
>> +{
>> +	struct platform_device *pdev = to_platform_device(dev);
>> +	struct omap_device *od;
>> +	struct omap_hwmod *oh;
>> +	int i;
>> +
>> +	od = pdev->archdata.od;
>> +	if (!od)
>> +		return;
>> +
>> +	for (i = 0; i < od->hwmods_cnt; i++) {
>> +
>> +		oh = od->hwmods[i];
>> +
>> +		/* shutdown hwmods */
>> +		omap_hwmod_shutdown(oh);
>> +
>> +		/* we don't remove clocks cause there's no API to do so */
>> +		/* no harm done, since they will not be created next time */
>> +	}
>> +
>> +	/* cleanup the structure now */
>> +	omap_device_delete(od);
>> +}
>> +
>> static int _omap_device_notifier_call(struct notifier_block *nb,
>> 				      unsigned long event, void *dev)
>> {
>> @@ -185,9 +211,13 @@ static int _omap_device_notifier_call(struct notifier_block *nb,
>> 	struct omap_device *od;
>> 
>> 	switch (event) {
>> -	case BUS_NOTIFY_DEL_DEVICE:
>> +	case BUS_NOTIFY_UNBOUND_DRIVER:
>> +		/* NOTIFY_DEL_DEVICE is not the right call...
>> +		 * we use a callback here, to make sure no-one is going to
>> +		 * try to use the omap_device data after they're deleted
>> +		 */
>> 		if (pdev->archdata.od)
>> -			omap_device_delete(pdev->archdata.od);
>> +			device_schedule_callback(dev, _omap_device_cleanup);
> 
> Really?  This is one sign that you are totally using the driver core
> incorrectly.  You shouldn't have to rely on notifier callbacks to handle
> device removals, your bus code should do that for you directly.
> 
> I don't like this at all, sorry.
> 

Don't shoot the messenger please...

This is all about fixing a crash without messing too many things.

> And I was waiting for the day when people started to finally remove
> platform devices from the system, I always thought it would never work
> properly.  Good luck with this, I think you have a lot of work ahead of
> yourself...
> 

I know.

Platform device removal is the wild-wild west...

If I had to make a wish, would be to kill platform_device completely and
integrate all it's functionality in the the vanilla device.

What exactly is a platform device anyway?

> greg k-h

Regards

-- Pantelis


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] pdev: Fix platform device resource linking
  2013-08-06  9:36   ` [PATCH 1/5] pdev: Fix platform device resource linking Greg Kroah-Hartman
@ 2013-08-06  9:45     ` Pantelis Antoniou
  2013-08-06 10:15       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 26+ messages in thread
From: Pantelis Antoniou @ 2013-08-06  9:45 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Tony Lindgren, Russell King, Benoît Coussno, Paul Walmsley,
	Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi, linux-omap,
	linux-kernel

Hi Greg,

On Aug 6, 2013, at 12:36 PM, Greg Kroah-Hartman wrote:

> On Tue, Aug 06, 2013 at 10:53:40AM +0300, Pantelis Antoniou wrote:
>> Platform device removal uncovered a number of problems with
>> the way resources are handled in the core platform code.
>> 
>> Resources now form child/parent linkages and this requires
>> proper linking of the resources. On top of that the OF core
>> directly creates it's own platform devices. Simplify things
>> by providing helper functions that manage the linking properly.
> 
> Ugh, the OF core shouldn't be creating platform devices.  Well, yes, I
> know it does that today, but ick, ick, ick.
> 

Yep, ick, ick, ick is the correct form.

>> Two functions are provided:
>> 
>> platform_device_link_resources(), which links all the
>> linkable resources (if not already linked).
>> 
>> and platform_device_unlink_resources(), which unlinks all the
>> resources.
> 
> Why would anyone need to call this?  I'm getting the feeling that OF
> should just have it's own bus of devices to handle this type of mess.
> ACPI is going through the same rewrite for this same type of problem
> (they did things differently.)  I suggest you work with the ACPI
> developers to so the same thing they are, to solve it correctly for
> everyone.
> 

It's the same problem really. Another bus type might not fly well.
The same device driver should be (in theory) be made to work unchanged
either on an OF/ACPI/Fex( :) ) setup.

What would it take to move all this into driver core?

> thanks,
> 
> greg k-h

Regards

-- Pantelis


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device
  2013-08-06  9:37     ` Pantelis Antoniou
@ 2013-08-06 10:14       ` Greg Kroah-Hartman
  2013-08-06 13:37         ` Alexander Holler
  2013-08-07  7:44         ` Pantelis Antoniou
  2013-08-07 16:15       ` Tony Lindgren
  1 sibling, 2 replies; 26+ messages in thread
From: Greg Kroah-Hartman @ 2013-08-06 10:14 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Tony Lindgren, Russell King, Benoît Coussno, Paul Walmsley,
	Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi, linux-omap,
	linux-kernel

On Tue, Aug 06, 2013 at 12:37:25PM +0300, Pantelis Antoniou wrote:
> > I don't like this at all, sorry.
> > 
> 
> Don't shoot the messenger please...
> 
> This is all about fixing a crash without messing too many things.

I understand, it's not your fault at all.

> > And I was waiting for the day when people started to finally remove
> > platform devices from the system, I always thought it would never work
> > properly.  Good luck with this, I think you have a lot of work ahead of
> > yourself...
> > 
> 
> I know.
> 
> Platform device removal is the wild-wild west...
> 
> If I had to make a wish, would be to kill platform_device completely and
> integrate all it's functionality in the the vanilla device.

YES!!!!

> What exactly is a platform device anyway?

Originally it was a "something that wasn't connected to a bus, but just
had memory-mapped i/o."  Like the PS2 keyboard controller.

Embedded systems got ahold of this and went to town, and made everything
a platform device because they could, and no one was paying attention.

Then OF came along and used it as well, and you know the rest...

I think we need to get the ACPI and OF people, and me, in a room
together at the kernel summit and not let us out until we have this all
worked out.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] pdev: Fix platform device resource linking
  2013-08-06  9:45     ` Pantelis Antoniou
@ 2013-08-06 10:15       ` Greg Kroah-Hartman
  2013-08-06 10:19         ` Russell King - ARM Linux
  2013-08-06 10:27         ` Pantelis Antoniou
  0 siblings, 2 replies; 26+ messages in thread
From: Greg Kroah-Hartman @ 2013-08-06 10:15 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Tony Lindgren, Russell King, Benoît Coussno, Paul Walmsley,
	Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi, linux-omap,
	linux-kernel

On Tue, Aug 06, 2013 at 12:45:42PM +0300, Pantelis Antoniou wrote:
> Hi Greg,
> 
> On Aug 6, 2013, at 12:36 PM, Greg Kroah-Hartman wrote:
> 
> > On Tue, Aug 06, 2013 at 10:53:40AM +0300, Pantelis Antoniou wrote:
> >> Platform device removal uncovered a number of problems with
> >> the way resources are handled in the core platform code.
> >> 
> >> Resources now form child/parent linkages and this requires
> >> proper linking of the resources. On top of that the OF core
> >> directly creates it's own platform devices. Simplify things
> >> by providing helper functions that manage the linking properly.
> > 
> > Ugh, the OF core shouldn't be creating platform devices.  Well, yes, I
> > know it does that today, but ick, ick, ick.
> > 
> 
> Yep, ick, ick, ick is the correct form.
> 
> >> Two functions are provided:
> >> 
> >> platform_device_link_resources(), which links all the
> >> linkable resources (if not already linked).
> >> 
> >> and platform_device_unlink_resources(), which unlinks all the
> >> resources.
> > 
> > Why would anyone need to call this?  I'm getting the feeling that OF
> > should just have it's own bus of devices to handle this type of mess.
> > ACPI is going through the same rewrite for this same type of problem
> > (they did things differently.)  I suggest you work with the ACPI
> > developers to so the same thing they are, to solve it correctly for
> > everyone.
> > 
> 
> It's the same problem really. Another bus type might not fly well.
> The same device driver should be (in theory) be made to work unchanged
> either on an OF/ACPI/Fex( :) ) setup.

No, that's not quite true, a driver needs to know how to talk to the
bus, as that is how it communicates to the hardware.  It can be done for
different types of busses (see the OHCI USB controller for one example
of this), but a driver will have to know what type of bus it is on in
order to work properly.

> What would it take to move all this into driver core?

What specifically would you move into there?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] pdev: Fix platform device resource linking
  2013-08-06 10:15       ` Greg Kroah-Hartman
@ 2013-08-06 10:19         ` Russell King - ARM Linux
  2013-08-07  5:57           ` Greg Kroah-Hartman
  2013-08-06 10:27         ` Pantelis Antoniou
  1 sibling, 1 reply; 26+ messages in thread
From: Russell King - ARM Linux @ 2013-08-06 10:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Pantelis Antoniou, Tony Lindgren, Benoît Coussno,
	Paul Walmsley, Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi,
	linux-omap, linux-kernel

On Tue, Aug 06, 2013 at 06:15:27PM +0800, Greg Kroah-Hartman wrote:
> On Tue, Aug 06, 2013 at 12:45:42PM +0300, Pantelis Antoniou wrote:
> > What would it take to move all this into driver core?
> 
> What specifically would you move into there?

A variable to hold the streaming dma_mask, so that its in struct device
rather than having to be separate.  Yes, we can keep the pointer in there
too, because that appears to convey meaning in the kernel today - whether
the device can do DMA or not.

A NULL dma_mask pointer already means it can't and dma_set_mask() always
fails.  A non-NULL dma_mask pointer means the driver can use
dma_set_mask() to set an appropriate mask.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] pdev: Fix platform device resource linking
  2013-08-06 10:15       ` Greg Kroah-Hartman
  2013-08-06 10:19         ` Russell King - ARM Linux
@ 2013-08-06 10:27         ` Pantelis Antoniou
  2013-08-07  5:56           ` Greg Kroah-Hartman
  1 sibling, 1 reply; 26+ messages in thread
From: Pantelis Antoniou @ 2013-08-06 10:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Tony Lindgren, Russell King, Benoît Coussno, Paul Walmsley,
	Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi, linux-omap,
	linux-kernel, Matt Porter

Hi Greg,

On Aug 6, 2013, at 1:15 PM, Greg Kroah-Hartman wrote:

> On Tue, Aug 06, 2013 at 12:45:42PM +0300, Pantelis Antoniou wrote:
>> Hi Greg,
>> 
>> On Aug 6, 2013, at 12:36 PM, Greg Kroah-Hartman wrote:
>> 
>>> On Tue, Aug 06, 2013 at 10:53:40AM +0300, Pantelis Antoniou wrote:
>>>> Platform device removal uncovered a number of problems with
>>>> the way resources are handled in the core platform code.
>>>> 
>>>> Resources now form child/parent linkages and this requires
>>>> proper linking of the resources. On top of that the OF core
>>>> directly creates it's own platform devices. Simplify things
>>>> by providing helper functions that manage the linking properly.
>>> 
>>> Ugh, the OF core shouldn't be creating platform devices.  Well, yes, I
>>> know it does that today, but ick, ick, ick.
>>> 
>> 
>> Yep, ick, ick, ick is the correct form.
>> 
>>>> Two functions are provided:
>>>> 
>>>> platform_device_link_resources(), which links all the
>>>> linkable resources (if not already linked).
>>>> 
>>>> and platform_device_unlink_resources(), which unlinks all the
>>>> resources.
>>> 
>>> Why would anyone need to call this?  I'm getting the feeling that OF
>>> should just have it's own bus of devices to handle this type of mess.
>>> ACPI is going through the same rewrite for this same type of problem
>>> (they did things differently.)  I suggest you work with the ACPI
>>> developers to so the same thing they are, to solve it correctly for
>>> everyone.
>>> 
>> 
>> It's the same problem really. Another bus type might not fly well.
>> The same device driver should be (in theory) be made to work unchanged
>> either on an OF/ACPI/Fex( :) ) setup.
> 
> No, that's not quite true, a driver needs to know how to talk to the
> bus, as that is how it communicates to the hardware.  It can be done for
> different types of busses (see the OHCI USB controller for one example
> of this), but a driver will have to know what type of bus it is on in
> order to work properly.
> 

In the case of OF & ACPI there is no 'bus'. The device is probably integrated
in the SoC's silicon, but there is absolutely no way to 'probe' for it's existence;
you have to use a-priori knowledge of the SoC's topology in order to configure it
(along with any per-board specific information if there is any kind of shared 
resource configuration - i.e. pinmuxing or something else).

There are the 3 well known methods to do so in the Linux kernel right now:

1) Board files in which the configuration information is stored in the per-board
platform file encoded in platform data structures.

2) OF, in which case the information is provided via the flat device tree blob
the bootloader provides.

3) ACPI in which case the information is provided via the firmware's ACPI tables
(I'm not overly familiar with ACPI, so there might be some more nuance here).

The device driver for all these cases is absolutely the same; the only place where
it differs it's in the way it uses to retrieve that configuration information.

The board file method is pretty much no-go due to the need to support multiple
boards from a single kernel; that leaves OF and ACPI.

>From what I can tell what ACPI supports is a very small subset of what OF can support right now;
that is both number of device drivers, as well as what you can do with device driver
functionality (see things like gpios etc, how much easier is to use with OF).

Since we're in the let's make a wish stage, what I wish for is a board-file & ACPI
translator stage to OF data, and depreciating everything else gradually.
That would kill platform_device and ACPI specific data and move everything
to a common device structure supporting OF for configuration.
 
>> What would it take to move all this into driver core?
> 
> What specifically would you move into there?
> 

Pretty much everything that's in the union of platform_device & whatever 
acpi uses to hold it's configuration info.

> thanks,
> 
> greg k-h

Regards

-- Pantelis


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device
  2013-08-06 10:14       ` Greg Kroah-Hartman
@ 2013-08-06 13:37         ` Alexander Holler
  2013-08-07  5:52           ` Greg Kroah-Hartman
  2013-08-07  7:44         ` Pantelis Antoniou
  1 sibling, 1 reply; 26+ messages in thread
From: Alexander Holler @ 2013-08-06 13:37 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Pantelis Antoniou, Tony Lindgren, Russell King,
	Benoît Coussno, Paul Walmsley, Sourav Poddar, Russ Dill,
	Felipe Balbi, Koen Kooi, linux-omap, linux-kernel

Am 06.08.2013 12:14, schrieb Greg Kroah-Hartman:

>> What exactly is a platform device anyway?
> 
> Originally it was a "something that wasn't connected to a bus, but just
> had memory-mapped i/o."  Like the PS2 keyboard controller.
> 
> Embedded systems got ahold of this and went to town, and made everything
> a platform device because they could, and no one was paying attention.
> 
> Then OF came along and used it as well, and you know the rest...
> 
> I think we need to get the ACPI and OF people, and me, in a room
> together at the kernel summit and not let us out until we have this all
> worked out.

MFD uses platform devices too.

Regards,

Alexander Holler


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device
  2013-08-06 13:37         ` Alexander Holler
@ 2013-08-07  5:52           ` Greg Kroah-Hartman
  2013-08-07 15:22             ` Alexander Holler
  0 siblings, 1 reply; 26+ messages in thread
From: Greg Kroah-Hartman @ 2013-08-07  5:52 UTC (permalink / raw)
  To: Alexander Holler
  Cc: Pantelis Antoniou, Tony Lindgren, Russell King,
	Benoît Coussno, Paul Walmsley, Sourav Poddar, Russ Dill,
	Felipe Balbi, Koen Kooi, linux-omap, linux-kernel

On Tue, Aug 06, 2013 at 03:37:13PM +0200, Alexander Holler wrote:
> Am 06.08.2013 12:14, schrieb Greg Kroah-Hartman:
> 
> >> What exactly is a platform device anyway?
> > 
> > Originally it was a "something that wasn't connected to a bus, but just
> > had memory-mapped i/o."  Like the PS2 keyboard controller.
> > 
> > Embedded systems got ahold of this and went to town, and made everything
> > a platform device because they could, and no one was paying attention.
> > 
> > Then OF came along and used it as well, and you know the rest...
> > 
> > I think we need to get the ACPI and OF people, and me, in a room
> > together at the kernel summit and not let us out until we have this all
> > worked out.
> 
> MFD uses platform devices too.

Ugh, I've been avoiding looking at mfd for a long time now, and really
don't want to start now...


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] pdev: Fix platform device resource linking
  2013-08-06 10:27         ` Pantelis Antoniou
@ 2013-08-07  5:56           ` Greg Kroah-Hartman
  2013-08-07  7:37             ` Pantelis Antoniou
  0 siblings, 1 reply; 26+ messages in thread
From: Greg Kroah-Hartman @ 2013-08-07  5:56 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Tony Lindgren, Russell King, Benoît Coussno, Paul Walmsley,
	Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi, linux-omap,
	linux-kernel, Matt Porter

On Tue, Aug 06, 2013 at 01:27:35PM +0300, Pantelis Antoniou wrote:
> Hi Greg,
> 
> On Aug 6, 2013, at 1:15 PM, Greg Kroah-Hartman wrote:
> 
> > On Tue, Aug 06, 2013 at 12:45:42PM +0300, Pantelis Antoniou wrote:
> >> Hi Greg,
> >> 
> >> On Aug 6, 2013, at 12:36 PM, Greg Kroah-Hartman wrote:
> >> 
> >>> On Tue, Aug 06, 2013 at 10:53:40AM +0300, Pantelis Antoniou wrote:
> >>>> Platform device removal uncovered a number of problems with
> >>>> the way resources are handled in the core platform code.
> >>>> 
> >>>> Resources now form child/parent linkages and this requires
> >>>> proper linking of the resources. On top of that the OF core
> >>>> directly creates it's own platform devices. Simplify things
> >>>> by providing helper functions that manage the linking properly.
> >>> 
> >>> Ugh, the OF core shouldn't be creating platform devices.  Well, yes, I
> >>> know it does that today, but ick, ick, ick.
> >>> 
> >> 
> >> Yep, ick, ick, ick is the correct form.
> >> 
> >>>> Two functions are provided:
> >>>> 
> >>>> platform_device_link_resources(), which links all the
> >>>> linkable resources (if not already linked).
> >>>> 
> >>>> and platform_device_unlink_resources(), which unlinks all the
> >>>> resources.
> >>> 
> >>> Why would anyone need to call this?  I'm getting the feeling that OF
> >>> should just have it's own bus of devices to handle this type of mess.
> >>> ACPI is going through the same rewrite for this same type of problem
> >>> (they did things differently.)  I suggest you work with the ACPI
> >>> developers to so the same thing they are, to solve it correctly for
> >>> everyone.
> >>> 
> >> 
> >> It's the same problem really. Another bus type might not fly well.
> >> The same device driver should be (in theory) be made to work unchanged
> >> either on an OF/ACPI/Fex( :) ) setup.
> > 
> > No, that's not quite true, a driver needs to know how to talk to the
> > bus, as that is how it communicates to the hardware.  It can be done for
> > different types of busses (see the OHCI USB controller for one example
> > of this), but a driver will have to know what type of bus it is on in
> > order to work properly.
> > 
> 
> In the case of OF & ACPI there is no 'bus'. The device is probably integrated
> in the SoC's silicon, but there is absolutely no way to 'probe' for it's existence;
> you have to use a-priori knowledge of the SoC's topology in order to configure it
> (along with any per-board specific information if there is any kind of shared 
> resource configuration - i.e. pinmuxing or something else).

Not all busses need to have the aiblity to "probe" for new devices,
that's not a requirement at all.  Some of them just "know" where the
devices are at in the driver model, and create the devices for the bus
just fine.

So don't think that just because of that lack of probing, they should be
on the "platform" bus at all.  Platform is for the "oh crap, I have no
way to bind this to anything else, make it a platform device then."

> There are the 3 well known methods to do so in the Linux kernel right now:
> 
> 1) Board files in which the configuration information is stored in the per-board
> platform file encoded in platform data structures.
> 
> 2) OF, in which case the information is provided via the flat device tree blob
> the bootloader provides.
> 
> 3) ACPI in which case the information is provided via the firmware's ACPI tables
> (I'm not overly familiar with ACPI, so there might be some more nuance here).
> 
> The device driver for all these cases is absolutely the same; the only place where
> it differs it's in the way it uses to retrieve that configuration information.

Not at all, they all should be differing in how to talk to the hardware,
right?  Or even if not, it should be pretty trivial for all of them to
have drivers that bind to a multiple of different types of busses just
fine, no need to want to abuse the platform code because people feel
lazy.

> > What specifically would you move into there?
> > 
> 
> Pretty much everything that's in the union of platform_device & whatever 
> acpi uses to hold it's configuration info.

Specifically what is that?  I've never had a problem with moving stuff
into struct device that is common among different subsystems that can
share the pointers, as that is exactly what the structure is there for.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] pdev: Fix platform device resource linking
  2013-08-06 10:19         ` Russell King - ARM Linux
@ 2013-08-07  5:57           ` Greg Kroah-Hartman
  2013-08-07  8:27             ` Russell King - ARM Linux
  0 siblings, 1 reply; 26+ messages in thread
From: Greg Kroah-Hartman @ 2013-08-07  5:57 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Pantelis Antoniou, Tony Lindgren, Benoît Coussno,
	Paul Walmsley, Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi,
	linux-omap, linux-kernel

On Tue, Aug 06, 2013 at 11:19:14AM +0100, Russell King - ARM Linux wrote:
> On Tue, Aug 06, 2013 at 06:15:27PM +0800, Greg Kroah-Hartman wrote:
> > On Tue, Aug 06, 2013 at 12:45:42PM +0300, Pantelis Antoniou wrote:
> > > What would it take to move all this into driver core?
> > 
> > What specifically would you move into there?
> 
> A variable to hold the streaming dma_mask, so that its in struct device
> rather than having to be separate.  Yes, we can keep the pointer in there
> too, because that appears to convey meaning in the kernel today - whether
> the device can do DMA or not.
> 
> A NULL dma_mask pointer already means it can't and dma_set_mask() always
> fails.  A non-NULL dma_mask pointer means the driver can use
> dma_set_mask() to set an appropriate mask.

Ok, that sounds fine to me, for some reason I thought I said yes to this
a long time ago, did no one ever send me a patch for it?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] pdev: Fix platform device resource linking
  2013-08-07  5:56           ` Greg Kroah-Hartman
@ 2013-08-07  7:37             ` Pantelis Antoniou
  2013-08-07 17:13               ` Matt Porter
  0 siblings, 1 reply; 26+ messages in thread
From: Pantelis Antoniou @ 2013-08-07  7:37 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Tony Lindgren, Russell King, Benoît Coussno, Paul Walmsley,
	Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi, linux-omap,
	linux-kernel, Matt Porter

Hi Greg,

On Aug 7, 2013, at 8:56 AM, Greg Kroah-Hartman wrote:

> On Tue, Aug 06, 2013 at 01:27:35PM +0300, Pantelis Antoniou wrote:
>> Hi Greg,
>> 
>> On Aug 6, 2013, at 1:15 PM, Greg Kroah-Hartman wrote:
>> 
>>> On Tue, Aug 06, 2013 at 12:45:42PM +0300, Pantelis Antoniou wrote:
>>>> Hi Greg,
>>>> 
>>>> On Aug 6, 2013, at 12:36 PM, Greg Kroah-Hartman wrote:
>>>> 
>>>>> On Tue, Aug 06, 2013 at 10:53:40AM +0300, Pantelis Antoniou wrote:
>>>>>> Platform device removal uncovered a number of problems with
>>>>>> the way resources are handled in the core platform code.
>>>>>> 
>>>>>> Resources now form child/parent linkages and this requires
>>>>>> proper linking of the resources. On top of that the OF core
>>>>>> directly creates it's own platform devices. Simplify things
>>>>>> by providing helper functions that manage the linking properly.
>>>>> 
>>>>> Ugh, the OF core shouldn't be creating platform devices.  Well, yes, I
>>>>> know it does that today, but ick, ick, ick.
>>>>> 
>>>> 
>>>> Yep, ick, ick, ick is the correct form.
>>>> 
>>>>>> Two functions are provided:
>>>>>> 
>>>>>> platform_device_link_resources(), which links all the
>>>>>> linkable resources (if not already linked).
>>>>>> 
>>>>>> and platform_device_unlink_resources(), which unlinks all the
>>>>>> resources.
>>>>> 
>>>>> Why would anyone need to call this?  I'm getting the feeling that OF
>>>>> should just have it's own bus of devices to handle this type of mess.
>>>>> ACPI is going through the same rewrite for this same type of problem
>>>>> (they did things differently.)  I suggest you work with the ACPI
>>>>> developers to so the same thing they are, to solve it correctly for
>>>>> everyone.
>>>>> 
>>>> 
>>>> It's the same problem really. Another bus type might not fly well.
>>>> The same device driver should be (in theory) be made to work unchanged
>>>> either on an OF/ACPI/Fex( :) ) setup.
>>> 
>>> No, that's not quite true, a driver needs to know how to talk to the
>>> bus, as that is how it communicates to the hardware.  It can be done for
>>> different types of busses (see the OHCI USB controller for one example
>>> of this), but a driver will have to know what type of bus it is on in
>>> order to work properly.
>>> 
>> 
>> In the case of OF & ACPI there is no 'bus'. The device is probably integrated
>> in the SoC's silicon, but there is absolutely no way to 'probe' for it's existence;
>> you have to use a-priori knowledge of the SoC's topology in order to configure it
>> (along with any per-board specific information if there is any kind of shared 
>> resource configuration - i.e. pinmuxing or something else).
> 
> Not all busses need to have the aiblity to "probe" for new devices,
> that's not a requirement at all.  Some of them just "know" where the
> devices are at in the driver model, and create the devices for the bus
> just fine.
> 
> So don't think that just because of that lack of probing, they should be
> on the "platform" bus at all.  Platform is for the "oh crap, I have no
> way to bind this to anything else, make it a platform device then."
> 

I'm not sure if you remember, but a long time ago when OF started getting
into the kernel, there actually was an OCP (On Chip Peripheral) bus, 
and the switch to platform devices was mandated by kernel people, by their
insistance that platform devices would cover every case.

See here:

http://lkml.indiana.edu/hypermail/linux/kernel/0405.1/0930.html

I'm sure Matt Porter can shed some light on the exchange that led to the 
abandonment of the ocp bus concept.

Since I've been down the new 'non hardware' bus road before, it seems that there
is absolutely no consensus about when a new bus is acceptable or not.


>> There are the 3 well known methods to do so in the Linux kernel right now:
>> 
>> 1) Board files in which the configuration information is stored in the per-board
>> platform file encoded in platform data structures.
>> 
>> 2) OF, in which case the information is provided via the flat device tree blob
>> the bootloader provides.
>> 
>> 3) ACPI in which case the information is provided via the firmware's ACPI tables
>> (I'm not overly familiar with ACPI, so there might be some more nuance here).
>> 
>> The device driver for all these cases is absolutely the same; the only place where
>> it differs it's in the way it uses to retrieve that configuration information.
> 
> Not at all, they all should be differing in how to talk to the hardware,
> right?  Or even if not, it should be pretty trivial for all of them to
> have drivers that bind to a multiple of different types of busses just
> fine, no need to want to abuse the platform code because people feel
> lazy.
> 

My point is that there are now a lot of ways to store the configuration options of
a given driver; instead of coming up with different ways to get that depending
on the bus type the platform happens to be using, let's dictate a single method
of getting that information.

I just don't feel that having N new bus types is the way to go there, since each
bus type will require modification in each driver file for the bus type.

OF configuration is pretty much there already, use that as a base and have converters
from whatever else is out there to OF.

>>> What specifically would you move into there?
>>> 
>> 
>> Pretty much everything that's in the union of platform_device & whatever 
>> acpi uses to hold it's configuration info.
> 
> Specifically what is that?  I've never had a problem with moving stuff
> into struct device that is common among different subsystems that can
> share the pointers, as that is exactly what the structure is there for.
> 

Resources (memory ranges, dma, irq, etc).

> thanks,
> 
> greg k-h

Regards

-- Pantelis


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device
  2013-08-06 10:14       ` Greg Kroah-Hartman
  2013-08-06 13:37         ` Alexander Holler
@ 2013-08-07  7:44         ` Pantelis Antoniou
  1 sibling, 0 replies; 26+ messages in thread
From: Pantelis Antoniou @ 2013-08-07  7:44 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Tony Lindgren, Russell King, Benoît Coussno, Paul Walmsley,
	Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi, linux-omap,
	linux-kernel

Hi Greg,

On Aug 6, 2013, at 1:14 PM, Greg Kroah-Hartman wrote:

> On Tue, Aug 06, 2013 at 12:37:25PM +0300, Pantelis Antoniou wrote:
>>> I don't like this at all, sorry.
>>> 
>> 

[snip]

>> Don't shoot the messenger please...
>> 
>> This is all about fixing a crash without messing too many things.
> 
> I understand, it's not your fault at all.

Let's start talking about the patchset again.

I know all of this code is rather distasteful but it's only purpose
is to fix a bunch of crashes in a code path that was supposed to be working,
namely the removal of a platform device, in the supposedly well supported
ARM OMAP arch.

Can we agree to at least not crashing until we get to a consensus about
fixing the whole mess properly? This will take at least a few minor
kernel revisions while in the meantime users get crashes everyday.

Regards

-- Pantelis


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] pdev: Fix platform device resource linking
  2013-08-07  5:57           ` Greg Kroah-Hartman
@ 2013-08-07  8:27             ` Russell King - ARM Linux
  0 siblings, 0 replies; 26+ messages in thread
From: Russell King - ARM Linux @ 2013-08-07  8:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Pantelis Antoniou, Tony Lindgren, Benoît Coussno,
	Paul Walmsley, Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi,
	linux-omap, linux-kernel

On Wed, Aug 07, 2013 at 02:57:00PM +0900, Greg Kroah-Hartman wrote:
> On Tue, Aug 06, 2013 at 11:19:14AM +0100, Russell King - ARM Linux wrote:
> > On Tue, Aug 06, 2013 at 06:15:27PM +0800, Greg Kroah-Hartman wrote:
> > > On Tue, Aug 06, 2013 at 12:45:42PM +0300, Pantelis Antoniou wrote:
> > > > What would it take to move all this into driver core?
> > > 
> > > What specifically would you move into there?
> > 
> > A variable to hold the streaming dma_mask, so that its in struct device
> > rather than having to be separate.  Yes, we can keep the pointer in there
> > too, because that appears to convey meaning in the kernel today - whether
> > the device can do DMA or not.
> > 
> > A NULL dma_mask pointer already means it can't and dma_set_mask() always
> > fails.  A non-NULL dma_mask pointer means the driver can use
> > dma_set_mask() to set an appropriate mask.
> 
> Ok, that sounds fine to me, for some reason I thought I said yes to this
> a long time ago, did no one ever send me a patch for it?

Apparantly not.  I'll see what I can do, and I'll add it to my dma-masks
series, probably below most of the existing patches that are already
there.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device
  2013-08-07  5:52           ` Greg Kroah-Hartman
@ 2013-08-07 15:22             ` Alexander Holler
  0 siblings, 0 replies; 26+ messages in thread
From: Alexander Holler @ 2013-08-07 15:22 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Pantelis Antoniou, Tony Lindgren, Russell King,
	Benoît Coussno, Paul Walmsley, Sourav Poddar, Russ Dill,
	Felipe Balbi, Koen Kooi, linux-omap, linux-kernel

Am 07.08.2013 07:52, schrieb Greg Kroah-Hartman:
> On Tue, Aug 06, 2013 at 03:37:13PM +0200, Alexander Holler wrote:
>> Am 06.08.2013 12:14, schrieb Greg Kroah-Hartman:
>>
>>>> What exactly is a platform device anyway?
>>>
>>> Originally it was a "something that wasn't connected to a bus, but just
>>> had memory-mapped i/o."  Like the PS2 keyboard controller.
>>>
>>> Embedded systems got ahold of this and went to town, and made everything
>>> a platform device because they could, and no one was paying attention.
>>>
>>> Then OF came along and used it as well, and you know the rest...
>>>
>>> I think we need to get the ACPI and OF people, and me, in a room
>>> together at the kernel summit and not let us out until we have this all
>>> worked out.
>>
>> MFD uses platform devices too.
>
> Ugh, I've been avoiding looking at mfd for a long time now, and really
> don't want to start now...
>

I've just mentioned it to suggest that platform devices seem to be used 
all over the kernel as the generic (minimal) form of a device driver. At 
least that is the impression I've got.

Regards,

Alexander Holler


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device
  2013-08-06  9:37     ` Pantelis Antoniou
  2013-08-06 10:14       ` Greg Kroah-Hartman
@ 2013-08-07 16:15       ` Tony Lindgren
  2013-08-07 16:23         ` Pantelis Antoniou
  1 sibling, 1 reply; 26+ messages in thread
From: Tony Lindgren @ 2013-08-07 16:15 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Greg Kroah-Hartman, Russell King, Benoît Coussno,
	Paul Walmsley, Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi,
	linux-omap, linux-kernel

* Pantelis Antoniou <panto@antoniou-consulting.com> [130806 02:44]:
> On Aug 6, 2013, at 12:33 PM, Greg Kroah-Hartman wrote:
> > On Tue, Aug 06, 2013 at 10:53:44AM +0300, Pantelis Antoniou wrote:
> >> +
> >> static int _omap_device_notifier_call(struct notifier_block *nb,
> >> 				      unsigned long event, void *dev)
> >> {
> >> @@ -185,9 +211,13 @@ static int _omap_device_notifier_call(struct notifier_block *nb,
> >> 	struct omap_device *od;
> >> 
> >> 	switch (event) {
> >> -	case BUS_NOTIFY_DEL_DEVICE:
> >> +	case BUS_NOTIFY_UNBOUND_DRIVER:
> >> +		/* NOTIFY_DEL_DEVICE is not the right call...
> >> +		 * we use a callback here, to make sure no-one is going to
> >> +		 * try to use the omap_device data after they're deleted
> >> +		 */
> >> 		if (pdev->archdata.od)
> >> -			omap_device_delete(pdev->archdata.od);
> >> +			device_schedule_callback(dev, _omap_device_cleanup);
> > 
> > Really?  This is one sign that you are totally using the driver core
> > incorrectly.  You shouldn't have to rely on notifier callbacks to handle
> > device removals, your bus code should do that for you directly.
> > 
> > I don't like this at all, sorry.
> > 
> 
> Don't shoot the messenger please...

As you're inititalizing capebus with DT, let's figure out what if
anything you actually need from omap_device. I'd much rather remove
dependencies than add more.

If you need omap_device for the clocks, there are patches pending to
make them DT only for omaps. And we already have DT based solution for
pins, regulators and DMA.

So what else remains? The pieces needed for runtime PM?
 
> This is all about fixing a crash without messing too many things.

It seems this fix is only needed for supporting out-of-tree code?
These features with omap_device we may not even want to support in
the mainline tree as is being discussed..

Regards,

Tony


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device
  2013-08-07 16:15       ` Tony Lindgren
@ 2013-08-07 16:23         ` Pantelis Antoniou
  2013-08-08  7:25           ` Tony Lindgren
  0 siblings, 1 reply; 26+ messages in thread
From: Pantelis Antoniou @ 2013-08-07 16:23 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Greg Kroah-Hartman, Russell King, Benoît Coussno,
	Paul Walmsley, Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi,
	linux-omap, linux-kernel

Hi Tony,

On Aug 7, 2013, at 7:15 PM, Tony Lindgren wrote:

> * Pantelis Antoniou <panto@antoniou-consulting.com> [130806 02:44]:
>> On Aug 6, 2013, at 12:33 PM, Greg Kroah-Hartman wrote:
>>> On Tue, Aug 06, 2013 at 10:53:44AM +0300, Pantelis Antoniou wrote:
>>>> +
>>>> static int _omap_device_notifier_call(struct notifier_block *nb,
>>>> 				      unsigned long event, void *dev)
>>>> {
>>>> @@ -185,9 +211,13 @@ static int _omap_device_notifier_call(struct notifier_block *nb,
>>>> 	struct omap_device *od;
>>>> 
>>>> 	switch (event) {
>>>> -	case BUS_NOTIFY_DEL_DEVICE:
>>>> +	case BUS_NOTIFY_UNBOUND_DRIVER:
>>>> +		/* NOTIFY_DEL_DEVICE is not the right call...
>>>> +		 * we use a callback here, to make sure no-one is going to
>>>> +		 * try to use the omap_device data after they're deleted
>>>> +		 */
>>>> 		if (pdev->archdata.od)
>>>> -			omap_device_delete(pdev->archdata.od);
>>>> +			device_schedule_callback(dev, _omap_device_cleanup);
>>> 
>>> Really?  This is one sign that you are totally using the driver core
>>> incorrectly.  You shouldn't have to rely on notifier callbacks to handle
>>> device removals, your bus code should do that for you directly.
>>> 
>>> I don't like this at all, sorry.
>>> 
>> 
>> Don't shoot the messenger please...
> 
> As you're inititalizing capebus with DT, let's figure out what if
> anything you actually need from omap_device. I'd much rather remove
> dependencies than add more.
> 

There is no such thing as capebus anymore. This is just the path of
removing a platform device, which happens to also be an omap_device.

> If you need omap_device for the clocks, there are patches pending to
> make them DT only for omaps. And we already have DT based solution for
> pins, regulators and DMA.
> 
> So what else remains? The pieces needed for runtime PM?
> 

What happens here is that the omap_device data are freed prematurely and then end up 
used again during the teardown of the platform device.


>> This is all about fixing a crash without messing too many things.
> 
> It seems this fix is only needed for supporting out-of-tree code?
> These features with omap_device we may not even want to support in
> the mainline tree as is being discussed..
> 

What out of tree code? The only thing this patch does is make sure we
don't crash when a perfectly valid call to platform_device_unregister() happens.

Drivers that don't use omap_device work just fine.

> Regards,
> 
> Tony
> 

Regards

-- Pantelis


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/5] pdev: Fix platform device resource linking
  2013-08-07  7:37             ` Pantelis Antoniou
@ 2013-08-07 17:13               ` Matt Porter
  0 siblings, 0 replies; 26+ messages in thread
From: Matt Porter @ 2013-08-07 17:13 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Greg Kroah-Hartman, Tony Lindgren, Russell King,
	Benoît Coussno, Paul Walmsley, Sourav Poddar, Russ Dill,
	Felipe Balbi, Koen Kooi, linux-omap, linux-kernel

[trimmed my old email]

On Wed, Aug 07, 2013 at 10:37:17AM +0300, Pantelis Antoniou wrote:
> Hi Greg,
> 
> On Aug 7, 2013, at 8:56 AM, Greg Kroah-Hartman wrote:
> 
> > On Tue, Aug 06, 2013 at 01:27:35PM +0300, Pantelis Antoniou wrote:
> >> Hi Greg,
> >> 
> >> On Aug 6, 2013, at 1:15 PM, Greg Kroah-Hartman wrote:
> >> 
> >>> On Tue, Aug 06, 2013 at 12:45:42PM +0300, Pantelis Antoniou wrote:
> >>>> Hi Greg,
> >>>> 
> >>>> On Aug 6, 2013, at 12:36 PM, Greg Kroah-Hartman wrote:
> >>>> 
> >>>>> On Tue, Aug 06, 2013 at 10:53:40AM +0300, Pantelis Antoniou wrote:
> >>>>>> Platform device removal uncovered a number of problems with
> >>>>>> the way resources are handled in the core platform code.
> >>>>>> 
> >>>>>> Resources now form child/parent linkages and this requires
> >>>>>> proper linking of the resources. On top of that the OF core
> >>>>>> directly creates it's own platform devices. Simplify things
> >>>>>> by providing helper functions that manage the linking properly.
> >>>>> 
> >>>>> Ugh, the OF core shouldn't be creating platform devices.  Well, yes, I
> >>>>> know it does that today, but ick, ick, ick.
> >>>>> 
> >>>> 
> >>>> Yep, ick, ick, ick is the correct form.
> >>>> 
> >>>>>> Two functions are provided:
> >>>>>> 
> >>>>>> platform_device_link_resources(), which links all the
> >>>>>> linkable resources (if not already linked).
> >>>>>> 
> >>>>>> and platform_device_unlink_resources(), which unlinks all the
> >>>>>> resources.
> >>>>> 
> >>>>> Why would anyone need to call this?  I'm getting the feeling that OF
> >>>>> should just have it's own bus of devices to handle this type of mess.
> >>>>> ACPI is going through the same rewrite for this same type of problem
> >>>>> (they did things differently.)  I suggest you work with the ACPI
> >>>>> developers to so the same thing they are, to solve it correctly for
> >>>>> everyone.
> >>>>> 
> >>>> 
> >>>> It's the same problem really. Another bus type might not fly well.
> >>>> The same device driver should be (in theory) be made to work unchanged
> >>>> either on an OF/ACPI/Fex( :) ) setup.
> >>> 
> >>> No, that's not quite true, a driver needs to know how to talk to the
> >>> bus, as that is how it communicates to the hardware.  It can be done for
> >>> different types of busses (see the OHCI USB controller for one example
> >>> of this), but a driver will have to know what type of bus it is on in
> >>> order to work properly.
> >>> 
> >> 
> >> In the case of OF & ACPI there is no 'bus'. The device is probably integrated
> >> in the SoC's silicon, but there is absolutely no way to 'probe' for it's existence;
> >> you have to use a-priori knowledge of the SoC's topology in order to configure it
> >> (along with any per-board specific information if there is any kind of shared 
> >> resource configuration - i.e. pinmuxing or something else).
> > 
> > Not all busses need to have the aiblity to "probe" for new devices,
> > that's not a requirement at all.  Some of them just "know" where the
> > devices are at in the driver model, and create the devices for the bus
> > just fine.
> > 
> > So don't think that just because of that lack of probing, they should be
> > on the "platform" bus at all.  Platform is for the "oh crap, I have no
> > way to bind this to anything else, make it a platform device then."
> > 
> 
> I'm not sure if you remember, but a long time ago when OF started getting
> into the kernel, there actually was an OCP (On Chip Peripheral) bus, 
> and the switch to platform devices was mandated by kernel people, by their
> insistance that platform devices would cover every case.
> 
> See here:
> 
> http://lkml.indiana.edu/hypermail/linux/kernel/0405.1/0930.html
> 
> I'm sure Matt Porter can shed some light on the exchange that led to the 
> abandonment of the ocp bus concept.

Heh, that OCP support looks a bit antiquated by today's standards. If it
helps, http://lkml.indiana.edu/hypermail/linux/kernel/0501.2/0696.html
is the posting where Kumar starts taking arch/ppc away from using
drivers/ocp/. I can't find any public discussion that led to this, but I
recall the common advice was "just use platform devices". This was,
incidentally, just before the move to arch/powerpc (and DT for all)
began.

Keep in mind that this is 8ish years ago before embedded was
fashionable since we didn't all have Linux machines in our pockets.
I suspect that advice was given because nobody cared about
platform_device removal and it worked for the use cases at the time.

-Matt

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device
  2013-08-07 16:23         ` Pantelis Antoniou
@ 2013-08-08  7:25           ` Tony Lindgren
  0 siblings, 0 replies; 26+ messages in thread
From: Tony Lindgren @ 2013-08-08  7:25 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Greg Kroah-Hartman, Russell King, Benoît Coussno,
	Paul Walmsley, Sourav Poddar, Russ Dill, Felipe Balbi, Koen Kooi,
	linux-omap, linux-kernel

* Pantelis Antoniou <panto@antoniou-consulting.com> [130807 09:31]:
> Hi Tony,
> 
> On Aug 7, 2013, at 7:15 PM, Tony Lindgren wrote:
> 
> > * Pantelis Antoniou <panto@antoniou-consulting.com> [130806 02:44]:
> >> On Aug 6, 2013, at 12:33 PM, Greg Kroah-Hartman wrote:
> >>> On Tue, Aug 06, 2013 at 10:53:44AM +0300, Pantelis Antoniou wrote:
> >>>> +
> >>>> static int _omap_device_notifier_call(struct notifier_block *nb,
> >>>> 				      unsigned long event, void *dev)
> >>>> {
> >>>> @@ -185,9 +211,13 @@ static int _omap_device_notifier_call(struct notifier_block *nb,
> >>>> 	struct omap_device *od;
> >>>> 
> >>>> 	switch (event) {
> >>>> -	case BUS_NOTIFY_DEL_DEVICE:
> >>>> +	case BUS_NOTIFY_UNBOUND_DRIVER:
> >>>> +		/* NOTIFY_DEL_DEVICE is not the right call...
> >>>> +		 * we use a callback here, to make sure no-one is going to
> >>>> +		 * try to use the omap_device data after they're deleted
> >>>> +		 */
> >>>> 		if (pdev->archdata.od)
> >>>> -			omap_device_delete(pdev->archdata.od);
> >>>> +			device_schedule_callback(dev, _omap_device_cleanup);
> >>> 
> >>> Really?  This is one sign that you are totally using the driver core
> >>> incorrectly.  You shouldn't have to rely on notifier callbacks to handle
> >>> device removals, your bus code should do that for you directly.
> >>> 
> >>> I don't like this at all, sorry.
> >>> 
> >> 
> >> Don't shoot the messenger please...
> > 
> > As you're inititalizing capebus with DT, let's figure out what if
> > anything you actually need from omap_device. I'd much rather remove
> > dependencies than add more.
> > 
> 
> There is no such thing as capebus anymore. This is just the path of
> removing a platform device, which happens to also be an omap_device.

OK, so let's figure out the minimal fixes needed.
 
> >> This is all about fixing a crash without messing too many things.
> > 
> > It seems this fix is only needed for supporting out-of-tree code?
> > These features with omap_device we may not even want to support in
> > the mainline tree as is being discussed..
> > 
> 
> What out of tree code? The only thing this patch does is make sure we
> don't crash when a perfectly valid call to platform_device_unregister() happens.
> 
> Drivers that don't use omap_device work just fine.

So what's the minimal set of fixes then?

Regards,

Tony

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/5] omap: Properly handle resources for omap_devices
       [not found]   ` <87a9kt2vd8.fsf@linaro.org>
@ 2013-08-08  9:23     ` Pantelis Antoniou
  2013-08-09 15:16       ` Kevin Hilman
  0 siblings, 1 reply; 26+ messages in thread
From: Pantelis Antoniou @ 2013-08-08  9:23 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Tony Lindgren, Russell King, Benoît Cousson, Paul Walmsley,
	Greg Kroah-Hartman, Sourav Poddar, Russ Dill, Felipe Balbi,
	Koen Kooi, linux-omap, linux-kernel

Hi Kevin,

On Aug 7, 2013, at 9:45 PM, Kevin Hilman wrote:

> [fixing address for Benoit]
> 
> Pantelis Antoniou <panto@antoniou-consulting.com> writes:
> 
>> omap_device relies on the platform notifier callbacks managing resources
>> behind the scenes. The resources were not properly linked causing crashes
>> when removing the device.
>> 
>> Rework the resource modification code so that linking is performed properly,
>> and make sure that no resources that have no parent (which can happen for DMA
>> & IRQ resources) are ever left for cleanup by the core resource layer.
>> 
>> Signed-off-by: Pantelis Antoniou <panto@antoniou-consulting.com>
> 
> This one failed my "took more than 15 minutes to understand" test.  The
> changelog is rather vague (especially about what "properly" means), and
> the combination of moving code and changing it makes the patch rather
> clunky to read, so I remain a bit confused about what the actual problem
> is.  Please elaborate.
> 
> Also, could you share a crash dump as well as details about how to
> reproduce this problem?
> 
> Thanks,
> 
> Kevin

It's the full patchset that fixes the problem:

Let me illustrate:

The kernel I use is located at:

git@github.com:pantoniou/linux-beagle-track-mainline.git
branch: merge-20130806 (there are topic branches for other stuff too)

The DT overlay we're loading:

> /*
>  * Copyright (C) 2013 CircuitCo
>  *
>  * Virtual cape for UART1 on connector pins P9.24 P9.26
>  *
>  * This program is free software; you can redistribute it and/or modify
>  * it under the terms of the GNU General Public License version 2 as
>  * published by the Free Software Foundation.
>  */
> /dts-v1/;
> /plugin/;
> 
> / {
> 	compatible = "ti,beaglebone", "ti,beaglebone-black";
> 
> 	/* identification */
> 	part-number = "BB-UART1";
> 	version = "00A0";
> 
> 	/* state the resources this cape uses */
> 	exclusive-use =
> 		/* the pin header uses */
> 		"P9.24",	/* uart1_txd */
> 		"P9.26",	/* uart1_rxd */
> 		/* the hardware ip uses */
> 		"uart1";
> 
> 	fragment@0 {
> 		target = <&am33xx_pinmux>;
> 		__overlay__ {
> 			bb_uart1_pins: pinmux_bb_uart1_pins {
> 				pinctrl-single,pins = <
> 					0x184 0x20 /* P9.24 uart1_txd.uart1_txd  OUTPUT  */
> 					0x180 0x20 /* P9.26 uart1_rxd.uart1_rxd  INPUT  */
> 				>;
> 			};
> 		};
> 	};
> 
> 	fragment@1 {
> 		target = <&uart1>;	/* really uart1 */
> 		__overlay__ {
> 			status = "okay";
> 			pinctrl-names = "default";
> 			pinctrl-0 = <&bb_uart1_pins>;
> 		};
> 	};
> };
> 

Nothing complicated; just a serial device.

With the full patchset on a beaglebone and loading a simple case of a UART 'cape'.

> root@beaglebone:/sys/devices/bone_capemgr.4# echo BB-UART1 >slots 
> [   58.546947] bone-capemgr bone_capemgr.4: part_number 'BB-UART1', version 'N/A'
> [   58.578374] bone-capemgr bone_capemgr.4: slot #4: generic override
> [   58.584925] bone-capemgr bone_capemgr.4: bone: Using override eeprom data at slot 4
> [   58.593086] bone-capemgr bone_capemgr.4: slot #4: 'Override Board Name,00A0,Override Manuf,BB-UART1'
> [   58.618455] bone-capemgr bone_capemgr.4: slot #4: Requesting part number/version based 'BB-UART1-00A0.dtbo
> [   58.638258] bone-capemgr bone_capemgr.4: slot #4: Requesting firmware 'BB-UART1-00A0.dtbo' for board-name 'Override Board Name', version '00A0'
> [   58.681259] bone-capemgr bone_capemgr.4: slot #4: dtbo 'BB-UART1-00A0.dtbo' loaded; converting to live tree
> [   58.705075] bone-capemgr bone_capemgr.4: slot #4: #2 overlays
> [   58.735058] 48022000.serial: ttyO1 at MMIO 0x48022000 (irq = 89) is a OMAP UART1
> [   58.757834] bone-capemgr bone_capemgr.4: slot #4: Applied #2 overlays.
> root@beaglebone:/sys/devices/bone_capemgr.4# echo -4 >slots 
> [   62.362519] bone-capemgr bone_capemgr.4: Removed slot #4
> 

Revert "omap: Properly handle resources for omap_devices"
Revert "of: Link platform device resources properly."
Revert "pdev: Fix platform device resource linking"

> root@beaglebone:/sys/devices/bone_capemgr.4# echo BB-UART1 >slots 
> [   39.317978] bone-capemgr bone_capemgr.4: part_number 'BB-UART1', version 'N/A'
> [   39.325630] bone-capemgr bone_capemgr.4: slot #4: generic override
> [   39.332248] bone-capemgr bone_capemgr.4: bone: Using override eeprom data at slot 4
> [   39.340336] bone-capemgr bone_capemgr.4: slot #4: 'Override Board Name,00A0,Override Manuf,BB-UART1'
> [   39.378854] bone-capemgr bone_capemgr.4: slot #4: Requesting part number/version based 'BB-UART1-00A0.dtbo
> [   39.396013] bone-capemgr bone_capemgr.4: slot #4: Requesting firmware 'BB-UART1-00A0.dtbo' for board-name 'Override Board Name', version '00A0'
> [   39.438009] bone-capemgr bone_capemgr.4: slot #4: dtbo 'BB-UART1-00A0.dtbo' loaded; converting to live tree
> [   39.452797] bone-capemgr bone_capemgr.4: slot #4: #2 overlays
> [   39.473180] 48022000.serial: ttyO1 at MMIO 0x48022000 (irq = 89) is a OMAP UART1
> [   39.498641] bone-capemgr bone_capemgr.4: slot #4: Applied #2 overlays.
> root@beaglebone:/sys/devices/bone_capemgr.4# echo -4 >slots
> [   42.233884] Unable to handle kernel NULL pointer dereference at virtual address 00000018
> [   42.242557] pgd = ca91c000
> [   42.245402] [00000018] *pgd=8a97e831, *pte=00000000, *ppte=00000000
> [   42.252062] Internal error: Oops: 17 [#1] SMP ARM
> [   42.256996] Modules linked in: ipv6 autofs4
> [   42.261429] CPU: 0 PID: 269 Comm: sh Not tainted 3.11.0-rc4-00111-g263aa42 #120
> [   42.269098] task: ca995040 ti: ca908000 task.ti: ca908000
> [   42.274786] PC is at __release_resource+0x8/0x44
> [   42.279632] LR is at release_resource+0x1c/0x34
> [   42.284379] pc : [<c003d7a8>]    lr : [<c003dbc0>]    psr: 600f0013
> [   42.284379] sp : ca909e80  ip : cf106658  fp : cf64d000
> [   42.296407] r10: cf49e888  r9 : c04badb0  r8 : 00000001
> [   42.301888] r7 : 00000001  r6 : 00000000  r5 : ca9e2e80  r4 : c06e1000
> [   42.308736] r3 : 00000000  r2 : 00000018  r1 : cf054b88  r0 : ca9e2e80
> [   42.315577] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> [   42.323065] Control: 10c5387d  Table: 8a91c019  DAC: 00000015
> [   42.329089] Process sh (pid: 269, stack limit = 0xca908240)
> [   42.334935] Stack: (0xca909e80 to 0xca90a000)
> [   42.339510] 9e80: 00000200 cf49ea00 00000000 c02c6a5c cf49ea00 cf0ad600 00000000 c02c6d28
> [   42.348093] 9ea0: ca8f3200 c03b64c8 ca8f3200 cf49e888 00200200 00100100 cf49e850 c03b6564
> [   42.356674] 9ec0: ca995040 cf49e850 00000001 cf49e858 cf28bc18 cf54c7d8 cf109818 c03b6d20
> [   42.365257] 9ee0: ca8f1810 cf109800 00000000 c02d7b6c 00000004 cf28bc20 cf109810 c02d9194
> [   42.373837] 9f00: cf109810 c06f588c cf64d000 cf28a9c8 ca909f80 00000003 cf54c7c0 cf54c7d8
> [   42.382420] 9f20: c04badb0 cf109818 00000000 c02c1c94 00000003 c012eb44 ca9b12c0 00000003
> [   42.391013] 9f40: 000d6408 ca909f80 000d6408 ca908000 00000003 c00d9538 ca9b12c0 000d6408
> [   42.399594] 9f60: 00000003 ca9b12c0 00000000 00000000 00000000 000d6408 00000003 c00d998c
> [   42.408172] 9f80: 00000000 00000000 00000003 00000003 000d6408 b6ee1a80 00000004 c000dc08
> [   42.416751] 9fa0: 00000000 c000da60 00000003 000d6408 00000001 000d6408 00000003 00000000
> [   42.425334] 9fc0: 00000003 000d6408 b6ee1a80 00000004 00000003 00000003 000d6408 00000000
> [   42.433914] 9fe0: 00000000 bed59984 b6e1db2c b6e7030c 600f0010 00000001 00000000 00000000
> [   42.442522] [<c003d7a8>] (__release_resource+0x8/0x44) from [<c003dbc0>] (release_resource+0x1c/0x34)
> [   42.452231] [<c003dbc0>] (release_resource+0x1c/0x34) from [<c02c6a5c>] (platform_device_del+0x60/0x7c)
> [   42.462104] [<c02c6a5c>] (platform_device_del+0x60/0x7c) from [<c02c6d28>] (platform_device_unregister+0xc/0x18)
> [   42.472807] [<c02c6d28>] (platform_device_unregister+0xc/0x18) from [<c03b64c8>] (of_overlay_device_entry_change.clone.2+0x108/0x15c)
> [   42.485393] [<c03b64c8>] (of_overlay_device_entry_change.clone.2+0x108/0x15c) from [<c03b6564>] (of_overlay_revert_one+0x48/0x1f0)
> [   42.497725] [<c03b6564>] (of_overlay_revert_one+0x48/0x1f0) from [<c03b6d20>] (of_overlay_revert+0x34/0x58)
> [   42.507965] [<c03b6d20>] (of_overlay_revert+0x34/0x58) from [<c02d7b6c>] (bone_capemgr_remove_slot_no_lock+0x44/0xcc)
> [   42.519111] [<c02d7b6c>] (bone_capemgr_remove_slot_no_lock+0x44/0xcc) from [<c02d9194>] (slots_store+0x78/0x394)
> [   42.529797] [<c02d9194>] (slots_store+0x78/0x394) from [<c02c1c94>] (dev_attr_store+0x18/0x24)
> [   42.538840] [<c02c1c94>] (dev_attr_store+0x18/0x24) from [<c012eb44>] (sysfs_write_file+0x108/0x13c)
> [   42.548441] [<c012eb44>] (sysfs_write_file+0x108/0x13c) from [<c00d9538>] (vfs_write+0xd4/0x1cc)
> [   42.557675] [<c00d9538>] (vfs_write+0xd4/0x1cc) from [<c00d998c>] (SyS_write+0x3c/0x60)
> [   42.566083] [<c00d998c>] (SyS_write+0x3c/0x60) from [<c000da60>] (ret_fast_syscall+0x0/0x30)
> [   42.574937] Code: e1a00003 e8bd8030 e5903010 e2832018 (e5933018) 
> [   42.581424] ---[ end trace 61c82b77609dbc03 ]---
> [   42.757607] systemd[1]: serial-getty@ttyO0.service holdoff time over, scheduling restart.
> 
> 

Regards

-- Pantelis


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/5] omap: Avoid crashes in the case of hwmod misconfiguration
       [not found]   ` <87siyl19uk.fsf@linaro.org>
@ 2013-08-08  9:29     ` Pantelis Antoniou
  0 siblings, 0 replies; 26+ messages in thread
From: Pantelis Antoniou @ 2013-08-08  9:29 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Tony Lindgren, Russell King, Benoît Coussno, Paul Walmsley,
	Greg Kroah-Hartman, Sourav Poddar, Russ Dill, Felipe Balbi,
	Koen Kooi, linux-omap, linux-kernel

Hi Kevin,

On Aug 8, 2013, at 12:15 AM, Kevin Hilman wrote:

> Pantelis Antoniou <panto@antoniou-consulting.com> writes:
> 
>> omap hwmod is really sensitive to hwmod misconfiguration.
>> Getting a minor clock wrong always ended up in a crash.
>> Attempt to be more resilient by not assigning variables with
>> error codes and then attempting to use them.
>> 
>> Without this patch, missing a clock ends up with something like this:
>> omap_hwmod: ehrpwm0: cannot clk_get opt_clk ehrpwm0_tbclk!
> 
> Definitely agree we should not be crashing when given bad data.
> 
> nit Re: "missing clock".  I don't think there will be any crash if a
> clock is missing.  This looks to me more like the clock name is wrong
> (tbclk instead of dbclk?), not missing.
> 

Yes, I'll rephrase.

> [...]
> 
>> index 7341eff..42cb7d4 100644
>> --- a/arch/arm/mach-omap2/omap_hwmod.c
>> +++ b/arch/arm/mach-omap2/omap_hwmod.c
>> @@ -784,7 +784,9 @@ static int _init_interface_clks(struct omap_hwmod *oh)
>> 		if (IS_ERR(c)) {
>> 			pr_warning("omap_hwmod: %s: cannot clk_get interface_clk %s\n",
>> 				   oh->name, os->clk);
>> -			ret = -EINVAL;
>> +			if (ret == 0)
>> +				ret = -EINVAL;
>> +			continue;
> 
> the 'if (ret == 0)' adds confusion IMO.  If we don't care additional
> failures, errors, then just add a 'break' instead of these 3 lines.
> 
> [...]
> 


I tried to carry on as much as possible even on the presence of errors.
The remaining clocks won't be initialized, but that might be OK.

> Kevin

Regards

-- Pantelis


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/5] omap: Properly handle resources for omap_devices
  2013-08-08  9:23     ` [PATCH 3/5] omap: Properly handle resources for omap_devices Pantelis Antoniou
@ 2013-08-09 15:16       ` Kevin Hilman
  2013-08-09 15:32         ` Pantelis Antoniou
  0 siblings, 1 reply; 26+ messages in thread
From: Kevin Hilman @ 2013-08-09 15:16 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Tony Lindgren, Russell King, Benoît Cousson, Paul Walmsley,
	Greg Kroah-Hartman, Sourav Poddar, Russ Dill, Felipe Balbi,
	Koen Kooi, linux-omap, linux-kernel

Pantelis Antoniou <panto@antoniou-consulting.com> writes:

> Hi Kevin,
>
> On Aug 7, 2013, at 9:45 PM, Kevin Hilman wrote:
>
>> [fixing address for Benoit]
>> 
>> Pantelis Antoniou <panto@antoniou-consulting.com> writes:
>> 
>>> omap_device relies on the platform notifier callbacks managing resources
>>> behind the scenes. The resources were not properly linked causing crashes
>>> when removing the device.
>>> 
>>> Rework the resource modification code so that linking is performed properly,
>>> and make sure that no resources that have no parent (which can happen for DMA
>>> & IRQ resources) are ever left for cleanup by the core resource layer.
>>> 
>>> Signed-off-by: Pantelis Antoniou <panto@antoniou-consulting.com>
>> 
>> This one failed my "took more than 15 minutes to understand" test.  The
>> changelog is rather vague (especially about what "properly" means), and
>> the combination of moving code and changing it makes the patch rather
>> clunky to read, so I remain a bit confused about what the actual problem
>> is.  Please elaborate.
>> 
>> Also, could you share a crash dump as well as details about how to
>> reproduce this problem?
>> 
>> Thanks,
>> 
>> Kevin
>
> It's the full patchset that fixes the problem:
>
> Let me illustrate:
>
> The kernel I use is located at:
>
> git@github.com:pantoniou/linux-beagle-track-mainline.git
> branch: merge-20130806 (there are topic branches for other stuff too)

Sorry, I don't have the time to go through a bunch of out of tree
branches to figure this out.  Can you create a simpler test case to
reproduce this?  e.g. Does this happen when building the serial driver
as a module and then removing it?  If not, why not?

Kevin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/5] omap: Properly handle resources for omap_devices
  2013-08-09 15:16       ` Kevin Hilman
@ 2013-08-09 15:32         ` Pantelis Antoniou
  2013-08-09 16:35           ` Kevin Hilman
  0 siblings, 1 reply; 26+ messages in thread
From: Pantelis Antoniou @ 2013-08-09 15:32 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Tony Lindgren, Russell King, Benoît Cousson, Paul Walmsley,
	Greg Kroah-Hartman, Sourav Poddar, Russ Dill, Felipe Balbi,
	Koen Kooi, linux-omap, linux-kernel

Hi
On Aug 9, 2013, at 6:16 PM, Kevin Hilman wrote:

> Pantelis Antoniou <panto@antoniou-consulting.com> writes:
> 
>> Hi Kevin,
>> 
>> On Aug 7, 2013, at 9:45 PM, Kevin Hilman wrote:
>> 
>>> [fixing address for Benoit]
>>> 
>>> Pantelis Antoniou <panto@antoniou-consulting.com> writes:
>>> 
>>>> omap_device relies on the platform notifier callbacks managing resources
>>>> behind the scenes. The resources were not properly linked causing crashes
>>>> when removing the device.
>>>> 
>>>> Rework the resource modification code so that linking is performed properly,
>>>> and make sure that no resources that have no parent (which can happen for DMA
>>>> & IRQ resources) are ever left for cleanup by the core resource layer.
>>>> 
>>>> Signed-off-by: Pantelis Antoniou <panto@antoniou-consulting.com>
>>> 
>>> This one failed my "took more than 15 minutes to understand" test.  The
>>> changelog is rather vague (especially about what "properly" means), and
>>> the combination of moving code and changing it makes the patch rather
>>> clunky to read, so I remain a bit confused about what the actual problem
>>> is.  Please elaborate.
>>> 
>>> Also, could you share a crash dump as well as details about how to
>>> reproduce this problem?
>>> 
>>> Thanks,
>>> 
>>> Kevin
>> 
>> It's the full patchset that fixes the problem:
>> 
>> Let me illustrate:
>> 
>> The kernel I use is located at:
>> 
>> git@github.com:pantoniou/linux-beagle-track-mainline.git
>> branch: merge-20130806 (there are topic branches for other stuff too)
> 
> Sorry, I don't have the time to go through a bunch of out of tree
> branches to figure this out.  Can you create a simpler test case to
> reproduce this?  e.g. Does this happen when building the serial driver
> as a module and then removing it?  If not, why not?

What 'bunch of out of tree branches'? There is a single merge branch against
current mainline. And no you don't trigger this by removing a module.

platform_driver_unregister() != platform_device_register().

I'll try to figure out something simpler, but it's pretty damn obvious that
something is not right there. Do you think fixing problems in that part
of platform device removal was something I did because I didn't have other
things to do?

It is broken and I'm trying to get it fixed. Are you?

> 
> Kevin

-- Pantelis


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/5] omap: Properly handle resources for omap_devices
  2013-08-09 15:32         ` Pantelis Antoniou
@ 2013-08-09 16:35           ` Kevin Hilman
  2013-08-09 18:08             ` Pantelis Antoniou
  0 siblings, 1 reply; 26+ messages in thread
From: Kevin Hilman @ 2013-08-09 16:35 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Tony Lindgren, Russell King, Benoît Cousson, Paul Walmsley,
	Greg Kroah-Hartman, Sourav Poddar, Russ Dill, Felipe Balbi,
	Koen Kooi, linux-omap, linux-kernel

Pantelis Antoniou <panto@antoniou-consulting.com> writes:

> Hi
> On Aug 9, 2013, at 6:16 PM, Kevin Hilman wrote:
>
>> Pantelis Antoniou <panto@antoniou-consulting.com> writes:
>> 
>>> Hi Kevin,
>>> 
>>> On Aug 7, 2013, at 9:45 PM, Kevin Hilman wrote:
>>> 
>>>> [fixing address for Benoit]
>>>> 
>>>> Pantelis Antoniou <panto@antoniou-consulting.com> writes:
>>>> 
>>>>> omap_device relies on the platform notifier callbacks managing resources
>>>>> behind the scenes. The resources were not properly linked causing crashes
>>>>> when removing the device.
>>>>> 
>>>>> Rework the resource modification code so that linking is performed properly,
>>>>> and make sure that no resources that have no parent (which can happen for DMA
>>>>> & IRQ resources) are ever left for cleanup by the core resource layer.
>>>>> 
>>>>> Signed-off-by: Pantelis Antoniou <panto@antoniou-consulting.com>
>>>> 
>>>> This one failed my "took more than 15 minutes to understand" test.  The
>>>> changelog is rather vague (especially about what "properly" means), and
>>>> the combination of moving code and changing it makes the patch rather
>>>> clunky to read, so I remain a bit confused about what the actual problem
>>>> is.  Please elaborate.
>>>> 
>>>> Also, could you share a crash dump as well as details about how to
>>>> reproduce this problem?
>>>> 
>>>> Thanks,
>>>> 
>>>> Kevin
>>> 
>>> It's the full patchset that fixes the problem:
>>> 
>>> Let me illustrate:
>>> 
>>> The kernel I use is located at:
>>> 
>>> git@github.com:pantoniou/linux-beagle-track-mainline.git
>>> branch: merge-20130806 (there are topic branches for other stuff too)
>> 
>> Sorry, I don't have the time to go through a bunch of out of tree
>> branches to figure this out.  Can you create a simpler test case to
>> reproduce this?  e.g. Does this happen when building the serial driver
>> as a module and then removing it?  If not, why not?
>
> What 'bunch of out of tree branches'? There is a single merge branch against
> current mainline. 

And That branch contains multiple other branches (as you stated yourself
above), many of which look to be out of tree:

$ git log --oneline --merges  panto/master..panto/merge-20130806
031297e Merge branch 'lcdc-fixes' into merge-20130806
44cf018 Merge branch 'usb-fixes' into merge-20130806
3cc739b Merge branch 'tscadc' into merge-20130806
f0ec35d Merge branch 'capemgr' into merge-20130806
4d87712 Merge branch 'general-fixes' into merge-20130806
04535fb Merge branch 'pinctrl-fixes' into merge-20130806
5115b55 Merge branch 'i2c-fixes' into merge-20130806
e96edf5 Merge branch 'capes' into merge-20130806
1eb1779 Merge branch 'dts-fixes' into merge-20130806
ca12149 Merge branch 'mmc-fixes' into merge-20130806
41ed7a0 Merge branch 'pdev-fixes' into merge-20130806
2ba9995 Merge branch 'of-fixes' into merge-20130806
38b5cb4 Merge branch 'dtc-overlays' into merge-20130806
167648d Merge branch 'dtc-fixes' into merge-20130806
2f7de02 Merge branch 'reset' into merge-20130806

$ git log --oneline --no-merges panto/master..panto/merge-20130806 |wc -l
85

> And no you don't trigger this by removing a module.
>
> platform_driver_unregister() != platform_device_register().
>
> I'll try to figure out something simpler, but it's pretty damn obvious that
> something is not right there. 

I agree something does not seem right, and never said otherwise.  I'm
simply looking for an easy way to reproduce it with mainline.  

Since it's so "damn obvious", I'll accept the insult and agree to being
a moron.  Please educate me by making it even more obvious with a way to
reproduce it against mainline.

> Do you think fixing problems in that part of platform device removal
> was something I did because I didn't have other things to do?

Settle down, I said nothing of the sort.

On the contrary, I assumed you were being a generous citizen of the
community and contributing back your fixes, and as a developer of this
part of the kernel, I'm trying to help.  If I didn't care, I wouldn't
have responded in the first place.

> It is broken and I'm trying to get it fixed. Are you?

I was, but your accusatory tone is reducing my motivation to help.

Kevin


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/5] omap: Properly handle resources for omap_devices
  2013-08-09 16:35           ` Kevin Hilman
@ 2013-08-09 18:08             ` Pantelis Antoniou
  0 siblings, 0 replies; 26+ messages in thread
From: Pantelis Antoniou @ 2013-08-09 18:08 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Tony Lindgren, Russell King, Benoît Cousson, Paul Walmsley,
	Greg Kroah-Hartman, Sourav Poddar, Russ Dill, Felipe Balbi,
	Koen Kooi, linux-omap, linux-kernel

Hi Kevin,

On Aug 9, 2013, at 7:35 PM, Kevin Hilman wrote:

> Pantelis Antoniou <panto@antoniou-consulting.com> writes:
> 
>> Hi
>> On Aug 9, 2013, at 6:16 PM, Kevin Hilman wrote:
>> 
>>> Pantelis Antoniou <panto@antoniou-consulting.com> writes:
>>> 
>>>> Hi Kevin,
>>>> 
>>>> On Aug 7, 2013, at 9:45 PM, Kevin Hilman wrote:
>>>> 
>>>>> [fixing address for Benoit]
>>>>> 
>>>>> Pantelis Antoniou <panto@antoniou-consulting.com> writes:
>>>>> 
>>>>>> omap_device relies on the platform notifier callbacks managing resources
>>>>>> behind the scenes. The resources were not properly linked causing crashes
>>>>>> when removing the device.
>>>>>> 
>>>>>> Rework the resource modification code so that linking is performed properly,
>>>>>> and make sure that no resources that have no parent (which can happen for DMA
>>>>>> & IRQ resources) are ever left for cleanup by the core resource layer.
>>>>>> 
>>>>>> Signed-off-by: Pantelis Antoniou <panto@antoniou-consulting.com>
>>>>> 
>>>>> This one failed my "took more than 15 minutes to understand" test.  The
>>>>> changelog is rather vague (especially about what "properly" means), and
>>>>> the combination of moving code and changing it makes the patch rather
>>>>> clunky to read, so I remain a bit confused about what the actual problem
>>>>> is.  Please elaborate.
>>>>> 
>>>>> Also, could you share a crash dump as well as details about how to
>>>>> reproduce this problem?
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Kevin
>>>> 
>>>> It's the full patchset that fixes the problem:
>>>> 
>>>> Let me illustrate:
>>>> 
>>>> The kernel I use is located at:
>>>> 
>>>> git@github.com:pantoniou/linux-beagle-track-mainline.git
>>>> branch: merge-20130806 (there are topic branches for other stuff too)
>>> 
>>> Sorry, I don't have the time to go through a bunch of out of tree
>>> branches to figure this out.  Can you create a simpler test case to
>>> reproduce this?  e.g. Does this happen when building the serial driver
>>> as a module and then removing it?  If not, why not?
>> 
>> What 'bunch of out of tree branches'? There is a single merge branch against
>> current mainline. 
> 
> And That branch contains multiple other branches (as you stated yourself
> above), many of which look to be out of tree:
> 
> $ git log --oneline --merges  panto/master..panto/merge-20130806
> 031297e Merge branch 'lcdc-fixes' into merge-20130806
> 44cf018 Merge branch 'usb-fixes' into merge-20130806
> 3cc739b Merge branch 'tscadc' into merge-20130806
> f0ec35d Merge branch 'capemgr' into merge-20130806
> 4d87712 Merge branch 'general-fixes' into merge-20130806
> 04535fb Merge branch 'pinctrl-fixes' into merge-20130806
> 5115b55 Merge branch 'i2c-fixes' into merge-20130806
> e96edf5 Merge branch 'capes' into merge-20130806
> 1eb1779 Merge branch 'dts-fixes' into merge-20130806
> ca12149 Merge branch 'mmc-fixes' into merge-20130806
> 41ed7a0 Merge branch 'pdev-fixes' into merge-20130806
> 2ba9995 Merge branch 'of-fixes' into merge-20130806
> 38b5cb4 Merge branch 'dtc-overlays' into merge-20130806
> 167648d Merge branch 'dtc-fixes' into merge-20130806
> 2f7de02 Merge branch 'reset' into merge-20130806
> 
> $ git log --oneline --no-merges panto/master..panto/merge-20130806 |wc -l
> 85
> 

Yes, but they have nothing to do with the problem at hand; the only
branch that matters i pdev-fixes.

>> And no you don't trigger this by removing a module.
>> 
>> platform_driver_unregister() != platform_device_register().
>> 
>> I'll try to figure out something simpler, but it's pretty damn obvious that
>> something is not right there. 
> 
> I agree something does not seem right, and never said otherwise.  I'm
> simply looking for an easy way to reproduce it with mainline.  
> 
> Since it's so "damn obvious", I'll accept the insult and agree to being
> a moron.  Please educate me by making it even more obvious with a way to
> reproduce it against mainline.
> 

You can't reproduce it via rmmod'ing a module; it's not the same code path.
That you need to do is to call platform_device_unregister() for the affected
device on a DT enabled board.

The only platform that triggers it currently is the beaglebone with DT
overlays support.

To get it to trigger on something simpler, I'll have to write a pretty
elaborate module where I register the device and then immediately unregister
it.

>> Do you think fixing problems in that part of platform device removal
>> was something I did because I didn't have other things to do?
> 
> Settle down, I said nothing of the sort.
> 
> On the contrary, I assumed you were being a generous citizen of the
> community and contributing back your fixes, and as a developer of this
> part of the kernel, I'm trying to help.  If I didn't care, I wouldn't
> have responded in the first place.
> 

OK

>> It is broken and I'm trying to get it fixed. Are you?
> 
> I was, but your accusatory tone is reducing my motivation to help.
> 

I guess that's a fair accusation; sorry for coming out a bit strong here.

Anyway, I'll see what it takes to come up with a test module that exhibits the issue
easier.

> Kevin
> 

Regards

-- Pantelis


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2013-08-09 18:08 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1375775624-12250-1-git-send-email-panto@antoniou-consulting.com>
     [not found] ` <1375775624-12250-6-git-send-email-panto@antoniou-consulting.com>
2013-08-06  9:33   ` [PATCH 5/5] arm: omap: Proper cleanups for omap_device Greg Kroah-Hartman
2013-08-06  9:37     ` Pantelis Antoniou
2013-08-06 10:14       ` Greg Kroah-Hartman
2013-08-06 13:37         ` Alexander Holler
2013-08-07  5:52           ` Greg Kroah-Hartman
2013-08-07 15:22             ` Alexander Holler
2013-08-07  7:44         ` Pantelis Antoniou
2013-08-07 16:15       ` Tony Lindgren
2013-08-07 16:23         ` Pantelis Antoniou
2013-08-08  7:25           ` Tony Lindgren
     [not found] ` <1375775624-12250-2-git-send-email-panto@antoniou-consulting.com>
2013-08-06  9:36   ` [PATCH 1/5] pdev: Fix platform device resource linking Greg Kroah-Hartman
2013-08-06  9:45     ` Pantelis Antoniou
2013-08-06 10:15       ` Greg Kroah-Hartman
2013-08-06 10:19         ` Russell King - ARM Linux
2013-08-07  5:57           ` Greg Kroah-Hartman
2013-08-07  8:27             ` Russell King - ARM Linux
2013-08-06 10:27         ` Pantelis Antoniou
2013-08-07  5:56           ` Greg Kroah-Hartman
2013-08-07  7:37             ` Pantelis Antoniou
2013-08-07 17:13               ` Matt Porter
     [not found] ` <1375775624-12250-4-git-send-email-panto@antoniou-consulting.com>
     [not found]   ` <87a9kt2vd8.fsf@linaro.org>
2013-08-08  9:23     ` [PATCH 3/5] omap: Properly handle resources for omap_devices Pantelis Antoniou
2013-08-09 15:16       ` Kevin Hilman
2013-08-09 15:32         ` Pantelis Antoniou
2013-08-09 16:35           ` Kevin Hilman
2013-08-09 18:08             ` Pantelis Antoniou
     [not found] ` <1375775624-12250-5-git-send-email-panto@antoniou-consulting.com>
     [not found]   ` <87siyl19uk.fsf@linaro.org>
2013-08-08  9:29     ` [PATCH 4/5] omap: Avoid crashes in the case of hwmod misconfiguration Pantelis Antoniou

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.