linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] driver: of: overlay: demote message to warning
@ 2022-09-07 23:07 Daniel Walker
  2022-09-07 23:54 ` Frank Rowand
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Walker @ 2022-09-07 23:07 UTC (permalink / raw)
  To: Pantelis Antoniou, Frank Rowand, Rob Herring
  Cc: xe-linux-external, devicetree, linux-kernel

This warning message shows by default on the vast majority of overlays
applied. Despite the text identifying this as a warning it is marked
with the loglevel for error. At Cisco we filter the loglevels to only
show error messages. We end up seeing this message but it's not really
an error.

For this reason it makes sense to demote the message to the warning
loglevel.

Cc: xe-linux-external@cisco.com
Signed-off-by: Daniel Walker <danielwa@cisco.com>
---
 drivers/of/overlay.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
index bd8ff4df723d..4ae276ed9a65 100644
--- a/drivers/of/overlay.c
+++ b/drivers/of/overlay.c
@@ -358,7 +358,7 @@ static int add_changeset_property(struct overlay_changeset *ovcs,
 	}
 
 	if (!of_node_check_flag(target->np, OF_OVERLAY))
-		pr_err("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
+		pr_warn("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
 		       target->np, new_prop->name);
 
 	if (ret) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] driver: of: overlay: demote message to warning
  2022-09-07 23:07 [PATCH] driver: of: overlay: demote message to warning Daniel Walker
@ 2022-09-07 23:54 ` Frank Rowand
  2022-09-08  0:35   ` Daniel Walker
  0 siblings, 1 reply; 13+ messages in thread
From: Frank Rowand @ 2022-09-07 23:54 UTC (permalink / raw)
  To: Daniel Walker, Pantelis Antoniou, Rob Herring
  Cc: xe-linux-external, devicetree, linux-kernel

On 9/7/22 18:07, Daniel Walker wrote:
> This warning message shows by default on the vast majority of overlays
> applied. Despite the text identifying this as a warning it is marked
> with the loglevel for error. At Cisco we filter the loglevels to only
> show error messages. We end up seeing this message but it's not really
> an error.
> 
> For this reason it makes sense to demote the message to the warning
> loglevel.
> 
> Cc: xe-linux-external@cisco.com
> Signed-off-by: Daniel Walker <danielwa@cisco.com>
> ---
>  drivers/of/overlay.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
> index bd8ff4df723d..4ae276ed9a65 100644
> --- a/drivers/of/overlay.c
> +++ b/drivers/of/overlay.c
> @@ -358,7 +358,7 @@ static int add_changeset_property(struct overlay_changeset *ovcs,
>  	}
>  
>  	if (!of_node_check_flag(target->np, OF_OVERLAY))
> -		pr_err("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
> +		pr_warn("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
>  		       target->np, new_prop->name);
>  
>  	if (ret) {

NACK

This is showing a real problem with the overlay.

-Frank

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] driver: of: overlay: demote message to warning
  2022-09-07 23:54 ` Frank Rowand
@ 2022-09-08  0:35   ` Daniel Walker
  2022-09-08 17:55     ` Frank Rowand
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Walker @ 2022-09-08  0:35 UTC (permalink / raw)
  To: Frank Rowand
  Cc: Pantelis Antoniou, Rob Herring, xe-linux-external, devicetree,
	linux-kernel

On Wed, Sep 07, 2022 at 06:54:02PM -0500, Frank Rowand wrote:
> On 9/7/22 18:07, Daniel Walker wrote:
> > This warning message shows by default on the vast majority of overlays
> > applied. Despite the text identifying this as a warning it is marked
> > with the loglevel for error. At Cisco we filter the loglevels to only
> > show error messages. We end up seeing this message but it's not really
> > an error.
> > 
> > For this reason it makes sense to demote the message to the warning
> > loglevel.
> > 
> > Cc: xe-linux-external@cisco.com
> > Signed-off-by: Daniel Walker <danielwa@cisco.com>
> > ---
> >  drivers/of/overlay.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
> > index bd8ff4df723d..4ae276ed9a65 100644
> > --- a/drivers/of/overlay.c
> > +++ b/drivers/of/overlay.c
> > @@ -358,7 +358,7 @@ static int add_changeset_property(struct overlay_changeset *ovcs,
> >  	}
> >  
> >  	if (!of_node_check_flag(target->np, OF_OVERLAY))
> > -		pr_err("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
> > +		pr_warn("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
> >  		       target->np, new_prop->name);
> >  
> >  	if (ret) {
> 
> NACK
> 
> This is showing a real problem with the overlay.

What's the real problem ?

Daniel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] driver: of: overlay: demote message to warning
  2022-09-08  0:35   ` Daniel Walker
@ 2022-09-08 17:55     ` Frank Rowand
  2022-09-12  6:45       ` Frank Rowand
  0 siblings, 1 reply; 13+ messages in thread
From: Frank Rowand @ 2022-09-08 17:55 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Pantelis Antoniou, Rob Herring, xe-linux-external, devicetree,
	linux-kernel

On 9/7/22 19:35, Daniel Walker wrote:
> On Wed, Sep 07, 2022 at 06:54:02PM -0500, Frank Rowand wrote:
>> On 9/7/22 18:07, Daniel Walker wrote:
>>> This warning message shows by default on the vast majority of overlays
>>> applied. Despite the text identifying this as a warning it is marked
>>> with the loglevel for error. At Cisco we filter the loglevels to only
>>> show error messages. We end up seeing this message but it's not really
>>> an error.
>>>
>>> For this reason it makes sense to demote the message to the warning
>>> loglevel.
>>>
>>> Cc: xe-linux-external@cisco.com
>>> Signed-off-by: Daniel Walker <danielwa@cisco.com>
>>> ---
>>>  drivers/of/overlay.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
>>> index bd8ff4df723d..4ae276ed9a65 100644
>>> --- a/drivers/of/overlay.c
>>> +++ b/drivers/of/overlay.c
>>> @@ -358,7 +358,7 @@ static int add_changeset_property(struct overlay_changeset *ovcs,
>>>  	}
>>>  
>>>  	if (!of_node_check_flag(target->np, OF_OVERLAY))
>>> -		pr_err("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
>>> +		pr_warn("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
>>>  		       target->np, new_prop->name);
>>>  
>>>  	if (ret) {
>>
>> NACK
>>
>> This is showing a real problem with the overlay.
> 
> What's the real problem ?
> 
> Daniel

A memory leak when the overlay is removed.

I'll send a patch to update the overlay file in Documumentation/devicetree/ to provide
more information about this.  If you don't see a patch by tomorrow, feel free to
ping me.

-Frank

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] driver: of: overlay: demote message to warning
  2022-09-08 17:55     ` Frank Rowand
@ 2022-09-12  6:45       ` Frank Rowand
  2022-09-12 17:05         ` Daniel Walker
  0 siblings, 1 reply; 13+ messages in thread
From: Frank Rowand @ 2022-09-12  6:45 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Pantelis Antoniou, Rob Herring, xe-linux-external, devicetree,
	linux-kernel

On 9/8/22 12:55, Frank Rowand wrote:
> On 9/7/22 19:35, Daniel Walker wrote:
>> On Wed, Sep 07, 2022 at 06:54:02PM -0500, Frank Rowand wrote:
>>> On 9/7/22 18:07, Daniel Walker wrote:
>>>> This warning message shows by default on the vast majority of overlays
>>>> applied. Despite the text identifying this as a warning it is marked
>>>> with the loglevel for error. At Cisco we filter the loglevels to only
>>>> show error messages. We end up seeing this message but it's not really
>>>> an error.
>>>>
>>>> For this reason it makes sense to demote the message to the warning
>>>> loglevel.
>>>>
>>>> Cc: xe-linux-external@cisco.com
>>>> Signed-off-by: Daniel Walker <danielwa@cisco.com>
>>>> ---
>>>>  drivers/of/overlay.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
>>>> index bd8ff4df723d..4ae276ed9a65 100644
>>>> --- a/drivers/of/overlay.c
>>>> +++ b/drivers/of/overlay.c
>>>> @@ -358,7 +358,7 @@ static int add_changeset_property(struct overlay_changeset *ovcs,
>>>>  	}
>>>>  
>>>>  	if (!of_node_check_flag(target->np, OF_OVERLAY))
>>>> -		pr_err("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
>>>> +		pr_warn("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
>>>>  		       target->np, new_prop->name);
>>>>  
>>>>  	if (ret) {
>>>
>>> NACK
>>>
>>> This is showing a real problem with the overlay.
>>
>> What's the real problem ?
>>
>> Daniel
> 
> A memory leak when the overlay is removed.
> 
> I'll send a patch to update the overlay file in Documumentation/devicetree/ to provide
> more information about this.  If you don't see a patch by tomorrow, feel free to
> ping me.
> 
> -Frank

The good news is that your question prodded me to start improving the in kernel documentation
of overlays.  The promised patch is a rough start at:

   https://lore.kernel.org/all/20220912062615.3727029-1-frowand.list@gmail.com/

The bad news is that what I wrote doesn't explain the memory leak in any more detail.
If an overlay adds a property to a node in the base device tree then the memory
allocated to do the add will not be freed when the overlay is removed.  Since it is
possible to add and remove overlays multiple times, the ensuing size of the memory
leak is potentially unbounded.

-Frank

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] driver: of: overlay: demote message to warning
  2022-09-12  6:45       ` Frank Rowand
@ 2022-09-12 17:05         ` Daniel Walker
  2022-09-12 20:32           ` Frank Rowand
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Walker @ 2022-09-12 17:05 UTC (permalink / raw)
  To: Frank Rowand
  Cc: Pantelis Antoniou, Rob Herring, xe-linux-external, devicetree,
	linux-kernel

On Mon, Sep 12, 2022 at 01:45:40AM -0500, Frank Rowand wrote:
> On 9/8/22 12:55, Frank Rowand wrote:
> > On 9/7/22 19:35, Daniel Walker wrote:
> >> On Wed, Sep 07, 2022 at 06:54:02PM -0500, Frank Rowand wrote:
> >>> On 9/7/22 18:07, Daniel Walker wrote:
> >>>> This warning message shows by default on the vast majority of overlays
> >>>> applied. Despite the text identifying this as a warning it is marked
> >>>> with the loglevel for error. At Cisco we filter the loglevels to only
> >>>> show error messages. We end up seeing this message but it's not really
> >>>> an error.
> >>>>
> >>>> For this reason it makes sense to demote the message to the warning
> >>>> loglevel.
> >>>>
> >>>> Cc: xe-linux-external@cisco.com
> >>>> Signed-off-by: Daniel Walker <danielwa@cisco.com>
> >>>> ---
> >>>>  drivers/of/overlay.c | 2 +-
> >>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
> >>>> index bd8ff4df723d..4ae276ed9a65 100644
> >>>> --- a/drivers/of/overlay.c
> >>>> +++ b/drivers/of/overlay.c
> >>>> @@ -358,7 +358,7 @@ static int add_changeset_property(struct overlay_changeset *ovcs,
> >>>>  	}
> >>>>  
> >>>>  	if (!of_node_check_flag(target->np, OF_OVERLAY))
> >>>> -		pr_err("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
> >>>> +		pr_warn("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
> >>>>  		       target->np, new_prop->name);
> >>>>  
> >>>>  	if (ret) {
> >>>
> >>> NACK
> >>>
> >>> This is showing a real problem with the overlay.
> >>
> >> What's the real problem ?
> >>
> >> Daniel
> > 
> > A memory leak when the overlay is removed.
> > 
> > I'll send a patch to update the overlay file in Documumentation/devicetree/ to provide
> > more information about this.  If you don't see a patch by tomorrow, feel free to
> > ping me.
> > 
> > -Frank
> 
> The good news is that your question prodded me to start improving the in kernel documentation
> of overlays.  The promised patch is a rough start at:
> 
>    https://lore.kernel.org/all/20220912062615.3727029-1-frowand.list@gmail.com/
> 
> The bad news is that what I wrote doesn't explain the memory leak in any more detail.
> If an overlay adds a property to a node in the base device tree then the memory
> allocated to do the add will not be freed when the overlay is removed.  Since it is
> possible to add and remove overlays multiple times, the ensuing size of the memory
> leak is potentially unbounded.

Isn't this only a problem if you remove the overlay?

if the dt fixup driver does have the ability to remove the overlay doesn't it
have responsibility to free the memory? Or is it impossible to free the memory?

Daniel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] driver: of: overlay: demote message to warning
  2022-09-12 17:05         ` Daniel Walker
@ 2022-09-12 20:32           ` Frank Rowand
  2022-09-13  0:51             ` Daniel Walker
  0 siblings, 1 reply; 13+ messages in thread
From: Frank Rowand @ 2022-09-12 20:32 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Pantelis Antoniou, Rob Herring, xe-linux-external, devicetree,
	linux-kernel

On 9/12/22 12:05, Daniel Walker wrote:
> On Mon, Sep 12, 2022 at 01:45:40AM -0500, Frank Rowand wrote:
>> On 9/8/22 12:55, Frank Rowand wrote:
>>> On 9/7/22 19:35, Daniel Walker wrote:
>>>> On Wed, Sep 07, 2022 at 06:54:02PM -0500, Frank Rowand wrote:
>>>>> On 9/7/22 18:07, Daniel Walker wrote:
>>>>>> This warning message shows by default on the vast majority of overlays
>>>>>> applied. Despite the text identifying this as a warning it is marked
>>>>>> with the loglevel for error. At Cisco we filter the loglevels to only
>>>>>> show error messages. We end up seeing this message but it's not really
>>>>>> an error.
>>>>>>
>>>>>> For this reason it makes sense to demote the message to the warning
>>>>>> loglevel.
>>>>>>
>>>>>> Cc: xe-linux-external@cisco.com
>>>>>> Signed-off-by: Daniel Walker <danielwa@cisco.com>
>>>>>> ---
>>>>>>  drivers/of/overlay.c | 2 +-
>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
>>>>>> index bd8ff4df723d..4ae276ed9a65 100644
>>>>>> --- a/drivers/of/overlay.c
>>>>>> +++ b/drivers/of/overlay.c
>>>>>> @@ -358,7 +358,7 @@ static int add_changeset_property(struct overlay_changeset *ovcs,
>>>>>>  	}
>>>>>>  
>>>>>>  	if (!of_node_check_flag(target->np, OF_OVERLAY))
>>>>>> -		pr_err("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
>>>>>> +		pr_warn("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
>>>>>>  		       target->np, new_prop->name);
>>>>>>  
>>>>>>  	if (ret) {
>>>>>
>>>>> NACK
>>>>>
>>>>> This is showing a real problem with the overlay.
>>>>
>>>> What's the real problem ?
>>>>
>>>> Daniel
>>>
>>> A memory leak when the overlay is removed.
>>>
>>> I'll send a patch to update the overlay file in Documumentation/devicetree/ to provide
>>> more information about this.  If you don't see a patch by tomorrow, feel free to
>>> ping me.
>>>
>>> -Frank
>>
>> The good news is that your question prodded me to start improving the in kernel documentation
>> of overlays.  The promised patch is a rough start at:
>>
>>    https://lore.kernel.org/all/20220912062615.3727029-1-frowand.list@gmail.com/
>>
>> The bad news is that what I wrote doesn't explain the memory leak in any more detail.
>> If an overlay adds a property to a node in the base device tree then the memory
>> allocated to do the add will not be freed when the overlay is removed.  Since it is
>> possible to add and remove overlays multiple times, the ensuing size of the memory
>> leak is potentially unbounded.
> 
> Isn't this only a problem if you remove the overlay?

Yes, but we don't know if the overlay will be removed.  And I will not accept a
change that suppresses the message if there is no expectation to remove the
overlay.

> 
> if the dt fixup driver does have the ability to remove the overlay doesn't it
> have responsibility to free the memory? Or is it impossible to free the memory?

It is difficult due to architectural issues.  Reference counting occurs at the node
level, and not at the property level.  So memory related to properties is freed
when the corresponding overlay node reference count leads to the node being freed.

> 
> Daniel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] driver: of: overlay: demote message to warning
  2022-09-12 20:32           ` Frank Rowand
@ 2022-09-13  0:51             ` Daniel Walker
  2022-09-16 22:47               ` Frank Rowand
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Walker @ 2022-09-13  0:51 UTC (permalink / raw)
  To: Frank Rowand
  Cc: Pantelis Antoniou, Rob Herring, xe-linux-external, devicetree,
	linux-kernel

On Mon, Sep 12, 2022 at 03:32:31PM -0500, Frank Rowand wrote:
> On 9/12/22 12:05, Daniel Walker wrote:
> > On Mon, Sep 12, 2022 at 01:45:40AM -0500, Frank Rowand wrote:
> >> On 9/8/22 12:55, Frank Rowand wrote:
> >>> On 9/7/22 19:35, Daniel Walker wrote:
> >>>> On Wed, Sep 07, 2022 at 06:54:02PM -0500, Frank Rowand wrote:
> >>>>> On 9/7/22 18:07, Daniel Walker wrote:
> >>>>>> This warning message shows by default on the vast majority of overlays
> >>>>>> applied. Despite the text identifying this as a warning it is marked
> >>>>>> with the loglevel for error. At Cisco we filter the loglevels to only
> >>>>>> show error messages. We end up seeing this message but it's not really
> >>>>>> an error.
> >>>>>>
> >>>>>> For this reason it makes sense to demote the message to the warning
> >>>>>> loglevel.
> >>>>>>
> >>>>>> Cc: xe-linux-external@cisco.com
> >>>>>> Signed-off-by: Daniel Walker <danielwa@cisco.com>
> >>>>>> ---
> >>>>>>  drivers/of/overlay.c | 2 +-
> >>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
> >>>>>> index bd8ff4df723d..4ae276ed9a65 100644
> >>>>>> --- a/drivers/of/overlay.c
> >>>>>> +++ b/drivers/of/overlay.c
> >>>>>> @@ -358,7 +358,7 @@ static int add_changeset_property(struct overlay_changeset *ovcs,
> >>>>>>  	}
> >>>>>>  
> >>>>>>  	if (!of_node_check_flag(target->np, OF_OVERLAY))
> >>>>>> -		pr_err("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
> >>>>>> +		pr_warn("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
> >>>>>>  		       target->np, new_prop->name);
> >>>>>>  
> >>>>>>  	if (ret) {
> >>>>>
> >>>>> NACK
> >>>>>
> >>>>> This is showing a real problem with the overlay.
> >>>>
> >>>> What's the real problem ?
> >>>>
> >>>> Daniel
> >>>
> >>> A memory leak when the overlay is removed.
> >>>
> >>> I'll send a patch to update the overlay file in Documumentation/devicetree/ to provide
> >>> more information about this.  If you don't see a patch by tomorrow, feel free to
> >>> ping me.
> >>>
> >>> -Frank
> >>
> >> The good news is that your question prodded me to start improving the in kernel documentation
> >> of overlays.  The promised patch is a rough start at:
> >>
> >>    https://lore.kernel.org/all/20220912062615.3727029-1-frowand.list@gmail.com/
> >>
> >> The bad news is that what I wrote doesn't explain the memory leak in any more detail.
> >> If an overlay adds a property to a node in the base device tree then the memory
> >> allocated to do the add will not be freed when the overlay is removed.  Since it is
> >> possible to add and remove overlays multiple times, the ensuing size of the memory
> >> leak is potentially unbounded.
> > 
> > Isn't this only a problem if you remove the overlay?
> 
> Yes, but we don't know if the overlay will be removed.  And I will not accept a
> change that suppresses the message if there is no expectation to remove the
> overlay.
 
I haven't researched the whole overlay system but there was one removal function
that I noted, I think in the link you provided above, called
of_overlay_remove(). It appears to call free_overlay_changeset() which calls kfree().

so your API seems to deal with freeing the memory. I would think the expectation is that
people using the API would free the overlay thru your API.

The only in tree usage of your API (besides the unit test) was drm/rcar-du which
had no ability to remove the overlay that I can see. That component of the driver was
removed several months ago.

> > 
> > if the dt fixup driver does have the ability to remove the overlay doesn't it
> > have responsibility to free the memory? Or is it impossible to free the memory?
> 
> It is difficult due to architectural issues.  Reference counting occurs at the node
> level, and not at the property level.  So memory related to properties is freed
> when the corresponding overlay node reference count leads to the node being freed.

How does of_overlay_remove() work then? It seems like it might not be possible
to do overlay removal, but your code has removal functions. I also see this one
of_overlay_remove_all() ..

It seems like your API supports removal. Is there an issue where your API is
maybe not complete or maybe doesn't currentily work ?

Maybe you could add a flag or other indicator which would indicate the overlay will never be
removed. Then your code could rely on this property to inform on if the author
has consider the removal issues related to overlays.

Daniel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] driver: of: overlay: demote message to warning
  2022-09-13  0:51             ` Daniel Walker
@ 2022-09-16 22:47               ` Frank Rowand
  2022-09-16 22:56                 ` Daniel Walker
  0 siblings, 1 reply; 13+ messages in thread
From: Frank Rowand @ 2022-09-16 22:47 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Pantelis Antoniou, Rob Herring, xe-linux-external, devicetree,
	linux-kernel

On 9/12/22 19:51, Daniel Walker wrote:
> On Mon, Sep 12, 2022 at 03:32:31PM -0500, Frank Rowand wrote:
>> On 9/12/22 12:05, Daniel Walker wrote:
>>> On Mon, Sep 12, 2022 at 01:45:40AM -0500, Frank Rowand wrote:
>>>> On 9/8/22 12:55, Frank Rowand wrote:
>>>>> On 9/7/22 19:35, Daniel Walker wrote:
>>>>>> On Wed, Sep 07, 2022 at 06:54:02PM -0500, Frank Rowand wrote:
>>>>>>> On 9/7/22 18:07, Daniel Walker wrote:
>>>>>>>> This warning message shows by default on the vast majority of overlays
>>>>>>>> applied. Despite the text identifying this as a warning it is marked
>>>>>>>> with the loglevel for error. At Cisco we filter the loglevels to only
>>>>>>>> show error messages. We end up seeing this message but it's not really
>>>>>>>> an error.
>>>>>>>>
>>>>>>>> For this reason it makes sense to demote the message to the warning
>>>>>>>> loglevel.
>>>>>>>>
>>>>>>>> Cc: xe-linux-external@cisco.com
>>>>>>>> Signed-off-by: Daniel Walker <danielwa@cisco.com>
>>>>>>>> ---
>>>>>>>>  drivers/of/overlay.c | 2 +-
>>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
>>>>>>>> index bd8ff4df723d..4ae276ed9a65 100644
>>>>>>>> --- a/drivers/of/overlay.c
>>>>>>>> +++ b/drivers/of/overlay.c
>>>>>>>> @@ -358,7 +358,7 @@ static int add_changeset_property(struct overlay_changeset *ovcs,
>>>>>>>>  	}
>>>>>>>>  
>>>>>>>>  	if (!of_node_check_flag(target->np, OF_OVERLAY))
>>>>>>>> -		pr_err("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
>>>>>>>> +		pr_warn("WARNING: memory leak will occur if overlay removed, property: %pOF/%s\n",
>>>>>>>>  		       target->np, new_prop->name);
>>>>>>>>  
>>>>>>>>  	if (ret) {
>>>>>>>
>>>>>>> NACK
>>>>>>>
>>>>>>> This is showing a real problem with the overlay.
>>>>>>
>>>>>> What's the real problem ?
>>>>>>
>>>>>> Daniel
>>>>>
>>>>> A memory leak when the overlay is removed.
>>>>>
>>>>> I'll send a patch to update the overlay file in Documumentation/devicetree/ to provide
>>>>> more information about this.  If you don't see a patch by tomorrow, feel free to
>>>>> ping me.
>>>>>
>>>>> -Frank
>>>>
>>>> The good news is that your question prodded me to start improving the in kernel documentation
>>>> of overlays.  The promised patch is a rough start at:
>>>>
>>>>    https://lore.kernel.org/all/20220912062615.3727029-1-frowand.list@gmail.com/
>>>>
>>>> The bad news is that what I wrote doesn't explain the memory leak in any more detail.
>>>> If an overlay adds a property to a node in the base device tree then the memory
>>>> allocated to do the add will not be freed when the overlay is removed.  Since it is
>>>> possible to add and remove overlays multiple times, the ensuing size of the memory
>>>> leak is potentially unbounded.
>>>
>>> Isn't this only a problem if you remove the overlay?
>>
>> Yes, but we don't know if the overlay will be removed.  And I will not accept a
>> change that suppresses the message if there is no expectation to remove the
>> overlay.
>  
> I haven't researched the whole overlay system but there was one removal function
> that I noted, I think in the link you provided above, called
> of_overlay_remove(). It appears to call free_overlay_changeset() which calls kfree().

free_overlay_changeset() frees data structures related to the changeset.  It does not
free memory allocated to the devicetree.  The memory allocated to the devicetree is
freed as a result of the decreasing reference counts on the devicetree data.

> 
> so your API seems to deal with freeing the memory. I would think the expectation is that
> people using the API would free the overlay thru your API.

Yes, the way to remove an overlay is of_overlay_remove() or of_overlay_remove_all().

> 
> The only in tree usage of your API (besides the unit test) was drm/rcar-du which
> had no ability to remove the overlay that I can see. That component of the driver was
> removed several months ago.

rcar was an early overlay use that was grandfathered in for a long time until
we were finally able to remove it.  It was not a model for how to use overlays.

> 
>>>
>>> if the dt fixup driver does have the ability to remove the overlay doesn't it
>>> have responsibility to free the memory? Or is it impossible to free the memory?
>>
>> It is difficult due to architectural issues.  Reference counting occurs at the node
>> level, and not at the property level.  So memory related to properties is freed
>> when the corresponding overlay node reference count leads to the node being freed.
> 
> How does of_overlay_remove() work then? It seems like it might not be possible
> to do overlay removal, but your code has removal functions. I also see this one
> of_overlay_remove_all() ..
> 
> It seems like your API supports removal. Is there an issue where your API is
> maybe not complete or maybe doesn't currentily work ?

The overlay implementation is definitely not complete.  For some details please see:

   https://elinux.org/Device_Tree_Reference#Overlays

and:

   https://elinux.org/Frank%27s_Evolving_Overlay_Thoughts

If you have not read those two links recently, _please_ read them for a better
understanding of the state of overlay implementation!


> 
> Maybe you could add a flag or other indicator which would indicate the overlay will never be
> removed. Then your code could rely on this property to inform on if the author
> has consider the removal issues related to overlays.

No.  I guess I wasn't clear enough above, where I said:

   "And I will not accept a
    change that suppresses the message if there is no expectation to remove the
    overlay."

There are multiple reasons for this, but the most fundamental is that if a
new overlay is not removable, then any overlay already applied can not be
removed (because overlays must be removed in the reverse order that they
are applied).  It would be incredibly bad architecture to allow an overlay
to block another overlay from being removed.

-Frank

> 
> Daniel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] driver: of: overlay: demote message to warning
  2022-09-16 22:47               ` Frank Rowand
@ 2022-09-16 22:56                 ` Daniel Walker
  2022-09-17  2:47                   ` Frank Rowand
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Walker @ 2022-09-16 22:56 UTC (permalink / raw)
  To: Frank Rowand
  Cc: Pantelis Antoniou, Rob Herring, xe-linux-external, devicetree,
	linux-kernel

On Fri, Sep 16, 2022 at 05:47:54PM -0500, Frank Rowand wrote:
> > 
> > Maybe you could add a flag or other indicator which would indicate the overlay will never be
> > removed. Then your code could rely on this property to inform on if the author
> > has consider the removal issues related to overlays.
> 
> No.  I guess I wasn't clear enough above, where I said:
> 
>    "And I will not accept a
>     change that suppresses the message if there is no expectation to remove the
>     overlay."
> 
> There are multiple reasons for this, but the most fundamental is that if a
> new overlay is not removable, then any overlay already applied can not be
> removed (because overlays must be removed in the reverse order that they
> are applied).  It would be incredibly bad architecture to allow an overlay
> to block another overlay from being removed.

So how about an option to turn off removable overlays entirely? As far as I can
tell it's not used currently by the tiny number of implementation I've seen.

Cisco doesn't need it, and we could have a smaller kernel without it.

The issue is that the error log on blast is log level abuse in my opinion. If
there's no way to fix it, it should not be an error.

Daniel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] driver: of: overlay: demote message to warning
  2022-09-16 22:56                 ` Daniel Walker
@ 2022-09-17  2:47                   ` Frank Rowand
  2022-09-17  3:26                     ` Daniel Walker
  0 siblings, 1 reply; 13+ messages in thread
From: Frank Rowand @ 2022-09-17  2:47 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Pantelis Antoniou, Rob Herring, xe-linux-external, devicetree,
	linux-kernel

On 9/16/22 17:56, Daniel Walker wrote:
> On Fri, Sep 16, 2022 at 05:47:54PM -0500, Frank Rowand wrote:
>>>
>>> Maybe you could add a flag or other indicator which would indicate the overlay will never be
>>> removed. Then your code could rely on this property to inform on if the author
>>> has consider the removal issues related to overlays.
>>
>> No.  I guess I wasn't clear enough above, where I said:
>>
>>    "And I will not accept a
>>     change that suppresses the message if there is no expectation to remove the
>>     overlay."
>>
>> There are multiple reasons for this, but the most fundamental is that if a
>> new overlay is not removable, then any overlay already applied can not be
>> removed (because overlays must be removed in the reverse order that they
>> are applied).  It would be incredibly bad architecture to allow an overlay
>> to block another overlay from being removed.
> 
> So how about an option to turn off removable overlays entirely? As far as I can
> tell it's not used currently by the tiny number of implementation I've seen.
> 
> Cisco doesn't need it, and we could have a smaller kernel without it.
> 
> The issue is that the error log on blast is log level abuse in my opinion. If
> there's no way to fix it, it should not be an error.

The way to fix it is to not have a construct in the overlay that triggers the
message.  In other words, do not add a property to a pre-existing node.  (At
least I think that is what is the underlying cause, if I recall correctly.)

-Frank

> 
> Daniel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] driver: of: overlay: demote message to warning
  2022-09-17  2:47                   ` Frank Rowand
@ 2022-09-17  3:26                     ` Daniel Walker
  2022-09-26 22:29                       ` Rob Herring
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Walker @ 2022-09-17  3:26 UTC (permalink / raw)
  To: Frank Rowand
  Cc: Pantelis Antoniou, Rob Herring, xe-linux-external, devicetree,
	linux-kernel

On Fri, Sep 16, 2022 at 09:47:19PM -0500, Frank Rowand wrote:
> On 9/16/22 17:56, Daniel Walker wrote:
> > On Fri, Sep 16, 2022 at 05:47:54PM -0500, Frank Rowand wrote:
> >>>
> >>> Maybe you could add a flag or other indicator which would indicate the overlay will never be
> >>> removed. Then your code could rely on this property to inform on if the author
> >>> has consider the removal issues related to overlays.
> >>
> >> No.  I guess I wasn't clear enough above, where I said:
> >>
> >>    "And I will not accept a
> >>     change that suppresses the message if there is no expectation to remove the
> >>     overlay."
> >>
> >> There are multiple reasons for this, but the most fundamental is that if a
> >> new overlay is not removable, then any overlay already applied can not be
> >> removed (because overlays must be removed in the reverse order that they
> >> are applied).  It would be incredibly bad architecture to allow an overlay
> >> to block another overlay from being removed.
> > 
> > So how about an option to turn off removable overlays entirely? As far as I can
> > tell it's not used currently by the tiny number of implementation I've seen.
> > 
> > Cisco doesn't need it, and we could have a smaller kernel without it.
> > 
> > The issue is that the error log on blast is log level abuse in my opinion. If
> > there's no way to fix it, it should not be an error.
> 
> The way to fix it is to not have a construct in the overlay that triggers the
> message.  In other words, do not add a property to a pre-existing node.  (At
> least I think that is what is the underlying cause, if I recall correctly.)
> 
> -Frank

Here's the check,

 if (!of_node_check_flag(target->np, OF_OVERLAY))

If the print shows when the modifications is made to a non-overlay, I'm not
sure how you could construct a device tree where you only modify other overlays.

It seems like this should print on the vast majority of overlays.

Daniel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] driver: of: overlay: demote message to warning
  2022-09-17  3:26                     ` Daniel Walker
@ 2022-09-26 22:29                       ` Rob Herring
  0 siblings, 0 replies; 13+ messages in thread
From: Rob Herring @ 2022-09-26 22:29 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Frank Rowand, Pantelis Antoniou, xe-linux-external, devicetree,
	linux-kernel

On Fri, Sep 16, 2022 at 10:26 PM Daniel Walker <danielwa@cisco.com> wrote:
>
> On Fri, Sep 16, 2022 at 09:47:19PM -0500, Frank Rowand wrote:
> > On 9/16/22 17:56, Daniel Walker wrote:
> > > On Fri, Sep 16, 2022 at 05:47:54PM -0500, Frank Rowand wrote:
> > >>>
> > >>> Maybe you could add a flag or other indicator which would indicate the overlay will never be
> > >>> removed. Then your code could rely on this property to inform on if the author
> > >>> has consider the removal issues related to overlays.
> > >>
> > >> No.  I guess I wasn't clear enough above, where I said:
> > >>
> > >>    "And I will not accept a
> > >>     change that suppresses the message if there is no expectation to remove the
> > >>     overlay."
> > >>
> > >> There are multiple reasons for this, but the most fundamental is that if a
> > >> new overlay is not removable, then any overlay already applied can not be
> > >> removed (because overlays must be removed in the reverse order that they
> > >> are applied).  It would be incredibly bad architecture to allow an overlay
> > >> to block another overlay from being removed.
> > >
> > > So how about an option to turn off removable overlays entirely? As far as I can
> > > tell it's not used currently by the tiny number of implementation I've seen.
> > >
> > > Cisco doesn't need it, and we could have a smaller kernel without it.
> > >
> > > The issue is that the error log on blast is log level abuse in my opinion. If
> > > there's no way to fix it, it should not be an error.
> >
> > The way to fix it is to not have a construct in the overlay that triggers the
> > message.  In other words, do not add a property to a pre-existing node.  (At
> > least I think that is what is the underlying cause, if I recall correctly.)
> >
> > -Frank
>
> Here's the check,
>
>  if (!of_node_check_flag(target->np, OF_OVERLAY))
>
> If the print shows when the modifications is made to a non-overlay, I'm not
> sure how you could construct a device tree where you only modify other overlays.
>
> It seems like this should print on the vast majority of overlays.

There is essentially zero support in the kernel for nodes to change
once they are in use (typ. bound to a driver), and I don't see us ever
supporting that use case. Unless shown otherwise, I don't think that
is a good split between a base DT and overlay either.

What I've said multiple times for supporting runtime overlays, is that
it needs to be restricted to adding/removing whole nodes/subtrees.

Rob

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-09-26 22:30 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-07 23:07 [PATCH] driver: of: overlay: demote message to warning Daniel Walker
2022-09-07 23:54 ` Frank Rowand
2022-09-08  0:35   ` Daniel Walker
2022-09-08 17:55     ` Frank Rowand
2022-09-12  6:45       ` Frank Rowand
2022-09-12 17:05         ` Daniel Walker
2022-09-12 20:32           ` Frank Rowand
2022-09-13  0:51             ` Daniel Walker
2022-09-16 22:47               ` Frank Rowand
2022-09-16 22:56                 ` Daniel Walker
2022-09-17  2:47                   ` Frank Rowand
2022-09-17  3:26                     ` Daniel Walker
2022-09-26 22:29                       ` Rob Herring

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).