From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Message-ID: <53AA07A1.20001@roeck-us.net> Date: Tue, 24 Jun 2014 16:20:01 -0700 From: Guenter Roeck MIME-Version: 1.0 Subject: Re: [PATCH 1/6] of: Do not free memory at of_node_release References: <1403430039-15085-1-git-send-email-pantelis.antoniou@konsulko.com> <1403430039-15085-2-git-send-email-pantelis.antoniou@konsulko.com> <20140624141004.9E989C40B84@trevor.secretlab.ca> <0E39F7E8-1141-4D6D-A251-3F0DDEA9AA29@konsulko.com> <20140624202121.7A8E3C40D39@trevor.secretlab.ca> <44A964DB-2C39-4E5B-BD7B-61973942B25A@konsulko.com> <20140624203357.GA23620@roeck-us.net> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit To: Pantelis Antoniou Cc: Grant Likely , Rob Herring , Stephen Warren , Matt Porter , Koen Kooi , Greg Kroah-Hartman , Alison Chaiken , Dinh Nguyen , Jan Lubbe , Alexander Sverdlin , Michael Stickel , Dirk Behme , Alan Tull , Sascha Hauer , Michael Bohan , Ionut Nicu , Michal Simek , Matt Ranostay , Joel Becker , devicetree@vger.kernel.org, Wolfram Sang , linux-i2c@vger.kernel.org, Mark Brown , linux-spi@vger.kernel.org, linux-kernel@vger.kernel.org, Pete Popov , Dan Malek , Georgi Vlaev List-ID: On 06/24/2014 02:02 PM, Pantelis Antoniou wrote: > Hi Guenter, > > On Jun 24, 2014, at 11:33 PM, Guenter Roeck wrote: > >> On Tue, Jun 24, 2014 at 11:23:35PM +0300, Pantelis Antoniou wrote: >>> Hi Grant, >>> >>> On Jun 24, 2014, at 11:21 PM, Grant Likely wrote: >>> >>>> On Tue, 24 Jun 2014 17:23:35 +0300, Pantelis Antoniou wrote: >>>>> Hi Grant, >>>>> >>>>> On Jun 24, 2014, at 5:10 PM, Grant Likely wrote: >>>>> >>>>>> On Sun, 22 Jun 2014 12:40:34 +0300, Pantelis Antoniou wrote: >>>>>>> The life-cycle of nodes and properties does not allow us to release >>>>>>> the memory taken by a device_node. Pointer to properties and nodes >>>>>>> might still be in use in drivers, so any memory free'ing is dangerous. >>>>>>> >>>>>>> Simply move all the properties to the deadprops list, and the node >>>>>>> itself to of_alldeadnodes until the life-cycles issues are resolved. >>>>>> >>>>>> Ummm. this looks wrong. The release function is supposed to be the place >>>>>> to do the freeing, and with our discussion the other day about moving to >>>>>> rcu, but keeping of_node_get/put() for anything that needs to hold a >>>>>> long term reference, that means the lifecycle issues are pretty much >>>>>> resolved. I don't think this patch is necessary. >>>>>> >>>>> >>>>> Well, I thought about it too. This is the culmination of that process :) >>>>> >>>>> The problem is not the node life-cycle, it's the properties. We can't >>>>> tell for sure who and where has a pointer to the properties of the node, >>>>> and when we free the node, the current code iterates over the properties >>>>> list and frees them. This works only because in my tests, of_node_release >>>>> only gets called when using overlays created nodes. >>>> >>>> If a caller want's to use the value of a property, then it needs to >>>> of_node_get() the node. Property lifecycle is strictly attached to the >>>> lifecycle of it's node. >>>> >>> >>> Taking bets if that's always the case? >>> >> For my part I prefer to take the bet (and handle the consequences) over >> a potential memory leak. >> > > It's your call really. But let me just demonstrate why this is harder than it appears, > with a small example. > Nothing is easy in life. > The of_node_get() method is not invoked for instance when reading string properties. > > The of_property_read_string() method will return a pointer to the string property value. > That means that potentially all drivers that call of_property_read_string() will have to > be converted to taking a reference of the node using of_node_get(), or the of_property_read_string() > must do so in the background. > Or the caller must duplicate the string if it is used later on. Yes, I understand that there is potential for problems. I am still more concerned about memory leaks, though. Given that, I am actually more concerned about code which does not call of_node_put() even if it should, and I suspect we may have to spend some time going through the code to identify and fix as many of those as we can. Again, keep in mind that the systems I am dealing with tend to run for months if not years without reboot. Guenter